Author Archives: goodmami

Accumulating dictionaries in Python

I often have a need to count tokens in a corpus. In Python, there are many ways to do this, but currently I most often use defaultdicts: 123d = defaultdict(int) for x in sequence:   d[x] += 1 I would like to get rid of the for-loop and construct such a dictionary at once. I [...]

Posted in Programming, Software | Tagged , , | Leave a comment

Hulden and Bischoff, similar implementation

I just came across a paper for doing non-concatenative morphology with an implementation remarkably similar to my own. This paper by Mans Hulden and Shannon Bischoff uses unification of three different morphotactic constraints (unification, exclusion, coercion) in lexical rules to deal with co-occurrence restrictions and 4 different operators for modelling morpheme ordering (precedes, immediately precedes, [...]

Posted in phd-updates | Tagged | Leave a comment

Koskenniemi’s Two-Level Morphology

I just read what appears to be Koskenniemi’s most cited article on Two-Level Morphology, and now I am reviewing my understanding before writing a summary in my paper. When Kimmo Koskenniemi worked out two-level morphology, generative phonology via context-sensitive rewrite rules were the common way of describing morphological systems. These kinds of rules were fairly [...]

Posted in phd-updates | Tagged , | Leave a comment

Morphology paper: AVM problem fixed, no new text yet

I fixed the problem I was having with avm.sty. It turned out I had some avm environments using an older style of notation which no longer worked with the 1.02 version of avm.sty. The whole document now compiles with xelatex. I also found some articles by Koskenniemi (here and here) and Karttunen, Kaplan, Zaenen (here), [...]

Posted in phd-updates | Leave a comment

Morphology paper update

Getting back into the general’s paper on morphology, I found that xelatex failed to compile it with the avm environment (using the 2006 version of avm.sty). So my current goals for the paper are: to compile the whole thing (after uncommenting the avms) do some reading, then write the background/previous-work section Koskenniemi’s 2-level automata Beesly [...]

Posted in phd-updates | Leave a comment

Simple logging in Bash scripts

I couldn’t find much mention of logging utilities for Linux shell scripting (namely Bash), so I wrote my own fairly quickly. I wanted several functions for various levels of logging (info, debug, warning, errors, etc), and a way to adjust what levels can be displayed. I followed the fairly standard convention of using numeric values [...]

Posted in Linux, Software | Tagged , , | 2 Comments

Lightweight Music

I was getting tired of the bloat and memory usage of Rhythmbox, so I was searching for a new music player/manager. After trying Muine and Decibel and not being totally satisfied, I finally (re)found MPD, the Music Player Daemon. Once I figured out how to set it up (not too hard, but more work than [...]

Posted in Linux, Software | Tagged , , | Leave a comment

Tomboy and note-sharing

Other than as a place to quickly jot down ideas, phone numbers, etc., Tomboy‘s most common purpose for me is as a study aid for my coursework. The ability to link the different topics, concepts, and people that I learn about is very useful. Of course, I’m not the only student in my classes, and [...]

Posted in Software | Tagged , , , , | Leave a comment

Traditional and Simplified Chinese in LaTeX

I’ve been in the habit of using LaTeX’s CJK environment across a whole document to allow me to insert, for example, Japanese anywhere I like. However, if you want to have more than one language (not covered by the same font) in the same document (such as both traditional and simplified Chinese, Japanese and Korean, [...]

Posted in LaTeX | Tagged , , , | 7 Comments

glot

What started as an attempt to make a desktop application for CEDICT turned into an ambitious attempt to create an omniformat dictionary database and interface. glot aims to be both a backend for managing and querying dictionaries of any (electronic) format–even those over network protocols like DICT–and also an intelligent interface for querying a massive [...]

Posted in Programming | Tagged , | 2 Comments