Things Micha Is Interested In
As an introduction to my research program, I will talk about two (currently disjoint) research questions I plan to work on in the near future. First, I will discuss my recent work on phonetic and lexical acquisition. During early language acquisition, infants must learn both a lexicon and a model of phonetics that explains how lexical items can vary in pronunciation-- for instance "you" might be realized as 'you' with a full vowel or reduced to 'yeh' with a schwa. Previous models of acquisition have generally tackled these problems in isolation, yet behavioral evidence suggests infants acquire lexical and phonetic knowledge simultaneously. I will present ongoing research on constructing a Bayesian model which can simultaneously group together phonetic variants of the same lexical item, learn a probabilistic language model predicting the next word in an utterance from its context, and learn a model of pronunciation variability based on articulatory features. Secondly, I will present some recent work on the plot structure of novels. Here, I attempt to describe plot structure at a high level in terms of relationships between characters. I construct a similarity function between novels in terms of character frequency and emotion over time. Given a corpus of 19th-century novels as training data, the system can accurately distinguish held-out novels in their original form from artificially disordered or reversed surrogates. I will suggest plans for extending this system to better differentiate between types of characters and potential applications such as summarization.