Research Projects | Department of Linguistics

Research Projects

NSF CompCog: RI: Small: Human-like semantic grammar induction through knowledge distillation from pre-trained language models

NSF #2313140 (09/01/2023--08/31/2026), William Schuler (PI)

The proposed work will develop a broad-coverage semantic grammar induction model that integrates world knowledge into the acquisition process by distilling it from large pre-trained neural language models. This model will be used to evaluate claims about the statistical learnability of grammar.

NSF RI: Small: Comp Cog: Broad-coverage semantic models of human sentence processing

NSF #1816891 (8/15/18--7/31/21), William Schuler (PI)

The purpose of this project is to develop a sentence processing model that decodes sentences into meanings using a human-like incremental probabilistic process. This model will then be used to control for frequency effects in neural activation, blood oxygenation and reading time data in order to isolate effects that can be attributed to the mechanical process of constructing and storing complex ideas during language comprehension.

DARPA LORELEI: Cognitively-based unsupervised grammar induction for low-resource languages

DARPA #HR0011-15-2-0022 (6/1/15--5/31/19), Lane Schwartz (PI), Timothy Miller, Finale Veles-Dosher, William Schuler

The objective of this research proposal is to develop tractable algorithms for unsupervised grammar induction for any human language, drawing on cognitive models of human language processing.

NSF EAGER: Incremental Semantic Sentence Processing Models

NSF #1551313 (9/1/15--8/31/18), William Schuler (PI)

The purpose of this project is to define a complete semantic dependency representation for sentences, including quantifier scope and coreference relationships, even those that cross sentence boundaries, then exploit the graphical nature of these dependency representations by estimating the probability of each analysis as the product of the probabilities of its component dependencies, based on the distributional similarity of each dependency's source predicate to the other predicates connected to its destination.

NSF CAREER/PECASE: Integrating denotational meaning into probabilistic language models

NSF #474906 (5/15/2005–4/30/2011), William Schuler (PI)

The purpose of this project is to develop probabilistic language models for use in spoken language interfaces to sensor or robotic agents which efficiently integrate denotational semantic information into syntactic and phonological stages of recognition. This integration is intended to allow information about the meanings or denotations of words in the agent's current environment context to influence the probability estimates it assigns to hypothesized analyses of its input, before any recognition decisions have been made, so that the interfaced agent that can favor in its search those analyses that `make sense' in its representation of the current state of the world. Since these models can be trained on the same kinds of examples that may be used to establish word meanings in the agent's lexicon, it is expected that they will be easier to adapt to changing domains than those relying exclusively on word co-occurrence statistics in fixed corpora.