200 Oxley Hall
1712 Neil Ave.
Columbus, OH 43210
Areas of Expertise
- Computational Linguistics
- Natural Language Generation (NLG)
- B.A. English, University of Kentucky, 2013
- B.A. Linguistics, minor in Spanish, University of Kentucky, 2013
My research in Natural Language Processing and Generation (NLP, NLG) centers on the responsible application and advancement of large language models (LLMs), particularly in dialogue settings. I am actively engaged in addressing key challenges such as mitigating confabulations (or hallucinations) and toxic outputs, emphasizing ethical and accurate AI communications. Leveraging my background in linguistic and pragmatic frameworks, particularly the Rational Speech Act (RSA) approach, I aim to enhance generation systems by modeling iterative reasoning between speaker and listener to verify model outputs and instill models with pragmatic nuances. In addition, my focus extends to knowledge distillation techniques, utilizing LLMs to generate training data for smaller, more efficient models, thereby reducing computational overhead. My work is committed to fostering responsible AI development, enhancing accessibility for non-expert users, and ensuring the trustworthy evolution of AI technologies.
Virtual Museum Tour Guide
I'm working on creating a virtual, interactive avatar that can act as a tour guide for the Language Pod at COSI. This project began as an offshoot of the Virtual Patient Project, but has since been revamped to be a document-grounded conversational agent that can respond to user questions dynamically and with contextual awareness. I work primarily on the response generation model and have been focusing on leveraging the benefits of LLMs (like ChatGPT) but mitigating the risks of confabulations/hallucinations and toxic outputs, particularly through knowledge distillation. The initial stages of this work are described in this paper, which I presented at the Taming LLMs Workshop in Prague, Czech Republic.
Interactive Semantic Parsing
Currently I am working on an interactive semantic parsing system, the goal of which is to make more accessible the information stored in databases and knowledge bases (KBs) for novice users. Because this information is often accessed using query languages such as SQL and SPARQL, one must be well-versed in these languages in order to query the DB or KB. Automatic semantic parsers aim to act as translators between natural language and a query language, however current state-of-the-art systems still have far-from-perfect accuracy. Thus, interactive semantic parsing systems take a human-in-the-loop approach, allowing users to make corrections to incorrect parses. Our system conducts this process in the form of a dialogue, in which the AI agent takes a natural-language query from the user, produces a parse (logical form), then decomposes that parse into pieces, translate them to natural language in a sequence of sub-questions, and shows these to the use, who can give corrections. This work, which I presented at ACL 2022 in Dublin, can be found here. A follow-up work, presented by my co-author at InterNLP, can be found here.
In terms of NLG, I have been focused on improving the question generation model in the above system, incorporating Rational Speech Act (RSA) methods to improve the accuracy and clarity of generated questions. I model both a listener and a speaker, which are iteratively reasoning about one another's behavior to select the best utterance. At this stage, it appears that this approach is effective at reducing hallucination (where the model outputs "extra" information that does not appear in the input) and omissions (where the model output fails to mention key information).
AlexaPrize Taskbot Challenge