Ohio State is in the process of revising websites and program materials to accurately reflect compliance with the law. While this work occurs, language referencing protected class status or other activities prohibited by Ohio Senate Bill 1 may still appear in some places. However, all programs and activities are being administered in compliance with federal and state law.

Colloquium fest (Ashley Lewis, Angelica Aviles, Kevin Lilley)

Our new front door
Fri, April 30, 2021
3:55 pm - 5:15 pm
Virtual zoom meeting

Speaker: Ashley Lewis (QP1)

Title: Identifying Inaccurate Paraphrases Using Cycle Consistency

Abstract:  When training a natural language processing/generation model, large amounts of training data is typically essential. However, as models improve, less and less data is required, though the quality of the reduced data is even more important. This project explores various strategies of data cleaning, particularly to identify potentially troublesome data points using automatic methods. To this end, I utilize data currently being collected for the purposes of training an AI dialogue agent, in which Amazon Mechanical Turk workers rephrase highly structured, templated questions into more natural sounding language. This project focuses on identifying when a paraphrase does not capture the content of the original template by means of cycle consistency (recovering the input from the output). I will present various experiments to test whether the rephrased questions match their original meaning representations, thus ensuring that the content has been preserved or flag problematic rephrasing for further inspection. Specifically, I fine-tune Hugging Face’s implementation of T5 in a seq2seq model on rephrased questions to map them to meaning representations (MRs) which can be compared to the originals. Further, I obtain the 10-best MR predictions and likelihood scores for them along with the target MR. I hypothesize that if the target MR is low (or absent) on the 10-best list and/or has a low likelihood score, it is likely to be a poor paraphrase. This method could allow for very quick identification of bad paraphrases and save many hours of manual data cleaning. 

 

Speaker: Angelica Aviles Bosques (QP1)

Title: Are Emojis Scalar?

Abstract: Texting has become a frequent medium use for communicating with each other. In comparison to face-to-face conversations, we use alternate ways, known as “text talk”,  to convey meaning since facial expression and gestures cannot be observed in text. I focus on one of these alternate ways: emojis (😠,👍, 😉, 😊). I am interested in figuring out if emojis trigger scalar implicatures, similarly to scalar adjectives (e.g. Your QP presentation was good implicates that it was not great) or quantifiers (e.g. Some of my students came to class implicates that they did not all come). This QP reports results of experimental work determining whether emojis are scalar, a requirement to be able to trigger scalar implicatures. These results show that overall participants interpret emojis as scalar. However participants’ agreement about the ordering on the scale varies: while some emojis are clearly ordered similarly for all participants, other emojis such as (😤, 😳, 😆) have less clear interpretations.

 

Speaker: Kevin Lilley (QP2)

Title: Perceptual Cues to Non-Native-Directed Speech 

Abstract: Talkers spontaneously adjust their speech based on the communicative needs of a situation, hyperarticulating to promote listeners’ perception. Hyperarticulate, listener-directed speech styles are together called “clear speech” – researchers have attempted to map talkers’ adjustments in clear speech to increased intelligibility of the speech signal. This mapping is made difficult by sources of variation in talkers’ speech, including the listener’s linguistic background and the elicitation instructions given to talkers in the lab. Production studies have previously demonstrated acoustic differences between native-directed speech and non-native-directed speech and between speech directed towards imagined versus real listeners. Given these acoustic differences in the production of clear speech, the present study investigated the impact of these differences in perception. Participants in this study completed a forced-choice identification task in which they attempted to identify the background of a listener from the speech of a talker. The results revealed poor identification accuracy, especially for the most hyperarticulate speech. Participants’ responses were influenced most by speech rate, second by vowel dispersion, but not by stressed vowel duration, though each cue was found to differ across styles in production. I conclude by discussing how these results inform theories which explain the origins of variation in clear speech styles.

Accommodation statement: If you require an accommodation such as live captioning or interpretation to participate in this event, please contact Ashwini Deo at deo.13@osu.edu. In general, requests made two weeks before the event will allow us to provide seamless access, but the university will make every effort to meet requests made after this date.