Title: What do natural language inference models know about speaker commitment?
Abstract: Natural language inference (NLI) datasets (e.g., MultiNLI) were collected by soliciting hypotheses for a given premise from annotators. Such data collection led to annotation artifacts: systems can identify the premise-hypothesis relationship without observing the premise (e.g., negation in hypothesis being indicative of contradiction). We address this problem by recasting the CommitmentBank for NLI, which contains items involving reasoning over the extent to which a speaker is committed to complements of clause-embedding verbs under entailment-canceling environments (conditional, negation, modal and question). Instead of being constructed to stand in certain relationships with the premise, hypotheses in the recast CommitmentBank are the complements of the clause-embedding verb in each premise, leading to no annotation artifacts in the hypothesis. A state-of-the-art BERT-based model performs well on the CommitmentBank with 85% F1. However analysis of model behavior shows that the BERT models still do not capture the full complexity of pragmatic reasoning, nor encode some of the linguistic generalizations, highlighting room for improvement.
Title: Unexpected affix deletion in Kihehe
Abstract: Sometimes languages put the same morpheme (with the same meaning) more than once in the same word. This is called multiple exponence (Harris, 2017). There are strong typological tendencies that when univerbation happens and results in multiple exponence, the inner or “trapped” morph is lost (Harris and Faarlund, 2006). Against this backdrop, an apparent case of multiple exponence in Kihehe ke constructions where an outer, rather than an inner, affix is lost, seems to go against typological expectations, in addition to going against canonical affix order in the language itself. Fieldwork data from Kihehe shows, however, that, in this case, loss of an affix may have actually preceded univerbation, rather than vice versa, explaining why the Kihehe data initially looks so unexpected. The Kihehe facts remind us that languages evolve from preexisting systems (Harris, 2008) and that typological generalizations are only generalizations.