Statistical blindness and generalization in the syntax of long distance dependencies: The role of LLMs
Syntactic theory seeks to explain the generalizations that constitute human knowledge of syntax.
LLMs are an exciting technology that reveal generalizations over large swaths of natural language data.
In this talk, I discuss how this technology can be used as a tool to help identify necessary constraints on
the syntax of human languages. Using long distance dependencies as a case study, I show that LLMs
discover generalizations that humans appear to be blind to while ignoring other generalizations that are
robust in human behavior. Ultimately I argue that LLMs can contribute to our theories of syntax and our
understanding of language and cognition by identifying mismatches between the language environment
and human behavior.