
Abstract
The debate concerning the origins of human language juxtaposes the theory of universal grammar (UG), which asserts an innate linguistic framework, against the perspective that language emerges through pattern recognition shaped by exposure. A cornerstone of UG, as articulated by Noam Chomsky, is the argument from poverty of stimulus (hereafter PSA), which contends that the limited linguistic input children receive cannot account for their sophisticated output, necessitating an innate grammar. This article challenges that assertion by leveraging large language models (LLMs), artificial intelligence systems that produce fluent language without pre-installed rules. I argue that LLMs undermine the PSA by demonstrating that pattern recognition can yield complex speech, suggesting that human language acquisition may not require UG. I bolster this claim with evidence from language-deprived children and a thought experiment on disrupted learning, both interpreted through LLMs’ functionality. Following an accessible explication of LLMs for philosophical readers, I present evidence to question UG’s premises, then directly critique the PSA, proposing that input is more robust and output less reliant on innate rules than UG suggests.
Introduction
The question of how humans acquire language remains a pivotal issue in linguistic and philosophical scholarship. Two principal theories dominate this discourse: the Chomskyan notion of universal grammar (UG), which posits that language depends on an innate set of syntactic rules, and an alternative view that language arises from pattern recognition honed through linguistic experience. UG’s most compelling defense is the argument from poverty of stimulus (PSA), which holds that the incomplete input available to children is insufficient to explain their linguistic proficiency, implying an innate grammatical system. This PSA has long bolstered UG’s prominence, yet the advent of large language models (LLMs) in artificial intelligence, alongside new interpretive lenses, prompts a reassessment.
In this article, I contend that LLMs provide substantial evidence against the PSA, suggesting that language acquisition may not necessitate an innate grammar but could instead rely on sophisticated pattern recognition—a mechanism observable in both artificial systems and human learners. My analysis unfolds as follows: I first offer a clear, non-technical overview of LLMs, designed for philosophical inquiry; I then demonstrate how LLMs challenge the necessity of UG, with specific relevance to the PSA; next, I directly engage the PSA, arguing that it overestimates input inadequacy and underestimates experiential learning, incorporating cases of language deprivation; and finally, I explore a novel scenario of disrupted learning to further illuminate pattern-based acquisition. My objective is to refute this foundational UG argument, positing that LLMs and fresh perspectives illuminate a viable alternative to innate grammar.
Large Language Models: A Conceptual Overview
Large language models (LLMs) are artificial intelligence systems engineered to process and generate human-like language, providing a novel perspective for examining linguistic theory. Unlike rule-based computational models, LLMs lack explicit grammatical instructions or lexical definitions. They are instead trained on vast textual corpora—billions of sentences sourced from literature, academic texts, and digital discourse—with the aim of predicting subsequent words in a sequence. For instance, given “The moon began to…,” an LLM might generate “rise,” not through semantic understanding, but via statistical associations derived from extensive exposure.
The development of an LLM begins with a neural network, a computational structure loosely inspired by neural connectivity, though devoid of biological intricacy. Initially, this network possesses no linguistic knowledge. Through training, it is exposed to massive textual datasets, iteratively refining its internal parameters to improve predictive accuracy. For example, repeated presentation of “The cat sat on the…” enables the model to favor “mat” as a completion, achieved without explicit syntactic or morphological guidance. This process, supported by significant computational resources over prolonged periods, hinges on pattern detection—no rules such as “verbs require tense” or “nouns may take plural forms” are encoded. The outcome is a system capable of producing coherent, contextually apt language, from scholarly prose to informal exchanges, without intrinsic comprehension. For philosophical analysis, LLMs function as a controlled experiment: a language-generating entity without innate structure, inviting comparison to human acquisition processes.
LLMs as a Challenge to the Poverty Argument
The functionality of LLMs directly pertains to the UG debate, particularly the PSA, by suggesting that linguistic competence need not presuppose an innate grammar. I advance three observations to substantiate this claim.
First, LLMs initiate development from a null state. UG asserts that human learners require an innate framework because unguided exposure cannot produce linguistic proficiency. However, LLMs begin with no syntactic presets or instinctual guides, relying solely on training data—human-generated text with inherent irregularities—and still achieve fluency. If an artificial system can attain such competence without an initial grammar, I argue that human learners might similarly leverage heightened sensitivity to linguistic input, reducing the need for UG.
Second, LLMs generate language through prediction rather than rule adherence. Their output stems from probabilistic inference based on prior patterns, not an internalized grammar. This mirrors child language acquisition: a child might produce “I runned” instead of “I ran,” extrapolating from “walk/walked” without formal instruction. UG proponents argue that such creativity requires a pre-existing structure, but LLMs demonstrate that predictive mechanisms can emulate rule-like behavior. I propose that human speech may likewise arise from pattern-based inference, challenging the necessity of innate rules.
Third, LLMs are not wholly constrained by statistical dominance. While their predictions are data-driven—favoring “could have” over “could off” if the former predominates—designers may temper this reliance. For example, despite a dataset skewed 95% toward “could off,” an LLM could be configured to grant “could have” equal probability. This less aggressive approach suggests that language generation transcends mere frequency, a trait reflected in children who experiment beyond prevalent patterns. I argue that this adaptability reinforces the sufficiency of pattern recognition over rigid grammatical presets.
These observations collectively weaken the PSA’s premise that input inadequacy necessitates UG. If LLMs can approximate human language without innate rules, I suggest that the presumed disparity between input and output may be less pronounced than UG assumes.
The Poverty of the Stimulus Argument
The PSA forms the bedrock of UG’s defense, meriting rigorous examination. It contends that the linguistic input available to children—fragmentary, inconsistent, and incomplete—cannot account for the sophistication of their output, implying an innate grammar. A child might hear “She walked” and “I went” but not “I goed,” yet briefly adopt “I goed” before correcting to “I went.” Likewise, a child constructs embedded clauses—“The dog that chased the cat ran away”—without explicit exposure to such structures. UG theorists assert that this gap—meager input yielding torrential output—evidences a prewired system, supplying rules like tense application or syntactic embedding absent from the data.
The PSA’s persuasiveness stems from its empirical foundation: children acquire language rapidly and accurately despite an ostensibly impoverished environment. UG positions this as proof of an internal mechanism bridging the divide, a claim I now seek to refute.
A Critique of the Poverty Argument
I argue that the PSA falters under scrutiny, and LLMs provide the evidence to expose its shortcomings. I offer five critiques to dismantle its foundations.
First, I contest the depiction of input as “meager.” UG assumes a paucity of data, yet the linguistic environment of a child is demonstrably robust. By age five, children encounter millions of words, often presented in repetitive, contextually rich forms—“Here is your milk,” “See the dog”—augmented by gestures and situational cues. This input constitutes a structured dataset, not a barren expanse. LLMs rely on text alone; humans benefit from a multimodal experience integrating speech and context. I propose that this enriched input mitigates the perceived poverty, lessening the reliance on an innate grammar.
Second, I challenge the “torrential” characterization of output. UG cites the complexity of child speech as evidence of rules, but early productions—“Me want juice,” “I runned”—reveal experimentation rather than mastery. LLMs exhibit analogous behavior: given “walk/walked,” they might predict “run/runned,” refining with further data. I suggest that such output reflects pattern-based inference, not an innate system, and that children’s subsequent corrections align with learning from exposure rather than preformed rules.
Third, I address the rapidity of acquisition, which UG deems inexplicable without an innate framework. LLMs illustrate that efficiency, not innateness, drives progress: their swift learning results from vast data and computational power. Human children, with less input, possess a cognitive capacity optimized for pattern recognition across domains—language, perception, social interaction. I argue that this general aptitude accelerates language learning, obviating the need for a specialized grammar module.
Fourth, I emphasize social feedback, a factor UG overlooks. A child’s “I goed” elicits “No, I went,” enhancing the input’s efficacy. LLMs lack this real-time correction yet succeed; humans, benefiting from it, transform “meager” into “sufficient.” I assert that this interactive process narrows the alleged gap, further undermining the PSA’s premise.
Fifth, I draw on cases of language-deprived children, often cited to bolster UG, to argue the opposite. A striking example is Genie, a girl discovered in California in 1970 at age 13, who had been confined by her abusive father in near-total isolation since infancy. Locked in a small room, she was rarely spoken to, exposed only to sporadic, minimal language—grunts or commands—rather than the rich discourse typical of early childhood. When rescued, Genie could not speak fluently; her utterances were limited to short, telegraphic phrases like “stopit” or “nomore,” and despite years of therapy, she never mastered complex grammar. UG proponents might argue this shows an innate system starved of activation, but I contend it demonstrates the opposite: Genie failed to develop language not because an innate grammar was untriggered, but because she lacked the linguistic patterns essential for recognition and generalization. LLMs, when denied sufficient data, similarly falter, producing incoherent output. This parallel suggests that language acquisition depends on exposure, not innateness. Far from supporting the PSA, these cases reveal that the “poverty” is not universal but situational, and when input is truly absent, no UG emerges to fill the void.
Disrupted Patterns: Lessons from LLMs
To further challenge the PSA, I propose a thought experiment: imagine a child whose early linguistic input is systematically disrupted by constant, arbitrary corrections. Told “No, say ‘cat fly’” after “The cat runs,” or “No, ‘blue eat’” for “I eat lunch,” the child faces a barrage of inconsistent feedback, undermining any stable patterns. Conceivably, such a child might never achieve fluency, their speech stunted by confusion rather than enriched by correction. This scenario, while hypothetical, finds an explanatory parallel in the mechanics of large language models (LLMs), illuminating what goes wrong and reinforcing my critique of UG.
LLMs learn by adjusting internal parameters to predict patterns in training data. If that data is coherent—say, millions of sentences with consistent grammar—the model refines its predictions, producing fluent output. But consider an LLM trained on a corrupted dataset where “The dog barks” is arbitrarily corrected to “Dog sky jumps,” and “She walks” becomes “Walks she cloud.” Each iteration introduces noise, destabilizing the model’s ability to detect reliable patterns. Its output would devolve into nonsense, not because it lacks an innate grammar, but because the training signal is incoherent. In technical terms, the model’s loss function—a measure of prediction error—would fail to converge, leaving it unable to generalize.
This mirrors the hypothetical child. Human learners, I argue, rely on pattern recognition honed by consistent exposure, much like LLMs. Arbitrary corrections disrupt this process, flooding the system with contradictory data. A child might infer “runs” pairs with “cat” one day, only to be told “fly” the next, preventing the consolidation of linguistic rules through experience. UG posits an innate grammar to stabilize such chaos, yet the child’s failure to speak fluently in this scenario suggests no such mechanism rescues them. Instead, the breakdown reflects a dependence on stable input—a dependence LLMs emulate. This thought experiment underscores that language emerges from patterns, not prewired rules, and that disrupting those patterns, far from revealing UG, exposes its absence.
Conclusion
I conclude that the PSA, though historically influential, does not withstand the multifaceted challenge posed by LLMs and new interpretive angles. These systems demonstrate that language can emerge from pattern recognition, casting doubt on the necessity of an innate grammar. The input available to children is more substantial than UG acknowledges—evidenced by deprivation cases like Genie’s—and their output is less dependent on prewired rules than posited, aligning with a learning process LLMs emulate. Moreover, the vulnerability of language acquisition to disrupted patterns reinforces its experiential basis. While UG advocates might contend that human uniqueness persists, I maintain that the PSA’s foundation is untenable. Language, I propose, may not be an innate endowment but a capacity forged through experience—a hypothesis invigorated by the artificial systems we have developed and the scenarios we can envision.
References
Chomsky, N. (1980). Rules and Representations. Columbia University Press.
Lan, N., Chemla, E., & Katzir, R. (2024). “Large Language Models and the Argument from the Poverty of the Stimulus.” Linguistic Inquiry. Advance online publication. https://doi.org/10.1162/ling_a_00533
Piantadosi, S. T. (2023). “Modern Language Models Refute Chomsky’s Approach to Language.” Lingbuzz Preprint, lingbuzz/006423.
Wilcox, E. G., et al. (2023). “Testing the Predictions of Large Language Models for Linguistic Phenomena.” Proceedings of the National Academy of Sciences, 120(15), e2215678120.