From Collocations to Computation: The Lexical Approach and LLMs
Introduction
The way we understand language—how it is acquired, processed, and produced—has shifted considerably over the past few decades. In language teaching, Michael Lewis’s Lexical Approach challenged traditional grammar-first models by emphasizing the importance of word combinations, or collocations, in language use. In the field of artificial intelligence, large language models (LLMs) like GPT-4 or Claude have also abandoned rule-based systems in favor of pattern recognition. Though born in different disciplines, both approaches rest on a similar foundation: that fluency and meaning emerge not from abstract rules, but from repeated exposure to patterned language.
This article explores the surprising alignment between Lewis’s pedagogical model and the operational logic of LLMs. By examining the underlying principles of the Lexical Approach, and mapping them onto the behavior of AI text generators, we can better understand not only how machines learn to write, but also how humans become fluent speakers.
Lexical Chunks and Pedagogical Patterns
In his influential 1993 book The Lexical Approach, Michael Lewis argued that vocabulary—not grammar—is the central component of language acquisition. At the heart of his theory lies the concept of collocation: the tendency of certain words to occur together more often than by chance. Expressions like “commit a crime,” “take responsibility,” or “utterly ridiculous” are not generated by assembling individual words via syntactic rules. Instead, they are retrieved from memory as ready-made units.
Lewis famously declared, “Language consists of grammaticalised lexis, not lexicalised grammar.” In other words, the structure of language arises from recurring lexical patterns rather than from pre-imposed grammatical frameworks. This view reverses the traditional hierarchy where grammar governs word use. Instead, grammar emerges as a side-effect of frequent lexico-semantic combinations.
A key implication of this view is pedagogical. Lewis proposed that learners gain fluency not by mastering grammar tables but by becoming familiar with chunks, semi-fixed expressions, and collocational patterns. The goal of teaching becomes one of input maximization: exposing learners to real-life language, fostering noticing, and encouraging storage of frequently co-occurring items in long-term memory.
Tokens, Probability, and AI Prediction
This lexico-patterned view of language finds a curious analogue in how LLMs operate. These models are not programmed with grammar rules or syntax trees. Instead, they are trained on massive datasets of natural language, learning to predict the next most probable token—a unit that may be smaller than a word (“un-”, “-ing”), a whole word (“cat”), or even a frequent multi-word expression (“as a matter of fact”).
Take the sentence:
“She burst into…”
A token-predictive model trained on billions of words will propose likely continuations such as “tears,” “laughter,” or “song.” These are not grammatically calculated but statistically predicted based on how often such patterns appear in the training data. The model learns, implicitly, that “burst into” collocates with emotional or physical outcomes—not by understanding meaning, but by tracking frequency.
This is functionally equivalent to what Lewis describes in language learning: fluency stems not from abstract rules, but from internalized patterns. Just as learners must encounter “pay attention” often enough to use it fluently, the LLM must “see” that phrase frequently enough to predict it accurately. In both systems, usage determines form.
What’s more, just as Lewis challenges the sanctity of the “word” as a unit of language, AI writing systems also treat text in sub-word or supra-word segments. This aligns with Saussure’s insight that linguistic signs are not fixed atoms but entities defined by their place in a system. These models recognize no essential “wordness,” only probabilistic continuity across variable units.
Pedagogical Lessons and Systemic Insights
This convergence between AI language generation and the Lexical Approach offers several useful insights for educators. First, it reinforces Lewis’s claim that pattern exposure trumps rule instruction. Just as text-generating algorithms produce coherent language by internalizing co-occurrence patterns, learners acquire fluency through rich, repeated input.
Second, it suggests that we should rethink language not as a set of fixed rules, but as a dynamic system of interrelated expressions. The French theorist Roland Barthes once proposed that the author is no longer a source of meaning, but a scriptor—one who arranges existing codes. The token-predictive architecture mirrors this exactly: it does not “author” text but rearranges linguistic probabilities. Language teaching might benefit from this mindset shift, treating learners less as rule-appliers and more as noticers of form.
Finally, both systems—LLMs and the Lexical Approach—validate the idea that fluency arises from familiarity, not construction. AI confirms this pedagogical hunch: exposure, memory, and repetition matter more than parsing trees or metalanguage.
Conclusion: A Shared Logic of Language
The Lexical Approach and large language models may originate from distinct fields—one educational, the other computational—but they converge on a shared principle: language is fundamentally patterned. Whether in the mind of a learner or the architecture of an LLM, fluency and coherence are byproducts of collocation, chunking, and frequency, not syntactic calculation.
Michael Lewis helped redefine language instruction by urging teachers to “raise learners’ awareness of lexis in use.” Statistical language processors, in their own silent, statistical way, confirm the wisdom of that shift. As educators incorporate AI into classrooms, they might find that machines and students thrive under the same principle: learn the patterns, and the grammar will follow.
Bibliography
Lewis, Michael. The Lexical Approach: The State of ELT and a Way Forward. Hove, England: Language Teaching Publications, 1993.
Ellis, Nick C. Implicit and Explicit Learning of Languages. London: Academic Press, 1994.
Jurafsky, Daniel, and James H. Martin. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. 3rd ed. Draft version, 2023.
McCarthy, Michael, and Felicity O’Dell. English Collocations in Use: Intermediate. Cambridge: Cambridge University Press, 2005.
Comments
Post a Comment