The Fluidity of "The Word": Martinet, Writing, and AI-Based Linguistic Processing
Introduction
André Martinet’s article The Word challenges the conventional understanding of words as fixed linguistic units. Across languages, the concept of a "word" varies significantly, yet writing has reinforced the perception of words as stable entities. Mechanical translation further contributed to this illusion by treating words as discrete units with direct one-to-one correspondences across languages. However, modern AI-based language models, which process language in more nuanced ways, lend support to his functionalist perspective. This article explores Martinet’s argument by tracing the evolution of how language is conceptualized—from early scriptio continua, to machine translation, to the contemporary role of AI in text processing. By doing so, we uncover how recent technological advancements reaffirm his claim that linguistic units must be analyzed beyond the apparent boundaries of words.
The Concept of "Word" Across Languages
One of Martinet’s key observations is that "the concept of ‘word’ is not a universal given, but rather a construct shaped by linguistic traditions and practical necessity." Different languages treat words in distinct ways: in Chinese, characters represent units of meaning but do not always correspond to single words; in agglutinative languages such as Turkish, what might seem like a word in English functions more like an entire phrase. These variations challenge the notion of words as stable units, suggesting that linguistic analysis must be flexible enough to accommodate multiple structures. As Martinet states, "any attempt to define the word as an isolated entity meets immediate difficulty when tested against linguistic diversity."
The Role of Writing in Fixing the Word as a Unit
Martinet observes that writing has played a decisive role in shaping our perception of words as discrete entities. Early Latin and Greek texts employed scriptio continua, a writing style devoid of spaces between words. The later introduction of spaces—a convention refined by medieval scribes and printers such as Aldus Manutius—reinforced the illusion that words exist as fixed units. Martinet cautions against using written text as the foundation for linguistic analysis, emphasizing that "it is from speech that one should always start in order to understand the real nature of human language." While writing serves as a powerful tool for preservation and communication, it imposes artificial segmentations that do not necessarily correspond to the structure of spoken language.
It should be noted, however, that this discussion refers specifically to
phonetic writing systems and their influence on linguistic analysis. It does
not engage with broader philosophical considerations, such as Derrida’s
critique of the privileging of speech over writing in Of Grammatology.
Derrida’s notion of arche-writing—a fundamental inscription underlying
both speech and writing—operates at a different level of analysis. Here, the
concern is not whether writing, in some deep ontological sense, precedes or
conditions language, but rather how conventional writing systems have shaped
linguistic thought.
Mechanical Translation and the Misconception of Words as Fixed Entities
The advent of machine translation in the 1950s and 1960s further entrenched the view that words are isolated, meaning-bearing units. Early translation systems, such as rule-based models, operated under the assumption that words had fixed meanings across languages. This approach led to poor translations, as it ignored syntax, idioms, and contextual meaning. As Martinet cautions, "the fundamental traits of human language are frequently to be found behind the screen of the word." These early technologies inadvertently validated his argument that words alone cannot be the primary units of linguistic analysis.
AI and the Reaffirmation of Martinet’s View
Modern AI-based language models, such as GPT-4 and DeepL, operate on principles that align more closely with Martinet’s linguistic theories. Rather than analyzing words in isolation, these systems process language based on syntax, pragmatics, and discourse context. They break words into smaller subunits and consider broader textual structures, making them more adept at capturing linguistic meaning. This is exemplified in the virtual unrolling of the Herculaneum scrolls, where AI reconstructs lost text by analyzing ink density, character patterns, and microstructures—proving that writing is not merely a sequence of words but contains deeper layers of information. Paradoxically, AI is, in a way, helping linguistics recover a more speech-based, functionalist perspective—despite its origins in written text, which lacks the social, subjective nature of oral interactions. This technological breakthrough underscores Martinet’s argument that "the necessity of pushing the examination beyond the immediate appearances" remains critical for linguistic research.
Conclusion: Martinet and Saussure’s Shared Skepticism of the "Word"
Martinet’s insights into the limitations of the word as a linguistic unit echo earlier concerns raised by Ferdinand de Saussure. Saussure wrote that "words do not answer exactly to our definition of linguistic units" (Course in General Linguistics, p. 158), emphasizing that linguistic meaning arises from relationships rather than isolated terms. The continuity between his skepticism and Saussure’s approach to language suggests a broader re-evaluation of how language should be studied. A future article will explore in greater depth the connection between Martinet and Saussure regarding the validity of the word as a linguistic unit, shedding light on how their critiques continue to shape contemporary linguistic thought.
Bibliography
Martinet, André, and Victor A. Velen. "The Word." Diogenes 13, no. 51 (1965): 38-54.
Saussure, Ferdinand de. "Course in General Linguistics." Translated and annotated by Roy Harris. With a new introduction by Roy Harris. Bloomsbury, 2013.
Saussure, Ferdinand de. Cours de linguistique générale. Edited by Charles Bally and Albert Sechehaye, with the collaboration of Albert Riedlinger. Arbre d’Or, Genève, 2005.
Poibeau, Thierry. Machine Translation. Cambridge, MA: MIT Press, 2017.
Almahasees, Zakaryia. Analysing English-Arabic Machine Translation: Google Translate, Microsoft Translator, and Sakhr. New York: Routledge, 2022.
Steiner, George. After Babel: Aspects of Language and Translation. Oxford: Oxford University Press, 1998.
Comments
Post a Comment