File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/w04-1307_intro.xml
Size: 7,401 bytes
Last Modified: 2025-10-06 14:02:38
<?xml version="1.0" standalone="yes"?> <Paper uid="W04-1307"> <Title>Statistics Learning and Universal Grammar: Modeling Word Segmentation</Title> <Section position="2" start_page="49" end_page="50" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Two facts about language learning are indisputable.</Paragraph> <Paragraph position="1"> First, only a human baby, but not her pet kitten, can learn a language. It is clear, then, that there must be some element in our biology that accounts for this unique ability. Chomsky's Universal Grammar (UG), an innate form of knowledge specific to language, is an account of what this ability is. This position gains support from formal learning theory [13], which sharpens the logical conclusion [4,5] that no (realistically efficient) learning is possible without priori restrictions on the learning space. Second, it is also clear that no matter how much of a head start the child has through UG, language is learned. Phonology, lexicon, and grammar, while governed by universal principles and constraints, do vary from language to language, and they must be learned on the basis of linguistic experience. In other words-indeed a truism-both endowment and learning contribute to language acquisition, the result of which is extremely sophisticated body of linguistic knowledge. Consequently, both must be taken in account, explicitly, in a theory of language acquisition [6,7].</Paragraph> <Paragraph position="2"> Controversies arise when it comes to the relative contributions by innate knowledge and experience-based learning. Some researchers, in particular linguists, approach language acquisition by characterizing the scope and limits of innate principles of Universal Grammar that govern the world's language. Others, in particular psychologists, tend to emphasize the role of experience and the child's domain-general learning ability. Such division of research agenda understandably stems from the division of labor between endowment and learningplainly, things that are built in needn't be learned, and things that can be garnered from experience needn't be built in.</Paragraph> <Paragraph position="3"> The important paper of Saffran, Aslin, & Newport [8] on statistical learning (SL), suggests that children may be powerful learners after all. Very young infants can exploit transitional probabilities between syllables for the task of word segmentation, with only minimum exposure to an artificial language. Subsequent work has demonstrated SL in other domains including artificial grammar learning [9], music [10], vision [11], as well as in other species [12]. This then raises the possibility of learning as an alternative to the innate endowment of linguistic knowledge [13].</Paragraph> <Paragraph position="4"> We believe that the computational modeling of psychological processes, with special attention to concrete mechanisms and quantitative evaluations, can play an important role in the endowment vs.</Paragraph> <Paragraph position="5"> learning debate. Linguists' investigations of UG are rarely developmental, even less so corpus-oriented.</Paragraph> <Paragraph position="6"> Developmental psychologists, by contrast, often stop at identifying components in a cognitive task [14], without an account of how such components work together in an algorithmic manner. On the other hand, if computation is to be of relevance to linguistics, psychology, and cognitive science in general, being merely computational will not suffice. A model must be psychological plausible, and ready to face its implications in the broad empirical contexts [7]. For example, how does it generalize to typologically different languages? How does the model's behavior compare with that of human language learners and processors? In this article, we will present a simple computational model of word segmentation and some of its formal and developmental issues in child language acquisition. Specifically we show that SL using transitional probabilities cannot reliably segment words when scaled to a realistic setting (e.g., child-directed English). To be successful, it must be constrained by the knowledge of phonological structure. Indeed, the model reveals that SL may well be an artifact-an impressive one, nonethelessthat plays no role in actual word segmentation in human children.</Paragraph> <Paragraph position="7"> 2 Statistics does not Refute UG It has been suggested [15, 8] that word segmentation from continuous speech may be achieved by using transitional probabilities (TP) between adjacent syllables A and B, where , TP(A!B) = P(AB)/P(A), with P(AB) being the frequency of B following A, and P(A) the total frequency of A.</Paragraph> <Paragraph position="8"> Word boundaries are postulated at local minima, where the TP is lower than its neighbors. For example, given sufficient amount of exposure to English, the learner may establish that, in the foursyllable sequence &quot;prettybaby&quot;, TP(pre!tty) and TP(ba!by) are both higher than TP(tty!ba): a word boundary can be (correctly) postulated. It is remarkable that 8-month-old infants can extract three-syllable words in the continuous speech of an artificial language from only two minutes of exposure [8].</Paragraph> <Paragraph position="9"> To be effective, a learning algorithm-indeed any algorithm-must have an appropriate representation of the relevant learning data. We thus need to be cautious about the interpretation of the success of SL, as the authors themselves note [16]. If anything, it seems that the findings strengthen, rather than weaken, the case for (innate) linguistic knowledge. A classic argument for innateness [4, 5, 17] comes from the fact that syntactic operations are defined over specific types of data structuresconstituents and phrases-but not over, say, linear strings of words, or numerous other logical possibilities. While infants seem to keep track of statistical information, any conclusion drawn from such findings must presuppose children knowing what kind of statistical information to keep track of. After all, an infinite range of statistical correlations exists in the acoustic input: e.g., What is the probability of a syllable rhyming with the next? What is the probability of two adjacent vowels being both nasal? The fact that infants can use SL to segment syllable sequences at all entails that, at the minimum, they know the relevant unit of information over which correlative statistics is gathered: in this case, it is the syllables, rather than segments, or front vowels.</Paragraph> <Paragraph position="10"> A host of questions then arises. First, How do they know so? It is quite possible that the primacy of syllables as the basic unit of speech is innately available, as suggested in neonate speech perception studies [18]? Second, where do the syllables come from? While the experiments in [8] used uniformly CV syllables, many languages, including English, make use of a far more diverse range of syllabic types. And then, syllabification of speech is far from trivial, which (most likely) involve both innate knowledge of phonological structures as well as discovering language-specific instantiations [14].</Paragraph> <Paragraph position="11"> All these problems have to be solved before SL for word segmentation can take place.</Paragraph> </Section> class="xml-element"></Paper>