File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/04/w04-0110_concl.xml

Size: 6,619 bytes

Last Modified: 2025-10-06 13:54:08

<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-0110">
  <Title>Segment Predictability as a Cue in Word Segmentation: Application to Modern Greek</Title>
  <Section position="5" start_page="9" end_page="9" type="concl">
    <SectionTitle>
4 Conclusion
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="9" end_page="9" type="sub_section">
      <SectionTitle>
4.1 Comparing the Four Variants
</SectionTitle>
      <Paragraph position="0"> The findings here confirm Brent's (1999a) contention that mutual information is a better measure of predictability than is transitional probability--at least for the task of identifying words, not just boundaries. This is particularly true in the global comparison. Transitional probability finds more word boundaries in the 'local comparison' model, but this does not carry over to the task of pulling out the word themselves, which is arguably the infant's main concern. This result should be kept in mind when interpreting or replicating (Saffran et al., 1996) or similar studies. While Brent's 'local comparison' heuristic was unable to pull out one-phoneme-long words, as predicted above, this did not adversely affect it as much as anticipated. On the contrary, both the local and global comparison heuristics tended to postulate too many word boundaries, as Brent had observed. This is not necessarily a bad thing for infants, for several reasons.</Paragraph>
      <Paragraph position="1"> First, infants may have a preference for finding short words, since these will presumably be easier to remember and learn, particularly if the child's phonetic memory is limited. Second, it is probably easier to reject a hypothesized word (for example, on failing to find a consistent semantic cue for it) than to obtain a word not correctly segmented; hence false positives are less of a problem than false negatives for the child. Third and most importantly, this cue is not likely to operate on its own, but rather as one among many contributing cues. Other cues may act as filters on the boundaries suggested by this cue. One example of this is the distribution of segments before utterance edges, as used by e.g., Aslin et al. (1996) and Christiansen et al. (1998) which indicate the set of possible word-final segments in the language.</Paragraph>
      <Paragraph position="2">  However, as far as these results go, the word type metric shows that the finite-state model using a global threshold suffered slightly less from this problem than the local comparison model. For the MI variants, both recall and precision for word type were about 2% higher on the global threshold variant. For transitional probability, the precision of the local and global models was roughly equal, but recall for the global comparison model was 5.5% higher. Not only were the global models better at pulling out a variety of words, but they also managed to learn longer ones (especially the global TP variant), including a few four-syllable words. The local model learned no four-syllable words, and relatively few three-syllable words.</Paragraph>
      <Paragraph position="3"> The mixed nature of these results suggests that evaluation depends fairly crucially on what performance metric needs to be optimized. This demands stronger prior hypotheses regarding the process and needed input of a vocabularyacquiring child. However, it cannot be blindly assumed that children are selecting low points over as short a window as Brent's (1999a) MI and TP models suggest. Quite possibly the best model would involve either a hybrid of local and global comparisons, or a longer window, or even a 'gradient' window where far neighbors count less than near ones in a computed average.</Paragraph>
      <Paragraph position="4"> However, further speculation on point this of less importance than considering how this cue interacts with others known experimentally to be salient to infants. Christiansen et al. (1998) and Johnson and Jusczyk (2001) have already began simulating and testing these interactions in English. However, more work needs to be done to understand better the nature of these interactions cross-linguistically.</Paragraph>
    </Section>
    <Section position="2" start_page="9" end_page="9" type="sub_section">
      <SectionTitle>
4.2 Further Research
</SectionTitle>
      <Paragraph position="0"> As mentioned above, one obvious area for future research is the interaction between predictability cues like MI and utterance-final information; this is one of the cue combinations explored in Christiansen et al. (1998) in English. Previous research (Rytting, 2004) examined the role of utterance-final information in Greek, and found that this cue performs better than chance on its own. However, it seems that utterance-final information would be more useful as a filter on the heuristics explored here to restrain them from oversegmenting the utterance. Since nearly all Greek words end in /a/, /e/, /i/, /o/, /u/, /n/, or /s/, just restricting word boundaries to positions after these seven phonemes boosts boundary precision considerably with little effect on recall.</Paragraph>
      <Paragraph position="1">  Naturally, in unrestricted speech the characteristics Preliminary testing suggests that this filter boosts both precision and recall at the word level.</Paragraph>
      <Paragraph position="2"> However, a model that incorporates the likelihoods of word boundaries after each of these final segments, properly weighted, may be even more helpful than this simple, unweighted filter.</Paragraph>
      <Paragraph position="3"> Another fruitful direction is the exploration of prosodic information such as lexical stress. With the exception of a certain class of clitic groups, Greek words have at most one stress. Hence, at least one word boundary must occur between two stressed vowels. Relations between stress and the beginnings and endings of words, while not predicted to be as robust a cue as in English (see e.g., Cutler, 1996), should also provide useful information, both alone and in combination with segmental cues.</Paragraph>
      <Paragraph position="4"> Finally, the relationship between these more 'static' cues and the cues that emerge as vocabulary begins to be acquired (as in Brent's main MBDP-1 model and others discussed above) seems not to have received much attention in the literature. As vocabulary is learned, it can help bootstrap these cues by augmenting heuristic cues with actual probabilities derived from its parses. Hence, the combination of e.g., MLDP-1 and these heuristics may prove more powerful than either approach alone.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML