File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/96/c96-1018_concl.xml
Size: 2,138 bytes
Last Modified: 2025-10-06 13:57:27
<?xml version="1.0" standalone="yes"?> <Paper uid="C96-1018"> <Title>Unsupervised Discovery of Phonological Categories through Supervised Learning of Morphological Rules</Title> <Section position="8" start_page="99" end_page="99" type="concl"> <SectionTitle> 8 Conclusion </SectionTitle> <Paragraph position="0"> We have shown by example that machine learning technique, s can profitably be used in linguistics as a tool for the comparison of linguistic theories and hypotheses or for the discovery of new linguistic theories in the form of linguistic rules or categories.</Paragraph> <Paragraph position="1"> The case study we presented concerns dimimllive formation in Dutch, for which we showed that (i) machine learning techniques can be used to corroborate and falsify some of the existing theories about the phenomenon, and (ii) machine learning techniques can be used to (re)discover interesting linguistic rules (e.g. the rule solving the -etjc versus -kje problem) and categories (e.g. the category of bimoraic vowels).</Paragraph> <Paragraph position="2"> The extracted system can of course also be used in language technology as a data-oriented system for solving particular linguistic tasks (in this case diminutive format!on). In order to test the usability of the approach for this application, we compared the performance of the extracted rule system to tile performance of the hand-crafted rule system proposed by Trommelen. Table 4 shows for each allomorph the number of errors by the C4.5 rules (trained using corpus NC, i.e. only the rhyme of the last syllable) as opposed to an implementation of the rules suggested by ~lY=ommelen. One problem with the latter is thai; they often suggest more than one allomorph (the rules are not mutually exclusive). In those cases where more than one rule applies, a choice was made at random. null The comparison shows that C4.5 did a good job of finding an elegant and accurate rule-based description of the problem. This rule set is useful both in linguistics (for evaluation, refinement, and discovery of theories) and in language technology.</Paragraph> </Section> class="xml-element"></Paper>