File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/00/a00-2040_concl.xml
Size: 1,451 bytes
Last Modified: 2025-10-06 13:52:39
<?xml version="1.0" standalone="yes"?> <Paper uid="A00-2040"> <Title>A Finite State and Data-Oriented Method for Grapheme to Phoneme Conversion</Title> <Section position="6" start_page="308" end_page="308" type="concl"> <SectionTitle> 5 Concluding remarks </SectionTitle> <Paragraph position="0"> We have presented a method for grapheme to phoneme conversion, which combines a hand-crafted finite state transducer with rules induced by a transformation-based learning. An advantage of this method is that it is able to achieve a high level of accuracy using relatively small training sets. Busser (1998), for instance, uses a memory-based learning strategy to achieve 90.1% word accuracy on the same task, but used 90% of the CELEX data (over 300K words) as training set and a (character/phoneme) window size of 9. Hoste et al. (2000a) achieve a word accuracy of 95.7% and a phoneme accuracy of 99.5% on the same task, using a combination of machine learning techniques, as well as additional data obtained from a second dictionary.</Paragraph> <Paragraph position="1"> Given the result of Roche and Schabes (1997), an obvious next step is to compile the induced rules into an actual transducer, and to compose this with the hand-crafted transducer. It should be noted, however, that the number of induced rules is quite large in some of the experiments, so that the compilation procedure may require some attention.</Paragraph> </Section> class="xml-element"></Paper>