File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/04/w04-1612_concl.xml

Size: 1,562 bytes

Last Modified: 2025-10-06 13:54:21

<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-1612">
  <Title>Automatic diacritization of Arabic for Acoustic Modeling in Speech Recognition</Title>
  <Section position="7" start_page="0" end_page="0" type="concl">
    <SectionTitle>
6 Conclusions
</SectionTitle>
    <Paragraph position="0"> In this study we have investigated di erent options for automatically diacritizing Arabic text for use in acoustic model training for ASR. A comparison of the di erent approaches showed that more linguistic information (morphology and syntactic context) in combination with the acoustics provides lower diacritization error rates. However, there is no signi cant difference among the word error rates of ASR sys- null tion with the baseline system. FBIS1, FBIS2 and FBIS3 correspond to the diacritization procedures described in Sections 4.1, 4.2 and 4.3 respectively. For the rst approach we report results using the tagger probabilities with weights 1 and 5.</Paragraph>
    <Paragraph position="1"> tems trained on data resulting from the di erent methods. This result suggests that it is possible to use automatically diacritized training data for acoustic modeling, even if the data has a comparatively high diacritization error rate (23% in our case). Note, however, that one reason for this may be that the acoustic models are nally adapted to the accurately transcribed CH-only data. In the future, we plan to apply knowledge-poor diacritization procedures to other dialects of Arabic, for which morphological analyzers do not exist.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML