File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/01/w01-1613_evalu.xml

Size: 2,950 bytes

Last Modified: 2025-10-06 13:58:47

<?xml version="1.0" standalone="yes"?>
<Paper uid="W01-1613">
  <Title>A Study of Automatic Pitch Tracker Doubling/Halving &amp;quot;Errors&amp;quot;</Title>
  <Section position="5" start_page="0" end_page="0" type="evalu">
    <SectionTitle>
4 Future and Related Work
</SectionTitle>
    <Paragraph position="0"> A further step would be to coordinate descriptions of pitch tracking errors with respect to categorizations of laryngealization, such as that of Batliner et al. (1993). A pitch value that is in a &amp;quot;subharmonic&amp;quot; or a &amp;quot;diplophonic&amp;quot; laryngealization, (from MUSLI) may need to be doubled, and context-dependent doubling rules could make use of the MUSLI classification.</Paragraph>
    <Paragraph position="1"> Different kinds of final tone classification can be investigated, once the post-processing of pitch measurements has been better established.</Paragraph>
    <Paragraph position="2"> Murray (2001) used automatic doubling rules, and a different classification scheme, resulting in lower performance than this study.</Paragraph>
    <Paragraph position="3"> Shriberg (1999) mentions laryngealizations in the context of &amp;quot;cut-off&amp;quot; words, ie, those words that a speaker did not complete. In a corpus of human-computer dialogues on air travel planning (ATIS), cut-off words had a form of laryngealization corresponding to creaky voice usually on the last 20-50 ms of the word. Better recognition of glottal pulses may lead to improved recognition of cut-off words, which are difficult phenomena for an ASR system.</Paragraph>
    <Paragraph position="4"> Brondsted (1997) reported that for a specific dialect of Danish, the presence of a glottal consonant &amp;quot;stod&amp;quot; can cause a pitch tracker to incorrectly report a halved value. Further use of wideband spectrograms to facilitate conventions of locations of glottal pulses and their influence on perceived pitch could assist dialogue research for other languages that have glottalized consonants. Black and Campbell (1995) presented a model for generating intonation patterns based on high-level discourse features automatically extracted from dialogue speech.</Paragraph>
    <Paragraph position="5"> One particular discourse act label, the so-called &amp;quot;d-yu-Q&amp;quot; label, was reported to rise up to significantly higher pitch values than other discourse act labels. Once pitch halvings and doublings are better understood, additional relationships between pitch changes and discourse acts might be discovered. Lastly, it would be useful to compare this data to outputs of other pitch trackers, such as that of Praat, Paul Boersma and David Weenink. (2001), or an updated version of &amp;quot;EDWave&amp;quot; Bunnell (2001). More sophisticated mathematical models would be interesting to use for the final tone classification, especially with respect to different kinds of pitch tracking algorithms.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML