File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/relat/00/w00-0310_relat.xml

Size: 3,448 bytes

Last Modified: 2025-10-06 14:15:34

<?xml version="1.0" standalone="yes"?>
<Paper uid="W00-0310">
  <Title>Using Dialogue Representations for Concept-to-Speech Generation</Title>
  <Section position="7" start_page="51" end_page="51" type="relat">
    <SectionTitle>
5 Related Work
</SectionTitle>
    <Paragraph position="0"> Although a number of earlier CTS systems have captured linguistic phenomena that we address in our work, the computation of prosody from dialogue representations is often not as rigorous, detailed or complete as in MIMIC-CTS. Further, while several systems use given/new information status to decide whether to accent or deaccent a lexical item, no system has directly implemented general rules for pitch accent type assignment. Together, MIMIC-CTS's computation of accentuation, pitch accent type and dialogue prosody constitutes the most general and complete implementation of a compositional theory of intonational meaning in a CTS system to date.</Paragraph>
    <Paragraph position="1"> Nevertheless, elements of a handful of previous CTS systems support the approaches taken in MIMIC-CTS toward conveying semantic, task and dialogue level meaning. For example, the Direction Assistant system (Davis and Hirschberg, 1988) mapped a hand-crafted route grammar to a discourse structure for generated directions. The discourse structure determined accentuation, with deaccenting of discourse-old entities realized (by lexically identical morphs) in the current or previous discourse segment. Other material was assigned accentuation based on lexical category information, with the exception that certain contrastive cases of accenting, such as left versus right, were stipulated for the domain.</Paragraph>
    <Paragraph position="2"> Accent assignment in the SUNDIAL travel information system (House and Yond, 1990) also relied on discourse and task models. Mutually known entities, said to be in negative focus, were deaccented; entities in the current task space, in referring focus, received (possibly contrastive) accenting; and entities of the same type as a previously mentioned object, were classified-as in either referring or emphatic focus, depending on the dialogue act~ in the cases of corrective situations or repeated system-intitiated queries, the contrasting or corrective items were emphatically accented.</Paragraph>
    <Paragraph position="3"> The BRIDGE project on speech generation (Zacharski etal., 1992) identified four main factors affecting accentability: linear 0rder, lexical category, semantic weight and givenness. In relatedwork (Monaghan, 1994), word accentability was quantitatively scored by hand-crafted rules based on information status, semantic focus and Word class. The givenness hierarchy of Gundel and'colleagues (1989), which associates lexical forms of expression with information statuses, was divided into four intervals, with scores assigned to each. A binary semantic focus score was based on whether the word occurred in the topic or comment of a sentence. Finally, lexical categories determined word class scores. These scores were combined, and metrical phonological rules then referred to final acce'ntability scores to assign a final accenting pattern.</Paragraph>
    <Paragraph position="4"> To summarize, all of the above CTS systems employ either hand-crafted or heuristic techniques for representing semantic and discourse focus information. Further, only SUNDIAL makes use of dialogue acts.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML