File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/00/w00-1008_concl.xml

Size: 3,249 bytes

Last Modified: 2025-10-06 13:52:54

<?xml version="1.0" standalone="yes"?>
<Paper uid="W00-1008">
  <Title>Using decision trees to select the gran natical relation of a noun phrase</Title>
  <Section position="8" start_page="71" end_page="72" type="concl">
    <SectionTitle>
6. Conclusion
</SectionTitle>
    <Paragraph position="0"> Natural language generation is typically done under one of two scenarios. In the first scenario, language is generated eex nihilo: a planning component formulates propositions on the basis of a database query, a system event, or some other non-linguistic stimulus.</Paragraph>
    <Paragraph position="1"> Under such a scenario, the discourse status of referents is known, since the planning component has selected the discourse entities to be expressed. More abstract discourse features like \[informationStatus\] can therefore be used to guide the linguistic encoding decisions.</Paragraph>
    <Paragraph position="2"> In the second, more typical scenario, natural language generation involves reformulating existing text, e.g. for summarization or machine translation. In this scenario, analysis of the linguistic stimulus will most likely have resulted in only a partial understanding of the source text. Coreferenee relations (e.g. between a pronoun and its antecedent)  may not be fully resolved, discourse relations may be unspecified, and the information status of mentions is unlikely to have been determined. As was shown in section 5.2, the accuracy of the decision trees constructed without the feature \[InformationStatus\] is comparable to the accuracy that results from using this feature, since superficial elements of the linguistic form of a mention are motivated by the information status of the mention.</Paragraph>
    <Paragraph position="3"> The decision trees that were constructed to model the distribution of NPs in real texts can be used to guide the generation of natural language, especially to guide the selection among alternative grammatical ways of expressing the same propositional content.</Paragraph>
    <Paragraph position="4"> Sentences in which mentions occur in positions that are unlikely given a set of linguistic features should be avoided.</Paragraph>
    <Paragraph position="5"> One interesting problem remains for future research: why do writers occasionally place mentions in statistically unlikely positions? One possibility is that writers do so for stylistic variation. Another intriguing possibility is that statistically unusual occurrences reflect pragmatic markedness, i.e. that writers place NPs in certain positions in order to signal discourse information. Fox (1987), for example, demonstrates that lexical NPs may be used for previously mentioned discourse entities where a pronoun might be expected instead if there is an episode boundary in the discourse. For example, a prot~igonist in a novel may be reintroduced by name at the beginning of a chapter. In future research we propose to examine the mentions that occur in places not predicted by the models. It may be that this approach to modeling the distribution of mentions, essentially a machine-learning approach that seeks to mine an abstract property of texts, will provide useful insights into issues of discourse structure.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML