XML Viewer - p95-1006

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/95/p95-1006_concl.xml
Size: 2,866 bytes
Last Modified: 2025-10-06 13:57:27
<?xml version="1.0" standalone="yes"?>
<Paper uid="P95-1006">
  <Title>Robust Parsing Based on Discourse Information: Completing partial parses of ill-formed sentences on the basis of discourse information</Title>
  <Section position="5" start_page="44" end_page="45" type="concl">
    <SectionTitle>
4 Conclusion
</SectionTitle>
    <Paragraph position="0"> We have proposed a method for completing partial parses of ill-formed sentences on the basis of information extracted from complete parses of well-formed sentences in the discourse. Our approach to handling ill-formed sentences is fundamentally different from previous ones in that it reanalyzes the part of speech and modifiee-modifier relationships of each word in an ill-formed sentence by using information extracted from analyses of other sentences in the same text, thus, attempting to generate the analysis most appropriate to the discourse. The results of our experiments show the effectiveness of this method; moreover, implementation of this method on a machine translation system improved the accuracy of its translation. Since this method has a simple framework that does not require any extra knowledge resources or inference mechanisms, it is robust and suitable for a practical natural language processing system. Furthermore, in terms of the turnaround time (TAT) of the whole translation procedure, the improvement in the parses achieved by using this method along with other disambiguation methods involving discourse information, as shown in another paper (Nasukawa, 1995), shortened the TAT in the late stages of the translation procedure,  and compensated for the extra TAT required as a result of using the discourse information, provided the size of the discourse was kept to between 100 and 300 sentences.</Paragraph>
    <Paragraph position="1"> In this paper, the term &amp;quot;discourse&amp;quot; is used as a set of words in a text together with the usage of each of those words in that text - namely, a part of speech and modifiee-modifier relationships with other words. The basic idea of our method is to improve the accuracy of sentence analysis simply by maintaining consistency in the usage of morphologically identical words within the same text. Thus, the effectiveness of this method is highly dependent on the source text, since it presupposes that morphologically identical words are likely to be repeated in the same text. However, the results have been encouraging at least with technical documents such as computer manuals, where words with the same lemma are frequently repeated in a small area of text. Moreover, our method improves the translation accuracy, especially for frequently repeated phrases, which are usually considered to be important, and leads to an improvement in the overall accuracy of the natural language processing system.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML