File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/94/c94-1104_concl.xml

Size: 2,702 bytes

Last Modified: 2025-10-06 13:57:13

<?xml version="1.0" standalone="yes"?>
<Paper uid="C94-1104">
  <Title>SYNTACTIC ANALYSIS OF NATURAL LANGUAGE USING LINGUISTIC RULES AND CORPUS-BASED PATTER.NS</Title>
  <Section position="7" start_page="632" end_page="632" type="concl">
    <SectionTitle>
4 CONCLUSION
</SectionTitle>
    <Paragraph position="0"> We discussed combining a linguistic rule-based parser and a corpus-based empirical parser. We divide the parsing process into two parts: applying linguistic information and applying corpus-based patterns. The linguistic rules are regarded ms more reliable than the corpus-based generalisations. They are therefore applied first.</Paragraph>
    <Paragraph position="1"> The idea is to use reliable linguistic information as long as it is possible. After certain phase it comes harder and harder to make new linguistic constraints to eliminate the remaining ambiguity. Therefore we use corpus-based patterns to do the remaining disand)iguation. The overall success rate of the combination of the linguistic rule-based parser and the corpus-based pattern parser is good. If some unrcsolvable ambiguity is left pending (like prepositional attachment), the total success rate of our morphological and surface-syntactic analysis is only slightly worse than that of many probabilistic part-of-speech taggers. It is a good result because we do more than just label each word with a morphological tags (i.e.</Paragraph>
    <Paragraph position="2"> noun, verb, etc.), we label them also with syntactic fimction tags (i.e. subject, object, subject complement, etc.).</Paragraph>
    <Paragraph position="3"> Some improvements might be achieved by modifying the syntactic tag set of ENGCG. As discussed above, the (syntactic) tag set of the ENGCG is not probably optimal. Some ambiguity is not resolvable (like prepositional attachment) and some distinctions arc not made (like subjects of the finite and the non-finite clauses). A better tag set for surface-syntactic parsing is presented in \[Voutilainen and Tapanainen, 1993\]. But we have not modified the present tag set because it is not clear whether small changes would improve the result significantly when compared to the effort needed.</Paragraph>
    <Paragraph position="4"> Although it is not possible to fully disambiguate the syntax in ENGCG, the rate of disambiguation can be improved using a more powerful linguistic rule tbrmalism (see \[Koskenniemi el al., 1992; Koskenniemi, 1990; Tapanainen, 1991\]). The results reported in this sudy can most likely be improved by writing a syntactic grammar in the finite-state framework. The same kind of pattern parser could then be used for disambiguating the resulting analyses.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML