File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/03/p03-2033_metho.xml

Size: 3,024 bytes

Last Modified: 2025-10-06 14:08:22

<?xml version="1.0" standalone="yes"?>
<Paper uid="P03-2033">
  <Title>A Debug Tool for Practical Grammar Development</Title>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3 Functions of willex
</SectionTitle>
    <Paragraph position="0"> To create the new debugging tool, we have extended will (Imai et al., 1998). Will is a browser of parsing results of grammars based on feature structures. Will and willex are implemented in JAVA.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.1 Using XML Tagged Corpora
</SectionTitle>
      <Paragraph position="0"> Willex uses sentence boundaries, word chunking, and POSs/labels encoded in XML tagged corpora.</Paragraph>
      <Paragraph position="1"> First, with the information of sentence boundaries and word chunking, ambiguity of sentences is reduced, and ambiguity at parsing phase is also reduced. A parser connected to willex is assumed to produce only results consistent with the information.</Paragraph>
      <Paragraph position="2"> An example is shown in Figure 1 (&lt;su&gt; is a sentential tag and &lt;np&gt; is a tag for noun phrases).</Paragraph>
      <Paragraph position="3">  Next, willex compares POSs/labels encoded in XML tags and parsing results, and deletes improper parsing trees. Therefore, it reduces numbers of partial parsing trees, which appear in the way of parsing and should be checked by human debuggers. In addition, human debuggers can delete partial parsing trees manually later. Figure 2 shows a concrete example. (NP and S are labels for noun and sentential</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.2 Output of Grammar Defects
</SectionTitle>
      <Paragraph position="0"> Willex has a function to output information of grammar defects into a file in order to collect the defects data and treat them statistically. In addition, we can save a log of debugging experiences which show what grammar defects are found.</Paragraph>
      <Paragraph position="1"> An example of an output file is shown in Table 1. It includes sentence numbers, word ranges in which parsing failed, and comments input by a human debugger. For example, the first row of the table means that the sentence #0 has coordinations of verb phrases at position #3-#12, which cannot be parsed. &amp;quot;OK&amp;quot; in the second row means the sentence is parsed correctly (i.e., no grammar defects are found in the sentence). The third row means that the word #4 of the sentence #2 has no proper lexical entry.</Paragraph>
      <Paragraph position="2"> The word ranges are specified by human debuggers using a GUI, which shows parsing results in CKY tables and parse trees. The comments are input by human debuggers in a natural language or chosen from the list of previous comments. A postprocessing module of willex sorts the error data by the comments to help statistical analysis.</Paragraph>
      <Paragraph position="4"/>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML