File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/06/w06-2301_abstr.xml

Size: 1,580 bytes

Last Modified: 2025-10-06 13:45:33

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-2301">
  <Title>Robust Parsing, Error Mining, Automated Lexical Acquisition, and Evaluation</Title>
  <Section position="1" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> In our attempts to construct a wide coverage HPSG parser for Dutch, techniques to improve the overall robustness of the parser are required at various steps in the parsing process.</Paragraph>
    <Paragraph position="1"> Straightforward but important aspects include the treatment of unknown words, and the treatment of input for which no full parse is available.</Paragraph>
    <Paragraph position="2"> Another important means to improve the parser's performance on unexpected input is the ability to learn from your errors. In our methodology we apply the parser to large quantities of text (preferably from different types of corpora), and we then apply error mining techniques to identify potential errors, and furthermore we apply machine learning techniques to correct some of those errors (semi-)automatically, in particular those errors that are due to missing or incomplete lexical entries.</Paragraph>
    <Paragraph position="3"> Evaluating the robustness of a parser is notoriously hard. We argue against coverage as a meaningful evaluation metric. More generally, we argue against evaluation metrics that do not take into account accuracy. We propose to use variance of accuracy across sentences (and more generally across corpora) as a measure for robustness.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML