File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/06/n06-1039_evalu.xml

Size: 2,635 bytes

Last Modified: 2025-10-06 13:59:40

<?xml version="1.0" standalone="yes"?>
<Paper uid="N06-1039">
  <Title>Hurricane Date (Affected Place) Articles</Title>
  <Section position="6" start_page="308" end_page="309" type="evalu">
    <SectionTitle>
4.2 Results
</SectionTitle>
    <Paragraph position="0"> We evaluated 48 randomly chosen tables. Among these tables, we found that 36 tables were consistent. We also counted the total number of rows that fit each description, shown in Table 3. Table 4 shows the descriptions of the selected tables. The largest consistent table was about hurricanes (Table 5). Although we cannot exactly measure the recall of each table, we tried to estimate the recall by comparing this hurricane table to a manually created one (Table  Hurricane Katrina and Longwang shown in the previous examples are not included in this table. They appeared before this period.</Paragraph>
    <Paragraph position="1">  for each cluster pair (c  ber of fitted/total rows.</Paragraph>
    <Paragraph position="2"> name. The second largest table (about nominations of officials) is shown in Table 7.</Paragraph>
    <Paragraph position="3"> We reviewed 10 incorrect rows from various tables and found 4 of them were due to coreference errors and one error was due to a parse error. The other 4 errors were due to multiple basic patterns distant from each other that happened to refer to a different event reported in the same cluster. The causes of the one remaining error was obscure. Most inconsistent tables were a mixture of multiple relations and some of their rows still looked consistent.</Paragraph>
    <Paragraph position="4"> We have a couple of open questions. First, the overall recall of our system might be lower than existing IE systems, as we are relying on a cluster of comparable articles rather than a single document to discover an event. We might be able to improve this in the future by adjusting the basic clustering algorithm or weighting schema of basic patterns. Secondly, some combinations of basic patterns looked inherently vague. For example, we used the two basic patterns &amp;quot;pitched&amp;quot; and &amp;quot;'s-series&amp;quot; in the following sentence (the patterns are underlined): Ervin Santana pitched 5 1-3 gutsy innings in his postseason debut for the Angels, Adam Kennedy hit a goahead triple that sent Yankees outfielders crashing to the ground, and Los Angeles beat New York 5-3 Monday night in the decisive Game 5 of their AL playoff series. It is not clear whether this set of patterns can yield any meaningful relation. We are not sure how much this sort of table can affect overall IE performance.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML