File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/03/w03-0909_concl.xml

Size: 2,136 bytes

Last Modified: 2025-10-06 13:53:41

<?xml version="1.0" standalone="yes"?>
<Paper uid="W03-0909">
  <Title>Surfaces and Depths in Text Understanding: The Case of Newspaper Commentary</Title>
  <Section position="8" start_page="0" end_page="0" type="concl">
    <SectionTitle>
8 Conclusions
</SectionTitle>
    <Paragraph position="0"> Knowledge-based text understanding and surface-based analysis have in the past largely been perceived as very different enterprises that do not even share the same 2In addition to this &amp;quot;traditional&amp;quot; pipeline approach, Reitter (2003) performed experiments with machine learning techniques based on our MAZ corpus as training data.</Paragraph>
    <Paragraph position="1"> goals. The paper argued that a synthesis can be useful, in particular: that knowledge-based understanding can benefit from stages of surface-based pre-processing. Given that a0 pre-coded knowledge will almost certainly have gaps when it comes to understanding a &amp;quot;new&amp;quot; text, and a0 surface-based methods yield &amp;quot;some&amp;quot; analysis for any text, however sparse, irrelevant or even wrong that analysis may be, a better notion of robustness is needed that explains how language understanding can be &amp;quot;as good (deep) as possible or as necessary&amp;quot;. The proposal is to first employ &amp;quot;defensive&amp;quot; surface-based methods to provide a first, underspecified representation of text structure that has gaps but is relatively trustworthy. Then, this representation may be enriched with the help of statistical, probabilistic, heuristic information that is added to the representation (and marked as being less trustworthy). Finally, a &amp;quot;deep&amp;quot; analysis can map everything into a TBox/ABox scheme, possibly again filling some gaps in the text representation (Abox) on the basis of prior knowledge already encoded in the TBox. The deep analysis should not be an all-or-nothing step but perform as good as possible -- if something cannot be understood entirely, then be content with a partial representation or, in the worst case, with a portion of the surface string.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML