File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/05/w05-0406_intro.xml

Size: 2,004 bytes

Last Modified: 2025-10-06 14:03:15

<?xml version="1.0" standalone="yes"?>
<Paper uid="W05-0406">
  <Title>Identifying non-referential it: a machine learning approach incorporating linguistically motivated patterns</Title>
  <Section position="3" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> The automatic classification of it as either referential or non-referential is a topic that has been relatively ignored in the computational linguistics literature, with only a handful of papers mentioning approaches to the problem. With the term &amp;quot;nonreferential it&amp;quot;, we mean to refer to those instances of it which do not introduce a new referent. In the previous literature these have been called &amp;quot;pleonastic&amp;quot;, &amp;quot;expletive&amp;quot;, and &amp;quot;non-anaphoric&amp;quot;. It is important to be able to identify instances of non-referential it to generate the correct semantic interpretation of an utterance. For example, one step of this task is to associate pronouns with their referents. In an automated pronoun resolution system, it is useful to be able to skip over these instances of it rather than attempt an unnecessary search for a referent for them, The authors would like to thank the GE Foundation Faculty for the Future grant for their support of this project. We would also like to thank Detmar Meurers and Erhard Hinrichs for their helpful advice and feedback.</Paragraph>
    <Paragraph position="1"> only to end up with inaccurate results. The task of identifying non-referential it could be incorporated into a part-of-speech tagger or parser, or viewed as an initial step in semantic interpretation.</Paragraph>
    <Paragraph position="2"> We develop a linguistically-motivated classification for non-referential it which includes four types of non-referential it: extrapositional, cleft, weather/condition/time/place, and idiomatic, each of which will be discussed in more detail in Section</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML