File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/92/p92-1035_metho.xml

Size: 14,005 bytes

Last Modified: 2025-10-06 14:13:20

<?xml version="1.0" standalone="yes"?>
<Paper uid="P92-1035">
  <Title>CORRECTING ILLEGAL NP OMISSIONS USING LOCAL FOCUS</Title>
  <Section position="3" start_page="273" end_page="273" type="metho">
    <SectionTitle>
2 FOCUS TRACKING
</SectionTitle>
    <Paragraph position="0"> Our focusing algorithm is based on Sidner's focusing algorithm for tracking local and actor foci (Sidner 1979; Sidner 1983). 3 In each sentence, the actor focus (AF) is identified with the (thematic) agent of the sentence. The Potential Actor Focus List (PAFL) contains all NP's that specify an animate element of the database but are not the agent of the sentence.</Paragraph>
    <Paragraph position="1"> Tracking local focus is more complex. The first sentence in a text can be said to be about something. That something is called the current focus (.CF) of the sentence and can generally be identified via syntactic means, taking into consideration the thematic roles of the elements in the sentence. In addition to the CF, an initial sentence introduces a number of other items (any of which can become the focus of the next sentence). Thus, these items are recorded in a potential focus list (PFL).</Paragraph>
    <Paragraph position="2"> At any given point in a well-formed text, after the first sentence, the writer has a number of options: null  * Continue talking about the same thing; in this case, the CF doesn't change.</Paragraph>
    <Paragraph position="3"> * Talk about something just introduced; in this case, the CF is selected from the previous sentence's PFL.</Paragraph>
    <Paragraph position="4"> * Return to a topic of previous discussion; in  this case, that topic must have been the CF of a previous sentence.</Paragraph>
    <Paragraph position="5"> * Discuss an item previously introduced, but which was not the topic of previous discussion; in this case, that item must have been on the PFL of a previous sentence.</Paragraph>
    <Paragraph position="6"> The decision (by the reader/hearer/algorithm) as to which of these alternatives was chosen by the speaker is based on the thematic roles (with particular attention to the agent role) held by the anaphora of the current sentence, and whether their co-specification is the CF, a previous CF, or a member of the current PFL or a previous PFL.</Paragraph>
    <Paragraph position="7"> Confirmation of co-specifications requires inferencing based on general knowledge and semantics. At each sentence in the discourse, the CF and PFL of the previous sentence are stacked for the possibility of subsequent return. 4 When one of these items is returned to, the stacked CF's and PFL's above it are popped, and are thus no longer available for return.</Paragraph>
  </Section>
  <Section position="4" start_page="273" end_page="274" type="metho">
    <SectionTitle>
2.1 FILLING IN A MISSING NP
</SectionTitle>
    <Paragraph position="0"> We propose extending this algorithm to identify an illegally omitted NP. To do this, we treat the omitted NP as an anaphor which, like Sidner's treatment of full definite NP's and personal pronouns, co-specifies an element recorded by the focusing algorithm. This approach is based on the belief that an omitted NP is likely to be the topic of a previous sentence. We define preferences among the focus data structures which are similar to Sidner's preferences.</Paragraph>
    <Paragraph position="1"> More specifically, when we encounter an omitted NP that is not the agent, we first try to fill the deleted NP with the CF of the immediately preceding sentence. If syntax, semantics or inferencing based on general knowledge cause this co-specification to be rejected, we then consider members of the PFL of the previous sentence as fillers for the deleted NP. If these too are rejected, we consider stacked CF's and elements of stacked PFL's, taking into account preferences (yet to be determined) among these elements.</Paragraph>
    <Paragraph position="2"> When we encounter an omitted agent NP, in a simple sentence or a sentence-initial clause, we first test the AF of the previous sentence as co-specifier, then members of the PAFL, the previous CF, and finally stacked AF's, CF's and PAFL's. To identify a missing agent NP in a non-sentence-initial clause, our algorithm will first test the AF of the previous clause, and then follow the same preferences just given. Further preferences are yet to be determined, including those between the stacked AF, stacked PAFL, and stacked CF.</Paragraph>
    <Section position="1" start_page="273" end_page="274" type="sub_section">
      <SectionTitle>
2.2 COMPUTING THE CF
</SectionTitle>
      <Paragraph position="0"> To compute the CF of a sentence without any illegally omitted NP's, we prefer the CF of the last sentence over members of the PFL, and PFL members over members of the focus stacks. Exceptions to these preferences involve picking a non-agent anaphor co-specifying a PFL member over an agent co-specifying the CF, and preferring a PFL member co-specified by a pronoun to the CF co-specified by a full definite description.</Paragraph>
      <Paragraph position="1"> To compute the CF of a sentence with an illegally omitted NP, our algorithm treats illegally omitted NP's as anaphora since they (implicitly) co-specify something in the preceding discourse. However, it is important to remember that discourse-oriented languages allow deletions of NP's that are the topic of the discourse. Thus, we prefer a deleted non-agent as the focus, as long as it closely ties to the previous sentence. Therefore, we prefer the co-specifier of the omitted non-agent NP as the (new) CF if it co-specifies either the last CF or a member of the last PFL. If the omitted NP is the thematic agent, we prefer for the new CF to be a pronominal (or, as a second choice, full definite description) non-agent anaphor co-specifying either the last CF or a member of the last PFL (allowing the deleted agent NP to be the AF and keeping the AF and CF different). 5 If no anaphor meets these criteria, then  the members of the CF and PFL focus stacks will be considered, testing a co-specifier of the omitted NP before co-specifiers of pronouns and definite descriptions at each stack level.</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="274" end_page="274" type="metho">
    <SectionTitle>
3 EXAMPLE
</SectionTitle>
    <Paragraph position="0"> Below, we describe the behavior of the extended algorithm on an example from our collected texts containing both a deleted non-agent and agent.</Paragraph>
    <Paragraph position="1"> Example: &amp;quot;($1) First, in summer I live at home with my parenls. ($2) I can budget money easily.</Paragraph>
    <Paragraph position="2"> ($3) I did not spend lot of money at home because al home we have lot of good foods, I ate lot of foods.</Paragraph>
    <Paragraph position="3"> (S4) While living at college I spend lot of money because_ go out to eat almost everyday. ($5) At home, sometimes my parents gave me some money right away when I need_. &amp;quot; After S1, the AF is I, the CF is I, and the PFL contains SUMMER, HOME, and the LIVE VP. For $2, I is the only anaphor, so it becomes the CF, the PFL contains HONEY and the BUDGET VP, and the focus stack contains I and the previous PFL.</Paragraph>
    <Paragraph position="4"> $3 is a complex sentence using the conjunction &amp;quot;because.&amp;quot; Such sentences are not explicitly handled by Sidner's algorithm. Our analysis so far suggests that we should not split this sentence into two 6, and should prefer elements of the main clause as focus candidates. Thus, we take the CF from the first clause, and rank other elements in that clause before elements in the second clause on the PFL. 7 In this case, we have several anaphora: I, money, at home .... The AF remains I. The CF becomes MONEY since it co-specifies a member of the PFL and since the co-specifier of the last CF is the agent. Ordering the elements of the first clause before the elements in the second results in the PFL containing HOME, the NOT SPEND VP, GOOD FOOD, and the HAVE VP. We stack the CF and the PFL of $2.</Paragraph>
    <Paragraph position="5"> Note that $4 has a missing agent in the second clause. To identify the missing agent in a non-sentence-initiM clause, our algorithm will first test the AF of the preceding clause for possible cospecification. Because this co-specification would cause no contradiction, the omitted NP is filled with 'T', which is eventually taken as the AF of $4. The CF is computed by first considering the first clause of $4, since the X clause is the preferred clause of an X BECAUSE Y construct. Since &amp;quot;money&amp;quot; co-specifies the CF of $3, and nothing else in the preferred clause co-specifies a member of the PFL, MONEY remains the CF. The PFL contains COLLEGE, the SPEND VP, EVER.Y DAY, the TO EAT VP, and the GO OUT TO EAT VP. We stack the CF and PFL of $3.</Paragraph>
    <Paragraph position="6"> $5 contains a subordinate clause with a missing non-agent. Our algorithm first considers the 6If we were to split the sentence up, then tile focus would shift away from MONEY when we process the second clause (which contradicts our intuition of what the focus is in this paragraph).</Paragraph>
    <Paragraph position="7"> 7The appropriateness of placing elements from both clauses in one PFL and ranking them according to clause menlbership will be further investigated. This construct (&amp;quot;X BECAUSE Y&amp;quot;) is further discussed in section 4.</Paragraph>
    <Paragraph position="8"> CF, MONEY, as the co-specifier of the omitted NP; syntax, semantics and general knowledge inferencing do not prevent this co-specification, so it is adopted. MONEY is also chosen as the CF since it is the co-specifier of the omitted NP occurring in the verb complement clause which is the preferred clause in this type of construct.</Paragraph>
  </Section>
  <Section position="6" start_page="274" end_page="275" type="metho">
    <SectionTitle>
4 DISCUSSION OF EXTENSIONS
</SectionTitle>
    <Paragraph position="0"> One of the major extensions needed in Sidner's algorithm is a mechanism for handling complex sentences. Based on a limited analysis of sample texts, we propose computing the CF and PFL of a complex sentence based on a classification of sentence types. For instance, for a sentence of the form &amp;quot;X BECAUSE Y&amp;quot; or &amp;quot;BECAUSE Y, X&amp;quot;, we prefer the expected focus of the effect clause as CF, and order elements of the X clause on the PFL before elements of the Y clause. Analogous PFL orderings apply to other sentence types described here. For a sentence of the form &amp;quot;X CONJ Y&amp;quot;, where X and Y are sentences, and CONJ is &amp;quot;and&amp;quot;, &amp;quot;or&amp;quot;, or &amp;quot;but&amp;quot;, we prefer the expected focus of the Y clause. For a sentence of the form &amp;quot;IF X (THEN) Y&amp;quot;, we prefer the expected focus of the THEN clause, while for &amp;quot;X, IF Y&amp;quot;, we prefer the expected focus of the X clause. Further study is needed to determine other preferences and actions (including how to further order elements on the PFL) for these and other sentence types. These preferences will likely depend on thematic roles and syntactic criteria (e.g., whether an element occurs in the clause containing the expected CF).</Paragraph>
    <Paragraph position="1"> The decisions about how these and other extensions should proceed have been or will be based on analysis of both standard written English and the written English of deaf students. The algorithm will be developed to match the intuitions of native English speakers as to how focus shifts.</Paragraph>
    <Paragraph position="2"> A second difference between our algorithm and Sidner's is that we stack the PFL's as well as the CF's. We think that stacking the PFL's may be needed for processing standard English (and not just for our purposes) since focus sometimes revolves around the theme of one of the clauses of a complex sentence, and later returns to revolve around items of another clause. Further investigation may indicate that we need to add new data structures or enhance existing ones to handle focus shifts related to these and other complex discourse patterns.</Paragraph>
    <Paragraph position="3"> We should note that while we prefer the CF as the co-specifier of an omitted NP, Sidner's recency rule suggests that perhaps we should prefer a member of the PFL if it is the last constituent of the previous sentence (since a null argument seems similar to pronominal reference). However, our studies show that a rule analogous to the recency rule does not seem to be needed for resolving the co-specifier of an omitted NP. In addition, Carter (1987) feels the recency rule leads to unreliable predictions for co-specifiers of pronouns. Thus, we do not expect to change our algorithm to reflect the recency rule.</Paragraph>
    <Paragraph position="4"> (We also believe we will abandon the recency rule for resolving pronouns.)  Another task is to specify focus preferences among stacked PFL's and stacked CF's, perhaps using thematic and syntactic information.</Paragraph>
    <Paragraph position="5"> An important question raised by our analysis is how to handle a paragraph-initial, but not discourse-initial, sentence. Do we want to treat it as discourse-initial, or as any other non-discourse-initial sentence? We suggest (based on analysis of samples) that we should treat the sentence as any non-discourse-initial sentence, unless its sentence type matches one of a set of sentence types (which often mark focus movement from one element to a new one). In this latter case, we will treat the sentence as discourse-initial by calculating the CF and PFL in the same manner as a discourse-initial sentence, but we will retain the focus stacks. We have identified a number of sentence types that should be included in the set of types which trigger the latter treatment; we will explore whether other sentence types should be included in this set.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML