File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/95/e95-1041_metho.xml
Size: 4,904 bytes
Last Modified: 2025-10-06 14:14:03
<?xml version="1.0" standalone="yes"?> <Paper uid="E95-1041"> <Title>An Algorithm to Co-Ordinate Anaphora Resolution and PPS Disambiguation Process</Title> <Section position="3" start_page="0" end_page="284" type="metho"> <SectionTitle> 2 The algorithm </SectionTitle> <Paragraph position="0"> Two of the main principles of the algorithm are : a) The algorithm is applied on the text sentence by sentence, i.e. the ambiguities of the previous sentences have already been considered (resolved or not).</Paragraph> <Paragraph position="1"> b) The anaphora procedure skips the resolution of a given anaphor when this anaphor is preceded by an unattached preposition. This is because the resolution rules may have an empty role as a parameter, due to this unattached preposition. The resolution of the anaphor is then postponed to the second phase of anaphora resolution.</Paragraph> <Paragraph position="2"> The proposed procedure is based on successive calls to the anaphora module and to the PP attachment module. The output of each call is a set of CSs that represent the intermediate resuits exchanged between each call and on which both modules operate in turn. The aim is to fill the unfilled roles in the CSs, due to anaphora or unattached PPs. To summarize the algorithm is: 1) Apply the anaphora module first.</Paragraph> <Paragraph position="3"> 2) Apply the PP attachment procedure.</Paragraph> <Paragraph position="4"> 3) If some anaphora.are left unresolved, apply the anaphora module again.</Paragraph> <Paragraph position="5"> 4) If there are still unattached PPs, apply the attachment procedure again.</Paragraph> <Paragraph position="6"> 5) Repeat (3) and (4), until all VPs and anaphors are treated.</Paragraph> <Paragraph position="7"> The order in which the two modules are called is based on efficiency deduced from statistical data performed on COBALT corpuses.</Paragraph> <Paragraph position="8"> Three main cases are faced by the algorithm : a) When the anaphor occurs before a given preposition in the sentence, its resolution does not depend on where the preposition is to be attached (except for cataphors that are quite rare). In this case the anaphora module can be applied before the attachment procedure.</Paragraph> <Paragraph position="9"> The example 1 below shows that the resolution of the anaphoric pronoun that must be performed first and that the PP starting with of be attached later.</Paragraph> <Paragraph position="10"> (1) The sale of Credito was first proposed last August and that of BCI late last ~lear.</Paragraph> <Paragraph position="11"> b) When the anaphor occurs after one or several unattached prepositions, it could be an intrasententiai anaphor (i.e. referring to an entity in the same sentence), then its resolution may depend on one of the previous prepositional phrases. In this case, the resolution of the anaphora is postponed to a next call of the anaphora module according to principle b) stated above.</Paragraph> <Paragraph position="12"> c) When the anaphor is included in a PP (particular case of b), PP attachment rules need semantic information about the &quot;object&quot; of the PP; when it is a pronoun, no semantic information is available, so that the attachment rules can not be applied. The anaphoric pronouns have to be resolved first, so as to determine what semantic class they refer to ; the PP attachment procedure can then be applied. When a sequence contains more than two such PPs, i.e., with anaphors as objects, the length of a cycle is more than 4.</Paragraph> </Section> <Section position="4" start_page="284" end_page="284" type="metho"> <SectionTitle> 3 An example </SectionTitle> <Paragraph position="0"> (~) UPHB shares have been suspended since October ~g at the firm's request following a surge in its share price on a takeover rumour.</Paragraph> <Paragraph position="1"> - The pronoun its can not be resolved by the anaphora resolution module because it is preceded by unattached PPs ; its resolution is skipped.</Paragraph> <Paragraph position="2"> - The PP attachment procedure is then called to determine the attachment of since and at while the object of the in PP comprises an anaphoric pronoun its (case c).</Paragraph> <Paragraph position="3"> - The anaphora module is called again to resolve the anaphoric pronoun its, which is possible, in this example, since the previous PPs have been attached and there is no anaphors before.</Paragraph> <Paragraph position="4"> - Finally, the PP attachment procedure has to be called again for the in PP.</Paragraph> <Paragraph position="5"> Notice that even if each module is called several times, there is no redundancy in the processing. The algorithm should be considered as the splitting of both anaphora resolution and PP attachment procedures into several phases and not as the repetition of each procedure.</Paragraph> </Section> class="xml-element"></Paper>