File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/98/p98-2150_metho.xml
Size: 18,489 bytes
Last Modified: 2025-10-06 14:15:00
<?xml version="1.0" standalone="yes"?> <Paper uid="P98-2150"> <Title>An Estimate of Referent of Noun Phrases in Japanese Sentences</Title> <Section position="4" start_page="0" end_page="912" type="metho"> <SectionTitle> 2 Referential Property of a Noun </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="0" end_page="912" type="sub_section"> <SectionTitle> Phrase </SectionTitle> <Paragraph position="0"> The following is an example of noun phrase anaphora. &quot;OJIISAN (old man)&quot; in the first sen- null tence and &quot;OJIISAN (old man)&quot; in the second sentenee refer to the same old man, and they are in anaphoric relation.</Paragraph> </Section> </Section> <Section position="5" start_page="912" end_page="912" type="metho"> <SectionTitle> OJIISAN TO OBAASAN-GA SUNDEITA. </SectionTitle> <Paragraph position="0"> (an old man) (and) (an old woman) (lived) (There lived an old man and an old woman.)</Paragraph> </Section> <Section position="6" start_page="912" end_page="912" type="metho"> <SectionTitle> OJIISAN-WA YAMA-HE SHIBAKARI-NI ITTA. </SectionTitle> <Paragraph position="0"> Indefinite noun phrase An indefinite noun phrase denotes an arbitrary member of the class of the noun phrase. For example, &quot;INU(dog)&quot; in the following sentence is an indefinite noun phrase.</Paragraph> </Section> <Section position="7" start_page="912" end_page="912" type="metho"> <SectionTitle> INU-GA SANBIKI IRU. </SectionTitle> <Paragraph position="0"> (dog) (three) (there is) (There are three dogs.) (5) (old man) (mountain) (to gather firewood) (go) An indefinite noun phrase cannot refer to the entity (The old man went to the mountains to gather firewood.) denoted by a noun phrase that has already appeared. (2) When the system analyzes the anaphoric relation of noun phrases like these, the referential properties of noun phrases are important. The referential property of a noun phrase here means how the noun phrase denotes the referent. If the system can recognize that the second &quot;OJIISAN (old man)&quot; has the referential property of the definite noun phrase, indicating that the noun phrase refers to the contextually non-ambiguous entity, it will be able to judge that the second &quot;OJIISAN (old man)&quot; refers to the entity denoted by the first &quot;OJIISAN (old man). The referential property plays an important role in clarifying the anaphoric relation.</Paragraph> <Paragraph position="1"> We previously classified noun phrases by referential property into the following three types (Murata and Nagao 1993).</Paragraph> <Paragraph position="2"> generic NP { NP non generic NP definite NP indefinite NP Generic noun phrase A noun phrase is classified as generic when it denotes all members of the class described by the noun phrase or the class itself of the noun phrase. For example, &quot;INU(dog)&quot; in the following sentence is a generic noun phrase.</Paragraph> <Paragraph position="4"> A generic noun phrase cannot refer to the entity denoted by an indefinite or definite noun phrase. Two generic noun phrases can have the same referent.</Paragraph> <Paragraph position="5"> Definite noun phrase A noun phrase is classified as definite when it denotes a contextually non-ambiguous member of the class of the noun phrase.</Paragraph> <Paragraph position="6"> For example, &quot;INU(dog)&quot; in the following sentence is a definite noun phrase.</Paragraph> </Section> <Section position="8" start_page="912" end_page="912" type="metho"> <SectionTitle> INU-WA MUKOUHE ITTA. </SectionTitle> <Paragraph position="0"/> <Paragraph position="2"> A definite noun phrase can refer to the entity de- null noted by a noun phrase that has already appeared. 3 How to Determine the Referent of a Noun Phrase To determine referents of noun phrases, we made the following three constraints.</Paragraph> <Paragraph position="3"> 1. Referential property constraint 2. Modifier constraint 3. Possessor constraint When two noun phrases which have the same head noun satisfy these three constraints, the system judges that the two noun phrases have the same referent. null</Paragraph> <Section position="1" start_page="912" end_page="912" type="sub_section"> <SectionTitle> 3.1 Referential Property Constraint </SectionTitle> <Paragraph position="0"> First, our system estimates the referential property of a noun phrase by using the method described in one of our previous papers (Murata and Nagao 1993). The method estimates a referential property using surface expressions in the sentences. For example, since the second &quot;OJIISAN (old man)&quot; in the following sentences is accompanied by a particle &quot;WA (topic)&quot; and the predicate is in the past tense, it is estimated to be a definite noun phrase.</Paragraph> <Paragraph position="1"> OJIISAN-WA JIMEN-NI KOSHI-WO-OROSHITA.</Paragraph> <Paragraph position="2"> (old man) (ground) (sit down) (The old man sat down on the ground.)</Paragraph> </Section> </Section> <Section position="9" start_page="912" end_page="913" type="metho"> <SectionTitle> YAGATE OJIISAN-WA NEMUTTE-SHIMAIMATTA. </SectionTitle> <Paragraph position="0"> (soon) (old man) (fall asleep) (He soon fell asleep.) (6) Next, our system determines the referent of a noun phrase by using its estimated referential property. When a noun phrase is estimated to be a definite noun phrase, our system judges that the noun phrase refers to the entity denoted by a previous noun phrase which has the same head noun. For example, the second &quot;OJIISAN&quot; in the above sentences is estimated to be a definite noun phrase, and our system judges that it refers to the entity denoted by the first &quot;OJIISAN&quot;.</Paragraph> <Paragraph position="1"> When a noun phrase is not estimated to be a deftnite noun phrase, it usually does not refer to the entity denoted by a noun phrase that has already been mentioned. Our method, however, might fail to estimate the referential property, so the noun phrase might refer to the entity denoted by a noun phrase that has already been mentioned. Therefore, when a noun phrase is not estimated to be a definite noun phrase, our system gets a possible referent of the noun phrase and determines whether or not the noun phrase refers to it by using the following three kinds of information.</Paragraph> <Paragraph position="2"> * the plausibility(P) of the estimated referential property that is a definite noun phrase When our system estimates a referential property, it outputs the score of each category (Murata and Nagao 1993). The value of the plausibility (P) is given by the score.</Paragraph> <Paragraph position="3"> the weight (W) of the salience of a possible referent The weight (W) of the salience is given by the particles such as &quot;WA (topic)&quot; and &quot;GA (subject)&quot;. The entity denoted by a noun phrase which has a high salience, is easy to be referred by a noun phrase.</Paragraph> <Paragraph position="4"> the distance (D) between the estimated noun phrase and a possible referent The distance (D) is the number of noun phrases between the estimated noun phrase and a possible referent.</Paragraph> <Paragraph position="5"> When the value given by these three kinds of information is higher than a given threshold, our system judges that the noun phrase refers to the possible referent. Otherwise, it judges that the noun phrase does not refer to the possible referent and is an indefinite noun phrase or a generic noun phrase.</Paragraph> <Section position="1" start_page="913" end_page="913" type="sub_section"> <SectionTitle> 3.2 Modifier Constraint </SectionTitle> <Paragraph position="0"> It is insufficient to determine referents of noun phrases by using only the referential property.</Paragraph> <Paragraph position="1"> When two noun phrases have different moditiers, they usually do not have the same referent. For example, &quot;MIGI(right)-NO HOO(cheek)&quot; and &quot;HIDARI(left)-NO HOO(cheek)&quot; in the following sentences do not have the same referent.</Paragraph> </Section> </Section> <Section position="10" start_page="913" end_page="913" type="metho"> <SectionTitle> KONO OJIISAN-NO KOBU-WA MIGI-NO HOO-NI ATTA. </SectionTitle> <Paragraph position="0"> (this) (old man) (lump) (right) (cheek) (be on) (This old man's lump was on his right cheek.)</Paragraph> </Section> <Section position="11" start_page="913" end_page="914" type="metho"> <SectionTitle> TENGU-WA, KOBU-WO HIDARI-NO HOO-NI TSUKETA. </SectionTitle> <Paragraph position="0"> (tengu) ~ (lump) (left) (cheek) (put on) (The &quot;tengu&quot; put a lump on his left cheek) (7) Therefore, we made the following constraint: A noun phrase that has a modifier cannot refer to the 2A tengu is a kind of monster.</Paragraph> <Paragraph position="1"> entity denoted by a noun phrase that does not have the same modifier. A noun phrase that does not have a modifier can refer to the entity denoted by a noun phrase that has any modifier.</Paragraph> <Paragraph position="2"> The constraint is incomplete, and is not truly applicable to all cases. There are some exceptions where a noun can refer to the entity of a noun that has a different modifier. But we use the constraint because we can get a higher precision than if we did not use it.</Paragraph> <Section position="1" start_page="913" end_page="914" type="sub_section"> <SectionTitle> 3.3 Possessor Constraint </SectionTitle> <Paragraph position="0"> When a noun phrase has a semantic marker PAR (a part of a body), 3 our system tries to estimate the possessor of the entity denoted by the noun phrase.</Paragraph> <Paragraph position="1"> We suppose that the possessor of a noun phrase is the subject or the noun phrase's nearest topic that has a semantic mark,er HUM (human) or a semantic marker AN I (animal). For example, we examine two instances of &quot;HOO (cheek)&quot; in the following sentences, which have a semantic marker PAR, (He looked as if he had puffed out his cheek.) The possessor of the first &quot;HOO (cheek)&quot; is determined to be &quot;OJIISAN (old man)&quot; because &quot;OJIISAN (old man)&quot;, which has a semantic marker HUM (human), is followed by a particle &quot;NIWA (topic)&quot; and is the topic of the sentence. The possessor of the second &quot;HOO (cheek)&quot; is also determined to be &quot;OJIISAN (old man)&quot; because &quot;OJIISAN (old man)&quot; is the subject of the sentence.</Paragraph> <Paragraph position="2"> We made the following constraint, which is similar to the modifier constraint, by using possessors. When the possessor of a noun phrase is estimated, the noun phrase cannot refer to the entity denoted by a noun phrase that does not have the same possessor. When the possessor of a noun phrase is not estimated, the noun phrase can refer to the entity For example, since the two instances of &quot;HOO (cheek)&quot; in the above sentences have the same possessor &quot;OJIISAN (old man)&quot;, our system correctly judges that they have the same referent.</Paragraph> </Section> </Section> <Section position="12" start_page="914" end_page="914" type="metho"> <SectionTitle> 4 Anaphora Resolution System </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="914" end_page="914" type="sub_section"> <SectionTitle> 4.1 Procedure </SectionTitle> <Paragraph position="0"> Before referents are determined, sentences are transformed into a case structure by the case structure analyzer (Kurohashi and Nagao 1994).</Paragraph> <Paragraph position="1"> Referents of noun phrases are determined by using heuristic rules which are made from information such as the three constraints mentioned in Section 3.</Paragraph> <Paragraph position="2"> Using these rules, our system takes possible referents and gives them points. It judges that the candidate having the maximum total score is the referent. This is because a number of types of information are combined in anaphora resolution. VCe can specify which rule takes priority by using points.</Paragraph> <Paragraph position="3"> The heuristic rules are given in the following form.</Paragraph> <Paragraph position="5"> Here, Condition consists of surface expressions, semantic constraints and referential properties. In Possible-Referent, a possible referent, &quot;Indefinite&quot;, &quot;Generic&quot;, or other things are written. &quot;Indefinite&quot; means that the noun phase is an indefinite noun phrase, and it does not refer to the entity denoted by a previous noun phrase. Point means the plausibility value of the possible referent.</Paragraph> </Section> <Section position="2" start_page="914" end_page="914" type="sub_section"> <SectionTitle> 4.2 Heuristic Rule for Estimating Referents </SectionTitle> <Paragraph position="0"> We made 8 heuristic rules for the resolution of noun phrase anaphora. Some of them are given below.</Paragraph> <Paragraph position="1"> definite noun phrase, { (A noun phrase X which satisfies the modifier and possessor constraints, P + W - D + 4)} The values P, W, D are as defined in Section 3.1.</Paragraph> </Section> </Section> <Section position="13" start_page="914" end_page="915" type="metho"> <SectionTitle> 5 Experiment and Discussion </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="914" end_page="914" type="sub_section"> <SectionTitle> 5.1 Experiment </SectionTitle> <Paragraph position="0"> Before determining the referents of noun phrases, sentences were at first transformed into a case structure by the case structure analyzer (Kurohashi and Nagao 1994). Tile errors made by the case analyzer were corrected by hand. Table 1 shows the results of determining the referents of noun phrases.</Paragraph> <Paragraph position="1"> To confirm that the three constraints (referential property, modifier, and possessor) are effective, we experimented under several different conditions and compared them. The results are shown in Table 2.</Paragraph> <Paragraph position="2"> Precision is the fraction of noun phrases which were judged to have antecedents. Recall is the fraction of noun phrases which have antecedents.</Paragraph> <Paragraph position="3"> In these experiments we used training sentences and test sentences. The training sentences were used to make the heuristic rules in Section 4.2 by hand.</Paragraph> <Paragraph position="4"> The test sentences were used to confirm the effectiveness of these rules.</Paragraph> <Paragraph position="5"> In Table 2, Method 1 is the method mentioned in Section 3 which uses all three constraints. Method 2 is the case in which a noun phrase can refer to the entity denoted by a noun phrase, only when the estimated referential property is a definite noun phrase, where the modifier and possessor constraints are used. Method 3 does not use a referential property. It only uses information such as distance, topicfocus, modifier, and possessor. Method 4 does not use the modifier and possessor constraints.</Paragraph> <Paragraph position="6"> The table shows many results. In Method 1, both the recall and the precision were relatively high in comparison with the other methods. This indicates that the referential property was used properly in the method that is described in this paper. Method 1 was higher than Method 3 in both recall and precision. This indicates that the information of referential property is necessary. In Method 2, the recall was low because there were many noun phrases that were definite but were estimated to be indefinite or generic, and the system estimated that the noun phrases cannot refer to noun phrases. In Method 4, the precision was low. Since the modifier and possessor constraints were not used, and there were many pairs of two noun phrases that did not corefer, such as &quot;HIDARI(left)-NO HOO(cheek)&quot; and &quot;MIGI(right)-NO HOO(cheek)&quot;, these pairs were incorrectly interpreted to be co-references. This indicates that it is necessary to use the modifier and possessor constraints.</Paragraph> </Section> <Section position="2" start_page="914" end_page="915" type="sub_section"> <SectionTitle> 5.2 Examples of Errors </SectionTitle> <Paragraph position="0"> We found that it was necessary to use modifiers and possessors in the experiments. But there are some cases when the referent was determined incorrectly because the possessor of a noun was estimated incorrectly. null Sometimes a noun can refer to the entity denoted by a noun that has a different modifier. In such cases, the system made an incorrect judgment.</Paragraph> </Section> </Section> <Section position="14" start_page="915" end_page="915" type="metho"> <SectionTitle> OJIISAN-WA CHIKAKU-NO OOKINA SUGI-NO </SectionTitle> <Paragraph position="0"> (old man) (near) (huge) (cedar)</Paragraph> </Section> <Section position="15" start_page="915" end_page="915" type="metho"> <SectionTitle> KI-NO NEMOTO-NI ARU ANA-DE </SectionTitle> <Paragraph position="0"> (tree) (base) (be at) (hole) AMAYADORI-WO SURU-KOTO-NI-SHITA.</Paragraph> <Paragraph position="1"> (take shelter from the rain) (decide to do) (So, he decided to take shelter from the rain in a hole which is at the base of a huge cedar tree nearby.) (an omission of the middle part)</Paragraph> </Section> <Section position="16" start_page="915" end_page="915" type="metho"> <SectionTitle> TSUGI-NOHI, KONO OJIISAN-WA YAMA-HE ITTE, </SectionTitle> <Paragraph position="0"> (next day) (this) (old man) (mountain) (go to) (The next day, this man went to the mountain, )</Paragraph> </Section> <Section position="17" start_page="915" end_page="915" type="metho"> <SectionTitle> SUGI-NO KI-NO NEMOTO-NO ANA-WO MITSUKETA. </SectionTitle> <Paragraph position="0"> (cedar) (tree) (at base) (hole) (found) (and found the hole at the base of the cedar tree.) Tile two instances of &quot;ANA (hole)&quot; in these sentences refer to the same entity. But our system judged that they do not refer to it because tlae modifiers of the two instances of &quot;ANA (hole)&quot; are different. In order to correctly analyze this case, it is necessary to decide whether the two different expressions are equal in meaning.</Paragraph> </Section> class="xml-element"></Paper>