File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/99/w99-0205_metho.xml

Size: 14,562 bytes

Last Modified: 2025-10-06 14:15:27

<?xml version="1.0" standalone="yes"?>
<Paper uid="W99-0205">
  <Title>Resolution of Indirect Anaphora in Japanese Sentences Using Examples &amp;quot;X no Y (Y of X)&amp;quot;</Title>
  <Section position="4" start_page="33" end_page="34" type="metho">
    <SectionTitle>
3 Anaphora Resolution System
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="33" end_page="33" type="sub_section">
      <SectionTitle>
3.1 Procedure
</SectionTitle>
      <Paragraph position="0"> Before starting the anaphora resolution process, the syntactic structure analyzer transforms sentences into dependency structures (Kurohashi and Nagao, 1994). Antecedents are determined by heuristic rules for each noun from left to right in the sentences.</Paragraph>
      <Paragraph position="1"> Using these rules, our system gives possible antecedents points, and it determines that the possible antecedent having the maximum total score is the desired antecedent. This is because a several types of information are combined in anaphora resolution.</Paragraph>
      <Paragraph position="2"> An increase in the points of a possible antecedent corresponds to an increase of the plausibility of the possible antecedent.</Paragraph>
      <Paragraph position="3"> The heuristic rules are given in the following form:</Paragraph>
      <Paragraph position="5"> Surface expressions, semantic constraints, referential properties, for example, are written as conditions in the Condition part. A possible antecedent is written in the Possible-Antecedent part. Point refers to the plausibility of the possible antecedent.</Paragraph>
      <Paragraph position="6"> To implement the method mentioned in Section 2, we use the weights W of topics and foci, the distance D, the definiteness P, and the semantic similarity S (in R4 of Section 3.2) to determine points. The weights W oftopics and foci are given in Table 3 and Table 4 respectively in Section 2, and represent the preferability of the desired antecedent. In this work, a topic is defined as a theme which is described, and a focus is defined as a word which is stressed by the speaker (or the writer). But we cannot detect topics and foci correctly. Therefore we approximated them as shown in Table 3 and Table 4. The distance D is the number of the topics (foci) between the anaphor and a possible antecedent which is a topic (focus).</Paragraph>
      <Paragraph position="7"> The value P is given by the score of the definiteness in referential property analysis (Murata and Nagao, 1993). This is because it is easier for a definite noun phrase to have an antecedent than for an indefinite noun phrase to have one. The value S is the semantic similarity between a possible antecedent and Noun X of &amp;quot;Noun X no Noun Y.&amp;quot; Semantic similarity is shown by level in Bunrui Goi Hyou (NLRI, 1964).</Paragraph>
    </Section>
    <Section position="2" start_page="33" end_page="34" type="sub_section">
      <SectionTitle>
3.2 Heuristics for determining antecedents
</SectionTitle>
      <Paragraph position="0"> We wrote 15 heuristic rules for noun phrase anaphora resolution. Some of the rules are given below: R1 When the referential property of a noun phrase (an anaphor) is definite, and the same noun phrase A has already appeared, =C/, { (the noun phrase A, 30)} A referential property is estimated by this method (Murata and Naga~, 1993). This is a rule for direct anaphora.</Paragraph>
      <Paragraph position="2"> clause of the clause, 23 -t- P + S) where the values W, D, P, and S are as they were defined in Section 3.1.</Paragraph>
      <Paragraph position="3"> R5 When a noun phrase is a verbal noun, =~ { (A topic which satisfies the semantic constraint in a verb case frame and has the weight W and the distance D, W-D+P+S), (A focus which satisfies the semantic constraint and has the weight W and the distance D, W-D+P+S),  (A subject in a subordinate clause or a main clause of the clause, 23 + P + S) R6 When a noun phrase is a noun such as ichibu, tonari, and it modifies a noun X, =~ { (the same noun as the noun X, 30)}</Paragraph>
    </Section>
    <Section position="3" start_page="34" end_page="34" type="sub_section">
      <SectionTitle>
3.3 Example of analysis
</SectionTitle>
      <Paragraph position="0"> An example of the resolution of an indirect anaphora is shown in Figure 1. Figure 1 shows that the noun koutei buai (official rate) is analyzed well. This is explained as follows: The system estimated the referential property of koutei buai (official rate) to be indefinite in the method (Murata and Nagao, 1993). Following rule R3 (ection 3.2) the system took a candidate &amp;quot;Indefinite,&amp;quot; which means that the candidate is an indefinite noun phrase that does not have an indirect anaphoric referent. Following R4 (Section 3.2) the system took four possible antecedents, nisidoku (West Germany), jikokutuuka (own currency), kyoutyou (cooperation), dorudaka (dollar's surge). The possible antecedents were given points based on the weight of topics and foci, the distance from the anaphor, and so on. The system properly judged that nisidoku (West Germany), which had the best score, was the desired antecedent.</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="34" end_page="36" type="metho">
    <SectionTitle>
4 Experiment and Discussion
</SectionTitle>
    <Paragraph position="0"> Before the antecedents in indirect anaphora were determined, sentences were transformed into a case structure by the case analyzer (Kurohashi and Nagao, 1994). The errors made by the analyzer were corrected by hand. We used the IPAL dictionary (IPAL, 1987) as a verb case frame dictionary. We used the Japanese Co-occurrence Dictionary (EDR, 1995) as a source of examples for &amp;quot;X no Y.&amp;quot; We show the result of anaphora resolution using both &amp;quot;X no Y&amp;quot; and a verb case frame dictionary in Table 6. We obtained a recall rate of 63% and a precision rate of 68% when we estimated indirect anaphora in test sentences. This indicates that the information of &amp;quot;X no Y&amp;quot; is useful to a certain extent even though we cannot make use of a noun frame dictionary. We also tested the system when it did not have any semantic information. The precision and the recall were lower. This indicates that semantic information is necessary. The experiment was performed by fixing all the semantic similarity values S to 0.</Paragraph>
    <Paragraph position="1"> We also estimated the results for the hypothetical use of a noun case frame dictionary. We estimated these results in the following manner: We looked over the errors that had occured when we used &amp;quot;X no Y&amp;quot; and a verb case frame dictionary. We regarded errors made for one of the following three reasons as right answers:  The upper row and the lower row of this table show rates on training sentences and test sentences respectively.</Paragraph>
    <Paragraph position="2"> The training sentences are used to set the values given in the rules (Section 3.2) by hand. Training sentences {example sentences (Walker et al., 1994) (43 sentences), a folk tale Kobutori jiisan (Nakao, 1985) (93 sentences), an essay in Tenseijingo (26 sentences), an editorial (26 sentences)} Test sentences {a folk tale Tsuru no ongaeshi (Nakao, 1985) (91 sentences), two essays in Tenseijingo (50 sentences), an editorial (30 sentences)} Precision is the fraction of the noun phrases which were judged to have the indirect anaphora as antecedents. Recall is the fraction of the noun phrases which have the antecedents of indirect anaphora. We use precision and recall to evaluate because the system judges that a noun which is not an antecedent of indirect anaphora is an antecedent of indirect anaphora, and we check these errors thoroughly.  1. Proper examples do not exist in examples of &amp;quot;X no Y&amp;quot; or in the verb case frame dictionary.</Paragraph>
    <Paragraph position="3"> 2. Wrong examples exist in examples of &amp;quot;X no Y&amp;quot; or in the verb case frame dictionary.</Paragraph>
    <Paragraph position="4"> 3. A noun case frame is different from a verb case  frame.</Paragraph>
    <Paragraph position="5"> If we were to make a noun case frame dictionary, it would have some errors, and the success ratio would be lower than the ratio shown in Table 6.</Paragraph>
    <Paragraph position="6">  Even if we had a noun case frame dictionary, there are certain pairs of nouns in indirect anaphoric relationship that could not be resolved using our framework. null kon'na hidoi hubuki-no naka-wo ittai dare-ga kita-noka-to ibukarinagara, obaasan-wa iimashita.</Paragraph>
    <Paragraph position="7"> (Wondering who could have come in such a heavy snowstorm, the old woman said:) &amp;quot;donata-jana&amp;quot; (&amp;quot;Who is it?&amp;quot;) to-wo aketemiruto, soko-niwa zenshin yuki-de masshironi natta musume-ga tatte orimashita.</Paragraph>
    <Paragraph position="8"> (She opened the door, and there stood before her a girl all covered with snow. ) (5) The underlined mnsnme has two main meanings: a daughter or a girl. In the above example, mnsnme means &amp;quot;girl&amp;quot; and has no indirect anaphora relation but the system incorrectly judged that it is the daughter of obaasan (the old woman). This is a problem of noun role ambiguity and is very difficult to solve.</Paragraph>
    <Paragraph position="9"> The following example also presents a difficult problem: shushou-wa teikou-no tsuyoi (prime minister) (resistance) (very hard) senkyoku-no kaishou-wo miokutta.</Paragraph>
    <Paragraph position="10"> (electoral district) (modification) (give up) (The prime minister gave up the modification of some electoral districts where the resistance was very hard.) (6) On the surface, the underlined leikou (resistance) appears to refer indirectly to senkyoku (electoral district). But actually teikou (resistance) refers to the candidates of senkyokn (electoral district) not to senkyoku (electoral district) itself. To arrive at this conclusion, in other words, to connect senkyoku (electoral district) and ~eikou (resistance), it is necessary to use a two-step relation, &amp;quot;an electoral district =C/, candidates,&amp;quot; &amp;quot;candidates :=C/, resist&amp;quot; in sequence. It is not easy, however, to change our system so it can deal with two-step relationships. If we apply the use of two-step relationships to nouns, many nouns which are not in an indirect anaphoric rela- null vanced country), vyoukoku (the two countries), naichi (inland), zenkoku (the whole country), nihon (Japan), soren (the Soviet Union), eikoku (England), amerika (America), suisu (Switzerland), denmaaku (Denmark), sekai (the world) &lt;Human&gt; raihin (visitor) &lt;Organization&gt; gaikoku (a foreign country), kakkoku (each country), poorando (Poland) &lt;Organization&gt; hokkaido (Hokkaido), sekai (the world), gakkou (school), koujou (factory), gasorinsutando (gas station), suupaa (supermarket), jilaku (one's home), honbu (the head office) &lt;Product&gt; kuruma (car), juutaku (housing), ie (house), shinden (temple), genkan (entrance), shinsha (new car) &lt;Phenomenon&gt; midori (green) &lt;Action&gt; kawarabuki (tile-roofed) &lt;Mental&gt; houshiki (method) &lt;Character&gt; keishiki (form) &lt;Animal&gt; zou (elephant) &lt;Nature&gt; fujisan (Mt. Fuji) &lt;Product&gt; imono (an article of cast metal), manshon (an apartment house), kapuseru (capsule), densha (train), hunt (ship), gunkan (warship), hikouki (airplane), jettoki (jet plane) &lt;Action&gt; zousen (shipbuilding) &lt;Mental&gt; puran (plan) &lt;Character&gt; unkou (movement) &lt;Human&gt; koushitsu (the Imperial Household), oushilsu (a Royal family), iemoto (the head of a school) &lt;Organization&gt; nouson (an agricultural village), ken (prefecture), nihon (Japan), soren (the Soviet Union), tera (temple), gakkou (school) &lt;Action&gt; shuunin (take up one's post), matsuri (festival), iwai (celebration), junrei (pilgrimage) &lt;Mental&gt; kourei (an established custom), koushiki (formal) &lt;Human&gt; watashi (myself), ningen (human), seishounen (young people), seijika (statesman) tion will be incorrectly judged as indirect anaphora.</Paragraph>
    <Paragraph position="11"> A new method is required in order to infer two relationships in sequence.</Paragraph>
  </Section>
  <Section position="6" start_page="36" end_page="36" type="metho">
    <SectionTitle>
5 Consideration of Construction of
Noun Case Frame Dictionary
</SectionTitle>
    <Paragraph position="0"> We used &amp;quot;X no Y&amp;quot; (Y of X) to resolve indirect anaphora. But we would achieve get a higher accuracy rate if we could utilize a good noun case frame dictionary. Therefore we have to consider how to construct a noun case frame dictionary. A key is to get the detailed meaning of &amp;quot;no (of)&amp;quot; in &amp;quot;X no Y.&amp;quot; If it is automatically obtainable, a noun case frame dictionary could be constructed automatically. Even if the semantic analysis of &amp;quot;X no Y&amp;quot; is not done well, we think that it is still possible to construct the dictionary using &amp;quot;X no Y.&amp;quot; For example, we arrange &amp;quot;noun X no noun Y&amp;quot; by the meaning of &amp;quot;noun Y,&amp;quot; arrange them by the meaning of &amp;quot;noun X&amp;quot;, delete those where &amp;quot;noun X&amp;quot; is an adjective noun, and obtain the results shown in Table 7. In this case, we use the thesaurus dictionary &amp;quot;Bunrui Goi Hyou&amp;quot; (NLRI, 1964) to learn the meanings of nouns. It should not be difficult to construct a noun case frame dictionary by hand using Table 7. We will make a noun case frame dictionary by removing aite (partner) in the line of kokumin (nation), raihin (visitor) in the line of genshu (the head of state), and noun phrases which mean characters and features. When we look over the noun phrases for kokumin (nation), we notice that almost all of them refer to countries. So we will also make the semantic constraint (or the semantic preference) that countries can be connected to kokumin (nation). When we make a noun case frame dictionary, we must remember that examples of &amp;quot;X no Y&amp;quot; are insufficient and we must add examples. For example, in the line of genshu (the head of state) there are few nouns that mean countries. In this case, it is good to add examples by from the arranged nouns for kokumin (nation), which is similar to genshu (the head of state). Since in this method examples are arranged by meaning in this method, it will not be very difficult to add examples.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML