File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/94/c94-2180_evalu.xml

Size: 9,365 bytes

Last Modified: 2025-10-06 14:00:13

<?xml version="1.0" standalone="yes"?>
<Paper uid="C94-2180">
  <Title>CUSTOMIZING AND EVALUATING A MULTILINGUAL DISCOURSE MODULE</Title>
  <Section position="5" start_page="1110" end_page="1110" type="evalu">
    <SectionTitle>
4 EVALUATION RESUI:I'S
</SectionTitle>
    <Paragraph position="0"> hi this section, we will report onr evahtalion results.</Paragraph>
    <Paragraph position="1"> We ran 100 Japanese and 100 English blind test joint vett2, For example: ~\[~(~H~f~), ~C/,~,~ (~.,~2Z~), * .</Paragraph>
    <Paragraph position="2"> 3. A definite plural NP can be expressed in Japanese by a numeral or numerical quantilier plus a classifier, as in &amp;quot;ryousha&amp;quot; (file two companies) and &amp;quot;san-sha&amp;quot; (the three companies).</Paragraph>
    <Paragraph position="3"> lure ,'uticlcs through our text uuderslmlding system with and without the discourse module turned on, and scored the resnlls using an automalic scoring prognam. The scoring program uses a scoring metric from information retrieval, and reports recall and precision for each slot in the lemplates as well as a single combined score called F-measure 4 for overall perforlnance (of. 1141).</Paragraph>
    <Paragraph position="4"> It shouM be noted that this ewduation is a blackbox ewduation of the syslem as used in a particular application task. Consequently, the results do not directly reflect the perfonn~mce of the discourse module itself* For cxanple, this task does not require all company name anaphora (i.e. aliases) to be reported, but only those which are involved in joint ventures. Also, the causes of task l~tilure or success are somelimes due to the lhilure or success of system modules other ttum the discourse module. For instance, the proprocessing system does not always recognize company names which me potential autecedenls. On the other hard, the preprocessing module rather than the discourse module sometimes recognizes compaty acronyms as aliases.</Paragraph>
    <Paragraph position="5"> Thus, the resnlts of the hlackbox ev~dnation reflect more on how the discourse module helps the whole system per-Ibnn a p~uticular task.</Paragraph>
    <Section position="1" start_page="1110" end_page="1110" type="sub_section">
      <SectionTitle>
4.1 Name Anaphora
</SectionTitle>
      <Paragraph position="0"> It is clem* that the perlbnmmce of name auaphora resohtlion is directly linked to how well the system tills in the ALIASES slot in the output templates (of. Figure 3), The 100 Japanese texts required idenlifying a total of 127 company name aliases. With the discourse module tnnted on, the recall of Ihc ALIASES slot increases by 38 poinls and the precision by 16 points. Though the set of KS's used for nane amphora was mostly satisihctory, we lound one problem paticular to this domain in tx~th l~mguages. Since the texts arc in the joint venture domain, it is often i\[ie c~tse that the nane of a new joint venture company (e.g.</Paragraph>
      <Paragraph position="1"> &amp;quot;'Chrysler Japan&amp;quot;) overlaps the nanes of its p~u'ent cornpanics (e.g. &amp;quot;Chrysler Corp.&amp;quot;). Wlten the text nses a nane anaphor (e.g. &amp;quot;Chrysler&amp;quot;), it must refer to the pm+ent company even when the joint venture company is mentioned most recently. We are plmming to add another orderer which preli~rs the pm'ent company when there is such a conllicl.</Paragraph>
    </Section>
    <Section position="2" start_page="1110" end_page="1110" type="sub_section">
      <SectionTitle>
4.2 I)elinite NP
</SectionTitle>
      <Paragraph position="0"> We hytx)thesized that resolving delinitc NP's affects the extraclion of information about which company is performing which &amp;quot;economic activity&amp;quot; in a joint venture (e.g. Compaty A will nlanufaelufe ca's while Company B will mm'ket them), since snch information appem's later in at</Paragraph>
      <Paragraph position="2"> where 1' is precision, R is recall, and \[\] is the relative importmtce given to recall over precision. In this case, ~= 1.0.</Paragraph>
      <Paragraph position="3"> 111I article after compmties involved ill It joint venture are &amp;quot;already introduced into the discourse (e.g. &amp;quot;Publishing rivals Time Inc. and New York Thnes Co. said they agreed ill principle to form ajointiy owned national magazine distribution partnership... The joint venture will continue to market mag~ines currently marketed by Tune Distribution...&amp;quot;). null Under the same test condition as above, the precision of the relevant slot (i.e. ACTIVITY-SITE slot ill Figure 3) increased by 5 points in JapaJmse when discourse processing was used. The recall was not affected much by the discourse processing; it increased only by 1 point. In the English test, the changes in both precision and recall were negligible. One of the reasons for this less drastic incre~tse of this slot value is that the sentence expressing economic activities do not always use delinite NPs for the agents of such activities. Such agents can be expressed by name mlaphora or pronouns or, often in English, by implicit subjects of infinitives, as in &amp;quot;Siemens AG and GTE Corp. agreed to set up a new holding eomp~my in West Germany to oversee their telecommunications joint venture...&amp;quot;. In addition, examination of the test results showed that when there are more than one antecedent hypothesis, topic marking (using particle &amp;quot;wa&amp;quot;) plays a more significant role in determining the antecedent of a Japmmse &amp;quot;dou&amp;quot; definite NP th,'m recency. At the time of the testing, however, we were not using topic marking infonnafion to prefer topicalized amecedent hypotheses. Another finding which is true of both Japanese and English is that definite NP ,'maphora resolution often requires pragmatic infercncing ill order to obtain a fact which is not explicitly slated in the text. For ex,-unple, in order to resolve the definite NP in the senteuce &amp;quot;Chevron, an oil company, also said it acquired Rhonc-Poulenc's 30% interest in Petrosynthese S.A., boosting its holding in the French joint venture to 65%,&amp;quot; the discourse module has to infer either that Petrosyuthese S.A. is a French comp~my (perhaps from the company designator?) or that acquiring someone's holding ill a company increases one's holding in that company. Wc are currently adding KS's which m~dce use of topic information and pragmatic inferencing, ,and also investigating which combinations of KS's will optimize discourse pcrfo iTllallce.</Paragraph>
      <Paragraph position="4"> Furthermore, we think that very little ch,-mge in recall is due to the fact that the system a~ssumed tile parent companies to be the value of ACTIVITY-SITE when it is undetermined. Thus, this detault value kept the recall of the system without discourse processing higher, mid themfore the ACTIVITY-SITE slot was not as good an indicator of the discourse module performance as the ALIASES slot.</Paragraph>
      <Paragraph position="5"> It is interesting to note that ml approach like Dagan ~u~(I Itai's \[3\], which uses statistical data on semantic selectional restriction that is automatically acquired from large corpora to resolve anaphora 5, tines not work well in this domain. This is because a typical text in this domain contains at least two lX)ssible antecedents (joint venture partners m~d possibly a joint venture comp~my) of the s~une semm~tic type, munely organization, hn&amp;quot; a delinite NP anaphora referring to organizations.</Paragraph>
    </Section>
    <Section position="3" start_page="1110" end_page="1110" type="sub_section">
      <SectionTitle>
4.3 Overall Performance
</SectionTitle>
      <Paragraph position="0"> Overall, discourse processing increased the system perh~rmance measured by tile combination of overall recall mid precision scores (i.e. F-measure) by 4 points in Japanese, mostly due to ~m overall increa.se in precision.</Paragraph>
      <Paragraph position="1"> Interestingly, the discourse processing helped also in the identification of links between organizations mid people, ,'~s indicated by the PERSON slot of the &lt;ENTITY&gt; object ,'rod the PERSON'S ENTITY slot of tile &lt;PERSON&gt; object (cf. Figure 3). With the discourse processing lunged on, the recall of both PERSON and PERSON'S ENTITY slots incre~Lsed by 7 points, and the precision by 10 points and 12 points respectively.</Paragraph>
      <Paragraph position="2"> We think that this is because when a person associated with an organization is mentioncd, the company mune or the person's naJne is often an anaphoric form as in &amp;quot;Carlos M. Herrera, president of Preferred,&amp;quot; or &amp;quot;Katzenstein, a former executive with Bomar Resources Inc.&amp;quot;. In order to undersUmd the relation between ,'m organization and a per-son as in &amp;quot;Eric S. Katzenstein, M&amp;M vice president&amp;quot; (cf. Figure 2), tile system has to recognize both the alfilialion link between the person and the comDmy hnplicit in tile appositive phrase, and the mmphoric link between Ihe objects under different aliases. Our discourse module takes care of both identifying appositive relations (e.g.</Paragraph>
      <Paragraph position="3"> Eric S. Katzenstein is vice presideu0 and resolving u~une anaphora (e.g. &amp;quot;M&amp;M&amp;quot; refers to &amp;quot;M&amp;M Ferrous America Ltd.&amp;quot;).</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML