File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/94/c94-2183_metho.xml

Size: 13,744 bytes

Last Modified: 2025-10-06 14:13:43

<?xml version="1.0" standalone="yes"?>
<Paper uid="C94-2183">
  <Title>Auton latic Detection of Discourse Structure by Checking Surface information in Sentences</Title>
  <Section position="2" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2 Discourse Structure Model
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
and Coherence Relations
</SectionTitle>
      <Paragraph position="0"> Studies o\[' I)S have I)een reported by a large munher o\['resca.rchers (e,g., Cohc'n I.OS4; l)algren 1988; (~rosz aud Sidner 1!)80; llohbs 1!)85; Mann 1994; Po\]auyi am\] ,t-;elm 19&amp;'l; FhgchumH 1985; Zadrozny ,,i.nd J(!llSell 1991). \'Vhal, has been c,.)lnmoldy suggest,ed is that Ih( ~ I)&lt;-; resull,iug I't'oul the recursive elubeddiug mid sequeuciug o\[' di.&lt;-:cc, urs,,~ uniLs has the \['ornl of a \[l'C(? (discoHrse hist, c)ry parse i, rce). lfo,,vever, Lhere has heeu a variety of ,.lefiuitioll fbr discourse Illli\[iS, constil, uengs oftl.' tree, mid cc, hereuce relat,ions. \[n this research we have adOl)ted the siI.plest, model iN t,hc interest of focusiug ou how to dei,e('A, I)S aul,onmtically. In our ulo&amp;A, each s&lt;~mence is considered a dis('om'se tlliig, au(\[ each no(h: of Lhe discourse history I)a.rse tree is a seuleHcc~ aud each liuk a cohereuce relation, 1 (:ollereuce relai,iolls existJng iu a text;, as IL{ficlnuau (1985) I}oiuh~d out, greatly del)end ou the genre of the \[.(?X\[.; ii;/lr~/i.iv(!~ lll'glllllOIIL, iI(!WS article, COliVerslt|.iOll, au(\] scienLific reporL. AIIIOllg it lltlillb(?l' ()1&amp;quot; Lhe. cohQl +. enc~ rehli, ious suggeste.d s() Jar, we selected the followiug set of the relat, ions which a,zcounted \[br intuitic, l~S couceruing our t.argel, text,s, Immely scientific amt t, echaical texts (S:i. deuot,~s Lhe forlncr selm.mce ;tlld Sj the latter).</Paragraph>
      <Paragraph position="1"> last: S:i. and Sj iuvolve the salH&lt;~ or similar events or state~,, or the same or sinfitar imi&gt;orLanl, con.</Paragraph>
      <Paragraph position="2"> stitueuis, like sq..3 and s,\]-6 in Api~ei.lix.</Paragraph>
      <Paragraph position="3"> (JOIIILI'~I,NI; : Si aim Sj involw? c&lt;.~ut, rastil~g events or !-;t~lt(!s, or (:Ollll'~l~d.illg illll)ol'i.~tlll. (:oll.c,(,i\[,llOlll,s. T,.:,pic cliaining : Si and Sj haw~ dist, incL pre(lica I, ious ahouL Lhe sanle t, opic, like s I-13 am\] s l-i9.</Paragraph>
      <Paragraph position="4"> Tol)iC-,.l,.mfimm.t: chahihlg : A dolninanl&gt; constii,ueuL al:,m'l from a giw~n tol)ic iN Si i)ecolnes a t, ol)ic in Sj, like s,\] d ;m(\] s,'t-5, I'\]hll)(:,raLion : Sj gives (let,ails ah, olfl; ;~ constiLueut intr,:)duced in S+-, like s\]-16 aud s1.17.</Paragraph>
      <Paragraph position="5"> lb:!a.s,.ni : Sj is Lhe reason for S:i., likesl-\[3 and sl-.ld.</Paragraph>
      <Paragraph position="6"> Canse : .&lt;-;j &lt;:)c('urs as a result of Si, lil,:e sl..17 and sl 18, IA\[ \[)l'l!~,l!lit~ W(! reT, ard ;t NL'JI((!II((\] IllD.l'J';f!d off \]Jy ~l i)(!l'iOd ,:is it disc~&gt;tu'&lt;.e uuil, (Jt+lll!l'ellCl! rel;dloils ;ire exlsthig also bel, ween cl;lll~;ey, iu tt sf ill#lllC(!, \~Ie Ilihlk lilll&amp;quot; al)ln'om:h ex;ili/hiillg stlrf~l.ce clue \]ll\[)n'ulal i&lt;ul fJt.ii IJ(! adapted Io exlra,.;\[ Ihcir I'elations, ;tiid  state or eonsl, il.uent iu Si is explained ill S 3.</Paragraph>
      <Paragraph position="7"> QlleSlLiOli-aliSWC.r : Sj is I.he allswcr Io the qileSt ioii ill Si, like s4- I and s,'l-2.</Paragraph>
      <Paragraph position="8"> The l)Ss for the sample rexi, hi Al~peudix is shown iu Figure 1.</Paragraph>
      <Paragraph position="9"> As in really previous approaciies, we also niake the foilowhlg assulrlptioll in the I)S model: ;i liew SOlil.eiice coining iu ean be eolinecl.ed to the node eli |,he right pnost edge ill the DS tree (llerel/ftel', we ealla ilew sentellce all NS, alld a I)ossible eolineel,cd Selll,ell('e on the right edge in the I)S tree a CS: Iqgnre 2). This lllOalls that, after detailed explanal.kms \[br erie |.epic l.erlrihial,e, alid a new topic is hitro(lueed, derails of lhe old to|tic are hidden hi ililll,r liodes ali(I are IiO IOliger refer ro(i to.</Paragraph>
    </Section>
  </Section>
  <Section position="3" start_page="0" end_page="1125" type="metho">
    <SectionTitle>
3 Automatic Detection of Dis-
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
course Structure
3.1 Outline
</SectionTitle>
      <Paragraph position="0"> C'orlsiderhlg oilr \[)S model, what lhe I)S analysis should do is clear; for each NS, il. t.rles to find lhe correct C~ alid l.he ('orrecl. relalioli bei.weell l.helli. In ordeP to cstiluate l.lieuL we have dil'eel.ed otlr aLI.ell|.ion |.o I.hrc.e l.ypes el&amp;quot; chic rill'or'ilia|loll'. 1) clue i!xpressions indicating SOllie re|aliens, 2) oe(fiil're/ICe o\[' identiea.l/synouyulous woi'ds/phrases hi topic chaiiihlg or topic-don|brant chahfing relal, ion&gt; 3) similarity between two s01itellces in list or ColHra.st relation. By l.he iriethod described later we Call l,l'atisforul such illf'Ol'iPlat{()l-i into reliable scores for so|lie relalions. As all  el tie ill \[ori llatio il.</Paragraph>
      <Paragraph position="1"> NS COllieS in, \['or each CS we calculate reliable scopes for all relations by exaniiniug the above three types o\[ ehles. As a \[hlal resull,, we choose the CS and tile relation haviug the ntaxhnuni reliahle score (Figure ~).</Paragraph>
      <Paragraph position="2"> kS air initial stai,e t~ I)S has one node, starting node,. We always give a ceri,alli score for the speci~d relalion, sf, art;, bei.ween all N,q alid tim sl,;u'ting node. \\qleli any other relal.iou Io lilly (L',,~ (lees ilol. have hu'ger s('ol'e for all N,('J, it, is eOl/lieeled Io tile starl, hlg node by si.arl, relat.ion. '\['his lllealls |.hal, I, he NS i~ i, he starting senl.eli('e of a ii&lt;)w large sogilielll, like paragral)h or ,seclioli, in |he I)S,</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="1124" type="sub_section">
      <SectionTitle>
3.2 Detection of Chle Expressions
</SectionTitle>
      <Paragraph position="0"> \Ve l)re.lmred heurisl, ie rules for findiug clue expressions by lmtl.ern lnatchilig and relating I, herri 1,o proper relations with reliable scores. A rule consists oft, he \['of lowing paris: * condition for rule alHHic.ation : rule applicable range (how far ill the. sequence of C, Ss the rule can be applied to) relation of CS to its previous DS dependency structure pattern for CS dependency structure pattern for NS * corresponding relation alld reliable score.</Paragraph>
      <Paragraph position="1"> Patterns for CS and \]~IS arc lilatelied ilof for woPd s(~(lll(qic(~s I)lll for del~emh'uey slluicliIll'es el' botih sen-I(!li('es. 2 \4,&amp;quot;e IlSl' it \])ouer\[ll\] patlerli uiaLchhlg facility for del)elidelioy SIl'llCttll'eS, where a wild card n\]atching ally l&gt;arl.ial del)eudeney sl.rtlCtill'e.&gt; reglliar expressions, AND-, ()It , N()'lLol)eral.ors, etc. are available (~l/\[tl ral.a and Nagao 1993). We apl)ly each rule for the pair of' a C'.S and all NS. If the eondition of' the rule is satisfied, the sl~e(',ified reliable score is given to  2 hllml, lo {Itll' syslcIIl iS IL SO(IIIC;I1Cf: Of pa.rsed soil|elites I (\[e-I)(!lld&lt;~llCy SII'tlCtilI'CS, IJy Otll&amp;quot; developed IM).FSeI' (l'~uroh~lshl and Nagao 1992a). In .Japanese Ihe del)endency siruclure of a sen-I(!llCC CCtli~isi~4 lt\[' head/nlodifier relallons })el.ween |)llll.'4el;sus, e;ich of which is , Ollil)OSeC/l &lt;If ;i ( Olllelll %V(ll'd ~lltd suffix WOFdS.</Paragraph>
      <Paragraph position="3"> where &amp;quot;A&amp;quot; depends on &amp;quot;13&amp;quot;.</Paragraph>
      <Paragraph position="4"> &amp;quot;*&amp;quot; {\[ellol, es ~t wiM C~tl'd.</Paragraph>
      <Paragraph position="5"> the corresponding relation be{,ween l.hc ('S and the NS.</Paragraph>
      <Paragraph position="6"> For exatuph.h lhde-I in 'l'al)h'. l giw!s a score to {,he reasou relat, ion I)et,weeu two adjoiniug semel.:,.~s (note t, he rule applicable range is '1') i\[' t,he NS sUtrt.s wit.h l.he expressioll &amp;quot;NAZI';.NAI{A (because)&amp;quot;. l~ulc2 iu Table 1 is al}l~lied i~ot only for tim tLeighl)oring (:S I:,ut ;\].Is() \['{~r faP(,\]ler CSs, I)y specifying I\]le occurreH(:e (}\[' identical words (&amp;quot;X&amp;quot;) in (,he cor,ditiort. V(e also {'au Sl}cei\['y the relation of CS to its previous DS as a. eolldition, like Rule-3 in Table 1. This rule cousiders the thcl, that when sonic exmnl)h:s are iMr{'xluce{l I)y exen@ificatiou-i}r('.s('.nt relal, iou, d&lt;,aih~d {'xl)tauat, iotls for l, hem oft,en follow.</Paragraph>
    </Section>
    <Section position="3" start_page="1124" end_page="1124" type="sub_section">
      <SectionTitle>
3.3 Detection of Word/Phrase Chain
</SectionTitle>
      <Paragraph position="0"> Ill gelleral a senlen(;(: can bc divided h~{o Ix,,,{) \])nl'l'-;; a topic part. a.n(I a iloll-{,()\]}i{; part. \Vh{?ll (we sellt,(',llces ape ill a l,opic chaiuillg rclalion, Ihe same Lopic is maintained t, hrc+ugh them. 'l'herel(}re, (.he oc(;/li'r(!llC(! of idelttical/syuouynlous words/I}hrases ((,ll{' word/phrase chaiu) in topic lmrl,s o\[' t, wo sellLel~ces suppor(,s Lifts relation, in Llle case of (,olfic--dominaul chaining relation, a ,,Iomil\]aut constituelLI iutrodtJce, d in a non-(,el}i( I:,n\]'{, o\[' a prior senlence h(?COIil{~S ~1 topic in a succeedil\]g sent, ence, So, t, he v,'ord/l',hrase chain froln a noi&gt;t,OF, iC part of a I)rior seut(!n('e to a topic \]);IP\[, Of 3 Siicceedill~ selll,ell('{! Stll)l)Ol'ts (,his t'(!\];lI,ioll+ \]\[owever, since there are Ill~|lly chios R)r ~llI ~S supporLing or, her relatious to some Cgs, we luusl, uot ouly fiud such WOl'd/I}hrase chaius but also give sou.! (+eli able score I,o t,OlfiC chailfing or t,ol}ic-douliuanl, chaiuins relation. In order to do this, we give scores to words/p\]lrases ill t,ol}ic and nolM,OlfiC l++U't,s ac(,~)rding 1,o the degree o\[' their importauce iu s{mtcn('cs; ;v{! also give scores 1o the IIlal,ching o\[ i{h'll{,ical/syllou3ulous wor(Is/l}hrases ac{:ordiug to the (l{+grt~e (}/+ their agt'e{'inent, 'l'hetl we give these Pelatiol~S the sum of the scores of t, wo clmiued words/i)llrases au{\] Ihe score of Lh{!ir sial,thing (l&amp;quot;igure 3).</Paragraph>
      <Paragraph position="1"> All of these, are doue hy al}lflyiug rules COIISiSLillg O\[' ;:t i}at, t,erll for a imrl, ial dep(md{,iwy st, rucl.ur{, and a score. l;'or example, I)y I~ule-a mid I/ in Tal:,h~ 2, words hi a l)\]n.ase whose head word is folio, wed I)y a topic umrl,:ills i)osl, posiI, ion &amp;quot;\VA&amp;quot; are given sotlle scores tin (,el)i(  parts. A word in a I~ou-tol}ic part in the sentential st.yle, &amp;quot;.. (~A Al\]U(l.here is ...)&amp;quot; is given alarge score by llule c in 'l'al)le 2 because (.his word is arl important uew iulbl'umtioll iu this Sellt.elw.e and topic-dominant chaiuiug relat+iou iuvolving it, oft,en occur. MaLching of phrases like &amp;quot;A of If' is give, n a larger score than (.hat, {ff word lil,:e &amp;quot;A&amp;quot; alone by I/uh&gt;d and e in Table 2. a</Paragraph>
    </Section>
    <Section position="4" start_page="1124" end_page="1125" type="sub_section">
      <SectionTitle>
3.4 Calculation of Similarity between
Sentences
</SectionTitle>
      <Paragraph position="0"> \V\]LOIL I, WO SelliX?llc{!s have list, or contrast relation, they have a certain Silnilarity. I\[owever, their similarity can-Ilot be deteci.ed by rules like the abow~ which see re\]ativt!Iy sinai\[ }}locks ill senl.ences, because it is n+oL the situl~le similarity lint. the silnilarity in the sequence ol' wor{l:; aud their granmlat, ical sl,ructures as a. whole.</Paragraph>
      <Paragraph position="1"> 111 Ol'(\[er 1,O illeaSilPe Sllch LI similarity, we extended our dymunic programufillg method for detecting the scope o\['a coordination in a sent.ence (Kurohashi and Nagao 19921&gt;). '\['his method ('.an calculate the overall similarity value h{~t.weexl Cwo word-strings of art)itrary leugths. First., the similarity value between gwo words are cal{'ulal,ed a(:(:,.~l'ding to exact matching, matching of {h{&amp;quot;il' parts of Sl~ee(:h, aIKI their closeness in :-i thesaurus dicliouary. 'l'heu, the siruilarity wdue between two wor(I-+strillgs are calculat.ed roughly hy combining t.he similarity values bel.ween words in the two word:q)H,' clifllcult problem i~ that authc, rs often use subt.ly dil'fcrcm {'XlJl',~ssi,ms, n&lt;,l, M&lt;,ttlc,,d Wol'ds/i)hrases , fc)r such chains. \Vhih~ s.tnP, of them can be caught by uMng a Ihesaurus and by rules like \[hde-f in 'FaMe 2, ~here is a ',vMc range of variety in their diflbrences, Their complete trea.lment will be a. target of OIIl' fll|tLl'C/+ WOI'\]'~.</Paragraph>
      <Paragraph position="3"> As tk)r rules for topic/non-topic paris, t, he score is given to the hnnset.su ma.rked by ;t square. As for rules for matching, &amp;quot;X&amp;quot; and &amp;quot;x&amp;quot; denote identical words or synonynlous words from this Jatmnese thesaurus, &amp;quot;Bulu'ui (',el \[\[you&amp;quot;. So do llV+~ a.lld lly~.</Paragraph>
      <Paragraph position="4"> strings.</Paragraph>
      <Paragraph position="5"> While originally we cnlculated the sitJtilarity vahle between possible conjuncts in a setllellce, het+e we calculate the similaril,y value I)et.ween t.wo senlenees, a (:S and an NS, by this method. '\['his can he done simply by connecting two sentences and calctJlating the simi\]a.rity value between two imit, ative conjuncts consisting of the two sentences. We give t.he ItOrl/laliz,:'d sinlilar.ity score between a CS and an NS (divided hy their average length) to their list, and contrast relnlious as a reliable score.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML