File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/00/c00-1060_metho.xml
Size: 17,732 bytes
Last Modified: 2025-10-06 14:07:09
<?xml version="1.0" standalone="yes"?> <Paper uid="C00-1060"> <Title>A Hybrid Japanese Parser with Hand-crafted Grammar and Statistics</Title> <Section position="4" start_page="412" end_page="414" type="metho"> <SectionTitle> 3 The Hybrid Parsing Method </SectionTitle> <Paragraph position="0"> This section describes tim procedure of parsing with the ~l&quot;riplet/Quadrul)let Model. Our hybrid 1)arsing method proceeds as tbllows: * At; the beginning, dependency structures are obtained from trees generated by SLUNG. For each bunsctsu, modification candidates are enumerated, and if there are four or more candidates, tlmy are restricted to three. The lmuristic used in this process is described in Section 3.1.</Paragraph> <Paragraph position="1"> * Then, with the ~'il)let/Quadruplef; Mode.l and maxinnnn entropy estimation, prol)abilities of the del)endencies are calculated. Secti(m 3.2 discusses the characteristics and advantages of the model.</Paragraph> <Paragraph position="2"> * Finally, the most preferable trees for the whole sentence are selected.</Paragraph> <Section position="1" start_page="412" end_page="412" type="sub_section"> <SectionTitle> 3.1 Restriction of Modification Candidates </SectionTitle> <Paragraph position="0"> Kanayama et al. (1999) report that when modification candidates are emnnerated according to SLUNG, 98.6% of the correct modifie.es are in one of the following three 1)ositions among the candidates: 1;11(; nearest one from the modifier, the second nearest one, and the. farthest one.</Paragraph> <Paragraph position="1"> As a consequence, we can siml)lil\[y I;11(; problem by considering only these three candidates and discarding tim other candidates, with only 1.4% potential errors. We therefore assume that the. number of modification candidates ix always three or less.</Paragraph> <Paragraph position="2"> This idea is sinfilar to that of Sekine (2000)'s study, which restricts the candidates to five, i)ut in his case, without a granmmr.</Paragraph> </Section> <Section position="2" start_page="412" end_page="414" type="sub_section"> <SectionTitle> 3.2 The Triplet/Quadruplet Model </SectionTitle> <Paragraph position="0"> The 'Diplel,/Quadruplet Model calculates the likelihood of the dependency between bunsetsu i and bunsctsu cn; P(i --, cn) with the formulas (8) and (9), where c,~ denotes the nth candidate among b,msctsu i's candidates; (I,i denotes some attributes of i; and ~I~C/,~ denotes attributes of c,~ (including attributes between i and cn).</Paragraph> <Paragraph position="2"> As (8) and (9) suggest, the model considers attributes of the modifier bunsetsu and attributes of all modification candidates simultaneously in the conditional parts of the probabilities. Moreover, what is calculated is not tile probability of &quot;whether the dependency is correct (T, see Formula(6))&quot;, but the probability of &quot;which of tile given candidates is chosen as tile nlodifiee (n =1, 2, or 1)&quot;. These characteristics imply the fbllowing two advantages.</Paragraph> <Paragraph position="3"> Advantage 1 A new distance metric. The correct modifiee can be chosen by considering relative position among grannnatically licensed candidates, instead of the absolute distance between bunsets~as.</Paragraph> <Paragraph position="4"> Advantage 2 2)'eating alternative trees. The candidates are taken into consideration simultaneously. But because the nlodificaPSion candidates are restricted to at most three, we considerably avoid data-sparseness 1)rot)lems.</Paragraph> <Paragraph position="5"> Below we discuss these advantages in order. These advantages clarify the differences fl'om previous models described in Section 2.1, and are empMcally confirmed through the experiments in Section 4.</Paragraph> <Paragraph position="6"> 3.2.1 Advantage 1 : A new distance metric As discussed in Section 2.1, the distance metric Ai,j used in previous statistical methods was obtained simply by counting intervening words or b'unscts,t~ l)etween i and j. On the other hand, we use the relative position among the modification candidates as the distance metric. Tile following examples illustrate a difference between those two types of melric. The correct modifiee of kare-ga is hashir'u-no-wo in both (10a) ~u~d (lOb).</Paragraph> <Paragraph position="7"> (10)a. kare-ga hashiru-no-wo mira koto he-SUBJ mm see fact (the fact that I saw him run) b. kare-ga yukkuri hashiru-no-wo mira koto he-SUB.} slowly run see fact (the fact that I saw him run slowly) In previous models, (10a) and (10b) would yield, P. ( kare-o,~--* t~ashir'u-no-wo)=P(T I kar,~-ga, h,~shiru-no-~vo,A1) \])b(kare-ga--+ hashi,.tt-?zo-wo)=\])(~l'll~a~ve-ga, hashi,'u-,zo-wo,A2) respectively, where A1 = 1 and A 2 = 2. Then, the two probabilities above do not have the same value in general.</Paragraph> <Paragraph position="8"> Our grammar does not allow the dependency &quot;kare-.qa --~yukkurY tbr (10b). The modification candidates of karc-ga are hashiru-no-wo and mita, hence (8) gives the probabilities between kare-ga and hashiru-no-wo as follows, in both examples.</Paragraph> <Paragraph position="10"> Thus, P(kare-ga --+hashiru-no-wo) has the same value for both examples. Our interl)retation of this difl'erenee is sumnlarized as follows. The word yukk'uri is an adverb modifying the verb h, ash&'u. Our linguistic intuition tells us that the presence of such adverb should not affect the strength tbr the dependency between kare-ga and hashiru-no-wo. According to this intuition, the existence of the adverb should be considered as a noise. Our model allows us to ignore such a noise in learning from annotated corpus, while previous nlodels are atfected by such noisy elements.</Paragraph> <Paragraph position="11"> Contrary to tim previous examl)les, TaTv-no ill (11) ntodifies different nlodification candidates. In example (11a), &quot;~hr'o-no --+musume&quot; is the correct dependency while &quot;Taro-no -~musume&quot; is not correct in (11l)). This difference is caused t)y the b'u'asetsu between Taro-no and musume, kawaii (Adj) in (lla) and y,u~lfin-no (NP) in (llb). Actually, the grannnar allows Taro-no to depend on either of these types of words. Thus, in our model,</Paragraph> <Paragraph position="13"> Then, P(varo-no-+musume) has different values for the two examples, hi the annotated corpus, l'(21~laro-no, kawaii, musume) tends to have a high value since kawaii is an adjective. However, since yuujin-no is an NP, P(2\[Taro-no, yuujin-no, musume) tends to have a low value.</Paragraph> <Paragraph position="14"> Now consider previous models.</Paragraph> <Paragraph position="16"> Then, contrary to our model, P(Taro-no --~musumc) lms exactly the same wdue for both examples. The outconle is determined by = P(TI Taro-no, kawaii, 1) In text corpora, P(TITaro-no , yu~,jin-no, 1) tends to be high, and consequently, P(T ITaro-no, musume, 2) is very small. These values will make the correct prediction for (111)) as yuujin-no will be favored over musume. However, for (11a), these models are likely to incorrectly favor kawaii over musume. This is because 1'('.171 Tin'o-no, mus'ume, 2), being very small, is likely to be snlaller than P(T\] :late-no, t~,,waii, 1).</Paragraph> </Section> </Section> <Section position="5" start_page="414" end_page="414" type="metho"> <SectionTitle> 4 Experiments and Discussion </SectionTitle> <Paragraph position="0"> \].'his section reports a series of parsing experiments with our mode, l, and gives some discussion.</Paragraph> <Section position="1" start_page="414" end_page="414" type="sub_section"> <SectionTitle> 4.1 Environlnents </SectionTitle> <Paragraph position="0"> We used the EDR ,lal)anese Corl)us (El)R, 1996) for training and evaluation of 1)arsing accuracy. The EI)R Corpus ix a ,Japanese treebank which consists of 208,1.57 sentences from newspapers and magazines. We. used 192,778 sentences for training, (1,744 for pro-analysis (as reported in Section 3.1), and 3,372 tbr testing 3.</Paragraph> <Paragraph position="1"> With tril)lets constitute(\] of a modifice and two modification eandida.te.s extractc(l ti'onl the learning corl)uS l;hc Triplet Model is ('.onstructed. \Vith the quadruplets constituted of a moditiee and three candidates, the Quadruplet Model is constructed.</Paragraph> <Paragraph position="2"> 'PShese~ inodels arc estimated by the ChoiccMaker Maxinmm Entropy Estimator (Borthwick, 19!)9).</Paragraph> <Paragraph position="3"> The features fin' the estimation are listed in Ta/)le 1. The values partially folk)w other researches e.g. Uchimoto el; al. (\]999), and JUMAN's outputs are used for POS classification. Mainly the head of the b'unsc.tsu (the rightmost morl)helnc in a b'unscts'u, except for whose major POS is &quot;peculiar&quot;, &quot;auxiliary verb&quot;, &quot;particle&quot;, &quot;suffix&quot; or &quot;copula&quot;) and type of the b'ltnscts'u (the rightmost morphenm in a b'wnsel.s'lt except tbr whose major P()S is &quot;l)eculiar&quot;) are used as thc at.tributes. \~;e show the meaning of some f('atures below.</Paragraph> <Paragraph position="4"> POS JUMAN's minor \]?()S (for both &quot;head&quot; and ':type&quot;).</Paragraph> <Paragraph position="5"> particle, adverb Frequent words: 26 lmrticles and 69 adverbs.</Paragraph> <Paragraph position="6"> head lex 2.(14 lexical forms regardless of their POS.</Paragraph> <Paragraph position="7"> type lex 70 suffixes or auxiliary verbs.</Paragraph> <Paragraph position="8"> inflection 6 types of inttcction : &quot;normal&quot;, &quot;a(lverl)ial&quot;, &quot;adnominal&quot;, &quot;tc-fornf', &quot;ta-tbrm&quot;, and &quot;others&quot;.</Paragraph> <Paragraph position="9"> The cohmm %aria(ion&quot; in Tal)h; 1 denotes the mnnbcr of possible values tbr the feature. &quot;Valid features&quot; indicates the nmnber of features which al)peared three times or more in the training corlms.</Paragraph> </Section> </Section> <Section position="6" start_page="414" end_page="414" type="metho"> <SectionTitle> 4.2 Results </SectionTitle> <Paragraph position="0"> Wil;h our model and the features described above, the accuracy shown in Tal)lc 2 is achieved. We ovaluate the following two tyl)eS of accuracy: 35,263 SOld,enccs were rOllloved 1)eCmlSe the order of the words in the annotal;ion ditl'ered front that in the original SOlltOllCeS.</Paragraph> <Paragraph position="1"> lated to the nm(litiee, thus they are considered for each candidate, li'eatures from 19 to 27 are combination fea\[;IlI'CS. null Bunsetsu accuracy The percentage of bu.n,~cts'us whose rood(rice is correctly identified. The dcnonfinator includes all b'unsets'us except for the last bun,~cts'u of a sentence.</Paragraph> <Paragraph position="2"> Sentence accuracy The percentage of sentences whose detmndencies art'. perfectly correct.</Paragraph> <Paragraph position="3"> &quot;h>coverage sentences&quot; is the accuracy for the sentences flw which SLUNG could generate parse trees. We give the accuracy for &quot;All sentences&quot; too, by 1)art(ally 1)arsing sentences which SLUNG fail to parse. The coverage of SLUNG is al)out 99%, thus high accuracy is achieved even for &quot;All sentences&quot;. Moreover, we conducted a series of experiments in order to evaluate the COld;ribution of each characteristic in our parsing model. The parsing schemes used are the four in Figure 3. Major differences among them are (I) whether a gralnlnar is used, (II) whether modification candidates are restricted to three, and (III) whether a previous pair model with Formula (6) or the 'lS'iplet/Quadrulflet Model with Formula (8),(9) was used.</Paragraph> <Paragraph position="4"> W/O Grammar Model This model does not use a grammar. Likelihood values for dcpenden null &quot;G&quot; indicates whether the grmmnar is used, &quot;R&quot; indicates whether the modification candidates are restricted to three, and &quot;F&quot; denotes the formula; &quot;P&quot; is the pair tbrmula (6), and &quot;T&quot; is the %'iplet/Quadruplet formula (s), (9).</Paragraph> <Paragraph position="5"> cies are calculated for all bunsctsiLs that follow a modifier bunsctsu. Formula (6) is used, and as a distance metric Ai,j, the mnnl)er of bunscts~ls between the modifier and tile modifiee 4 are combined with all features. In general lines, this model corresponds to models such as (Fulie and Matsumoto, 1998; Haruno et al., 1998; Uchimoto et al., 1999).</Paragraph> </Section> <Section position="7" start_page="414" end_page="414" type="metho"> <SectionTitle> W/O Restriction Model Modification candi- </SectionTitle> <Paragraph position="0"> dates are restricted by SLUNG. Tim remaining is the same as the W/O Grannnar Model.</Paragraph> <Paragraph position="1"> Pair Model Modification candidates are restricted to three, in the way described in Section 3.1.</Paragraph> <Paragraph position="2"> The remaining is the same as W/O Grannnar Model.</Paragraph> <Paragraph position="3"> Triplet/Quadruplet Model Tiffs is the model proposed in the paper. Modification candidates are restricted to tln'ee, and Fornmla (8) or (9) are used.</Paragraph> <Paragraph position="4"> From the result shown in Table 3, we can say our method contributes to the improvement of our parser, because of the following reasons:</Paragraph> </Section> <Section position="8" start_page="414" end_page="416" type="metho"> <SectionTitle> * The %'iplet/Quadruplet Model outperforms the </SectionTitle> <Paragraph position="0"> Pair Model by 0.9%. Both of them restricts modification candidates to three, l)nt tim accuracy got higher when all candidates are considered simultaneously. It is because of the two adwmtages described in Section a.2.</Paragraph> <Paragraph position="1"> * TILe Pair Model outperforms the W/O Restriction Model by 0.3%. Thus the restriction of modification candidates does slot reduce tile accuracy. null * TILe W/O Restriction Model outperforms tile W/O Grammar Model by 0.7%. This means that the use of a grammar as a preprocessor works well to pick up possible modifice.</Paragraph> <Paragraph position="2"> We found that many structures similar to the ones described iLL Section 3.2 appeared in the EDR 4Three vahms: &quot;1&quot;, &quot;from 2 to 5&quot;, &quot;6 or more&quot; are distinguished. null corpus. Our Tl'iplet/Quadruplet model could treat these structures precisely as we intended. Tlfis is the main factor that contributed to the improvement of the overall parsing accuracy.</Paragraph> <Paragraph position="3"> Based on tim above experiments, we can say that our approach to use the grammar as a preprocessor before the calculating of the probability is appropriate for the improvement of parsing accuracy.</Paragraph> <Section position="1" start_page="414" end_page="414" type="sub_section"> <SectionTitle> 4.3 Comparison to other models </SectionTitle> <Paragraph position="0"> There are several works which use the EDR corpus for evaluation. The decision tree model (Haruno et al., 1998) achieves around 85%, the integrated model of lexical/syntactic information (Slfirai et al., 1998) achieves around 86%, and the lexicalized statistical model (Ft0io and Matsumoto, 1999) achieves 86.8% in bunsets'u accuracy. Our model outperforms all of them by 2 or 3%.</Paragraph> <Paragraph position="1"> Slfirai et al. (1998) used the Kyoto University text corpus (Kurohashi and Nagao, 1997) for evaluation and achieved around 86%. Uclfimoto et al. (2000) also used the Kyoto corlms , and their accuracy was 87.9%. For comparison, we applied our method to the same 1,246 sentences that Uclfimoto et al. (2000) used. The result is shown in Table 4.</Paragraph> <Paragraph position="2"> Our result is worse than theirs. The reason is thought to l)e as follows: * g~re use tim EDR corpus for training. Although we used around 24 times the amount of training data that Uchimoto et al. used, our training data lead to ca'ors in tile analysis of the Kyoto Corpus, because of differences in tile mmotation schenms adopted.</Paragraph> <Paragraph position="3"> * Uchimoto et al. used the correct morphological analyses, but we used JUMAN. Solnetimes this may cause errors.</Paragraph> <Paragraph position="4"> * The grammar SLUNG was designed for tile EDR corpus, and some types of structures in the Kyoto Corpus are not allowed.</Paragraph> <Paragraph position="5"> Clearly, our parser should be improved to overcome these problems and compared with other works directly. null</Paragraph> </Section> <Section position="2" start_page="414" end_page="416" type="sub_section"> <SectionTitle> 4.4 Discussion and I~lture Work </SectionTitle> <Paragraph position="0"> TILe following are some observations about the speed of our parser. Existing statistical parsers are quite etficient compared to grammar-based systems. Particularly, our system used an HPSG-1)ased grmmnar, whose speed is said to be slow. However, recent advances in HPSG 1)arsing (~Ibrisawa et al., 2000) enabled us to obtain a unique parse tree with our sysgem in 0.5 sec. in average tbr sentences in the EDR corpus.</Paragraph> <Paragraph position="1"> Future work shall extend SLUNG so that senmntie representatkms are produced. Carroll el; al. (1.998) discussed i;he 1)recisiol~ of argument si;ruetures. V~Te 1)elieve that the focus of ore' study will shift; from a shallow level to such a deeper level for ()Ill' tinal aim, realization of intelligent natural language processing systems.</Paragraph> </Section> </Section> class="xml-element"></Paper>