File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/94/c94-1011_metho.xml
Size: 7,240 bytes
Last Modified: 2025-10-06 14:13:38
<?xml version="1.0" standalone="yes"?> <Paper uid="C94-1011"> <Title>Portable Knowledge Sources for Machine Translation</Title> <Section position="3" start_page="86" end_page="87" type="metho"> <SectionTitle> 4. \])isambiguation Method </SectionTitle> <Paragraph position="0"> The Imsis of disanihiguation of the three t.ylms of itinhignlty discussed in ~ection '.). is to prefer the host PKS hi the do(:ilmel/{ llst. tlutL matches the anihig/lity~ alid to hlterpret ~}ie. PKS as ~t rule for selecting a word SellSe i phrasld attitchlttellt l oF word traitsh~thln.</Paragraph> <Paragraph position="1"> if there is lto matching PKS, either the. ambiguity is properly hiuidled I>y the systenl, which results in no tlS(!r corre.('.tioll in the dtiCltllielits~ ()r it is llew to the systcln. In tim fornler case, the riser will probably be satisfied wit.h the transhttim, by the system. In the latter case~ the translattlon may }litve to })e corrected hy tile riser, but the interaction will be recorded its a lleW lll(S and usexl for \[llttlri! dis;tm}/iguation. I '\['}m rllati:hiItg~ algorithrn for I>K1 and PK3 rnles is a sinllile exa.ct matching of words a.nd le.xieal features. If two or more PKS rules match the ambiguous word, the ages of the rules and the ordering o1&quot; documents in the document llst uniquely determine the most preferable PKS. The PK2 rules, however, can be used with a more flexible matching algorithmIe, 12\] since the coverage of PK2 rules would be very limited if two phra.ses (the modifier and modifiee phrases) had to match the rule exactly.</Paragraph> <Paragraph position="2"> Once the document list has been given, the PK1 and PK3 rules can be polynomlally converted into a shrq)le lookup table, where the key is an ambiguous ward, and only the most preferable rules itre stored mr retriew~d, s PK2 rules can be organized similarly as a ternary lookup table.</Paragraph> <Paragraph position="3"> It should be noted that sentences in a document list can be utilized as an example base \[9\] since the documents in the document list has already been translated, and the translation of the source sentence is readily available. Indeed~ the conventioual matching algorithm for a flat example base has to be extended into a hierarchicM one, where the latest translation has the highest priority, and PKSs must be equally tllken into cons\[d: eration.</Paragraph> </Section> <Section position="4" start_page="87" end_page="87" type="metho"> <SectionTitle> 5. Knowledge Source Compilation </SectionTitle> <Paragraph position="0"> When a set of documents in one domain grows considerably, or when the MT system is to be transporte(I to a different environment, it is convenient to be able to compile PKSs into a single, portable user dictionary.</Paragraph> <Paragraph position="1"> The compilation is similar to the creation of lookup tables, described in the previous section. The numbers of conflicting arcs and paths should be carefully examined to see whether a given document list yields a cousistent user dictionary. &quot;\['he user can rearrange the ordering of documents, and choose the most preferable among conflicting PKS rules to make the optimal user dictionary for the domain.</Paragraph> <Paragraph position="2"> The rearrangement of documents in the document llst does not change the resulting PKS graphs. It just changes the preferences among the conflicting arcs or paths. Therefore, the optimal construction of ~t user dictionary does not have to consider an exponenthd number of possible document orderlngs, but only a polynomial number of the following palrwise constraints: * If there are conflicting arcs ttl~ a2~ .. &quot;1 ak iu the PK1 (or PK3) graph, and the most i)referable arc is a;, the document di having the PK1 (or PK3) rule for ai must be preceded by each document dj having the arc a a&quot; (j = 1,..., k. j 7~ i).</Paragraph> <Paragraph position="3"> * If there are conflicting paths Pl, P2, .., l)k in the PK2 graph, and the moat preferable path is Pl, the 5Alternatively, all the conflicting PKS rules can I)e stored to give tire user as many candida.tes as possible.</Paragraph> <Paragraph position="4"> document di having the PK2 rule f~r the ~irst arc in Pl must be preceded by each docunmnt dj having tile I)1(2 rule for the first arc ill pj (j = 1,..., k.</Paragraph> <Paragraph position="5"> j T~ i).</Paragraph> <Paragraph position="6"> It is polynomlally decidable whether there is an mrdering of documents that satisfies all of the above constraints. An ordering of documents exists if\[&quot; the constraints are not cyclic (that is, iff there is a document D that must precede itself). \],\]yen if there is no linear ordering of such documents, the user dictionary can still be cre~ted from the user-selected arcs and paths. In this case, however, there is no natural carrespoudence between the user dictionary and a document list. Such a eorrespomtence is indispensable if the user wishes to update the user dictionary when a new document is :tdded at ;to arbitrary imsition in a document list. If the user dictionary is equivalently reducible to a document list, recompilation of the document list into the user dictionary is straightforward. When no such equlvMent list exists~ a docunmnt may only be added to the tail of the list, thus ow'.rruling all the conflicting PKSs.</Paragraph> </Section> <Section position="5" start_page="87" end_page="88" type="metho"> <SectionTitle> 6. Alternative Views of Knowledge </SectionTitle> <Paragraph position="0"> Organization In Section 3, we took a simplified view of the PKS organization~ which we may call an &quot;optimistic organization.&quot; It was implicitly i~ssumed ti,at only the elements in PKSs can conflict with each other. Howew~r, the system's default choice of word senses, may haw~ satisfied a user, but may eon\[lict with a PKS newly added to the document list. Thus, PKSs need to be more carefully organized if the user thinks that th~ translation by the system is adequate without the PKSs. This view may be called a &quot;pessimistic organization&quot; of PKSs. The op~ timlstic orga.niza.tion is easier to implement, while the pessimistic organization can provide users with more consistm~t translation.</Paragraph> <Paragraph position="1"> In the pessimistic org~niza.tion, the conflicting knowledge has to be defined in terms of PKSs ~uld the choices of word sense, phrasal i~ttachment, and word translation I V the system. 'lk~chnically, it means that fur every PKS in the doculmmt list, we have to examine if each rule in the PKS to determine whether it confilets with a preceding PKS rule or a choice by the system. This is often w~ry time<:onsuming. One way to deal with this l)essimistic view is to keep track of the docunmnt list with which a new docnment is translated. It can easily I)e shown that time-consuming checking of I)B2S conlli(-ts can be avoided by employing a monotonm~sly growing sequence of document lists, {dl }, {all, d2},..., {alL, d2, &quot;'i&quot; dk} such that each docunlent d i in the list has been translated by using the PI(Ss in the document llst {dr, d2,..., di-I }.</Paragraph> </Section> class="xml-element"></Paper>