File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/86/c86-1012_metho.xml

Size: 5,804 bytes

Last Modified: 2025-10-06 14:11:51

<?xml version="1.0" standalone="yes"?>
<Paper uid="C86-1012">
  <Title>Particle Homonymy and Machine Translation</Title>
  <Section position="3" start_page="0" end_page="59" type="metho">
    <SectionTitle>
2. Particle Homony_my
</SectionTitle>
    <Paragraph position="0"> Let us consider the following pairs of sentences: i a. There is ~n~u a little beer left. b. ~ was ?~ too pleased to leave that p~ace.</Paragraph>
    <Paragraph position="1">  2 a. Nur ihn hatte man vergessen. b. Woz~ babe ich nut gelebt? 3 a. Vous partez ddja? b. Comment vous vous appelez ddja? 4 a. ~pu~oCume ~ ~a~ u saempa. b. ~ u ~le enam, ~mo c~asamo. 5 a. Ann~ is elj~n hozzdnk? b. Hol is tartottunk?  The words underlined in the b. example of each pair of sentences belong to a wordgroup now more or less uniformly referred to as 'Modal Particles' /cf. Arndt 1960/. These words represent, in Arndt's term, a granunatical no-man's-land, although in the past ten years there has been a considerable interest towards modal particles.  Words like the English ~ or the German nur present two problems from the point of view of machine translation. On the one hand, they are ambiguous and their homonymy 1~/st be resolved. On the other hand, when such lexemes are used as modal particles, their &amp;quot;translation&amp;quot; causes serious problems since we can rarely translate the modal into German as nut, or, say, into Hungarian as csak.</Paragraph>
  </Section>
  <Section position="4" start_page="59" end_page="59" type="metho">
    <SectionTitle>
3. Resolution of HomonPSmy
</SectionTitle>
    <Paragraph position="0"> As far as homonymy is concerned, clearly the task is to set up formal rules for the categorization of a given word as opposed to its alternative morphological and syntactic status.</Paragraph>
    <Paragraph position="1"> The implication of the assignation of such homonymous lexemes to certain classes of words is by no means a matter of &amp;quot;simple&amp;quot; selection restriction at surface level. Each modal particle has preserved much of its etymon's syntactic and semantic properties.</Paragraph>
    <Paragraph position="2"> Given this, it follows that the ambiguity may be resolved by constructing small &amp;quot;subgrammars&amp;quot; for each of these particles, so as not only to set them apart from their homonyms, but also to take into consideration the whole co~nunicative content of the sentence. null Thus, a subgrammar recognizing onl\[ either as a logical operator, with its restrictive meaning, or as a modal particle, with its vague and, in a sense, antonymous meaning -- would have to be capable of manipulating information from different levels. By comparing sentences /la/ and /ib/ it could be concluded that, say, ~ is an operator when it precedes an NP /e.g. Det + Adj + N/ and is a particle when followed by too. But this assumption can readily be proved faulty by considering /6/: /6/ If ~ you had come, you could have saved me a lot of trouble.</Paragraph>
    <Paragraph position="3"> It is commonly held that, in order to parse sentences, one needs strategies for locating verbs and their complements, assign-</Paragraph>
    <Paragraph position="5"> ing words to various categories, depending on context /Lehrberger 1982, 102/. The recognition of particles can be done mainly by starting from semantic representations which should contain information concerning both the propositional content of sentences and their extrapropositional, or subjective modal content. Thus, assigning ~ to particles would imply an algorithm roughly defined as: &amp;quot;If the lexeme ~ is used with a word that has no restrictive component in its meaning, then it is a particle; otherwise it is an operator&amp;quot;.</Paragraph>
    <Paragraph position="6"> Parsing along these lines would mean a very complicated presentation of different parts of speech, including not only NPs, made up of adjectives, nouns, but also adverbs, pronouns and even phrases to account for ~n~ constructions like /6/. In addition, a very sophisticated and precise definition of the restriction/non-restriction opposition would have to be set up.</Paragraph>
    <Paragraph position="7"> Obviously, the difficulty of assigning homonymous lexemes to modal particles, on the one hand, and to operators, intensifiers, adverbs,conjunctions, and the like, on the other, lies in the fact that the former bear a relationship to the overall meaning of the sentence, while the latter add their meaning to the global meaning only via some lower level of semantic structure.</Paragraph>
    <Paragraph position="8"> From the above consideration it follows that it would be a fairly tedious and probably unreasonable task to attempt to resolve this kind of homonymy by the algorithmization of abstract sense-components. Instead, it might be sufficient to construct a subgran~ar to check ~ and other homonyms solely by reason of their being a particle. One way to make the information contained in the subgrammar available to the parser may be to indicate, in the dictionary entry of the homonym, all the cases in which the given word could possibly appear as a particle.</Paragraph>
    <Paragraph position="9"> In English, or French, the resolution of ambiguity would mean setting up as few as 6-10 subgrammars, while in German, Russian or Hungarian there are scores of homonymous particles and, consequently, subgrammars. In addition, the latter languages make quite frequent use of particle combinations which do not, as a rule, derive their meanings from a complement of the two /or more/ particles, but have some different meaning, cf. /7/ Csak hem fdztdl meg? /8/ Yx ~e npocmyCunc~ nu m~? Nevertheless, there seems to be no reason why these combinations could not be included in the subgrammar under one or the other dictionary entry.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML