XML Viewer - c92-2098

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/92/c92-2098_abstr.xml
Size: 25,217 bytes
Last Modified: 2025-10-06 13:47:28
<?xml version="1.0" standalone="yes"?>
<Paper uid="C92-2098">
  <Title>ASPECT-A PROBLEM FOR MT</Title>
  <Section position="1" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
ASPECT-A PROBLEM FOR MT
by BARBARA GAWRONSKA
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
Introduction
</SectionTitle>
      <Paragraph position="0"> Russian and Polish, two of the five languages involved in the experimental MT-system SWE-TRA (Dept. of Linguistics, Lund University; cf. Sigurd &amp; Gawroriska-Werngren 1988) are known as typical aspect languages. The lexical inventory of both Russian and Polish contains aspectually marked verb pairs, i.e. each verb (except a small group of biaspectual verbs) is inherently either perfective or imperfective. The distinction is usually marked by a prefix (Pol.</Paragraph>
      <Paragraph position="1"> czyta(/przeczyta(, R. ~itat'/pro?itat&amp;quot; 'to read' imp/perf) or a change in the stem (Pol.</Paragraph>
      <Paragraph position="2"> podpisa(/podpisywa(, R. podpisat'/podpisyvat' 'to sign' perf/imp, Pol. bra(/wziqF, R. brat'/ vzjat' 'to take' imp/perf). This means that a translator formulating a Polish/Russian equivalent of an English VP almost always has to choose between two members of a certain verb pair. Human translators who are native speakers of Russian or Polish normally perform this task without difficulties. What cues are they using when deciding which aspectual variant fits into the given context properly? Can the principles for aspect choice be formalized and used in an MT-system? The aspect category as a linguistic problem Do all languages express the category of aspect in some way? What exactly is expressed by this category? Questions like these have been discussed in an enormous number of works in general linguistics. Nevertheless, little agreement has been reached as to the status and the meaning of the aspect category. Some of the most common controversies in the domain of  aspectology may be summarized as follows: 1) Shall aspect be treated as a universal category or as a language-specific one? 2) Is aspect a purely verbal category, a sentence operator, or primarily a discourse strategy? 3) Is it possible to ascribe an invariant meaning to a certain aspect value? Or must the meaning of an aspectually  marked verb be derived from the semantic features of the verbal stem? Each of the questions above has been answered in different ways. Several aspectologists are focusing on the discourse functions of aspect (Hopper &amp; Thompson 1980, Wallace 1982, Paprott6 1988); others concentrate on aspect choice in isolated sentences (e.g. DeLancey 1982). There are arguments for an invariant difference between the perfective and the imperfective aspect (Forsyth 1970) as well as for investigating verbal stems one by one in order to discover the meaning of the aspect category (Apresjan 1980).</Paragraph>
      <Paragraph position="3"> Despite all controversies concerning the status and the main function of aspect, most researchers agree with the opinion that the perfective aspect is normally chosen when referring to events, processes or states (the general term &amp;quot;event-situations&amp;quot; will be used from now on), which are limited, complete or countable, whereas the imperfective aspect alludes to uncompleted event-situations without clear temporal boundaries. This way of describing the distinction between the perfective and the imperfective aspect is to be found both in traditional descriptive grammars (the Soviet Academic Grammar 1954) and in recent papers of cognitive grammarians (e.g. Langacker 1982, Paprott6 1988). The later authors argue especially for a parallelism between mass names and imperfective verbs and between countable nouns and perfective verbs. The basic conceptual distinction between spatially limited (countable) referential objects and referents without clear spatial limits (denoted by mass names) is assumed to apply 'also to the temporal limits of event-referents: temporally bounded events become &amp;quot;countable&amp;quot;, i.e. perfective, and get the &amp;quot;figure&amp;quot; (fore.ground) status in a discourse, while eventsltuauons which lack temporal limits (&amp;quot;mass&amp;quot; referents) are expressed by imperfective verbs and function as discourse background.</Paragraph>
      <Paragraph position="4"> The view on the aspect category (at least in Polish and Russian) presented in this paper is partially related to the interpretation proposed AcrEs DE COLING-92, NANTES, 23-28 ao~r 1992 6 5 2 PROC. ON COLING-92, NANTES, AUG. 2.3-28, 1992 by cognitive grammarians. A similarity between typical NP-referents and &amp;quot;event-referents&amp;quot; is also assumed, but instead of treating the perfective/imperfective distinction as reflecting the conceptual difference between &amp;quot;count&amp;quot; and &amp;quot;mass&amp;quot; referents, I prefer to relate the aspect value to another referential feature, namely, to the notion of uniqueness.</Paragraph>
      <Paragraph position="5"> The &amp;quot;uniqueness-based&amp;quot; approach The PROLEK~ implementation of some rules for aspect choice in translation from Swedish or English into Polish/Russian is based on the assumption that the choice between the peffective and the imperfective aspect in Russian and Polish reflects the distinction between event-situations which are marked as highly specific, unique, and those which are unmarked as to their uniqueness. By &amp;quot;unique&amp;quot; I roughly mean &amp;quot;not identical with another referent in the current universe of discourse from the sender's point of view&amp;quot;. In the Germanic languages, the referents of noun phrases may be marked as unique by the definite article or other definiteness markers, e.g. possessive and demonstrative pronouns. The uniqueness marking may apply both to countable and uncountable referents: the dog is sick refers to a specific entity belonging to the species dog; the wine was good alludes to a specific appearance of the substance in question (e.g. the wine that has been drunk at a specific party). In Russian and Polish, a similar function is fulfilled by the perfective aspect-with the difference that the choice of a perfective verb marks the referent of the whole predication (an event-situation) as highly specific, unique, i.e. not identical with other event-situations named in the discourse.</Paragraph>
      <Paragraph position="6"> The distinction between the uniqueness hypothesis and the mass/count interpretation of aspect proposed by cognitive grammarians may seem very subtle. Nevertheless, it is of importance. The mass/count analogy does not account for some &amp;quot;untypical&amp;quot; cases of aspect use, which poses difficulties to adult learners of Russian or Polish, e.g. the use of the imperfective aspect in Russian/Polish equivalents to a sentence like Have you already had breakfast/lunch~dinner? (R. Ty u\[e zavtrakal/ obedal/u~inal?, Pol.</Paragraph>
      <Paragraph position="7"> Jadte~ ju~ ~niadanie/obiad/kolacj~?). The event referred to is undoubtedly finished and time-limited, i.e. countable, yet in spite of these features, it is expressed imperfectively. The use of the perfective variants of the verbs exemplified is more restricted: it is e.g. possible in situations where the sender stresses the importance of the fact that a very specific food portion has, so to speak, disappeared, or when a sequence of specific events is expressed, as in the example below: R. My poobedali, we ate-lunch-perf a potom pogli v kino and later went-perf to cinema 'We had eaten lunch and then we went to the cinema' Here, the perfective aspect points out that the lunch referred to was a unique one (it was followed by the action of going to the cinema), whereas in questions like: R. Ty u~e obedal? you &amp;quot;already ate-dinner-imp the sender is not interested in a unique case of eating dinner, but merely in whether the addressee is hungry or not; thus, the imperfective aspect is a natural choice, although the event alluded to is a countable one.</Paragraph>
      <Paragraph position="8"> Finding uniqueness cues The role of the notion of uniqueness can be further illustrated by a fragment of an English text translated into Russian by a human translator. To make the example clearer, I do not quote the whole Russian text, but only specify the aspect values chosen by the translator.</Paragraph>
      <Paragraph position="9"> Sample text (the initial sentences of the preface to &amp;quot;An Introduction to Descriptive Linguistics&amp;quot; by  H.A. Gleason; aspect values from a translation into Russian): 1.1 Language is one of the most important and characteristic forms of human behaviour. null (no aspect marking - a verbless predicative) 1.2 It has, accordingly, always had a place in the academic world. (imperf) 1.3 In recent years, however, its position  has changed greatly. (perf) The sample text shows that there is no clear correlation between the English tense and the Russian aspect: the aspect value may vary, although the tense value of the source text is constant (in both 1.2 and 1.3 the Present Perfect is used).</Paragraph>
      <Paragraph position="10"> Thus, tense cannot be used as a primary cue when generating aspect. But if we look for uniqueness indices in the source text and treat them as aspect indices, the result will be quite AC1T~S DE COL1NG-92, NANT~, 23-28 AO~q&amp;quot; 1992 6 5 3 PROC. OF COLING-92. NANTES. AUG. 23-28. 1992 adequate. In sentence 1.2 (It has, accordingly, always had a place in the academic world), the adverb always indicates that the predication does not refer to any unique situation-the state expressed by 1.2 may be true at any point in time. Hence, the imperfective aspect is the only possible alternative (Polish and Russian perfective verbs in the past tense normally do not co-occur with adverbs such as always, often etc.). The situation expressed in 1.3 (In recent years, however, its position has changed greatly) contalns several elements that make it contrast with the one named in 1.2. The effect of contrast is achieved by the adverb however and by the semantics of the finite verb changed. In addition, the state referred to in 1.3 is placed in a quite definite time period (in recent years). All these factors taken together provide a sufficient motivation for marking the referent of 1.3, in the given context, as an event-situation which is unique in relation to the generally true state mentioned in 1.2. Accordingly, the perfective aspect is used.</Paragraph>
      <Paragraph position="11"> The sample text shows that there are certain adverbials which, on their own, may be sufficient as aspect indices (as always) and that the appropriate aspect value may be indicated by an interplay between adverbial phrases, semantic features of the main verb, and the context of the current predication (1.3).</Paragraph>
      <Paragraph position="12"> An attempt to formalize some principles for aspect choice A computer program for aspect choice in translation should take into account at least those types of aspect indices that have been observed in the sample text discussed above. The result will obviously not be a full set of aspect generating rules. Nevertheless, an attempt to design an automatic procedure generating aspect is of practical and theoretical interest: the translation quality may be improved, and an analysis of the advantages and the shortcomings of the procedure may provide a deeper insight into the nature of the aspect phenomenon.</Paragraph>
      <Paragraph position="13"> The program presented here is implemented in LPA MacProlog and functions as an intermediate (transfer) stage in the translation process-it intervenes between the parsing of the Swedish or English text and the generation of its Russian or Polish equivalents (similar to the procedure for definiteness choice, outlined in Gawro6ska 1990). For different language pairs, slightly different variants of the transfer program are used, but all modules are based on the same main principle.</Paragraph>
      <Paragraph position="14"> The programs used for parsing and generation are written in a modified version of Referent Grammar (Sigurd 1987), called Predicate Driven Referent Grammar (PDRG). The formalism, implemented in DCG, is an eclectic one: it is reminiscent of GPSG (no transformations, use of LP-rules in parsing certain constituents, a GPSG-inspired treatment of relative clauses), LFG (the use of c-representations and f-representations) and HPSG (the head of the phrase, especially the finite verb, plays the central role in the selection of the other phrasal elements). It is just the treatment of the finite verb (or a verbless predicative) as the central element of a sentence that the name of the formalism alludes to. A PDGRG rule may be written as follows:</Paragraph>
      <Paragraph position="16"> The rule above is slightly simplified-it contains no agreement conditions and only one optional adverbial phrase. In the actual program, the ACRES DE COLING-92, NANTES, 23-28 AOt~T 1992 6 5 4 PROC. Or COLING-92, NANTES, AUG. 23-28, 1992 number of adverbials may vary, and the subject-verb agreement is controlled.</Paragraph>
      <Paragraph position="17"> As the result of parsing, three kinds of representations are delivered: 1) a categorial representation (c rep), which is the most language-specific one. It contains the information about the following facts: a. the surface word order b. the syntactic category of the complements of the verb c. the case value of the NPs, if present d. the form and the case demand of valency-bound prepositions, if any (this kind of information is represented by the variables Markl and</Paragraph>
      <Paragraph position="19"> 2) a functional representation (f rep), including such traditional functional roles as subject, object, predicate and adverbial 3) a semantic representation (s_rep), con- null taining semantic roles like actor, patient, experiencer, stimulus, etc. The rule above is a very general one: both the functional and the semantic roles (F rolel/2, S_rolel/2) and the information about their surface realizations (Cat(egory)l/2) are unspecified; in the parsing/generation process they are instantiated by utilizing the information stored in the lexical entry for the verb (the entity with the functor &amp;quot;rlex&amp;quot;), which may have the following shape:</Paragraph>
      <Paragraph position="21"> The aspect category is represented both in the lexical entry and in the verbal slot of the categorial representation. The Russian/Polish aspect is thus treated as a language-specific category marked on the verb, as distinguished from the more abstract category of uniqueness, which, according to our approach, is a universal conceptual notion, expressed in different ways by different language systems.</Paragraph>
      <Paragraph position="22"> In the translation process, the f-representation and the s-representation are utilized. After parsing an English/Swedish sentence, the program tries to find out the &amp;quot;uniqueness value&amp;quot; of the event expressed by the current predication using three main kinds of rules: 1) rules checking uniqueness indices inside the functional and the semantic representation without looking at the context or using knowledge representation stored in the data base 2) rules comparing the current predication with the infommtion about the most typical predication containing the current verb (i.e. rules using a knowledge representation). The most typical predication is to be understood as a de~ription of the most typical event-situation, which may be expressed by means of the current verb and its complements. In the data base, such descriptions are stored as entities with the functor proto_event.</Paragraph>
      <Paragraph position="23"> 3) rules comparing the cmTent predication with its context and inferring the probability of aspect change.</Paragraph>
      <Paragraph position="24"> The three kinds of rules apply in the order suggested above. If a rule of type 1) results in instantiating the uniqueness value of the event-referent as &amp;quot;uni(que)&amp;quot; or &amp;quot;not uni(que)&amp;quot;, the other rule types do not apply. It means that rules of type 1) have to discover the strongest &amp;quot;not-uniqueness&amp;quot; indices, like indefinite frequency or durativity adverbials, or other &amp;quot;not-uniqueness&amp;quot; indicating markers, like the English progressive tenses, &amp;quot;aspectual&amp;quot; verbs like begin, stop etc., or, in Swedish, constructions with coordinated verbs (as satt och liiste, lit. sat and read - 'was reading') which are semantically similar to the English progressive tenses.</Paragraph>
      <Paragraph position="25"> This kind of rule may be exemplified by the following one, which may be used for finding habituality markers like indefinite frequency adverbials, adverbials expressing durativity or the verb brukade ('used to') in the Swedish input: uniqueness ind(past,sem_rep(Slist),not_uni):in list(Functor(Repr,Feature)),Slist), uniqueness_relevant(Fu nctor), not unifl'ense,Functor,Feature).</Paragraph>
      <Paragraph position="26"> &amp;quot;Slist&amp;quot; is the semantic representation (formulated as a Prolog list). The predicate &amp;quot;in list&amp;quot; checks if an element is a member of the list Slist. The functor of a list member (Functor) may be defined (iu the data base) as potentially relevant for the uniqueness value (unique-AC~'ES DE COLING-92, NANTES, 23-28 AOl~q&amp;quot; 1992 6 5 5 PROC. OF COLING-92, NANTES, AUG. 23-28, 1992 hess_relevant). For example, functors like &amp;quot;frequency&amp;quot; or &amp;quot;durativity&amp;quot;, or &amp;quot;actionkind&amp;quot; are treated as uniqueness-relevant. Thus, if the semantic representation Slist contains an element like: &amp;quot;action_kind(m(use,past), habituality)&amp;quot;, i.e. the representation of the verb brukade, or &amp;quot;frequency (often,indef)&amp;quot;, i.e. the representation of the adverb ofta ('often'), the program must check whether the combination of the functor, the feature specified inside the brackets (like &amp;quot;indef&amp;quot; or &amp;quot;habituality&amp;quot;) and the tense value (here: past) results in a specific uniqueness value. As the data base contains the following information: not_uni(past,frequency,indef).</Paragraph>
      <Paragraph position="27"> not_uni(_,_,habituality).</Paragraph>
      <Paragraph position="28"> the program will decide that a sentence in the past tense containing an adverb like ofta or a finite verb like brukade does not refer to a unique event-situation. As a consequence, the imperfective aspect will be preferred when generating the target equivalent.</Paragraph>
      <Paragraph position="29"> The next step, given the semantic representation and the uniqueness value, is to create a new functional representation, if needed, and then the appropriate c-representation. Sometimes, the input and the output may have the same f-representation, and differ only as to some details in their c-representations, like e.g. simple transitive sentences: Sw. pojken slog ofta hunden the-boy hit often the-dog Pol. chtopiec czgsto bit psa boy-nom often hit-imp dog-acc/gen R. malZik ~asto bil sobaku boy-nom often hit-imp dog-ace f rep(\[subj(m(boy,sg)),pred(m(hit,past), advl(often,indef)\]) But in such cases as the Swedish construction with brukade there is a need for changing the functional representation, as the most natural way of expressing the feature &amp;quot;habituality&amp;quot; in the Russian or Polish equivalent is by using the imperfective aspect and (optionally, if the habituality should be emphasized) an adverb like usually. Such changes are not especially difficult to implement if the semantic representation is used as a kind of interlingua. In the s-representation, the infinitive following the habituality marking verb brukade is treated as a semantic kernel of the event situation. The program must therefore find the target equivalent of the semantic kernel, make it the main predicate, provide the target representation with the right aspect value and then, optionally, insert an adverbial as an extra habituality marker. These operations result in translations like: Sw. Han brukade komma ffr sent he used come to late Pol. Zwykle sig spdlniat usually refl-he-was-late-imp R. On obydno opazdyval he usually was-late-imp Rules belonging to types 2) and 3) take care of cases lacking such obvious uniqueness indices as in the example above. Type 2) has access to the proto_events, i.e. representations of typical predications containing a certain verb. A proto_event may have the following structure: proto_event (become engaged, \[actors(\[specific,limited_reff2)), durativity(limited), frequency(low,def), uniqueness(high)I).</Paragraph>
      <Paragraph position="30"> A type 2) rule applying to a predication containing the predicate meaning 'be engaged' checks whether the &amp;quot;actors&amp;quot; involved are two specific individuals and whether there is no violation of the other conditions specified in the description. If the current predication matches most of the elements specified in the frame &amp;quot;proto_event&amp;quot;, the uniqueness value of the &amp;quot;proto event&amp;quot; (here: uniqueness (high), which means: unique with a high degree of probability) will be ascribed to the current event-referent. This means that, when translating a Swedish meaning like Per och Lisaffrlovade sig ('Per and Lisa became engaged') the perfective aspect would be chosen, whereas the same Swedish verb used in a sentence like: Fi~rr i tiden ffrlovade folk sig pd ffrgildrarnas order ('in former times, people got engaged by order of their parents') would be rendered by the Russian/Polish imperfective verb.</Paragraph>
      <Paragraph position="31"> The following is an example of a type 2) rule:  uniqueness_ind(past,sem_rep(Slist),not_uni):in list(event_nucl(m(EventNucl_)),Slist), proto_event (EventNucl ,Condlist), in list(uniqueness(high),Condlist), not(cond matching(Slist,Condlist)).</Paragraph>
      <Paragraph position="32"> The rule states that if the proto_event containing the semantic kernel of the current predication (EventNucl(eus)) is specified as unique with a high degree of probability and if the relevant ACRES DE COLING-92, NANTES. 23-28 Ao~r 1992 6 5 6 PROC. OF COLING-92. NANTES. AUG. 23-28. 1992 elements of the semantic representation of the current sentence do not match the conditions stored in the proto_event, then the uniqueness value of the event-situation referred to is .&amp;quot;not unique&amp;quot;. Writing specific rules matching semantic representations with proto_events is obviously not a trivial task-there are not many event-situations which are as easily described as the case of being engaged.</Paragraph>
      <Paragraph position="33"> Type 3) rules are the most complicated ones, as the task performed is to compare the current predication both with the proto event and with the previously stored semantic representations (including their uniqueness values) in order to discover possible motivation for aspect change. For the time being, only a restricted number of cues have been implemented. The program utilizes principles like: --It is quite probable that parts of a unique event may also be unique, if no counter-indices (as e.g. indefinite du~ rativity markers) have been found.</Paragraph>
      <Paragraph position="34"> --A predication which describes the manner of performing an already introduced event should probably be treated as imperfective (it expresses a property of an event-referent, in a way similar to a predicative NP: it does not introduce a new referent, but ascribes a property to an already introduced one).</Paragraph>
      <Paragraph position="35"> --Adverbials marking a kind of opposition (however etc.) and their interplay with other adverbials may be important cues for aspect change.</Paragraph>
      <Paragraph position="36"> Conclusions The main problems when implementing a procedure for aspect generation are to formulate concise and coherent descriptions of typical events, to design an appropriate hierarchy of rules comparing the current predication with the proto events and to describe conditions for aspect change. This is a field for further research. Another area for future investigations is finding cues for aspect choice in constructions containing infinitives where the infinitive is not preceded by an aspectual verb like the verbs meaning start or finish. Nevertheless, some uniqueness indices are possible to formalize and to implement in an MT-system (obviously, a system accepting lexical and syntactic restrictions). Our approach is a kind of compromise between different points of view represented in current research on aspect: the overt aspect is treated as language-specific, but the conceptual distinction behind the aspect choice is assumext to be based on the universal notion of uniqueness; furthermore, both seutence-internal and contextual factors are taken into consideration. The compromise seems to be quite useful.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML