XML Viewer - j85-1002

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/85/j85-1002_metho.xml
Size: 32,541 bytes
Last Modified: 2025-10-06 14:11:41
<?xml version="1.0" standalone="yes"?>
<Paper uid="J85-1002">
  <Title>Tasks MT HT Preparation/input Translation Human Revision Transcription/Proofreading</Title>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
1.2 LANGUAGES TRANSLATED
</SectionTitle>
    <Paragraph position="0"> TAUM-AVIATION is designed in such a way that a core portion of the system is independent of particular language pairs: linguistic descriptions constitute data for Copyright1985 by the Association for Computational Linguistics. Permission to copy without fee all or part of this material is granted provided that the copies are not made for direct commercial advantage and the CL reference and this copyright notice are included on the first page. To copy otherwise, or to republish, requires a fee and/or specific permission.</Paragraph>
    <Paragraph position="1">  the system. However, from a linguistic perspective, the project was exclusively focused on English-to-French translation.</Paragraph>
    <Paragraph position="2"> In addition, the linguistic descriptions incorporated into the system are addressed not to general language but to the particular sublanguage of maintenance manuals (Lehrberger 1982). The notion of sublanguage is presented in Kittredge and Lehrberger (1982).</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
1.3 PROJECT SIZE
</SectionTitle>
      <Paragraph position="0"> The initial staff of seven researchers in 1976 was rapidly increased to a peak of 20 people during 1979, and then slowly decreased until the project was terminated.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
1.4 SYSTEM SIZE
</SectionTitle>
      <Paragraph position="0"> TAUM-AVIATION was implemented on a CYBER 173 computer, with the NOS/BE 1.4 operating system, but was designed so as to be practically machine-independent. Most components of the system are based on the following scheme: certain linguistic data (dictionaries, grammars) are compiled into an object code interpreted at run time against the input text. Table 1 gives an idea of the size of the runtime code, together with typical memory requirements for execution. Table 2 gives the size of the programs used to compile the linguistic data.</Paragraph>
    </Section>
  </Section>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
1.5 SIZE OF DICTIONARIES
</SectionTitle>
    <Paragraph position="0"> The dictionaries list only the base form of the words (roughly speaking, the entry form in a conventional dictionary). In March 1981, the source language (English) dictionary included 4054 entries; these entries represented the core vocabulary of maintenance manuals, plus a portion of the speciafized vocabulary of hydraulics. Of these, 3280 had a corresponding entry in the bilingual English-French dictionary.</Paragraph>
  </Section>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2 APPLICATION ENVIRONMENT
</SectionTitle>
    <Paragraph position="0"> TAUM-AVIATION remains an experimental system. It is designed to take as input a text that is in a photocomposition-ready format; a pre-processing program stores the formatting codes, which will be reinserted in the translated text. No use is made of manual pre-editing.</Paragraph>
    <Paragraph position="1"> The translation process is fully automatic. If desired, it can be interrupted after dictionary lookup to obtain a list of unidentified words, and enter any such words in the dictionary.</Paragraph>
    <Paragraph position="2"> Revision of the machine output is normally necessary: the domain is too complex for results comparable to those of TAUM-METEO. The designers of the system decided not to rely heavily on &amp;quot;fail-soft&amp;quot; strategies such as constraint relaxation or partial parses; these strategies make the quality of the output totally unpredictable.</Paragraph>
    <Paragraph position="3"> Thus, the material passing through the system is translated relatively well (very well by MT standards), and the revisor is less likely to feel overwhelmed by finguistic garbage. The price to be paid is a failure to produce any output for a relatively high proportion of the input sentences (somewhere between 20 and 40 per cent, at the stage of development reached in 1981). For a sample of translations produced by TAUM-AVIATION, see the Appendix.</Paragraph>
    <Paragraph position="4"> The development of TAUM-AVIATION has not been taken far enough for a definitive assessment to be made Computational Linguistics, Volume 11, Number 1, January-March 1985 19 Pierre Isabelle and Laurent Bourbeau TAUM-AVIATION: Its Technical Features and Some Experimental Results of the linguistic and computational strategies that it embodied: the total system throughput was approximately 100,000 words.</Paragraph>
  </Section>
  <Section position="6" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3 GENERAL TRANSLATION APPROACH
</SectionTitle>
    <Paragraph position="0"> The TAUM-AVIATION system is based on a typical second generation design (Isabelle et al. 1978, Bourbeau 1981). The translation is produced indirectly, by means of an analysis/transfer/synthesis scheme. The internal organization of the major components of the system is based on the :notion of linguistic level. Finally, the linguistic data are generally separated from the algorithmic specifications.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.1 TRANSFER MODEL
</SectionTitle>
      <Paragraph position="0"> The overall design of the system is based on the assumption that translation rules should not be applied directly to the input string, but rather to a formal object that represents a structural description of the content of this input. Thus, the source language (SL) text (or successive fragments of it) is mapped onto the representations of an intermediate language, (also called normalized structure) prior to the application of any target language-dependent rule.</Paragraph>
      <Paragraph position="1"> No one knows how to construct a universal, language-independent semantic interlingua. The intermediate language used in the TAUM-AVIATION system is largely language dependent: it consists of semantically annotated deep structures for SL and TL sentences. A certain degree of language independence is attained by the use of a common &amp;quot;base component&amp;quot; (a context-free grammar that enumerates the admissible deep structures) for both SL and TL. But the lexical items are left intact, and a transfer module is used in order to map the lexical items of SL onto those of TL.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.2 LINGUISTIC ORGANIZATION
</SectionTitle>
      <Paragraph position="0"> The arrangement of the system into three major modules (analysis, transfer, synthesis) reflects a theoretical model of translation operations: it is claimed that these operations take place at a &amp;quot;deep&amp;quot; level, between language-dependent meaning representations. Moreover, each one of the three modules is arranged internally along the lines of a linguistic theory: the components of these modules correspond to the standard levels of linguistic description (lexicon, morphology, syntax, semantics). This contrasts with older systems, the structure of which frequently had no direct relationship to any definite theory of language and translation.</Paragraph>
      <Paragraph position="1"> Figure 1 shows the internal structure of the</Paragraph>
    </Section>
  </Section>
  <Section position="7" start_page="0" end_page="0" type="metho">
    <SectionTitle>
TAUM-AVIATION system.
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.1 PROCESSING UNITS
</SectionTitle>
      <Paragraph position="0"> It is well known that some translation problems can only be solved trough textual, as opposed to sentential, processing. However, we still know too little about discourse analysis techniques to use them effectively in large-scale systems. Thus, the processing unit in TAUM-AVIATION is the sentence.</Paragraph>
      <Paragraph position="1"> Fortunately, anaphoric pronouns are quite rare in technical manuals. A more frequent problem is the use of anaphoric definite noun phrases. Consider for example the text fragment in (1): (1) Remove hydraulic filter bypass valve. This valve is located below accumulator No. 1.</Paragraph>
      <Paragraph position="2"> A word like valve cannot be translated correctly in isolation. Depending on the type of valve, French will use clapet, robinet, soupape, etc. In the second sentence of (1), the word valve is used anaphorically. To translate it correctly, one has to refer to its antecedent: the modifier bypass determines a specific French equivalent.</Paragraph>
      <Paragraph position="3"> TAUM-AVIATION cannot solve problems of this type.</Paragraph>
      <Paragraph position="4"> The designers of the system preferred to concentrate their efforts on the best possible sentential analysis. And in fact, in spite of a relatively advanced sentence analyzer, translation failures due to weaknesses in sentential processing (e.g. scoping problems for conjunctions, nominal compounds, etc.) turned out to be much more frequent than failures due to anaphor problems, as evidenced by the error compilations of Lehrberger (1981).</Paragraph>
    </Section>
  </Section>
  <Section position="8" start_page="0" end_page="0" type="metho">
    <SectionTitle>
4.2 THE ANALYSIS MODULE: AMBIGUITY PROBLEMS
</SectionTitle>
    <Paragraph position="0"> Ambiguity is a language-internal phenomenon and it is the responsibility of the analysis module to resolve it.</Paragraph>
    <Paragraph position="1"> Sometimes, it is possible to ignore certain ambiguities, in the hope that the same ambiguities will carry over in translation. This is particularly 'true in systems like TAUM-AVIATION that deal with only one pair of closely related languages. The difficult problem of prepositional phrase attachment, for example, is frequently bypassed in this way. Generally speaking, however, analysis is aimed at producing an unambiguous intermediate representation. null The analysis module comprises four components: preprocessing, morphology, lexicon and syntactic/semantic analysis (see Figure 1). The pre-processing component segments the input text into successive words and into processing units. In this latter function, it can be seen as a degenerate text grammar. Because this is carried out deterministically, without interaction from the other components, segmentation problems occasionally arise.</Paragraph>
    <Paragraph position="2"> Morphological analysis includes complete rules and exception lists for English inflectional morphology, category assignment rules for numbers and rules for dealing with unknown words. No rules are provided for derivational morphology. The system handles some types of compositional morphology, but this is done in the syntactic component, since compounds frequently exhibit properties that are otherwise thought of as syntactic; for example, internal conjunction is possible (e.g. four- and six-cell batteries).</Paragraph>
    <Paragraph position="3"> Syntactic and semantic analysis are very tightly integrated in the TAUM-AVIATION system. First, both of them are implemented using the same metalanguage, a particular version of Wood's ATNs (see section 5, below). Second, both components interact freely during analysis. It is nevertheless convenient to describe them separately.</Paragraph>
    <Paragraph position="4">  The TAUM-AVIATION system includes a large-scale grammar of English capable of handling most constructions that occur with some frequency in the sublanguage of maintenance manuals (Lehrberger 1982).</Paragraph>
    <Paragraph position="5"> The rules are based on an extensive lexical subcategonzation scheme: 12 standard categories are further subclassified using more than 75 features (excluding morpho-syntactic features). This is in addition to the use of lexical &amp;quot;strict subcategonzation&amp;quot; frames comparable to those of transformational grammar.</Paragraph>
    <Paragraph position="6"> Since the intermediate representation used for transfer is a type of semantically annotated &amp;quot;deep structure&amp;quot;, and since maintenance manuals make use of a very complex syntax, it was necessary to provide the parser with a rich transformational component. Thus, the inverses of several transformations from standard transformational theory are used: passive, extraposition, raising, etc.</Paragraph>
    <Paragraph position="7"> In dealing with texts as complex as technical manuals, the parser is faced with difficult ambiguity problems.</Paragraph>
    <Paragraph position="8"> Ambiguities are already present in the input to the parser, at the lexical level. These ambiguities may concern the syntactic properties of the lexical element (e.g. light is a noun, a verb, or an adjective); or they may concern primarily its semantic properties: pure homographs like the two nouns lead or polysemous items like the noun line.</Paragraph>
    <Paragraph position="9"> The parser will as a side effect eliminate some lexical ambiguities; for example, if Check valve is to be taken as a sentence, syntax tells us that check must be a verb.</Paragraph>
    <Paragraph position="10"> However, the parser will itself introduce structural ambiguities, owing to the existence of syntactically undetermined choice points in the application of grammar rules. Two examples of structural ambiguity are adjective scope as in (3), and conjunction scope, as in (4).</Paragraph>
    <Paragraph position="11">  (3) a ) a' ) (4) a ) a')  These examples show that with ADJ NOUN NOUN sequences and NOUN CONJ NOUN NOUN sequences, two different syntactic groupings are possible. But only one of them is semantically acceptable and results in a correct translation.</Paragraph>
    <Paragraph position="12"> Moreover, some lexical ambiguities, instead of being eliminated in the parsing process, will constitute a further source of structural ambiguity, each reading of the relevant lexical item being compatible with a different syntactic structure. In example (5), drain can be taken either as a noun or as a verb, when appropriate adjustments are made to the surrounding syntactic structure. (5) Remove dust cap and drain plug.</Paragraph>
    <Paragraph position="13"> Thus by itself, a syntactic parser produces a highly ambiguous output, and further constraints are needed in a practical MT system.</Paragraph>
    <Paragraph position="14">  Semantic processing in the TAUM-AVIATION system performs two related tasks: a) it filters the syntactic structures, eliminating as many ambiguities as possible; and b) it associates with each node of the tree a set of semantic features which will be used by transfer rules. Most semantic features originate in the dictionary, where lexical items are described in terms of some 35 features that form a tangled hierarchy. Predicative lexical items (verbs, adjectives, certain prepositions) are assigned selectional restrictions on their possible arguments in terms of these semantic features.</Paragraph>
    <Paragraph position="15"> Selectional restrictions constitute the main semantic mechanism used by the system to eliminate ambiguities of two types: a. structural ambiguities introduced by syntactic rules; thus the spurious structure proposed by the parser for (5) is eliminated because the verb drain does not accept as direct object something in the semantic category of plug, b. lexical ambiguity in the semantic properties of certain lexical items; polysemous words like the noun line (which can denote either an abstract geometrical object, or physical objects such as conductors) are frequently disambiguated by selectional restrictions; for example, in Flush the line, the concrete sense is selected.</Paragraph>
    <Paragraph position="16"> In order for selectional restrictions to work properly and for trees to be correctly annotated, it is necessary to apply semantic projection rules which assign sets of features to tree nodes. In TAUM-AVIATION, the semantic rules work in a compositional fashion, raising selectively certain features from daughter nodes to their mothers (Isabelle 1985). Rules such as the following are used: * all of the semantic features of a headnoun are raised onto the dominating NP node; * the intersection of the features of two conjoined NP nodes is raised onto the dominating NP node; and * when the headnoun is a partitive noun (e.g. portion), and the NP has an of NP complement, the features of this complement are raised onto the dominating NP node.</Paragraph>
    <Paragraph position="17"> The system also makes use of standard control rules for subjectless infinitives and gerundives, and of some pronoun/antecedent rules, in order to enforce semantic constraints wherever possible.</Paragraph>
    <Paragraph position="18"> Semantic ambiguity, whether real homography (e.g.</Paragraph>
    <Paragraph position="19"> the two nouns lead) or polysemy (e.g. the various senses of the noun line), is not handled by creating multiple entries in the source language dictionary. Rather, in its single entry, the word is assigned a number of semantically incompatible features. The semantic rules seek to filter out some of these features, so that no incompatibility remains. This strategy prevents the redundant syntactic search that results from a multiple-entry strategy.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.3 THE TRANSFER MODULE
</SectionTitle>
      <Paragraph position="0"> In principle, transfer rules state correspondences between two sets of unambiguous structural descriptions. Their most obvious task is to relate the lexical items of SL to those of TL. Even if the rules are applied to unambiguous lexical elements, the correspondences are by no means one-to-one: the lexical system of each natural language reflects a specific way of breaking down the conceptual universe. For this reason, equivalences have to be stated in terms of structural patterns rather than in terms of words or strings of words.</Paragraph>
      <Paragraph position="1"> To take an example, there is no language-internal evidence that hard is ambiguous in English; however, depending on the context, it is translated into French as difficile, dur, etc. The French equivalents have more restricted collocations. In all those cases, transfer rules are needed to select the contextually appropriate equivalent. null Moreover, very frequently, these lexical transfer rules cannot simply substitute lexical items, leaving the tree structure unaffected. Since SL and TL lexical items frequently have different contextual requirements (i.e.</Paragraph>
      <Paragraph position="2"> subcategorization frames), translation rules have to establish correspondences between a source and a target structural pattern, as illustrated by the examples in (6).  (6) a. check x against y ~ comparer x ~ y b. supply x with y ~ fournir y ~ x c. cantilever x ~ monter x en porte-~t-faux d. bond x electrostatically ~ m6talliser x e. service x ~ faire l'entretien de x  It is clear that lexical transfer rules must include powerful transformational mechanisms. This basic fact has not so far received the attention it deserves in the MT community. The TAUM-AVIATION system provides for full transformational power at the level of lexical transfer (Chevalier et al. 1981).</Paragraph>
      <Paragraph position="3"> The transfer component also involves rules for structural transfer, that is, rules that deal with linguistic</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
22 Computational Linguistics, Volume 11, Number 1, January-March 1985
</SectionTitle>
      <Paragraph position="0"> Pierre Isabelle and Laurent Bourbeau TAUM-AVIATION: Its Technical Features and Some Experimental Results contrasts not tied to any specific lexical item. Since the same base rules are used for SL and TL, this sub-component is kept to a minimum. Nevertheless, a number of structural differences between SL and TL have to be accounted for by means of contrastive rules. For example, because the intermediate language does not provide for &amp;quot;universal semantic tenses&amp;quot;, the tense systems of SL and TL have to be explicitly contrasted by a set of rules.</Paragraph>
      <Paragraph position="1"> Another task left .to structural transfer is to deal with observable contrasts concerning the use of optional movement transformations. In all likelihood, the use of these transformations is governed by discourse phenomena that the system does not attempt to analyze. The strategy used in TAUM-AVIATION is to take advantage of the frequent parallelisms between SL and TL regarding these aspects of surface structure organization. Thus, the intermediate representation retains &amp;quot;traces&amp;quot; from SL surface structure used by the synthesis component to maintain a certain parallelism with SL. However, in some cases we know that the two languages exhibit systematic differences in their use of certain movement transformations.</Paragraph>
      <Paragraph position="2"> The structural transfer grammar describes these facts.</Paragraph>
      <Paragraph position="3"> For instance, TAUM-AVIATION includes complex rules for translating English passives with various French constructions, as illustrated in the following examples:  (7) Quick-disconnect fittings should not be removed.</Paragraph>
      <Paragraph position="4"> --,. Ne pas enleverAes raccords fi d6montage rapide.</Paragraph>
      <Paragraph position="5"> (8) Ensure that pump and lines are bled.</Paragraph>
      <Paragraph position="6"> --,. S'assurer qu'on a purg6 la pompe et les canalisations. null (9) The flaps are operated by hydraulic system no. 1.</Paragraph>
      <Paragraph position="7"> Le circuit hydraulique no. 1 actionne les volets.</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.4 THE SYNTHESIS MODULE
</SectionTitle>
      <Paragraph position="0"> Synthesis of the TL text involves three steps: syntactic synthesis, morphological synthesis, and post-processing.</Paragraph>
      <Paragraph position="1"> Syntactic synthesis is carried out on the basis of a large-scale transformational grammar of French. Since the input to the synthesis component is normally a well-formed unambiguous sentential deep structure, synthesis here is much simpler than analysis. This is not to say that synthesis of natural language texts is generally easy.</Paragraph>
      <Paragraph position="2"> Generating a coherent text from an abstract discourse representation is certainly a very difficult problem. But in TAUM-AVIATION, synthesis can only be achieved on a sentential basis. Therefore, no attempt can be made to describe the complex discourse factors that influence sentence generation (e.g. application of &amp;quot;optional&amp;quot; movement transformations). As mentioned in the previous section, the strategy adopted is to try to preserve a certain parallelism with the SL sentences, since both languages have relatively similar means of expressing discourse cohesion.</Paragraph>
      <Paragraph position="3"> Syntactic synthesis produces a string of lexical items annotated with all the information required to inflect them correctly. The morphological synthesis component then determines the final form of each word. This is done on the basis of an exhaustive description of the rules of French inflection (together with their exceptions). Post-processing reformats the TL text, making use, wherever possible, of the formatting codes of the SL text.</Paragraph>
    </Section>
  </Section>
  <Section position="9" start_page="0" end_page="0" type="metho">
    <SectionTitle>
5 COMPUTATIONAL TECHNIQUES
</SectionTitle>
    <Paragraph position="0"> From the computational point of view, the TAUM-AVIATION system is more complex than TAUM-METEO, which is entirely written in the Q-SYSTEMS metalanguage (Colmerauer 1971). One of the ideas underlying TAUM-AVIATION is to make use of specialized tools for different tasks in the interests of increased efficiency, though somewhat at the expense of overall simplicity.</Paragraph>
    <Paragraph position="1"> In the implementation, the actual modules closely match the components of the linguistic model presented in Figure 1. They are applied sequentially and communication between components is achieved by means of a chart structure (a type of loop-free graph). The arcs of these charts are labelled with tree structures whose nodes are labelled with complex symbols: a categorial label plus a set of features.</Paragraph>
    <Paragraph position="2"> Most components are based on the following scheme.</Paragraph>
    <Paragraph position="3"> Certain linguistic data are described with a high-level metalanguage; in this metalanguage, the linguist expresses facts about tree structures. These descriptions are compiled into an abstract formal structure interpreted at run time against the material to be translated. Most of these compilers and interpreters are written in PASCAL.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
5.1 PRE- AND POST-PROCESSING
</SectionTitle>
      <Paragraph position="0"> These relatively simple components, which map character strings onto sequences of chart structures and vice-versa, are implemented as sets of rules in a metalanguage called SISIF; a set of SISIF rules amounts to a deterministic finite-state automaton. These rules are compiled into list structures, which are interpreted against the input text at run time.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
5.2 INFLECTIONAL MORPHOLOGY
</SectionTitle>
      <Paragraph position="0"> Since it was possible to exhaustively describe the inflectional morphology of both French and English, there was no compelling reason to use a very high-level formalism.</Paragraph>
      <Paragraph position="1"> Consequently, in the interests of efficiency, two PASCAL programs were written for morphological analysis of English and morphological synthesis of French.</Paragraph>
    </Section>
  </Section>
  <Section position="10" start_page="0" end_page="0" type="metho">
    <SectionTitle>
5.3 DICTIONARIES
5.3.1 SOURCE LANGUAGE DICTIONARY
</SectionTitle>
    <Paragraph position="0"> A dictionary system called SYDICAN enables the linguist to write lexical rules that associate a complex of lexical information with a string of base forms, forming a path in an input chart. Two types of rules are provided: a) rules that simply add a new path (labelled with the complex of lexical information) to the chart; and b) rules that, in Computational Linguistics, Volume 11, Number 1, January-March 1985 23 Pierre Isabelle and Laurent Bourbeau TAUM-AVIATION: Its Technical Features and Some Experimental Results addition, have the effect of taking precedence over shorter matches in the chart (cf. &amp;quot;longest-match&amp;quot; strategy).</Paragraph>
    <Paragraph position="1"> The rules are compiled into list structures. At run time, they are retrieved from an arbitrarily large lexical data base and applied to the chart. The lexical database system includes some maintenance facilities, such as integrity constraints on its contents, and facilities for retrieving entries through arbitrarily complex requests on their contents.</Paragraph>
    <Paragraph position="2">  We saw in 4.3 that lexical transfer involves rules that perform complex transformations on tree structures. The LEXTRA metalanguage makes it possible to associate with any lexical item an arbitrarily complex set of tree transformations. These transformations describe a pattern (anchored in the relevant lexical item), which is to be matched against the tree structure at run time.</Paragraph>
    <Paragraph position="3"> When a match is found, a series of associated actions specifying structural changes is performed.</Paragraph>
    <Paragraph position="4"> An important idea embodied in LEXTRA is that a transfer component should have an explicit description of the intermediate language. In the TAUM approach this intermediate language is partially defined by a set of context-free rules that describe a common base component for SL and TL. LEXTRA takes as data this context-free grammar and guarantees that any manipulated tree structure corresponds to a permissible derivation in terms of that context-free grammar. This notion is to be related to computational formulations of transformational grammar such as Petrick (1973), where the deep structures produced by the inverse transformations are checked against the rules of the base component. No equivalent check is performed with parsing systems like ATNs.</Paragraph>
    <Paragraph position="5"> LEXTRA rules are compiled into list structures. It was found that some of the constraints on admissible tree structures could be enforced at compile time (G6dn-Lajoie 1980). This mechanism is very useful in the complex task of dictionary development. It helps validate the work of the lexicographer. At run time, the LEXTRA interpreter searches the tree structure for SL lexical items, retrieves the associated lexical rules and applies them to the tree.</Paragraph>
  </Section>
  <Section position="11" start_page="0" end_page="0" type="metho">
    <SectionTitle>
5.4 SYNTAX AND SEMANTICS
5.4.1 ANALYSIS
</SectionTitle>
    <Paragraph position="0"> The English grammar for syntactic analysis is written in REZO (Stewart 1975, 1978), TAUM's version of augmented transition networks (ATNs). The REZO meta-language is different from Wood's ATNs (Woods 1970) in several respects. Some of the differences are: * REZO does not support morphological analysis, which is performed in a separate component; * tree nodes are complex symbols that include sets of features on which boolean operations can be performed; * REZO includes a number of primitives to perform pattern matching over tree structures; * in addition to regular ATN states where all transitions are tried, REZO includes &amp;quot;deterministic&amp;quot; states where only the first transition whose test is met is followed; * REZO accords special status to the states to which a recursive call can be made, so that the resulting grammar is a collection of sub-networks.</Paragraph>
    <Paragraph position="1"> The REZO grammar is compiled into a set of instructions for a virtual machine, which is simulated by the runtime interpreter. Parsing is done in the usual topdown, depth-first, left-to-right, serial manner. The interpreter can either work in an all-paths or in a first-path mode. One important difference from Wood's ATN interpreter is that REZO takes as input a chart structure in which lexical ambiguities are encoded and applies the grammar in parallel to all the paths of this chart. The result is also a chart structure: REZO is thus a chart-tochart transducer.</Paragraph>
    <Paragraph position="2"> In 4.2, it was mentioned that syntactic rules create structural ambiguities, and that semantic processing can eliminate some of these. Serial parsing provides another means of selecting a particular reading. Since the transitions of the REZO networks are followed in a fixed order, the grammar can be made to produce the most likely reading first. In TAUM-AVIATION's analysis grammar, the ordering of the transitions reflects: * general parsing principles such as those discussed in human performance studies (e.g. Kimball 1973); and * sublanguage-specific statistical tendencies.</Paragraph>
  </Section>
  <Section position="12" start_page="0" end_page="0" type="metho">
    <SectionTitle>
5.4.2 STRUCTURAL TRANSFER AND SYNTHESIS
</SectionTitle>
    <Paragraph position="0"> Structural transfer and syntactic synthesis are implemented in the well-known Q-SYSTEMS, which we will not describe here. This introduces some heterogeneity into TAUM-AVIATION, since: a) unlike the other metalanguages, Q-SYSTEMS do not support trees with complex symbols as node labels; and b) the compiler and the interpreter are written in FORTRAN while PASCAL is used for the other metalanguages.</Paragraph>
    <Paragraph position="1"> In fact, the original design of the system included provisions for a new metalanguage well suited to synthesis; but time constraints precluded its development.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML