File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/94/w94-0332_metho.xml
Size: 14,007 bytes
Last Modified: 2025-10-06 14:14:02
<?xml version="1.0" standalone="yes"?> <Paper uid="W94-0332"> <Title>Representing Conceptual and Linguistic Knowledge for Multi-Lingual Generation in a Technical Domain</Title> <Section position="3" start_page="0" end_page="245" type="metho"> <SectionTitle> 2 The Document Analysis </SectionTitle> <Paragraph position="0"> We have chosen a ll0-page manual, English (\[3\]) and Swedish (\[8\]), of the truck gearbox R1000 to analyse. The manual is for expert servicemen and shows the design, function, and service instructions.</Paragraph> <Paragraph position="1"> The manual communicates some different kinds of domain information. We choose here to concentrate on the following two: * Static information (i.e what something is). Examples: null (1) The R1000 is a gearbox. (2) The.gearbox has nine forward gears. (3) The gearbox is mechanically operated. (1) RIO00 ar en v~xellPSda. (2) VPSxellPSdan hat nio v~xlar framPSt. (3) V~xell~dan manSvreras mekaniskt * Processive information (i.e what something does). Examples: (4) The purpose of the inhibitor valve is to prevent inadvertant shifting of the range gear when a gear in the basic box is in mesh. (5) The inhibitor cylinder prevents inadvertant shiRing in the basic box when range shifts are being carried out.</Paragraph> <Paragraph position="2"> (4) Sp~rrventilen har till uppgift att fbrhindra v~xling av rangev~xeln n~r nPSgon av v~xlarna i baslPSdan ligger i ingrepp. (5) Sp~rrcylindern f6rhindrar v~xling i baslPSdan n~r vPSxling reed rangen sker.</Paragraph> <Paragraph position="3"> The text can be broken down into approximately sentence-sized units, each one communicating a piece of information considered true in the domain. We observe a tight correspondence between the kind of information and its textual realization. The carefully defined terminology not only determines words, but their combinations as well.</Paragraph> <Paragraph position="4"> The text structure follows from conventions of language use for efficient communication about the domain.</Paragraph> <Section position="1" start_page="245" end_page="245" type="sub_section"> <SectionTitle> 7th International Generation Workshop * Kennebunkport, Maine * June 21-24, 1994 </SectionTitle> <Paragraph position="0"> These findings are in line with the issue of domain communication knowledge (Kittredge \[7\]). Rhsner and Stede (\[9\]) distinguish similarly between the macro and micro structure of texts. The architecture of Genie is built up around the division of sentence and text structure; the user incorporates the conventions in the specification while Genie provides the terminological definitions.</Paragraph> <Paragraph position="1"> The English and Swedish versions of the manual align at sentence level. Genie can cope with semantically non-equivalent sentence pairs, but not the very rare ones differing in content. Nevertheless, the documents correspond nicely compared to the difficulties Bateman reports (\[1\]) on a study of medical didactic texts. Grote and Rhsner (\[5\]) have studied car manuals for the TECH-DOC system, and they observe a close correspondence.</Paragraph> <Paragraph position="2"> We have employed Functional Grammar (FG) (c.f \[6\]) as a principal analysis tool to developing representations for domain and language.</Paragraph> </Section> </Section> <Section position="4" start_page="245" end_page="245" type="metho"> <SectionTitle> 3 Domain Representation </SectionTitle> <Paragraph position="0"> Domain representation is based on conceptual structures (Sowa \[11\]) and the transitivity structure of FG. Concept nodes are typed in an inheritance network. We follow Sowa's definition and notation of conceptual graphs.</Paragraph> <Paragraph position="1"> Next, we sketch how static and processive information are represented as facts, called aspects and transitions, respectively, in the knowledge base.</Paragraph> <Section position="1" start_page="245" end_page="245" type="sub_section"> <SectionTitle> 3.1 Aspects </SectionTitle> <Paragraph position="0"> An aspect contains a simple conceptual graph where an object has an attributive relation to a value. We define the is-a link as attributive and the type becomes the value. Sentence (1) and (2) are:</Paragraph> <Paragraph position="2"> Both aspects happen to be close to their linguistic realizations, which is not necessarily always the case.</Paragraph> </Section> <Section position="2" start_page="245" end_page="245" type="sub_section"> <SectionTitle> 3.2 Transitions </SectionTitle> <Paragraph position="0"> A transition is a concept trans with three relations, pre, means, and post. means has an event as value, pre and post hold circumstances that obtain before and after the event has occurred.</Paragraph> <Paragraph position="1"> An event carries mandatory, e.g actor, goal, and peripheral role relations, e.g instr to other objects. We can differentiate roles into subtypes, e.g i-instr inhibits the event.</Paragraph> <Paragraph position="2"> A circumstance can be: (i) a state characterized as a setting of some variable parameter. An example is in the aspect for sentence (4):</Paragraph> <Paragraph position="4"> Sub-events have their own transitions as value for pre and post, which allows us to link events together, gen-dur-pre is a version of pre used to give a meaning to &quot;... being carried out&quot;.</Paragraph> <Paragraph position="5"> Transitions are more powerful than what has been outlined here. Much of their internal temporal constituency, complex parameters, lambda-abstractions, and different kinds of constraints have been left out for clarity.</Paragraph> </Section> </Section> <Section position="5" start_page="245" end_page="247" type="metho"> <SectionTitle> 4 Linguistic Representation </SectionTitle> <Paragraph position="0"> This section describes how Genie derives categories for a fact, as part of generation. We first describe English categories briefly.</Paragraph> <Section position="1" start_page="245" end_page="246" type="sub_section"> <SectionTitle> 4.1 Categories </SectionTitle> <Paragraph position="0"> Categories are expressed in a language of typed feature structures. We define how categories can be formed, their different types and content.</Paragraph> <Paragraph position="1"> Construction of categories are inspired by modern Categorial Grammars (CG), such as UCG (c.f \[12\]), but differ in some respects. The set of categories g is defined recursively, (i) Basic categories E g. (if) If A and B E g, then the complex category AIB E g.</Paragraph> <Paragraph position="2"> The differences from CG are (i) the association of categories to facts and concepts, and (if) complex categories are non-directed.</Paragraph> <Paragraph position="3"> Categories compose using the reduction rule to unify: AIB, B ~ A Categories are expressed as typed feature structures (tfs) (c.f Carpenter \[2\]). a(name) denotes the set of attributes the type name carries, and s(name) the immediate subtypes, cat is the root with a(cat) = {}, s(cat) = {zcat, bcat), xcat is the I operator, bcat are the basic categories, a(bcat) = {:fb, st), s(bcat) = {Icat,pcat). Icat and pcat are the lexical and phrasal categories. The attribute fb holds some feature bundle, rooted at fb and named appropriately, e.g np-fb, n-fb, agr-fb, st has a FG mood-structure to hold subcategories. A peat has a certain tfs under the type st to encode the structure, while a lcat has a pointer into a surface lexicon, s-st is the structure for clauses. Elements are coded as attributes, e.g subj, fin, compl etc.</Paragraph> </Section> <Section position="2" start_page="246" end_page="247" type="sub_section"> <SectionTitle> 4.2 Conceptual Grammar </SectionTitle> <Paragraph position="0"> Facts are associated to categories composed of those obtained from the conceptual constituents. The grammar rules state that a particular domain type corresponds to a category with certain combinatorial properties. If violated, the rule cannot derive an adequate category for the fact. Concept nodes are associated to a number of categories as defined by lexical rules.</Paragraph> <Paragraph position="1"> We call this a conceptual grammar, since it is tied to conceptual rather than phrase structures. The rules are language independent as the linguistic material is effectively hidden within the basic categories. Rules have the following notation: <head> when <body>.</Paragraph> <Paragraph position="2"> <head> carries an association of the general form cs cat, where cs is a conceptual structure, and cat is the category. _<head> holds whenever all constraints in <body> hold 1. Help associations (arrow with a symbol on top) support ~ with extra material. We describe rules for atoms, objects, aspects and transitions.</Paragraph> <Paragraph position="3"> atoms have a rather simple and direct association:</Paragraph> <Paragraph position="5"> The type of category depends on how it will be used, but should be basic. The examples are typical.</Paragraph> <Paragraph position="6"> The object R10001gives &quot;a gearbox&quot; in: \[r10001 cnp\[fb: np-fb\[agr :Agr-- agr-fb\[numb :sg, pers: 3rd\] spee:inde)~ s c: np-st\[n: n\[fb: n-fb\[agr:Agr\] st: gearbozl\]\] There are potentially many alternative associations. Lexical choice is not addressed in this paper, although we recognize its necessity in generation systems. The category for the relation in an aspect is seen as a function of the categories for the two concepts. The 1 Like a Prolog rule. ! grammar rule for aspects fetches and applies the function. A relation operation, as in the aspect for sentence (3), has a category slnpla: \[operation\] s\[st:s-st\[subj :Subj fin:v\[fb: -\] \[pass:+, agr:agr= ag -N\] pred: v\[st: ope ration\]</Paragraph> <Paragraph position="8"> The rule says that one category should fill the compl element as an adverbial, and another to become an np in the subj element. Note the subject-verb agreement.</Paragraph> <Paragraph position="9"> The aspect rule simply reduces the relation category with the categories obtained from the concepts:</Paragraph> <Paragraph position="11"> An aspect is matched to the right hand side of the head to bind the variables O, R and V. The rule proves the following category for sentence (3): f in: v\[fb: v-Jb~as s:+ agr:agr= agr-fb\]\] pred: v\[ st : ope ration\] compl: a\[fb: a-fb lady :-/-\] st:mechanica~\]\] associations for transitions are more complex, but still compositional. The idea is to get a category for the event and reduce it with all roles to obtain a basic category. This is reduced with the transition type category and with those for pre and post relations and values. The association for trans is defined by the rule:</Paragraph> <Paragraph position="13"/> </Section> <Section position="3" start_page="247" end_page="247" type="sub_section"> <SectionTitle> 7th International Generation Workshop * Kennebunkport, Maine * June 21-24, 1994 </SectionTitle> <Paragraph position="0"> The transition is matched to bind variables in the head.</Paragraph> <Paragraph position="1"> retrieves the complex category of one argument for the mandatory event, pre and post are optional and have their own categories, e.g: \[gen-dur-pre\]=~ SlPre=progressive-s\] S=s\[st:s-st\[pre:Pre\]\] The category constrains the category in the pre to be a progressive-s. The rule for events basically looks like:</Paragraph> <Paragraph position="3"> The event category reduces with the mandatory role values to reveal the innermost result category for the event. It will then reduce with the peripheral roles.</Paragraph> <Paragraph position="4"> An example of an event category carried by</Paragraph> </Section> <Section position="4" start_page="247" end_page="247" type="sub_section"> <SectionTitle> 4.3 Discussion </SectionTitle> <Paragraph position="0"> The conceptual grammar is a semantic-head grammar, where the semantic head is the top node of the graph a rule analyzes* The grammar processor is a plain Prolog resolution. It behaves as the standard semantic-head driven generator (SHDG) (Shieber et al \[10\]) does when all nodes are pivots, i.e a purely top-down manner. SHDGs in general are quite different from ours in the way knowledge is organized. They follow the structure of categories in grammars that are more suitable for parsing, i.e allowing content-less words but not word-less contents. Hence, there is an assymetry between compositionality of words and semantics (Dymetman \[4\]). A content-less word can potentially occur anywhere in the output string and a generator must consider this to terminate gracefully. Problems of ensuring coherence and completeness degrade efficiency further. Our generator resembles a parser to a large extent, having a conceptual structure instead of a string to work on. As such, it is free from the problems and can potentially benefit directly from many research results in parsing technology.</Paragraph> <Paragraph position="1"> The rules are designed to work on any language, thus lessening the burden when adding more linguistic support. More rules have to be written only when new kinds of facts are added to the knowledge base, to account for their structures. We do not need a reachability relation, as the problem of goal-directedness in generation is achieved by doing clever choices of categories in lexical rules.</Paragraph> <Paragraph position="2"> The relations between domain types and categories are similar to the semantic type assignments in classic CGs.</Paragraph> <Paragraph position="3"> Our version is more flexible as a consequence of the type system.</Paragraph> <Paragraph position="4"> Genie is in an experimental state (about 20 aspects and 10 transitions), but has proven feasability of the issues discussed in this paper. It is less competent in lexical choice and the combinatory grammar. Development is continuing in the Life environment.</Paragraph> </Section> </Section> class="xml-element"></Paper>