File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/88/c88-1054_metho.xml
Size: 13,210 bytes
Last Modified: 2025-10-06 14:12:08
<?xml version="1.0" standalone="yes"?> <Paper uid="C88-1054"> <Title>Achieving Bidirectionality</Title> <Section position="1" start_page="0" end_page="0" type="metho"> <SectionTitle> Artificial Intelligence Program GE Research and Development Cenl;er Schenectady, NY 12301 USA Abstract </SectionTitle> <Paragraph position="0"> The topic of BIDIltECT1ONAL1TY~ using common knowledge in language processing for both anMysis and generation, is of both practical and theoretical concern. Theoretically, it is important to determine what knowledge structures can be applied to both. Practically, it is important that a cmnpetent natural language system be ~ble to generate outputs tilat are relevant to the inputs it understands, without exce~:sive redundancy. Tiffs problem revolves around the ~bility to relate linguistic structures declaratively to their mc~ming.</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="metho"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> BIDIRECTIONALITY, or the ability to use a cormnon knowledge base for both language anMysis mid generation, is a desirable feature of a reM language proccssing system. A natural language &quot;front end&quot; must not only perform syntactic anMysis, but must derive a suitable representation of a meaning or intention from a linguistic input. A natural language generator performs the: inverse task of producing a linguistic utterance fi'om a nmm~ing or intention. A bidirectional system performs both tusks using as much shaxed knowledge as possible.</Paragraph> <Paragraph position="1"> Two praeticM concerns motivate this work: (1) A system that uses shax'ed knowledge for analysis and generation will produce output ix* the subset of laa.lgu~ge that it understands, thus avoiding inconsistences between the input and output, and (2) Using shared knowledge avoids the inefficiency of having distinct encodings of the same linguistic information.</Paragraph> <Paragraph position="2"> The first concern, having a naturM language interface &quot;speak&quot; the same language it understands, is more than a conveniencc. Responses in a dialogue often use a word or phrase that has been mentioned by another speaker. This cannot be done effectively unless the word or phrase is common to both the input and output language. A computer user will expect the system to understand a phrase or construct that the system has itsel\[ used; this aggravates the consequences of inconsistencies between input and output language. Moreover, if an interface is to be traalsportable across domains, a distinct subset of language will be applicable to each domain. The bidirectional knowledge base allows both the input and output to be constrained simultaneously.</Paragraph> <Paragraph position="3"> The second concern, efficiency of knowledge representation, becomes more compelling as the lexical and semantic capabilities of natural language systems increase. While there is ample motivatioz, for having a common grammar fox&quot; analysis and ge:oeration, the need for a common lexicon is even stronger. H~ving two lexicons is counterintuitive; what makes practical sense is to have a single lexicon indexed differently for gene.ration fl'om analysis. Now that many systems have more mid more knowledge built into their lexicons, the effects of redundancy become more &'astie. When more information is required of the lexicon, however, the difficulties in developing a shared lexicon are more pronounced.</Paragraph> <Paragraph position="4"> The principal concern in designing a natural hmguage system that performs both analysis and generation, thereibre, is a bidirectional lexicon. The main issue to be eonsidered here is wt,at information nmst be included in this lexicon and how bidirectional lexiea./knowledge should be structured.</Paragraph> </Section> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> 2 Issues Regarding Bidlrectionality </SectionTitle> <Paragraph position="0"> There h~s been very little reseaxch in language generation relative to language understanding and syntactic analysis. A negligible amount of research has addressed the t)roblem of bidirectionality. Some work has touched on shared knowledge of lexical semantics \[aacobs, 1985, Steinacker and Buchberger, 19831 and on grammatical frameworks suitable for bidirectional systems \[Kay, 1984\]. At the recent TINLAP (Theoretical Issues in Natural Language Processing) conference \[Wilks, 1987\], position papers brought out a number of points concerning bidirectionality that had not previously appeared in the literature.</Paragraph> <Paragraph position="1"> The positions largely embraced the need for knowledge shared between a.nalysis and generation while laying out the practicsl reasons why bidirectional systems are not prevMent.</Paragraph> <Paragraph position="2"> A good summary of' issues in bidireetionality is found in \[Mann, 1.987\]. Erich aspect of the generation process can be related to some part of language analysis that seems to draw fl'om common knowledge. However, the processes themselves as well as the problems involved in building actuM language processing systems differ, to such an extent that scientists do not find the time to attend to the common issues. Another point is that both fields, especially generation, largely ignore the problem of lexical semantics \[Maxcus, 1987\], a problem that might help to bring the tasks closer together.</Paragraph> <Paragraph position="3"> It is a mistake to treat analysis and generation as como pletely independent tasks. Given that the goal of much of natural language research is to program computers to communicate in the way people do, the ideal natural language program must use natural language us both a &quot;front end&quot; and a &quot;back end&quot;. Knowledge that has tfistorieally been used more in generation, pertaining to text structure, coherence, and constraints on lex-.</Paragraph> <Paragraph position="4"> ical choice, influences the analysis task. Knowledge primarily applicable to analysis, such as vocabulary and grammatical coverage, and information applied to ambiguity mtd vagueness~ can be applied to generation as well. The problem of linguistic knowledge base design is thus fundamentally different for a bidirectional system.</Paragraph> </Section> <Section position="4" start_page="0" end_page="267" type="metho"> <SectionTitle> 3 The Bidirectional Lexicon </SectionTitle> <Paragraph position="0"> Several characteristics are essential to a lexicon that can be used effectively in both analysis and generation: 1. Principally, the lexicon and knowledge base of the system must be declarative; all the material must take the form of data structures rather than rules or program code. 2. The semantic component of the lexicon; i. e. the representation of word meanings and word senses, must be sufficient to guide lexical choice in generation and to resolve vague or ambiguous words in analysis.</Paragraph> <Paragraph position="1"> 3. Lexical collocations, phrasal lexemes, and grammatical constructions must be represented. This compound lexical knowledge is necessary in generation because the selection of a particular word influences the selection of other words in a phrase, even when the phrase is internally grammatical. The knowledge is important in analysis in so far as it can aid in handling multiple word senses.</Paragraph> <Paragraph position="2"> Most systems satisfy the declarative requirement above, although the degree to which knowledge is proceduralized varies greatly from one model to another. The second and third requirements, the richness of lexical semantics and the need for compound knowledge , are more often overlooked. In generation, a lexical entry that lists a word stem and a corresponding set of linguistic and semantic features is not enough; what is needed is a relationship between the lexlcal item and a knowledge representation structure \[Jacobs, 1986\] and a means of selecting the lexical item from among the other possible words \[Mathiessen, 1981\]. A word choice is not made independently from other choices; lexical choices have a direct influence on other lexical choices \[Jacobs, 1985\].</Paragraph> <Paragraph position="3"> Lexical knowledge used primarily for generation can impact the way language analysis is performed, and vice versa. The following simple examples help to illustrate how complex lexical knowledge required for generation can also affect understanding: null in order to produce utterances such as (3a), which is natural for most native speakers. In addition to knowledge about the word sense of &quot;hit&quot;, the system must know what keys are suitable for &quot;hitting&quot;, as well as that &quot;hit&quot; is used to describe striking a single key. This detailed lexical knowledge should also avoid using (2a) in place of (3a), since one cannot use &quot;type&quot; for a key that does not produce a character or text. Now, given that this knowledge is required for the appropriate generation of the utterances above, it makes sense that it should be used in determining the difference in meaning between (2a) and (3a) (the former means &quot;Hit the sequence of keys r-e-t-u-r-n). In designing a system strictly for analysis, one would tend to distingtiish (2a) from (3a) by assnming &quot;hit&quot; to have a different meaning fl'om &quot;type&quot;, and thus produce two incorrect but relatively subtle effects: First, the meanings of (2b) and (3b) would also be different, and second, (3b) would be equally acceptable to (3a).</Paragraph> <Paragraph position="4"> Because a generation system must have enough information in the lexicon to make appropriate lexical choices, it must have lexical knowledge that relates the specific word senses above to the linguistic context in which they are used. A linguistic analyzer can then use this knowledge to make more accurate interpretations of the same words. This is a typical way in which lexical choice and word sense determination are related.</Paragraph> </Section> <Section position="5" start_page="267" end_page="267" type="metho"> <SectionTitle> 4 FLUSH </SectionTitle> <Paragraph position="0"> An example of a lexicon designed with the three characteristics described in the previous section is FLUSH (Flexible Lexicon Using Structured Hierarchical knowledge) \[Besemer and Jacobs, 1987\]. FLUSH combines a hierarchical phrasal lexicon \[Wilensky and Arens, 1980, Jacobs, 1985, Dyer and Zernik, 1986\] with declarative relations between language and meaning \[Jacobs and Rau, 1985\]. For example, figure 1 shows part of the lexical knowledge about the preposition &quot;to&quot;, used in a prepositional phrase modifying either a verb or noun. The lexical relation to-pmod represents this linguistic category, and constrains how it can be used in a surface structure, based on its membership in the more general rood-tel (modifying relation) category.</Paragraph> <Paragraph position="1"> Figure 2 shows how the to-pmod relation is associated with a generalized transfer event (either a physical transfer or a transfer of possession), with the object of the preposition describing the destination of the transfer. The link marked &quot;REF&quot; in figure 2 represents this sort of association between a linguistic and a conceptual structure. More specific transfers, as well as metaphorical &quot;VIEWs&quot; of transfers, are also explicitly represented in this diagram. Knowledge about senses of &quot;sell&quot;, &quot;tell&quot;, and &quot;send?, as well as constructs using such verbs, is thus represented in a neutral fashion.</Paragraph> <Section position="1" start_page="267" end_page="267" type="sub_section"> <SectionTitle> Conceptual Structures Linguistic Structures </SectionTitle> <Paragraph position="0"/> <Paragraph position="2"> Compound lexical knowledge , often involving flgm'ative expressions, is also represented declaratively in FLUSH. Fig-Ure 3 shows how such knowledge is encoded: It-give-hug, the lexical category for &quot;giving a hug&quot; and other variations on the same expression, belongs to a general category, linguis~ tic/conceptual, which accounts for its linguistic flexibility such as its potential use in the passive voice. A &quot;REF&quot; association links Ic-give-hug to the hugging concept, indicating declaratively that these expressions describe a hugging action rather Th$.,~e examples, while only touching upon the lexical representatio:a of FLUSH, shows some of the characteristics of a birectional lexicon. The hierarchy of linguistic structures allows access to these structures for both analysis and generation. Declarative links between linguistic and conceptual entities allow specific knowledge about linguistic expression to be used in hoth processes. The current task is to encode enough information in this form so that analysis and generation alike can be robustly performed.</Paragraph> </Section> </Section> class="xml-element"></Paper>