File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/84/p84-1003_metho.xml
Size: 11,863 bytes
Last Modified: 2025-10-06 14:11:38
<?xml version="1.0" standalone="yes"?> <Paper uid="P84-1003"> <Title>TRANSFORMING ENGLISH INTERFACES TO OTHER NATURAL LANGUAGES: AN EXPERIMENT WITH PORTUGUESE</Title> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> I. INTRODUCTION </SectionTitle> <Paragraph position="0"> The CHAT-80 program for English (Warren & Pereira, 1981; Pereira, 1983) was transformed and a dapted to Portuguese. Logic Programming as a mental aid, and Prolog (Coelho, 1983; Clocksin & Melish , 1981) and Extraposition Grammars (Pereira, 1983) as practical tools, were adopted to implement a natural language interface for Portuguese. The interface here reported, called LUSO, was then coupled to a knowledge base for geography, an extension of the CHAT-80 knowledge base. In an ulterior experiment , LUSO dictionary was augmented with new vocabulary and LUSO was coupled to other modules that considerably augmented the expertise capabilities of SSIPA (Sistema Simulador de um Interlocutor Portugu~s Autom~tico (2)).</Paragraph> <Paragraph position="1"> SSIPA is a complex knowledge information processing system with natural language comprehension and synthesis capabilitites that interacts with users in Portuguese due to the linguistic knowledge that is logically organized and codified in the above mentioned SSIPA's interface ca\]led LUSO.After the first step of its development, SSIPA was able to answer (1) Present Adress: Centro de Inform~tica, Laborat5 rio Nacional de Engenharia Civil, lOl, Av. do Bra= sil, 1799 Lisboa Codex, Portugal (2) Simulating System of a Portuguese Automatic Interlocutor. null questions about geography and could agree or disagree with the opinions stated by the users about its geographical knowledge. After the second step of its development SSIPA became more powerful and intelligent because it could also perform actions that traditionally were attributes of computer monitors (Lopes & Viccari, 1984).As a matter of fact, SSIPA can create and delete files, fill them, change their names, list and change their, contents; SSIPA receives, keeps and send messaqes answers questions not only about geography but also about the knowledge SSIPA represents; it a grees or disagrees with the opinions stated byusers about the Knowledg~ context behind dialogues, reacts when users try to cheat it but, as a rule, SSIPA behaves as a helpful, deligent and cooperat~ ve interlocutor willing to serve human users, chan ging from one to another topic of conversation and developing intelligent clarification dialogues (Lo pes, 1984). All these features require a very power ful Portuguese language interface whosemain moron~ -syntactic features are pointed out in this paper. null</Paragraph> </Section> <Section position="4" start_page="0" end_page="8" type="metho"> <SectionTitle> 2. FORMALIZATION OF NATURAL LANGUAGE CONSTRUCTS </SectionTitle> <Paragraph position="0"> Natural language are complex structured systems difficult to formalize. Formalization can be understood as a step by step construction of a theory to achieve , as an ultimate goal, an axioma tic definition of natural language constructs. If this descriptive theory can also function as the linguistic structured knowledge necessary to simulate a human native using his mother language then, the formalization effort has acquired and gained a new insight. While representing a natural language system, it may represent a native competence about his mother language and, simultaneously, it mayper form the role of a native using that competence.</Paragraph> <Paragraph position="1"> This dual unity, incorporatingadescription of lin guistic knowledge and incorporating the same lin guistic knowledge ready to be active, is central to this work.This unification in the same unit of two apparently conflicting and contraditory aspects of natural languages is possible due to the usage of logic as a mental and a practical tool. SSIPA enca psulates both views of natural language.</Paragraph> <Paragraph position="2"> Practice demonstrates that, for the cons truction of complex models it is better to begin with simple model versions to represent the system one intends to simulate. This practical conclusion seems reasonable because knowledge about a system and about its representation keeps on augmenting as far as, to achieve the validation of the simula ting model, empirical investigation progresses(Klir, 1975). However one must be aware that while Know ledge about a real system keeps on growing so do the complexitythat one can unwillingly introduce in to the model. Having all this in mind, if we want to formalize linguistic knowledge about natural fan guage we must be prepared to use powerful formallanguages prone to description of complex systems and able to be used as programming languages. Here it is subsumed that computers are tools adapted to deal with complexity, augmenting considerably human capabilities to handle highly complex represen tational systems.</Paragraph> <Paragraph position="3"> 3. LUSO LUSO input subsystem is a device that transforms a sequence of words morfologically, syn tactically and semantically significant into a Logical Form. A Logical Form is here understood as a sequence of predicates, envelopes for knowledge transportation from users to SSIPA central processing unit (the EVENT DRIVER) and from this unit to users. These predicates generalize and augment the potencialities of Pereira's equivalent predicates, (Pereira, 1983). They can also be compared with the lexical functions of Bresnam (Ig81). However we don't use case classification. In Portuguese, prepositions associated to noun semanticfeatures seem to be enough to identify and differentiate meanings of verbal, noun, adjectival and even prepos~ tional form functions (Lopes, 1984).</Paragraph> <Paragraph position="4"> LUSO is a natural language interface that concentrates linguistic expert knowledge about Pot tuguese language.</Paragraph> <Paragraph position="5"> LUSO input subsystem works sequentially.</Paragraph> <Paragraph position="6"> In a first step it performs the syntactical analysis of an input Portuguese sequence of words. Depending on the task LUSO has been commited to perform, a lexically filled syntagmatic marker or a failure is the result of LUSO eagerness to prove the above mentioned input sequence of words as a syntactically correct yes-no question, wh-question, imperative or declarative sentence, or as a syntac tically correct noun phrase or prepositional phra Z se. When a lexically filled syntagmatic marker is obtained, it is translated to a logical form. Finally this form is planned and simplified accor ding to the methodology described by Pereira (1983) and Warren (1981).</Paragraph> <Paragraph position="7"> The design of LUSO input subsystem re flects the following hypothesis: * morphological analysis of Portuguese constructs is syntactically driven; * linguistic semantic analysis of Portuguese constructs is lexically (functio nally) driven (in a quasi-bresnamian, sense (Bresnam, 1981; Pereira, 1983;Lo pes, 1984)); * cognitive semantic analysis of Portu guese constructs depends on syntacti cal and linguistic semantic analysis previously achieved for Portuguese cons tructs.</Paragraph> <Paragraph position="8"> This suggests SSIPA as a formal system that already theorizes some aspects of Portuguese language while LUSO specificates the form of formal functions whose cognitive content and formal ap titude for transforming system state are defined at the semantic level of the formal system.</Paragraph> <Paragraph position="9"> To complete the formal role wewanted SS ! PA to play, LUSO output subsystem synthesizes Portuguese noun phrases, prepositional phrases or se D tences whenever it receives correspondent requests to output such constructs. To achieve that goal LU SO transforms any previously lexically filled syntagmatic marker into a sequence of Portuguesewords in its final forms, ready to be sent to a user.</Paragraph> </Section> <Section position="5" start_page="8" end_page="8" type="metho"> <SectionTitle> 4. MORPHO-SYNTACTICAL ANALYSIS AND SYNTHE - SIS OF PORTUGUESE LANGUAGE CONSTRUCTS </SectionTitle> <Paragraph position="0"> The morpho-syntactical analysis of Portu guese language constructs is application indepen dent and is based on the various concepts developed by Chomsky and followers in the framework of the Extended Standard Theory of Generative Grammar (Chomsky, 1980, 1981a, 1981b; Rouveret, 1983 and many others)* As it was already mentioned in this paper, one of the crucial hypothesis behind LUSO's design reflects the idea that morphological analysis of Portuguese constructs is syntactically driven. This means that when the syntactical parseris waiting for a specific grammatical category, it ta kes the next word to be analysed from the input se quence of words and searches the dictionary for that category, trying to find the input word. If the i put word does not match any dictionary entry for that particular category, all possible input word endings, one after another, starting from the longest towards ths shortest, are matched against the ending entries for that category until a success ful match will occur. If such a match does not suc ceed, this means that the input word does not belong to the foreseen grammatical category. As a co) sequence, a failure occurs and the Prolog mecha nism for backtracking is automatically activated.</Paragraph> <Paragraph position="1"> When one of the input word possible endings mat ches an ending entry for the syntactically predicted category, a basic form for the input word is coined. The newly coined basic form for that in put word is then checked against the subdictionary entries for the foreseen grammatical category.A pr~ cess of successes and/or failures proceeds. A syntagmatic marker for each input Portuguese construct is filled with word basic forms and correspon dingsyntactic features information (person, gender and number for noun phrases; tense, mode, aspect , voice and negation for verbs; etc.). The basic form fora-verb is its infinitive form; for a nouhisits singular form; for a pronoun, article or adjective is its singular masculine form.</Paragraph> <Paragraph position="2"> The morphological synthesis of Portuguese constructs is syntactically driven. This means that, departing from a syntagmatic marker lexicallp filled with basic forms of Portuguese words, using the syntactic features that are explicitelly considered into that marker, LUSO output subsystem coines the corresponding sequence of Portuguese words in its final output form ready to be sent to the user with whom the system is interacting. For this purpose most of the rules that were designed to consult LUSO's dictionary were reordered. Depa~ ting from basic forms of words, their final forms are obtained by a process nearly inverse of the process used for input.</Paragraph> <Paragraph position="3"> Extraposition grammars, the formalism d e veloped by Pereira (1983), were used to implement the analyser and the synthesizer for Portuguese.It is worth telling that this formalism proved to be quite adequate for the description of move-alpha ru le (Chomsky, IgBlb) in complex syntactical environ ments such as those that frequently occur in Portu guese. As a matter of fact phrase constituents order in Portuguese sentences is quite free. LUSO ta kes into account the same type of problems handled by CHAT-80 program. Additionally, it analysis syntactical structures involving prepositional phra ses and verb headed sentences where there is reordering of noun phrase constituents inside those se~ tences due to the heading process. Problems related to common nouns followed by the proper nouns they refer, in the context where they appear,is a ! so handled.</Paragraph> </Section> class="xml-element"></Paper>