File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/98/w98-1420_intro.xml
Size: 4,614 bytes
Last Modified: 2025-10-06 14:06:50
<?xml version="1.0" standalone="yes"?> <Paper uid="W98-1420"> <Title>A Language-Independent System for Generating Feature Structures from Interlingua Representations</Title> <Section position="3" start_page="188" end_page="189" type="intro"> <SectionTitle> 2 Interlingua and Ontology </SectionTitle> <Paragraph position="0"> The work described in this paper is based on interhngua approach to hiT. In this approach, the meaning conveyed in the source text is represented using a language-independent, artificial language. The language formalism that is utilized in this paper is developed for MicroCosmos project at New Mexico State University and it is called as text meaning representation (TMR) \[Mahesh and Nirenburg, 1996, Beale et al.. 1995\]. Its formalism is based on two main knowledge resources: speaker's world knowledge about entities, events, and their relationships which are described in ontology, and linguistic information about semantic (aspect, modality, etc.) and pragmatic (speech-act, stylistics, etc.) issues. In this section: first a brief description of the ontology is given, and then the interlingua formalism is presented with a demonstrative example.</Paragraph> <Paragraph position="1"> The ontology used in this work is a hierarchical model of the real world \[Mahesh, 1996\]. It is built upon proposed abstractions, concepts, about the world entities, events, and relations. The concepts in the ontology are not designed to denote word senses * in a specific language, instead they are defined to represent our common sense knowledge about the world. Each concept is represented as a frame and the information about its abstraction is described through a set of features with their value domains. For example, the concept HUMAN is defined to denote all human-beings in the world and it corresponds to the words 'man', 'woman', 'child', 'John', etc, in English. The frame given below is the simplified description of HUMAN.</Paragraph> <Paragraph position="2"> concept HUMAN</Paragraph> <Paragraph position="4"> teacher~engineer~..</Paragraph> <Paragraph position="5"> Representation of events in the ontology is somehow different from the entities since they are treated as predicates over arguments. So, an event concept provides extra information about its thematic structure such that each thematic role can take a set of entity concepts as its values. All concepts in the ontology are connected to others through a set of relations. The main relation, is-a. provides the hierarchical interpretation in the ontology such that child concepts define a.dditiolial properties and put some constraints on the definition of their parent concepts. So, a HUMAN is a MAMMAL, which is an ANIMAL, etc. There are also other types of relations to provide additional information like a MONITOR is-part-of a COMPUTER.</Paragraph> <Paragraph position="6"> The utilized language formalism, TMR, does not contain any specific information about the source language like lexemes and syntactic structure. It uses a frame-based notation and it is heavily based on the Ontology. Th e concepts from the ontology are used to denote the propositional content of the input sentences. But since concepts are only abstractions, their features should be instantiated tO denote real things when used in TMR. Although concept instances provide the * information about the propositional content, semantic and pragmatic properties of the sentence should also bedescribed in TMR. To facifitate this, TMR language provides special frames for representing aspectual properties, temporal relations, speech-acts, stylistic factors, etc. Instead of describing the TMR language in full detail, an example representation is given to demonstrate its formalism. The TMR of the sentence &quot;The man gave a book to the child&quot; is given in Figure 2. Note that, although English words are used as concepts, they are not denoting English word * senses, they are just generic abstractions. Each frame in a TMR is indexed to differentiate between frames with the same name. Both of the phrases 'the man' and 'the child' are represented with fraines of the same concept, HUMAN, but their instantiated features are totally different. The given TMR simply denotes the event give(man, child, book) with its aspectual properties (aspecta) and its temporal relation with the time of utterance (temp-rell). Information about the speech situation is described with speech-act1 frame. Observe that, there is nothing specific about the English sentence that is represented in the given TMR.</Paragraph> </Section> class="xml-element"></Paper>