File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/w04-0815_intro.xml
Size: 6,142 bytes
Last Modified: 2025-10-06 14:02:32
<?xml version="1.0" standalone="yes"?> <Paper uid="W04-0815"> <Title>Dependency Based Logical Form Transformations</Title> <Section position="3" start_page="0" end_page="0" type="intro"> <SectionTitle> 2 Methodology </SectionTitle> <Paragraph position="0"> The system is built using a highly modular design and is intended to be as generic and reusable as possible. The basic data structure is a flat list-like representation with generic property slots attached to each element. This structure maximises compatibility with the final representation and allows for greater flexibility in the types of information that may be</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> Association for Computational Linguistics </SectionTitle> <Paragraph position="0"> for the Semantic Analysis of Text, Barcelona, Spain, July 2004 SENSEVAL-3: Third International Workshop on the Evaluation of Systems associated with each predicate. Figure 1 illustrates the major proces s ing modules available and the work flow.</Paragraph> <Paragraph position="1"> A syntactic parse including functional dependencies is produced on a per sentence basis. Definitions of the properties associated with each token are presented in Table 1.</Paragraph> <Paragraph position="2"> The resultant parse is transformed into a linear data structure indexed by word position. This is illus trated in Table 2 using the example sentence 'Some students like to study in the mornings'. The original token text is stored, as is the lemmatised form.</Paragraph> <Paragraph position="3"> Head and dependency type are the most important class of information used by the system. The dependency type and head of the token is often directly, if not indirectly, translatable into a predicate argument. Examples of the types of dependency functions employed include subject, object, prepositional complement, agent, subject and object complements, indirect object, goal, and coordinating conjunctions. Determiner and negator functions are also of interest because they are excluded from the final represent ation.</Paragraph> <Paragraph position="4"> The filter module moderates the presence or absence of tokens using stop lists or pass lists or a combination of both. Stop lists are used to specify content to be excluded from the token stream and pass lists specify elements that should remain. Tokens may be filtered from the stream based on any attribute type and value listed in Table 1. This information is provided in the filter set. The principal types of information filtered in this system are determiners based on morpholog ical tags and auxiliaries based on syntactic tag inform ation. For example 'some' and 'the' are filtered as a cons equence of a morpho property equals 'DET' stop list rule .</Paragraph> <Paragraph position="5"> When the token stream has been annotated with the necessary information and has passed through the filter, the tokens that remain are passed through the logical form processor (LFP). The main function of the LFP is to build an inverted index identifying all dependent tokens. Once grammatical dependencies are assigned and the inverted index is built the logical form representation may be constructed.</Paragraph> <Paragraph position="6"> Each predicate is constructed from the token stream in turn based on the part-of-speech category of the token. The base form of the token is concatenated with the part-of-speech tag. A mapping table is used to transform the part-of-speech information pro- null duced by the parse into the coarser grained Wor dNet tags.</Paragraph> <Paragraph position="7"> Entities are the simplest type of predicate to construct as they contain only a single argument, for which the word identifier attribute value is used. Noun tokens 'student' and 'morning' from the example are transformed into the predicates student :n_(x3) and morning:n_(x9). Pronouns, prepositional complements, and coordinating conjunctions are dealt with individually using their respective dependency function values.</Paragraph> <Paragraph position="8"> Adjectives are constructed using the head dependency value as the argument unless the dependent is marked with a subject. In this case the argument becomes the head of the subject. Adverbs are created primarily using the dependency function alone.</Paragraph> <Paragraph position="9"> Verbal predicates are constructed using SUBJ, OBJ, GOAL, OC, I-OBJ, COMP, and PCOMP dependencies in the specified order. A special case exists for verbs that have object complement dependencies. In these cases attributive nominals are identified and assigned as arguments independently. The main verb 'like' in our example is transformed into the pred icate like:v_(e4, x3, e6) as a result of subject (SUBJ) and object (OBJ) dependencies found in 'student' and 'study' respectively. Given the fact that we are dealing with the main verb, the LFP inverts the subject and object dependencies, inserts them into the head verb token prop-erty slot and assigns their respective word identifier values. The inverted properties augment the token slot for 'like' which has word identifier four in Table 2. The additional elements of the inverted index used to build the predicate are listed in Table 3.</Paragraph> <Paragraph position="10"> Verbal predicates which also serve as grammatical objects also warrant special treatment. The token 'study' is an example of this as it serves as the object of the head verb 'like'. A cache is used to store the sentential head, prepositional complements, subjects, and coordinating conjunctions. The cache is used in this instance to assign the subject and prepositional complement arguments in order to form the predicate study:v_(e6, x3, x9). Notice from Table 2 word identifier three matches the grammatical subject token 'students' and word identifier nine matches the head of the prepositional phrase ' in the mornings'. Once all tokens are processed the logical form transformation is complete and the final representation is presented in the aforementioned notation.</Paragraph> </Section> </Section> class="xml-element"></Paper>