File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/80/p80-1013_metho.xml
Size: 20,917 bytes
Last Modified: 2025-10-06 14:11:23
<?xml version="1.0" standalone="yes"?> <Paper uid="P80-1013"> <Title>CAPTURING LINGUISTIC GENERALIZATIONS WITH METARULES IN AN ANNOTATED PHRASE-STRUCTURE GRAMMAR</Title> <Section position="1" start_page="0" end_page="0" type="metho"> <SectionTitle> CAPTURING LINGUISTIC GENERALIZATIONS WITH METARULES IN AN ANNOTATED PHRASE-STRUCTURE GRAMMAR Kurt Konolige SRI International = 1. Introduction </SectionTitle> <Paragraph position="0"> Computational models employed by current natural language understanding systems rely on phrase-structure representations of syntax. Whether implemented as augmented transition nets, BNF grammars, annotated phrase-structure grammars, or similar methods, a phrase-structure representation makes the parsing problem computatlonally tractable \[7\]. However, phrase-structure representations have been open to the criticism that they do not capture linguistic generalizations that are easily expressed in transformational grammars.</Paragraph> <Paragraph position="1"> This paper describes a formalism for specifying syntactic and semantic generalizations across the rules of a phrase-structure grammar (PSG). The formalism consists of two parts: 1. A declarative description of basic syntactic phrase-structures and their associated semantic translation.</Paragraph> <Paragraph position="2"> 2. A set of metarules for deriving additional grammar rules from the basic set.</Paragraph> <Paragraph position="3"> Since metarules operate on grammar rules rather than phrase markers, the transformational effect of metarules can be pro-computed before the grammar is used to analyze input, The computational efficiency of a phrase-structure grammar is thus preserved, Metarule formulations for PSGs have recently received increased attention in the linguistics literature, especially in \[4\], which greatly influenced the formalism presented in this paper. Our formalism differs significantly from \[4\] in that the metarules work on a phrase-structure grammar annotated with arbitrary feature sets (Annotated Phrase-structure Grammar, or APSG \[7\]). Grammars for a large subset of English have been written using this formalism \[9\], and its computational viability has been demonstrated \[6\]. Because of the increased structural complexity of APSGs over PSGs without annotations, new techniques for applying metarules to these structures are developed in this paper, and the notion of a match between a metarule and a grammar rule is carefully defined. The formalism has been implemented as a computer program and preliminary tests have been made to establish its validity and effectiveness.</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="metho"> <SectionTitle> 2. M etarules </SectionTitle> <Paragraph position="0"> Metarules are used to capture linguistic generalizations that are not readily expressed in the phrase-structure rules.</Paragraph> <Paragraph position="1"> Consider the two sentences: 1, John gave a book to Mary 2. Mary was given a hook by John Although their syntactic structure is different, these two sentences have many elements in common. In particular, the predicate/argument structure they describe is the same: the gift of a book by john to Mary. Transformational grammars capture this correspondence by transforming the phrase marker =This research was supported by the Defense Advanced Research Projects Agency under Contract N00039-79-C-0118 with the Naval Electronics Systems Command. The views and conclusions contained in this document are those of the author and should not be interpreted as representative of the official policies, either expressed or implied, of the U.S. Government. The author is grateful to Jane Robinson and Gary Hendrix for comments on an earlier draft of this paper.</Paragraph> <Paragraph position="2"> for (1) into the phrase marker for (2). The underlying predicate/argument structure remains the same, but the surface realization changes. However, the recognition of transformational grammars is a very difficult computational problem. = By contrast, metarules operate directly on the rules of a PSG to produce more rules for that grammar. As long as the number of derived rules is finite, the resulting set of rules is still a PSG, Unlike transformational grammars. PSGs have efficient algorithms for parsing \[3\]. In a sense, all of the work of transformations has been pushed off into a pre-processing phase where new grammar rules are derived.</Paragraph> <Paragraph position="3"> We are not greatly concerned with efficiency in pre-processing, because it only has to be done once.</Paragraph> <Paragraph position="4"> There are still computationa! limitations on PSGs that must be taken into account by any metarule system. Large numbers of phrase-structure rules can seriously degrade the performance of a parser, both in terms of its running time == , storage for the rules, and the ambiguity of the resulting parses \[6\]. Moreover, the generation of large numbers of rules seems psychologically implausible. Thus the two criteria we will use to judge the efficacy of metarules will be: can they adequately capture linguistic generalizations, and are they C/omputationally practicable in terms of the number of rules they generate. The formalism of \[4\] is especially vulnerable to criticism on the latter point, since it generates large numbers of new rules. *==</Paragraph> </Section> <Section position="3" start_page="0" end_page="44" type="metho"> <SectionTitle> 3. Representation </SectionTitle> <Paragraph position="0"> An annotated phrase-structure grammar (APSG) as developed in \[7\] is the target representation for the metarules. The core component of an APSG is a set of context-free phrase-structure rules. As is customary, these rules are input to a context-free parser to analyze a string, producing a phrase-structure tree as output. In addition, the parse tree so produced may have arbitrary feature sets, called annotations, appended to each node. The annotations are an efficient means of incorporating additional information into the parse tree. Typically, features will exist for syntactic processing (e.g., number agreement), grammatical function of constituents (e.g., subject, direct and indirect objects), and semantic interpretation.</Paragraph> <Paragraph position="1"> Associated with each rule of the grammar are procedures for operating on feature sets of the phrase markers the rule constructs. These procedures may constrain the application of the rule by testing features on candidate constituents, or add information to the structure created by the rule, based on the features of its constituents. Rule procedures are written in the programming language LISP, giving the grammar the power to recognize class 0 languages. The use of arbitrary procedures and feature set annotations makes APSGs an *There has been some success in restricting the power of transformational grammars sufficiently to allow a recognizer to be built; see \[8\].</Paragraph> <Paragraph position="2"> =*Shell \[10\] has shown that, for a simple recursive descent parsing algorithm, running time is a linear function of the number of rules. For other parsing schemes, the relationship between the number of rules and parsing time is unclear.</Paragraph> <Paragraph position="3"> ='~SThis is without considering infinite schemas such as the one for coniunction reduction. Basically, the problem is that the formalism of \[4\] allows complex features \[21 to define new categories, generating an exponential number of categories (and hence rules) with respect to the number of features.</Paragraph> <Paragraph position="4"> extremely powerful and compact for-alism for representing a language, similar to the earlier ATN formalisms \[1\]. An example of how an APSG can encode a large subset of English is the DIAGRAM grammar \[9\].</Paragraph> <Paragraph position="5"> It is unfortunately the very power .of APSGs (and ATNs) that makes it difficult to capture linguistic generalizations within these formalisms. Metarules for transforming one annotated phrase-structure rule into another must not only transform the phrase-structure, but also the procedures that operate on feature sets, in an appropriate way. Because the transformation of procedures is notoriously difficult,* one of the tasks of this paper will be to illustrate a declarative notation describing operations on feature sets that is powerful enough to encode the manipulations of features necessary for the grammar, but is still simple enough for metarulos to transform.</Paragraph> <Paragraph position="6"> 4. Notation Every rule of the APSG has three parts: 1. A phrase-structure rule; 2. A restriction set (RSET) that restricts the applicability of the rule, and 3. An assignment set (ASET) that assigns values to features.</Paragraph> <Paragraph position="7"> The RSET and ASET manipulate features of the phrase marker analyzed by the rule; they are discussed below in detail. Phrase-structure rules are written as: CAT -> C 1 C 2 ... Cn where CAT is the dominating category of the phrase, and C 1 through C n are its immediate constituent categories. Terminal strings can be included in the rule by enclosing them in double quote marks.</Paragraph> <Paragraph position="8"> A feature set is associated with each node in the parse tree that is created when z string is analyzed by the grammar. Each feature has a name (a string of uppercase alphanumeric characters) and an associated value. The values a feature can take on (the domain of the feature) are, in general, arbitrary. One of the most useful domains is the set &quot;/,-,NIL&quot;, where Nil is the unmarked case; this domain corresponds ~ to the binary features used in \[2). More complicated domains can be used; for example, a CASE feature might have as its domain the set of tuplos ~<1 SG>,<2 SG>,c3 SG>,<I PL>,<2 PL>,<3 PL>'~. Most interesting are those features whose domain is a phrase marker. Since phrase markers are just data structures that the parser creates, they can be assigned as the value of a feature. This technique is used to pass phrase markers to various parts of the tree to reflect the gr;llmmatical and semantic structure of the input; examples will be given in later sections.</Paragraph> <Paragraph position="9"> We adopt the following conventions in referring to features and their values: - Features are one-place functions that range over phrase markers constructed by the phrase-structure part of a grammar rule. The function is named by the feature name.</Paragraph> <Paragraph position="10"> - These functions are represented in prefix form, e.g., (CASE NP) refers to the CASE feature of the NP constituent of a phrase marker. In cases where there is more than one constituent with the same category name, they will be differentiated by a &quot;~/&quot; suffix, for example, VP-> V NPSSI NP~2 *it is sometimes hard to even understand what it is that a procedure does, since it may involve recursion, side-effects, and other complications.</Paragraph> <Paragraph position="11"> has two NP constituents.</Paragraph> <Paragraph position="12"> -A phrase marker is assumed to have its immediate constituents as features under their category name, e.|., (N NP) refers to the N constituent of the NP.</Paragraph> <Paragraph position="13"> - Feature functions may be nested, e.g., (CASE (N NP)) refers tO the CASE feature of the N constituent of the NP phrase marker. For these nestings, we adopt the simpler notation (CASE N NP), which is assumed to be right-associative.</Paragraph> <Paragraph position="14"> -The value NIL always implies the unmarked case.</Paragraph> <Paragraph position="15"> At times it will be useful to consider features that are not explicitly attached to a phrase marker as being present with value NIL.</Paragraph> <Paragraph position="16"> -A constant term will be written with a preceding single quote mark, e.s. , tSG refers to the constant token SG.</Paragraph> <Section position="1" start_page="0" end_page="44" type="sub_section"> <SectionTitle> 4.1. Restrictions </SectionTitle> <Paragraph position="0"> The RSET of a rule restricts the applicability of the rule by a predication on the features of its constituents. The phrase markers used as constituents must satisfy the predications in the RSET before they will he analyzed by the rule to create a new phrase marker. The most useful predicate is equality: a feature can take on only one particular value to be acceptable.</Paragraph> <Paragraph position="1"> For example, in the phrase structure rule:</Paragraph> <Paragraph position="3"> number agreement could be enforced by the predication:</Paragraph> <Paragraph position="5"> where NBR is a feature whose domain is SG,PL~.* This would restrict the NBR feature on NP to agree with that on VP before the S phrase was constructed. The economy of the APSG encoding is seen here: only a single phrase-structure rule is required. Also, the linguistic requirement that subjects and their verbs agree in number is enforced by a single statement, rather than being implicit in separate phrase structure rules, one for singular subject-verb combinations, another for plurals.</Paragraph> <Paragraph position="6"> Besides equality, there are only three additional predications: inequality (#), set membership (e) and set non-membership (It). The last two are useful in dealing with non-binary domains. As discussed in the next section, tight restrictions on predications are necessary if metarules are to be successful in transforming grammar rules. Whether these four predicates are adequate in descriptive power for the grammar we contemplate remains an open empirical question; we are currently accumulating evidence for their sufficiency by rewriting DIAGRAM using just those predicates.</Paragraph> <Paragraph position="7"> Restriction predications for a rule are collected in the RSET of that rule. All restrictions must hold for the rule to be applicable. As an illustration, consider the subcategorizatlon rule for dltransitlve verbs with prepositional objects (e.g.. eJohn gave a book to Mary&quot;):</Paragraph> <Paragraph position="9"> The first restriction selects only verbs that are marked as dltransitive; the TRANS feature comes from the lexical entry of the verb. Dltransitiv verbs with prepositional arguments are always subcategorized cy the particular preposition used, e.g., &quot;give a always uses Ire&quot; for its prepositional argument. *How NP and VP categories could &quot;inherit&quot; the NBR feature from their N and V constituents is discussed in the next section.</Paragraph> <Paragraph position="10"> The second predication restricts the preposition of the PP for a given verb. The PREP feature of the verb comes from its lexical entry, and must match the preposition of the PP phrase*</Paragraph> </Section> <Section position="2" start_page="44" end_page="44" type="sub_section"> <SectionTitle> 4.2. Assignments </SectionTitle> <Paragraph position="0"> A rule will normally assign features to the dominating node of the phrase marker it constructs, based on the values of the constituents f features. For example, feature inheritance takes place in this way. Assume there is a feature NBR marking the syntactic number of nouns. Then the ASET of a rule for noun phrases might be:</Paragraph> <Paragraph position="2"> ASET: (NBR NP) := (NBR N) This notation is somewhat non-standard; it says that the value of the NBR function on the NP phrase marker is to be the value of the NBR function of the N phrase marker.</Paragraph> <Paragraph position="3"> An interesting application of feature assignment is to describe the grammatical functions of noun phrases within a clause. Recall that the domain of features can be constituents themselves. Adding an ASET describing the grammatical function of its constituents to the ditransitive VP rule yields the following:</Paragraph> <Paragraph position="5"> This ASET assigns the DIROBJ (direct object) feature of VP the value of the constituent NP. Slmilarly~ the value of INDOBJ (indirect object) is the NP constituent of the PP phrase.</Paragraph> <Paragraph position="6"> A rule may also assign feature values to the constituents of the phrase marker it constructs. Such assignments are context sensitive, because the values are based on the context in which the constituent Occurs.*&quot; Again, the most interesting use of this technique is in assigning functional roles to constituents in particular phrases. Consider a rule for main clauses:</Paragraph> <Paragraph position="8"> ASET: (SUBJ VP) := (NP S), The three features SUBJ, DIROBJ, and INDOBJ of the VP phrase marker will have as value the appropriate NP phrase markers, since the DIROBJ and INDOBJ features will be assigned to the VP phrase marker when it is constructed. Thus the grammatical function of the NPs has been identified by assigning features appropriately.</Paragraph> <Paragraph position="9"> Finally, note that the grammatical Functions were assigned to the VP phrase marker. By assembling all of the arguments at this level, it is possible to account for bounded deletion phenomenon that are lexically controlled. Consider subcategorization for Equi verbs, in which the subject of the main clause has been deleted from the infinitive complement (&quot;John wants to gem): =Note that we are not considering here prepositional phrases that are essentially mesa-arguments to the verb, dealing with time, place, and the like. The prepositions used for mesa-arguments are much more variable, and usually depend on semantic considerations.</Paragraph> <Paragraph position="10"> &quot;*The assignment of features to constituents presents some computational problems, since a context-free parser will no longer be sufficient to analyze strings. This was recognized in the original version of APSGs \[7\], and a two-pass parser was constructed that first uses the context-free component of the grammar to produce an initial parse tree, then adds the assignment of features in context.</Paragraph> </Section> </Section> <Section position="4" start_page="44" end_page="44" type="metho"> <SectionTitle> VP-> V INF ASET: (SUBJ INF) := (SUBJ'VP) </SectionTitle> <Paragraph position="0"> Here the subject NP of the main clause has been passed down to the VP (by the S rule), which in turn passes it to the infinitive as its subject. Not all linguistic phenomenon can be formulated so easily with APSGs; in particular, APSGs have trouble describing unbounded deletion and conjunction reduction. Metarule formulations for the latter phenomena have been proposed in \[5\], and we will not deal with them here.</Paragraph> </Section> <Section position="5" start_page="44" end_page="44" type="metho"> <SectionTitle> 5. Metarules for APSGs </SectionTitle> <Paragraph position="0"> Metarules consist of two parts: a match template with variables whose purpose is to match existing grammar rules; and an instantiatlon template that produces a new grammar rule by using the match template~s variable bindings after a successful match. Initially, a basic set of grammar rules is input; metarules derive new rules, which then can recursively be used as input to the metarules. When (if) the process halts, the new set of rules, together with the basic rules, comprises the grammar.</Paragraph> <Paragraph position="1"> We will use the following notation for metarules:</Paragraph> <Paragraph position="3"> where MF is a _matchln |form, IF is an instantiation form, and CSET is a set of predications. Both the MF and IF have the same form as grammar rules, but in addition, they can contain variables. When an MF is matched against a grammar rule, these variables are bound to different parts of the rule if the match succeeds. The IF is instantlated with these bindings to produce a new rule. To restrict the application of metarules, additional conditions on the variable bindings may be specified (CSET); these have the same form as the RSET of grammar rules, hut they can mention the variables matched by the MF.</Paragraph> <Paragraph position="4"> Metarules may be classified into three types: I. Introductory metarules, where the MF is empty (=> IF). These metarules introduce a class of grammar rules.</Paragraph> <Paragraph position="5"> 2. Deletion metarules, where the IF is empty (MF =>). These delete any derived grammar rules that they match.</Paragraph> <Paragraph position="6"> 3. Derivation metarules, where both MF and IF are present. These derive new grammar rules from old ones.</Paragraph> <Paragraph position="7"> There are linguistic generalizations that can he captured most perspicuously by each of the three forms. We will focus on derivation metarules here, since they are the most complicated.</Paragraph> </Section> class="xml-element"></Paper>