File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/92/c92-3127_evalu.xml
Size: 11,864 bytes
Last Modified: 2025-10-06 14:00:07
<?xml version="1.0" standalone="yes"?> <Paper uid="C92-3127"> <Title>A HYBRID SYSTEM FOR QUANTIFIER SCOPING</Title> <Section position="3" start_page="0" end_page="0" type="evalu"> <SectionTitle> 2. \]Implementation </SectionTitle> <Paragraph position="0"> A hybrid scoping system has been fully implemented as part of the PRC Adaptive</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> Knowledge-Ba~d Text Understanding System </SectionTitle> <Paragraph position="0"> (Loatman et. al. 1986). Figure 1 shows the basic organization of the PAKTUS scoping module (PSM). I will describe input/output, the database, and the scoping algorithm in turn.</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.1 Input/Output </SectionTitle> <Paragraph position="0"> Given a parse tree, PSM returns a list of the preferred scope orders of the quantified phrases.</Paragraph> <Paragraph position="1"> No degree of preference is computed. A scope order is represented by an ordered list of the phrases, not by a logical fbrm.</Paragraph> <Paragraph position="2"> Though eventually there will be translation to logical form, there is good reason for delaying this until after the scope determination. The .problem with systems which translate a parse tree into an &quot;unscoped&quot; logical form as input to the scoping module (e.g. Hobbs and Shieber 1987) is that syntactic influences are not discernible to the module, since logical structure is not syntactic structure. For example, Every teacher who is at some high school joined the union and Every teacher at some high scl~ool joined the union have the same un~oped logical form: for Hobbs and Shieber, joined-union( <every t and (teacher(t),at (t, <some h high-school(h)>))>). So the different syntactic influences are invisible.</Paragraph> <Paragraph position="3"> Though syntactic input can of course be added (e.g. Hurum 1988), doing so amounts to an admission that the translation was premature. It is more efficient to have the input to the module consist just of the parse, postponing the t~mslation to logical form until after the scoping determination. Thus, the translator (not yet implemented) is not part of PSM.</Paragraph> <Paragraph position="4"> ACrEs DE COLING-92, NANTES, 23-28 Ao(;r 1992 S 6 1 Plt~)(!. oF COIJNG-92, NANrES, AUG. 23-28, 1992 2.2. Database PSM encodes a function flex s- defined for -., .vii 26 quantifier elements, including 9 quantificational adverbs such as always, and 49 syntactic environments. There are three &quot;vertical&quot; environments - embedded PP, reduced and full relatives - and 46 &quot;horizontal&quot; environments, where a horizontal environment is defined by a combination of grammatical roles, voice, and/or various ordering relations.</Paragraph> <Paragraph position="5"> Defining the mapping from a conjunction of quantifier pair and environment to a prescribed scope order for the over 9000 mathematically and syntactically possible conjunctions admittedly is a daunting task. This may be the main reason to prefer an IS approach. But while the required research effort has been lengthy and tedious, it has paid dividends in a body of data (150 pages, described in the appendix of Chien 1992), which subsumes existing consensus on lcxical and syntactic scoping influences while going deeper and beyond. However, the corpus is naturally subject to continual correction and extension, and while this upgrading can be accommodated, the process is not modular. It seems to me that this is the tradeoff for the hybrid's greater precision. Database implementation was motivated by the desire to make access to the large volume of data as efficient as possible. There are three levels of data objects. The first, top-level, object has slots corresponding to pairings of grammatical roles (subject, direct object, etc.; for their relevance to scope, see Ioup 1975). In each slot are pointers to several second-level objects, called &quot;rule groups&quot;. In these, a &quot;conditions&quot; slot contains procedures which test for syntactic properties such as voice and linear ordering, and another slot contains pointers to third-level objects called &quot;rules&quot;. In these, a conditions slot contains procedures to test for the lexical identity of a quantifier pair, and an &quot;actions&quot; slot contains procedures which effect a scope preference.</Paragraph> <Paragraph position="6"> Thus the latter procedures are invoked only after the collective syntactic and lexical properties of the input are verified. But checking the conditions in stages via the object hierarchy permits large aggregates of data to be eliminated from consideration at each stage. Data objects of all levels total about 325, including a second top-level object for vertical relations, el. 2.3 below. Database organization is illustrated in Figure 2. If a direct object and adverbial in a clause are quantified, the rule groups in the appropriate slot of RULEGRPS are tested. If in addition the clause is passive and the adverbial immediately precedes the main verb, then RULEGRtY25 is activated and its rules tested. If, finally, the direct object is quantified by some and the adverbial is a &quot;monotone decreasing&quot; quantifier such as never, seldom, or rarely (Barwise and Cooper 1981) then RULE112 is activated and the procedure &quot;setparams&quot; invoked. The effect of this - in the context of the algorithm explained in the next section - is to register a preference for the object to scope over the adverbial, as e.g. in He was seldom seen by some agent. (The alternative scoping is awkward, better expressed with polarity-sensitive any replacing some; for the treatment of any, see Chien 1991.) It should not be thought that a hybrid system cannot exploit generalizations in the data. PSM can and must do so, for even with a structured database, search would be relatively slow if there were as many actual data structures as abstract data points (i.e. values of flex-syn). But in fact each rule represents a cluster'of like points, grouped together by quantifier categories - e.g.</Paragraph> <Paragraph position="7"> &quot;deer&quot; in RULE112, or the category of universal quantifiers - by boolean combination, or by other AcrEs DE COLING-92, NANTES, 23-28 aotrr 1992 8 6 2 PROC. OF COLING-92, NANTEs, AUG. 23-28, 1992 generalization, thus gaining economies in the database. To illustrate generalization by syntactic information alone, consider the verb objects in He sent a firm each invoice: they appear to scope in order regardless of how they are quantified.</Paragraph> <Paragraph position="8"> To capture this phenomenon, the relevant rule registers a preference without checking for the lexical identity of the quantifiers. Note that this strategy subsumes cases which in an IS system would be handled by an overriding specialist, i.e.</Paragraph> <Paragraph position="9"> a specialist fo such that fiat fro(X), &quot;&quot;) = fo(x) * In such cases IS is not problematic, but hybridizatiou is equally straightforward.</Paragraph> <Paragraph position="10"> A generalization can also be based on syntactic information together with partial lexical infomlation, i.e. one quantifier only. It appears e.g. that sometimes in preverbal position always scopes over a direct object, as in She sometimes polishes each trophy, regardless of how the object is quantified. To implement this, the rule group that looks for this configuration of adverbial and direct object has in its rules slot a rule whose condition for firing is only that the adverbial is sometimes. Here is a generalization over the data points flex_syn(sometimes,x,e), for all NP quantifiers x, where e is this syntactic configuration. Note that the organization of the database precludes an overriding determination based on lexical information alone, since syntax must always be checked first. But I am unaware of any lexical preferences which are exceptionless across syntactic environments.</Paragraph> <Paragraph position="11"> The number of rules is farther reduced by the use of a default preference: PSM initially assumes scope order to match linear (&quot;natural&quot;) order. This enables the elimination of rules prescribing natural order, unless the preference is very strong in that it cannot be undone by any conflicting preference in a sentence with more than two quantifiers. This is explained below.</Paragraph> </Section> <Section position="3" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.3. Scoping Algorithm </SectionTitle> <Paragraph position="0"> PSM determines the scope order only of quantificrs all of which arc horizontally related, or all of which arc vertically related (as in Epstein 1988). So, for Every athlete who took some steroids won a race the system scopes every athlete and some steroids, likewise every athlete and a race; then the scoping of some steroids and a race is treatext as already indirectly determined.</Paragraph> <Paragraph position="1"> The top-level scoping procedure calls the horizontal scoping procedure (H-SCOPE) for the top-level clause of the parsed input. It then substitutes, for each top-level NP in each of the resulting scope orders, an order returned by the vertical scoping procedure (V-SCOPE) for that NP. V-SCOPE simply returns its argunmnt NP unless it has an embedded NP. The recognized vertical relations are embedded PP, relative clause, and reduced relative (or any combination).</Paragraph> <Paragraph position="2"> Van Lehn's &quot;embedding hierarchy&quot; (van Lehn 1978) - in which these relations induce inverse scope order, natural order, and ambiguity, respectively = is subsumed by the preferences in the database, which capture the variation of hierarchy preferences as quantifiers vary.</Paragraph> <Paragraph position="3"> For sentences with two quantifiers, H-SCOPE basically just does a lookup. But for more than two, it is non-trivial to determine an overall order from a set of pairwise orders. H-SCOPE first assumes the default natural order and initializes a &quot;record of imposed orders&quot; (RIO). This is a list of quantifier pairs, registering the prescriptions which have been followed to date in a given order; it insures that they will not be later undone. RIO is initialized with strong natural orders, i.e. naturally ordered pairs which must stay that way. The main body of H-SCOPE is a loop through the applicable rule groups, then a loop through a group's rules. If a rule fires, it sets one quantifier to L(eft), the other to R(ight). How this prescription is realized depends on the overall order under consideration, and on RIO. If e.g. L does not already precede R, R may be postposed to L or L may be preposed to R, nonequiv',dent options if L and R are not contiguous in the order, an option is not pursued if it undoes a pairwise order in RIO. Resultant new overall orders either replace or supplement the original, the former if the rule prefers the inverse pairwise order to the natural, the latter if the preferences are equal. The results are then each operated on by the next applicable rule.</Paragraph> <Paragraph position="4"> For A person in each house on both streets saw several men who were robbing some bank.v, PSM returns \[both each a several some\] in .7 seconds (Macintosh llx Common Lisp 2.0, scoping time only). Rarely did a park supervisor serving several districts in two counties assign everyone many trees with no large branches on some limb which might fall on a passerby gets 4 scopings, all with rarely widest and a passerby narrowest, in 1.283 seconds.</Paragraph> </Section> </Section> class="xml-element"></Paper>