File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/96/p96-1027_metho.xml
Size: 8,101 bytes
Last Modified: 2025-10-06 14:14:21
<?xml version="1.0" standalone="yes"?> <Paper uid="P96-1027"> <Title>Shieber, S. (1988). A Uniform Architecture for Parsing</Title> <Section position="4" start_page="200" end_page="201" type="metho"> <SectionTitle> 3 The Algorithm Schema </SectionTitle> <Paragraph position="0"> The entries in (2), with their variables suitably instantiated, become the initial entries of an agenda and we begin to move them to the chart in accordance with the algorithm schema, say in the order given.</Paragraph> <Paragraph position="1"> The variables in the 'Cat' and 'Semantics' columns of (2) provide the essential link between syntax and semantics. The predicates that represent the semantics of a phrase will simply be the union of those representing the constituents.</Paragraph> <Paragraph position="2"> The rules that sanction a phrase (e.g. (3) below) show which variables from the two parts are to be identified.</Paragraph> <Paragraph position="3"> When the entry for John is moved, no interactions are possible because the chart is empty. When run is moved, the sequence John ran is considered as a possible phrase on the basis of rule (3).</Paragraph> <Paragraph position="4"> (3) s(x) ~ rip(y), vp(x, 3').</Paragraph> <Paragraph position="5"> With appropriate replacements for variables, this maps onto the subset (4) of the original semantic specification in (1). (4) r: run(r), past(r), argl(r,j), name(j, John) Furthermore it is a complete sentence. However, it does not count as an output to the generation process as a whole because it subsumes some but not all of (1). It therefore simply becomes a new edge on the agenda.</Paragraph> <Paragraph position="6"> The string ran fast constitutes a verb phrase by virtue of rule (5) giving the semantics (6), and the phrase ran quickly with the same semantics is put on the agenda when the quickly edge is move to the chart.</Paragraph> <Paragraph position="7"> Assuming that adverbs modify verb phrases and not sentences, there will be no interactions when the John ran edge is moved to the chart.</Paragraph> <Paragraph position="8"> When the edge for ran .fast is moved, the possibility arises of creating the phrase tan fast quickly as well as ran fast fast. Both are rejected, however, on the grounds that they would involve using a predicate from the original semantic specification more than once. This would be similar to allowing a given word to be covered by overlapping phrases in free word-order parsing. We proposed eliminating this by means of a bit vector and the same technique applies here. The fruitful interactions that occur here are between ran .fast and ran quickly on the one hand, and John on the other. Both give sentences whose semantics subsumes the entire input.</Paragraph> <Paragraph position="9"> Several things are noteworthy about the process just outlined.</Paragraph> <Paragraph position="10"> !. Nothing turns on the fact that it uses a primitive version of event semantics. A scheme in which the indices were handles referring to subexpressions in any variety of fiat semantics could have been treated in the same way. Indeed, more conventional formalisms with richly recursive syntax could be converted to this form on the fly.</Paragraph> <Paragraph position="11"> 2. Because all our rules are binary, we make no use of active edges.</Paragraph> <Paragraph position="12"> 3. While it fits the conception of chart parsing given at the beginning of this paper, our generator does not involve string positions centrally in the chart representation. In this respect, it differs from the proposal of Shieber (1988) which starts with all word edges leaving and entering a single vertex. But there is essentially no information in such a representation. Neither the chart nor any other special data structure is required to capture the fact that a new phrase may be constructible out of any given pair, and in either order, if they meet certain syntactic and semantic criteria.</Paragraph> <Paragraph position="13"> 4. Interactions must be considered explicitly between new edges and all edges currently in the chart, because no indexing is used to identify the existing edges that could interact with a given new one.</Paragraph> <Paragraph position="14"> 5. The process is exponential in the worst case because, if a sentence contains a word with k modifiers, then a version it will be generated with each of the 2 k subsets of those modifiers, all but one of them being rejected when it is finally discovered that their semantics does not subsume the entire input. If the relative orders of the modifiers are unconstrained, matters only get worse.</Paragraph> <Paragraph position="15"> Points 4 and 5 are serious flaws in our scheme for which we shall describe remedies. Point 2 will have some importance for us because it will turn out that the indexing scheme we propose will require the use of distinct active and inactive edges, even when the rules are all binary. We take up the complexity issue first, and then turn to bow the efficiency of the generation chart might be enhanced through indexing.</Paragraph> </Section> <Section position="5" start_page="201" end_page="201" type="metho"> <SectionTitle> 4 Internal and External Indices </SectionTitle> <Paragraph position="0"> The exponential factor in the computational complexity of our generation algorithm is apparent in an example like (8).</Paragraph> <Paragraph position="1"> (8) Newspaper reports said the tall young Polish athlete ran fast The same set of predicates that generate this sentence clearly also generate the same sentence with deletion of all subsets of the words tall, young, and Polish for a total of 8 strings. Each is generated in its entirety, though finally rejected because it fails to account for all of the semantic material. The words newspaper and fast can also be deleted independently giving a grand total of 32 strings.</Paragraph> <Paragraph position="2"> We concentrate on the phrase tall young Polish athlete which we assumed would be combined with the verb phrase ran fast by the rule (3). The distinguished index of the noun phrase, call it p, is identified with the variable y in the rule, but this variable is not associated with the syntactic category, s, on the left-hand side of the rule. The grammar has access to indices only through the variables that annotate grammatical categories in its rules, so that rules that incorporate this sentence into larger phrases can have no further access to the index p. We therefore say that p is internal to the sentence the tall young Polish athlete ran fast.</Paragraph> <Paragraph position="3"> The index p would, of course, also be internal to the sentences the young Polish athlete ran fast, the tall Polish athlete ran fast, etc. However, in these cases, the semantic material remaining to be expressed contains predicates that refer to this internal index, say 'tall(p)', and 'young(p)'.</Paragraph> <Paragraph position="4"> While the lexicon may have words to express these predicates, the grammar has no way of associating their referents with the above noun phrases because the variables corresponding to those referents are internal. We conclude that, as a matter of principle, no edge should be constructed if the result of doing so would be to make internal an index occurring in part of the input semantics that the new phrase does not subsume. In other words, the semantics of a phrase must contain all predicates from the input specification that refer to any indices internal to it. This strategy does not prevent the generation of an exponential number of variants of phrases containing modifiers. It limits proliferation of the ill effects, however, by allowing only the maximal one to be incorporated in larger phrases. In other words, if the final result has phrases with m and n modifiers respectively, then 2 n versions of the first and 2 m of the second will be created, but only one of each set will be incorporated into larger phrases and no factor of 2 (n+m) will be introduced into the cost of the process.</Paragraph> </Section> class="xml-element"></Paper>