File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/96/c96-2106_metho.xml

Size: 19,242 bytes

Last Modified: 2025-10-06 14:14:12

<?xml version="1.0" standalone="yes"?>
<Paper uid="C96-2106">
  <Title>Modularizing Codescriptive Grammars for Efficient Parsing*</Title>
  <Section position="3" start_page="628" end_page="629" type="metho">
    <SectionTitle>
2 The Architecture
</SectionTitle>
    <Paragraph position="0"> The most important aspect for the distribution of analysis tasks and for defining modes of interaction is that one of the processes must work as a filter on the input word lattices, reducing the search space. The other component then works only with successflfl analysis results of the previous one. This means, that one parser is in control over the other, whereas the latter one is not directly exposed to the input. For reasons which will become obvious below, we will call the first of these parsers the SYN-parser, the second one controlled by the SYN-parser, the SEM-par'ser.</Paragraph>
    <Paragraph position="1"> Another consideration to be taken into account; is that the analysis should be incremental and time synchronous. This implies that the SVNparser should not send its results only when it is completely finished, thus forcing the SEM-parser to wait} Interactivity is another aspect we had to consider. Tile SF, M-parser must be able to report back to the SYN-parser at least when its hypotheses failed. This would not be possible when the SEM-parser has to wait till the SYN-parser is finished. This requirement also constrains the exchange of messages.</Paragraph>
    <Paragraph position="2"> Incrementality and interactivity imply a steady exchange of messages between the parsers. An important consideration then is that the overhead for this communication should not outweigh the gains of distributed processing. This rules out that the parsers should communicate by exchanging their analysis results in terms of resulting feature structures, since it; would imply that on each communication event the parsers would have to analyze the structures to detect changes, whether a structure is part of other already known structures, etc. It is hard to see how this kind of communication can be interleaved with normal parsing activity in efficient ways.</Paragraph>
    <Paragraph position="3"> In contrast to this, our approach allows to exploit tim fact that the grammars employed by the parsers are derived fl'om the same grammar and thereby &amp;quot;similar&amp;quot; in structure. This makes it possible to restrict the conmmnication between the parsers to intbrmation about what rules were successfiflly or nnsuccessfiflly applied. Each parser then can reconstruct on its side the state the other parser is in how its chart; or analysis tree looks like* Both parsers try to maintain or arriw~ at 1 Another l)roblem in incremental processing is that it; is not known in advmme when an utte.rance is tinished or a new utterance starts. To deal with this, prosodic information is taken into account; see (Kasper m~d Krieger, 1996) for more detmls.</Paragraph>
    <Paragraph position="4">  isomorphic charts. The approach allows that the parsers never need to exchange analysis results in terms of structures as the parsers should always be able to reconstruct these if necessary. On the other hand, this reconstructibility poses constraints on how the codescriptive grammar can be split up in subgrammars.</Paragraph>
    <Paragraph position="5"> The requirements of incrementality, interactivity and efficient communication show that, our approach does not ernulate the &amp;quot;description by analysis&amp;quot; methodology in syntax-semantics interfaces on the basis of codescriptive grmnmars.</Paragraph>
  </Section>
  <Section position="4" start_page="629" end_page="629" type="metho">
    <SectionTitle>
3 The Parsers and the Protocol
</SectionTitle>
    <Paragraph position="0"> The SYN-parser and the SEM-parser are agenda-driven chart parsers. For speech parsing, the nodes represent points of times and edges represent word hypotheses/paths in the word lattice.</Paragraph>
    <Paragraph position="1"> The parsers communicate by exchanging h~jpotheses, bottom-up hypotheses from syntax to semantics and top-down hypotheses from semantics to syntax; see (Kasper, Krieger, Spilker, and Weber, 1996) for an in-depth description of the current setup.</Paragraph>
    <Paragraph position="2"> * Bottmn-up hypotheses are emitted by the SYN-parser and sent to the SEM-parser. They undergo verification at tile semantic level.</Paragraph>
    <Paragraph position="3"> A bottom-up hypothesis describes a passive edge (complete subtree) constructed by tile syntax parser and consists of the identifier of the rule instantiation that represents the edge and the completion history of the constructed passive edge. Having passive status is a necessary but not sufficient condition for an edge to be sent as hypothesis. Whether a hypothesis is sent also depends on other criteria, such as its score.</Paragraph>
    <Paragraph position="4"> * Top-Down hypotheses result from activities of the SEM-parser, trying to verify bottom-up-hypotheses. To keep the communication efforts low, only failures are reported back to the SYN-parser by sending simply the hypothesis' identifier. This narrows the space of successflfl hypotheses on the SYN-parser's side (see remarks in Section 4.3.1).</Paragraph>
    <Paragraph position="5"> The central data structure by which synchronization and communication between tile parsers is achieved is that of a completion history, containing a record on how a subtree was completed. Basically it tells us for each edge in the chart which other edges are spanned, The nodes in tile chart correspond to points in time and edges to time intervals spanned. Completion histories are described by the following EBNF: {R&lt;rule-id&gt;&lt;edge-id&gt;&lt;start &gt;&lt;end&gt;{E&lt;edge-id&gt;} *I L&lt;lex-id&gt;&lt;edge-id&gt;&lt;st art &gt;&lt;end&gt; }+ &lt;rule-id&gt;, &lt;lex-id&gt;, &lt;edge-id&gt;, &lt;start&gt;, and &lt;end&gt; are integers. R&lt;rule-id&gt; and L&lt;lex-id&gt; denote rules and lexicon entries, resp. &lt;edge-id&gt; uniquely identifies a chart edge. Finally, &lt;start&gt; and &lt;end&gt; specify the start/end point of a spanning edge.</Paragraph>
    <Paragraph position="6"> This protocol allows the parsers to efficiently exchange information about the structure of their chart without having to deal with explicit analysis results as feature structures. Since the SEM-parser does not directly work on linguistic input, there are two possible parsing modes: * Non-autonomous parsing. The parsing process mainly consists of constructing the tree described by tile completion history, using the semantic counterparts of tile rules which led to a syntactic hypothesis. If this fails, it is reported back to the SYN-parser.</Paragraph>
    <Paragraph position="7"> * Quasi-autonomous parsing. The parser extends the chart on its own through prediction and completion steps. Obviously, this is only possible after some initial information by tile SYN-parser, since the S~;M-parser is not directly connected to the input word lattice.</Paragraph>
  </Section>
  <Section position="5" start_page="629" end_page="631" type="metho">
    <SectionTitle>
4 Compilation of Subgrammars
</SectionTitle>
    <Paragraph position="0"> In the following, we discuss possible options and problems for the distribution of information in a cospecifying grammar. Our approach raises tile question which of tile parsers uses what information. This set of information is what we call a subgrarnraar. These subgrammars are generated from a common source grammar.</Paragraph>
    <Section position="1" start_page="629" end_page="630" type="sub_section">
      <SectionTitle>
4.1 Reducing the Representational
Overhead by Separating Syntax and
Semantics
</SectionTitle>
      <Paragraph position="0"> An obvious choice for splitting up the grammar was to separate the linguistic levels (strata), such as syntax and semantics. This choice was also motivated by the observation that, typically the most important constraints on grammaticality of the input are in the syntactic part, while most of the semantics is purely representational. 2 A straightforward way to achieve this is by inanipubating grammar rules and lexicon entries for the SYN-parser, we recursively delete tile information under the SEM attributes and sinfilarly clear the SYN attributes to obtain the subgrammar for the SEM-parser. We abbreviate these subgrammars by G~v.,- ~ and G ....... and tile original grammar by G. This methodology reduces the size of tile structures for the SYN-parser to about 30% of the eom2This must be taken cu,n ,qrano salis as it depends on how a specific grammar draws the line between syntax and semantics: selecdonal constraints, e.g., for verb arguments, are typically part of semantics and are &amp;quot;true&amp;quot; constraints. Also, semantic constraints would have a much larger impact if, for instance, agreement constraints are considered as semantic, too, as (Pollard and Sag, 1994) suggest.</Paragraph>
      <Paragraph position="1">  plete structure. On(', disadvantage of this simple al)proach is that coreferences between syntax and semantics disappear (we call the collection of these (',Onllnon reentrancies the coref ske, lcton). This might lead to several problems which we address in Section 4.2. Section 4.3 then discusses possible solutions.</Paragraph>
      <Paragraph position="2"> Another, more sophisticated way to keep the structures small is due to the type expansion mechanism in 7T)PS (Krieger and SchMer, 1995).</Paragraph>
      <Paragraph position="3"> Instead of destructively modifying the feature.</Paragraph>
      <Paragraph position="4"> structures lmforehaml, we can elnl)loy type exl)an~ sion to let SYN or SI,;M unexpan(le(l. This has the desired eft'cot that we (lo not lose the coreference consl:raints and furthernlore are fl'ee to expan(l parts of the feature stru(',ture afterwards. We will discuss this feature in Section 4.4.</Paragraph>
    </Section>
    <Section position="2" start_page="630" end_page="630" type="sub_section">
      <SectionTitle>
4.2 Problems
</SectionTitle>
      <Paragraph position="0"> Obviously, the major advantage of our method ix that unification and copying l)ecome faster during processing, due to smaller structures. We Call even estimate the st)eedup ill the best case, viz., quasi-linear w.r.t, input structure if only conjunctive structures are used. Clearly, if many disjun(:tions are involved, the speedut) might even be exponential. null However, the most imi)ortant disadwmtage of the conq)ilation nmthod is that it no longer guarantees soundness, that ix, the sut)grammar(s) might accel)t ul;terances which are ruled out hy tile flfll grammar. This is due to tile silnple fact that certain constraints have \])een elinfinated in the subgranunars. If at least, one such constraint is a filtering constrMnt, we automatically enlarge the language accepted 1)y this sul)grainmar w.r.t.</Paragraph>
      <Paragraph position="1"> the original granunar. Clearly, completeness is not affected, since we do not add further constrMnts to the sul)grannnars.</Paragraph>
      <Paragraph position="2"> At this 1)oint, let us focus on the estimation above, since it is only a 1)est;-case forecast. Clearly, the structures I)econm snmller; however, due to the possil)le decrease of filter constraints, we nmst expect all increase of hypotheses in the parser. In fact, tile experimental results ill Section 5 show that our approach has a ditferent impact on tile SYN-parser and the Sl,;M-parser (see Figure 2). Our hope here, however, is that; the increase of non-deternfinisut inside the parser is coml)ensated by tile processing of smalh;r structures; see (Maxwell III and Kaplan, 1991) for more arguinents on this theme.</Paragraph>
      <Paragraph position="3"> In general, ewm the intersection of the languages accepted by G~,v, ~ and G~ ..... does not yield the language accepted by G only the weaker relation PS(G) C PS((;~y,~) O/2(G ..... )holds. This t)e,haviour is all outcome of our compilation schema, namely, cutting reentrancy points. Thus, even if an utterance is accepted by G with analysis fs encoded as a feature structure, it might be tile case that the unifi(:ation of the corresponding resuits for G.~.v,,. and G ....... will truly subsume fs: A' -&lt;/,%.~ A f.~ ......</Paragraph>
      <Paragraph position="4"> Let; us mention fllrther problems. Firstly, termination inight change ill case of tile sul)grammars. Consider a subgranunar which contains elnpty productions or unary (coercion) rules. Assume that such rules were previously &amp;quot;controlled&amp;quot; t)y constraints which are no longer presell(;. Obviously, if a parser is not restricted through additional (meta-)constraints, tile iterated al)l)lication of these rules could lead to all infinite computation, i.e., a loop. This was sometilnes the case. during our experin~ents. Secondly, recursivc rules couhl introduce infinitely nlany solutions for a given utterance. Theoretically, this might not pose a problenl, since the intersection of two infinite sets of parse trees nfight be finite. However in practice, l, his i)roblem might occur.</Paragraph>
    </Section>
    <Section position="3" start_page="630" end_page="631" type="sub_section">
      <SectionTitle>
4.a Solutions
</SectionTitle>
      <Paragraph position="0"> In this section, we will discuss three solution to the protflems mentioned before.</Paragraph>
      <Paragraph position="1">  Although senumtics construction is driven by the speech pm'ser, the use of (titfelent subgrammars suggest that the sl)e, cch I)mser should also be guided 1)y the Sl,:M-parsel'. This is achieved by sending 1)ack faIs~i/icd hypotheses. Because hypotheses are uniquely identitied in our framework, we must only send the integer that idenl;ities tile falsified chart edge. Ill the SYN-parser, this infermat;ion might either lead to a true ('hart revision process or be employed as a filter to narrow the set of enfitted bottom-ul) hyl)otheses.</Paragraph>
      <Paragraph position="2">  In order to gjuarantee correctness of tll(., almlysis, we might unify the results of 1)oth parsers with the corresl)onding coref skeletons at the end of an analysis. We (lid not tmrsue this strategy since it introduces an additional 1)recessing step during parsing. Illstea(l, as explained above, it is 1)referabh; to employ type expansion here, letting SYN or SEM unexpanded, so that coreferences are preserved. This treatment will be inve, stigated in Section 4.4.</Paragraph>
      <Paragraph position="3">  The most straighttbrward way to guarantee soundness is siml)ly by elnploying the full-size grammar ill one of the two parsers. This might sound strange, but if one processor lm.sieally only verifies hypotheses from tile other and doe, s not generate additional hyl)otheses, tile overhead is neglectat)le. We have used this scheme ill that the SEM-parser oI)erates oil the full-size grammar, whereas the si)eech parser directly conlnmnicates with tile word recognizer. This makes sense since  the word lattice parser processes an order of magnitude more hypotheses than the SEM-parser; see (Kasper, Krieger, Spilker, and Weber, 1996) for more details. Because the SEM-parser passes its semantic representation to other components, it makes further sense to guarantee total correctness here.</Paragraph>
    </Section>
    <Section position="4" start_page="631" end_page="631" type="sub_section">
      <SectionTitle>
4.4 hnprovements
</SectionTitle>
      <Paragraph position="0"> This section investigates several improvements of our compilation approach, solving the problems mentioned before.</Paragraph>
    </Section>
    <Section position="5" start_page="631" end_page="631" type="sub_section">
      <SectionTitle>
4.4.1 Identifying Functional Strata
Manually
</SectionTitle>
      <Paragraph position="0"> Normally, the grammarian &amp;quot;knows&amp;quot; which information needs to be made explicit. Hence, instead of differentiating between the linguistic strata sYN and SEM, we let the linguist identify which constraints filter and which only serve as a means for representation; see also (Shieber, 1985). In contrast to the separation along linguistic levels, this approach adopts a functional view, cutting across linguistic strata. On this view, the syntactic constraints together with, e.g., semantic selection constraints would constitute a subgrammar.</Paragraph>
      <Paragraph position="1">  In case that the grammarian is unaware of these constraints, it is at least possible to determine them relatively to a training corpus, simply by counting unifications. Features that occur only once on top of the input feature structures do not specialize the information in the resulting structure (actually the values of these features). Fhrthermore, unrestricted features (value T) do not constrain the result. For instance, indicates that only the path A needs to be made explicit, since its value is more specific than the corresponding input values: say -~ s and say ~_ v.</Paragraph>
      <Paragraph position="2">  Partial evaluation, as known from functional/logic programming, is a method of carrying out parts of computation at compile time that would otherwise be done at run time, hence improving run time performance of programs; see, e.g., (Jones, Gomard, ~and Stestoft, 1993). Analogous to partial evaluation of definite clauses, we can partially evaluate annotated grammar rules, since they drive the derivation. Partial evaluation means here to substitute type symbols by their expanded definitions.</Paragraph>
      <Paragraph position="3"> Because a grammar contains finitely many rules of the above form and because the daughters (the right hand side of the rule) are type symbols (and there are only finitely many of them), a great deal of this partial evaluation process can be performed otttine. In contrast to a pure CF grammar with finitely many terminal/nonterminals, the evaluation process must; not terminate, due to eoreferenee constraints within feature structures. However, recta-constraints such as of\]line parsability or lazy type expansion (see next section) help us to determine those features which actively participate in unification during partial evaluation. In contrast to the previous method, partial evaluation is corpus-independent.</Paragraph>
      <Paragraph position="4">  We have indicated earlier that type expansion can be fruitfully employed to preserve the coref skeleton. Type expansion can also be chosen to expand parts of a feature structure on the fly at run time.</Paragraph>
      <Paragraph position="5"> The general idea is as follows. Guaranteeing that the lexicon entries and the rules are consistent, we let everything unexpanded unless we are enforced to make structure explicit. As was the' case for the previous two strategies, this is only necessary if a path is introduced in the resulting structure whose value is more specific than the value(s) in the input structure(s).</Paragraph>
      <Paragraph position="6"> The biggest advantage of this approach is obvious--only those constraints must be touched which are involved in restricting the set of possible solutions. Clearly, such a test should be done every time the chart is extended. The cost of such tests and the on-line type expansions need further investigation.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML