XML Viewer - c00-1080

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/00/c00-1080_metho.xml
Size: 17,883 bytes
Last Modified: 2025-10-06 14:07:09
<?xml version="1.0" standalone="yes"?>
<Paper uid="C00-1080">
  <Title>Chart Parsing and Constraint Programming</Title>
  <Section position="4" start_page="552" end_page="556" type="metho">
    <SectionTitle>
2 Parsing as Constraint Propagation
</SectionTitle>
    <Paragraph position="0"> The basic observation which turns parsing-as-deduction into constraint propagation is simple: items o1' a chart parser arc just specM formulas which are used in an inference process. Since constraints in constraint programruing are nothing but atomic formulas and c(mslraint handling rules nothing but inference rules, the connection is immediate.</Paragraph>
    <Paragraph position="1"> In more detail, I will present in this section how to implement the three parsing algorithms given in %b. i in CHR and discuss the advantages and drawbacks of this approach. Since CHP, are integrated in SICStus Prolog, I will present constraints and rules in Prolog notation.</Paragraph>
    <Paragraph position="2"> We use tile following two types of constraints. The constraints cor,'esponding to tile items will be called edge constraints. They have two arguments in case of the two naive algorithms and tive in tile case o1' Earley's algorithm, i.e., edge (X, N) means in the case of the bottom-up algorithm that we have recognized a list of categories X up to position N, in the case of tile top-down algorithm that we are looking for a list of categories X starting at position N and in tile case of Earlcy's algorithm edge (A,Alpha,Beta, I,J) means that we found a substring fl'om I to J by recognizing the list of categories Alpha, bul we are still looking for a list of categories Beta to yield category k. The second constraint, werd(Pos,Cat-Word), is treed in tile scanning steps.</Paragraph>
    <Paragraph position="3"> It avoids using Icxical entries in prediction/completion since in gramnmr rules we do not use woIWs but their categories.</Paragraph>
    <Paragraph position="4"> For simplicity, a gramnmr is given as Prok/g Ihcts: lexical items as lex (Word, Category) and gramnmr rules as rule (RHS, LHS) where RHS is a list of categories representing the right hand side and LHS is a single category representing the left hand side of the rule.</Paragraph>
    <Paragraph position="5"> The algorithms are simple to implement by specifying the inl'erenee rules as constraint propagation rules, the axioms and the goal items as constraints. The inferonce rules are translated into CHR in the following way: The antecedents are transformed into constraints appearing in the head of the propagation rules, the side conditions into the guard and the consequence is posted in the body. A summarization of the resulting CHR programs is presented in Tab. 2.</Paragraph>
    <Paragraph position="6"> We use Earley's algorithm for a closer look at the CHR propagation rules. In the scanning step, we can move tile head of the list of categories we arc looking for to those we already recognized in case we have an appropriately matching edge and word constraint in our constraint store. The result is posted as a new edge constraint  parse(InList): null axiom, post eonst(InList, report(Length).</Paragraph>
    <Paragraph position="7"> 0, Length), post_const(\[\], Len, Len).</Paragraph>
    <Paragraph position="8"> post_const(\[Word}Str\], InLen, Len):rAndall(Cat, lex(Word,Cat), Cats), post words(Cats, InLen, Word), NewLen is InLen + i, post const(Str, NewLen, Len).</Paragraph>
    <Paragraph position="9">  with tile positional index appropriately increnmnted. The prediction step is more complex. There is only one head in a rule, namely an edge which is still looking tbr s()me category to be found. If one can lind rules with a matching LHS, we collect all of them in a lisl and post tile approl)riatc fl'esh edge constraints lbr each element of that list with the predicate post_ea_edges/3 which posts edges of tim lbllowingkind: edge(LHS,\[J,RHS,J,d).</Paragraph>
    <Paragraph position="10"> The collection of all matching rules in a call to setof/3 is necessm 7 since CHR are a committed choice language. One cannot enumerate all solutions via backtracking. If there are no matching rules, i.e., tim list of RHSs we found is cmpty, the call to setof i, the guard will fail and therefore avoid vacuous predictions and nontermination of tile predictor.</Paragraph>
    <Paragraph position="11"> Lastly, tile completion step is a pure propagation rule which |1&amp;quot;a11s\]alcs literally. The two antecedents are in the head and lhe consequence in the body with appropriate instantiations o1' the positional variables and with the movement of the category recognized by the passive edge from the categories Io be found to those found.</Paragraph>
    <Paragraph position="12"> In the table there is one more type of rule, called an absorption rule. It discovers those cases where we posted an edge consmtint which is already present in the chart and simply absorbs the newly created one.</Paragraph>
    <Paragraph position="13"> Note that we do not have to specify how to insert edges into ciflmr chart or agenda. The chart and the agenda are represented by the constraint store and therefore builtin. Neither do we need a specialized deduction engine as was necessary for the implementation described in Shieber et al. In fact, the utilities needed are extremely simple, see Fig. 2.</Paragraph>
    <Paragraph position="14"> All we have to do for parsing (parse/l) is to post the axionC/ and o~1 traversal of the input stri,g to post the word constraints according to the lexicon of the given grammar. Then the constraint resolution process witll the inference rules will automatically build a complete chart. The call to report/1 will just determine whether dmre is an appropriate edge with the correct length in the chart and print that information to the screen.</Paragraph>
    <Paragraph position="15"> Coming back to the issues of chart and agenda: the constraint store l'nnetiot~s as chart and agenda at the same i axiom/0 just posts Ihe edge(s) delined in Tab. 2.</Paragraph>
    <Paragraph position="16">  time since as soon as a constraint is added all rules are tried for applicability. If none apply, the edge will remain dormant until another constraint is added which triggers a rule together with it. 2 So, the parser works incrementally by recursively trying all possible inferences for each 2 Another way to &amp;quot;wake&amp;quot; a constraint is to instanliate any of its variables in which case, it will be matched against the rules again. Since all our constraints are ground, this does not play a role here, constraint added to the store before continuing with the posting of new constraints fi'om the post_const/3 predicate. The way this predicate works is to traverse the string fi'om left-to-right. It is trivial to alter tile predicate to post the constraints from right-to-left or any arbitrary order chosen. This can be used to easily test different parsing strategies.</Paragraph>
    <Paragraph position="17"> The testing for applicability of new rules also has a  I ?- parse(\[john, hit, the, dog, with, the, stick\]).</Paragraph>
    <Paragraph position="18"> Input recognized.</Paragraph>
    <Paragraph position="19"> word (0, pn- j ohn) word ( I, v-hi t), word (2, det- the word (3, n-dog) , word ( 4, p-wi th) word ( 5, det- the ) word ( 6, n-stick) edge(sprlme, \[\], \[s\],O,O), edge(s,\[\],\[np,vp\],O,O), edge(np, \[\] , \[det,nbar\], O, O) , edge(np, \[\], \[pn\], O, O) , edge(np, \[phi, \[\] , O, I) , edge(s, \[np\], \[vp\],O,l), edge(vp,\[\], \[v,np\],l,l), edge(vp, \[\], \[v,np,pp\],l,l), edge(s,\[vp,np\],\[\],O,7), edge(sprime,\[s\], \[\],0,7)  connection with the absorption rules. We absorb tile newer edge since we can assume that all possible propagations have been done with lhe old identical edge constraint so that we can safely throw tile other one away. As an example I'OZ&amp;quot; tile resulting chart, part of the output of an Earley-parse for John hit the dog with the stick assuming the grammar fl'om Fig. I is presented in Fig. 3. The entire conslrainl stere is prinlcd lo tl~e screen after the constraint resolution process stops. Tile order of file constraints actually reflects tile order of the construction o1' the edges, i.e., the chart constitutes a lrace o1' tile parse at the same time. Although the given siring was ambiguous, only a single solution is visible in lh.e chart. This is due to the fact that we only (lid recognition. No explicit parse wits built which could have differentiated between ihe two solutions. It is an easy exercise to either write a predicate to extract all possible parses from the chart or to alter the edges in such a way flint an explicit parse tree is built dr,ring parsing.</Paragraph>
    <Paragraph position="20"> By using a built-in deduction engine, one gives up control of its efficiency. As it turns out, this CHR-based approach is slower titan the specialized engine developed and provided by Shieber et al. by about a factor of 2, e,g., for a six word sentence and a simple grammar tile parsing time increased from 0.01 seconds to 0.02 seconds on a LINUX PC (Dual Pentium lI with 400MHz) running SICStus Prolog. This factor was preserved under 5 and 500 repetitions of the same parse. However, speed was not the main issue in developing this setup, but rather simplicity and ease of implementation.</Paragraph>
    <Paragraph position="21"> qb sunl up this section, tile advantages of the approach lie in its flexibility and its availability for rapid prototyping of different parsing algorithms. While we used the basic examples fl'om the Shieber et al. article, one can also implement all Iho different deduction schemes from Sikkel (1997). This also includes advanced algorithms st,ch as left-corner or head-corner parsing, the relined Earley-algoriflml proposed by Graham el al. (1980), or (unification-based) II)/LP parsing as defined in Morawietz (1995), or any improved version of any of these. Furthermore, because of the logical semantics of CHP, with their soundness and completeness, all eorrecmess and so/redness proofs for the algorithms can be directly applied to this constraint propagation proposal. The main disadvantage of the proposed approach certainly lies in its apparent lack of efficiency. One way to address this problem is discussed in the next section.</Paragraph>
    <Paragraph position="22"> 3 Extensions of the Basic Technique The,'e are tw'o directions the extensions of the presented technique of CHR imrsing might rake. Firstly, one might consider lmrsing of more complicated granmm,'s compared to tile CF ones which were assumed so far. Following Shieber et al., one can consider utfification-tmsed grammars or tree adjoining grammars. Since I think lha! the previous sections showed lhat the Shieber ctal. approach is transferable in general, the results they present are applicable here as well. 3 Instead, I want to consider parsing of minimalist grammars (Chomsky, 1995) as delined in recent work by Stabler (I 997, 1999). 4</Paragraph>
    <Section position="1" start_page="554" end_page="555" type="sub_section">
      <SectionTitle>
3.1 Minimalist Parsing
</SectionTitle>
      <Paragraph position="0"> We cannot cover the flleory behind deriwltional minimalism as presenled in Smbler's papers in any delail. Very briefly, lexical items arc combined wilh each other by a binary operation meIx'e which is lriggered by the availability of an appropriate pair of clashing features, here noted as cat: IC) for Smbler's categories c and comp (C) for =c. Fttrlhermorc, there is a treaty operation move which, again on tile awfilability era pair of clashing features (e.g., -case, +case), triggers the extraelion era (possibly trivial) subtree and its merging in at tile root node. On completion of these operations lhe clashing feature pairs are removed. The lexical items are of tile l'orm o1' linked sequences o1' trees. Accessibility o1' fealures is detined via a given order on the nodes in this chain of trees. A parse is acceptable if all features have been checked, apart li'om one category feature which spans the length of the string. The actual algorithm works naively bottom-up and, since the operations are at most binary, the algorithm is CYK-based.</Paragraph>
      <Paragraph position="1"> 3Obviously, using unification will inlroduce addilional complexity, but no change of the basic melhod is required. If lhe tmilicazion can be reduced to Prolog unilication, it can stay in the head of the rule(s). If it needs dedicated tmilicalion algorilhms, they have to be called explicitly it\] the guard.</Paragraph>
      <Paragraph position="2"> 4 The code for the original implenlenzalion underlying the paper was kindly provided by Hd Stabler Altar/from Ihe implementation in CHP,, all file rest is his work and his ideas.</Paragraph>
      <Paragraph position="3">  edge(A,B,NewHead,Ch).</Paragraph>
      <Paragraph position="4"> ==&gt; An initial edge or axiom in this minimalist parsing system cannot simply be assumed to cover only the part of the string where it was l~und since it could have been the result of a move. So the elements of the lexicon which will have to be moved (they contain a movement trigger -X) actually have the positional indices instantiated in the last o1' those features appearing. All other movement triggers and the position it will he base generated are assumed to be traces and therefore empty. Their positional markers are identical variables, i.e., they span no portion of the string and one does not know their value at the moment of the construction of the axioms. They have to be instantiated during the minimalist parse.</Paragraph>
      <Paragraph position="5"> Consider the set of items as delined by the axioms, see Tab. 3. The general form of the items is such that we have the indices first, then we separate the chain of trees into the first one and the renmining ones for better access. As an example for the actual edges and to illustrate the discussion about tile possibly variable string positions in the edges, consider the lexical item it (as in believe it):  lex(it,I,\[(K,K):\[cat(d),-case(I,J)\]\]):-J is I+l.</Paragraph>
      <Paragraph position="6"> Since I = 1 in the example the following edge results edge(K, K, \[cat(d),-case(1,2)\]\], \[\]).Weknow that it has been moved to cover positions 1 to 2, but we do not know (yet) where it was base generated.</Paragraph>
      <Paragraph position="7"> We cannot go into any further detail how the actual parser works. Nevertheless, the propagation rule for merging complementizers shown in Tab. 3 demonstrates how easily one can implement parsers for more advanced types of grammars. 5</Paragraph>
    </Section>
    <Section position="2" start_page="555" end_page="556" type="sub_section">
      <SectionTitle>
3.2 Compiling the Grammar Rules into the
Inference Rules
</SectionTitle>
      <Paragraph position="0"> A proposal for improving tile approach consists in moving the test for rule applicability from the guards into the heads of the CHR rules. One can translate a given context-fi'ee grammar under a given set of inference rules into a CHR program which contains constraint propagation rules for each grammar rtde, thereby making tile processing more efficient. For simplicity, we discuss only tim case of bottom-up parsing.</Paragraph>
      <Paragraph position="1"> For the translation from a CF grammar into a constraint framework we have to distinguish two types of rules: those with from those without an empty RHS. We treat the trivial case of the conversion first. For each rule in the CF grammar with a non-empty RHS we create a constraint propagation rule such that each daughter of the rule inu'oduces an edge constraint in the head of the propagation rule with variable, but appropriately nmtchink string positions and a tixed label. Tile new, propagated edge constraint spans tbe entire range of the positions ot' the daughters and is labeled with the (nonterminal) symbol of the LHS of tile CF rule. In our example, the resulting propagation rule for S looks as follows: edge(I,K, np), edge(K,J,vp) ::&gt; edge(I,J,s) The translation is a little bit more complicated for rules with empty RHSs. Basically, we create a propagation rule for each empty rule, e.g., A ----+ e, such that the head is an arbitrary edge, i.e., both positions and the label are arbitrary variables, and post new edge constraints with the LHS of the CF rule as label, using the positional variables and spanning no portion of the string, resulting in CHR rules of the following type:</Paragraph>
      <Paragraph position="3"> But obviously rtfles of tiffs type lead to nonlerlnination since they would propagate furlhcr constraints on their own output which is avoided by inchlding a guard which ensures flint empty edges are only propagated for every possible string position once by testing whether the edge spans a string of length one. Recall that storing and using already existing edge conslrairfls is aw)idcd with an absorption rule. Since these empty constraints can be reused an arbitrary number of times, we get the desired effect with.out having to fear nontermination. Allhough this is not an elegant solution, it seems lhat other alternatives such as analyzing and transforming the entire grammar or posting the empty constraints while traversing the input string arc not appealing eflher since they give up the one-to-one correspondence between the rules of the CF grammar and the constraint program which is advantageous in debugging.</Paragraph>
      <Paragraph position="4"> With this technique, the parsing timcs achieved were better by a factor of a third compared to the Shieher et al.</Paragraph>
      <Paragraph position="5"> implemenlation. Although now the process of the compilation obscures the direct com~ection betweet~ parsing-as-deduction and constraint propagalion somewhat, the increase in speed makes it a worfl~while exercise.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML