File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/98/w98-1227_metho.xml

Size: 11,089 bytes

Last Modified: 2025-10-06 14:15:15

<?xml version="1.0" standalone="yes"?>
<Paper uid="W98-1227">
  <Title>A Method of Incorporating Bigram Constraints into an LR Table and Its Effectiveness in Natural Language Processing i</Title>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3 Integration of Bigram and CFG
</SectionTitle>
    <Paragraph position="0"> Constraints into an LR Table</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.1 The Definition of a Probabilistic
Connection Matrix
</SectionTitle>
      <Paragraph position="0"> A close relation exists between bigrams and connection matrices, in that the bigram probability P(bla ) corresponds to the matrix dement of Connect(a, b).</Paragraph>
      <Paragraph position="1"> A connection matrix incorporating bigram probabilities is called a probabilistic connection matrix, in which Connect(a, b) = 0 still means b cannot follow a, but instead of connection matrix entries having a binary value of 0 or 1, a probability is associated with each element. This is then used to construct a probabilistic LR table.</Paragraph>
      <Paragraph position="2"> lmai and Tanaka 226 A Method of Incorporating Bigram Constraints  The N-gram model is the most commonly used probabilistic language model, and it assumes that a symbol sequence can be described by a higher order Markov process. The simplest N-gram model with N = 2 is called a bigram model, and approximates the probability of a string X = xzx2xa...x,~ as the product of conditional probabilities:</Paragraph>
      <Paragraph position="4"> In the above expression, &amp;quot;#&amp;quot; indicates the sentence beginning marker and &amp;quot;$&amp;quot; indicates the sentence ending marker. The above big-ram model can be represented in a probabilistic connection matrix defined as follows.</Paragraph>
      <Paragraph position="5"> DEFINITION 1 (probabilistic connection matriz) Let G = (V~v, Vr, P, S) be a context-free grammar. For Va, b E VT (the set of terminal symbols), the probabilistic connection matrix named PConnect is defined as follows.</Paragraph>
      <Paragraph position="7"> cannot occur consecutively in the given order.</Paragraph>
      <Paragraph position="8"> PConnect(a, b) ~ 0 means b can follow a with probability P(b\[a).</Paragraph>
      <Paragraph position="9"> 3.2 An algorithm to construct a bigram LR table An algorithm to construct a probabilistic LR table, combining both bigram and CFG constraints, is given in Algorithm I: Algorithm 1 Input: A CFG G = (Vjv, VT, P, S) and a probabilistic connection matrix PConnect. Output: An LR table T with CFG and big-ram constraints. null Method: Step 1 Generate an LR table To from the given CFG G.</Paragraph>
      <Paragraph position="10"> Step 2 Removal of actions:  For each shift action shm with lookahead a in the LR table To, delete actions in the state m with lookalaead b if PConnect(a, b) = O. Step 3 Constraint Propagation (Tanaka et al., 1994): Repeat the following two procedures until no further actions can be removed:  1. Remove actions which have no succeeding action, 2. Remove actions which have no preceding action.</Paragraph>
      <Paragraph position="11">  Step 4 Compact the LR table if possible. Step 5 Incorporation of big-ram constraints into the LR table: For each shift action shm with lookahead a in the LR table To, let</Paragraph>
      <Paragraph position="13"> where {hi : i = 1,-.-,N} is the set of lookaheads for state m. For each action Aj in state rn with lookahead bi, assign a probability p to</Paragraph>
      <Paragraph position="15"> where n is the number of conflict actions in state m with lookahead bi. The denominator is dearly a normalization factor.</Paragraph>
      <Paragraph position="16"> Step 6 For each shift action A with lookahead a in state 0, assign A a probability p = p(al#), where &amp;quot;#&amp;quot; is the sentence beginning marker. Step 7 Assign a probability p = 1/n to each action. A in state m with lookahead symbol a that has not been assigned a probability, where n is the number of conflict actions in state m with lookahead symbol a.</Paragraph>
      <Paragraph position="17"> Step 8 Return the LR table T produced at the completion of Step 7 as the Bi#ram LR table.</Paragraph>
      <Paragraph position="18"> As explained above, the removal of actions at Step  2 corresponds to the operation of incorporating connection constraints into an LR table. We call Step 3 Constraint Propagation, by which the size of the LR table is reduced (Tanaka et al., 1994). As many lmai and Tanaka 227 A Method of Incorporating Bigram Constraints (1) S--*XY (6) A--*al (2) X ~ A (7) A--, ae (3) X-~AB (S) n-~bl (4) Y--,A (9) B--,b2 (5) Y--*blA  actions are removed from the LR table during Steps 2 and 3, it becomes possible to compress the LR table in Step 4. We will demonstrate one example of this process in the following section.</Paragraph>
      <Paragraph position="19"> It should be noted that the above algorithm can be applied to any type of LR table, that is a canonical LR table, an LALR table, or an SLR table.</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="0" end_page="14" type="metho">
    <SectionTitle>
4 An Example
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="0" end_page="14" type="sub_section">
      <SectionTitle>
4.1 Generating a Big-ram LR Table
</SectionTitle>
      <Paragraph position="0"> In this section, we will provide a simple example of the generation of a bigram LR table by way of applying Algorithm 1 to both a CFG and a probabilistic connection matrix, to create a big'ram LR table. Figure 4 and Figure 5 give a sample CFG Gz and a probabilistic connection matrix M1, respectively.</Paragraph>
      <Paragraph position="1"> Note that grammar G1 in Figure 4 does not explicitly express local connection constraints between terminal symbols. Such local connection constraints are easily expressed by a matrix M1 as shown in Figure 5.</Paragraph>
      <Paragraph position="2"> From the CFG given in Figure 4, we can generate an LR table, Table 1, in Step 1 using the conventional LR table generation algorithm.</Paragraph>
      <Paragraph position="3"> Table 2 is the resultant LR table at the completion of Step 2 and Step 3, produced based on Table 1. Actions numbered (2) and (3) in Table 2 are those which are removed by Step 2 and Step 3, respectively. null In state 1 with a lookahead symbol bl, re6 is carried out after executing action shl in state 0, pushing al onto the stack. Note that al and bl are now consecutive, in this order. However, the probabilistic connection matrix (see Figure 5) does not allow such a sequence of terminal symbols, since PConnect( al , bl ) = O. Therefore, the action re6 in state 1 with lookahead bl is removed from Table 1 in Step 2, and thus marked as (2) in Table 2. For this same reason, the other re6s in state 1 with lookahead symbols al and a$ are also removed from  On the other hand, in the case of re6 in state 1 with lookahead symbol b$, as al can be followed by</Paragraph>
      <Paragraph position="5"> removed. The remaining actions marked as (2} in Table 2 should be self-evident to readers.</Paragraph>
      <Paragraph position="6"> Next, we would like to consider the reason why action sh9 in state 4 with lookahead al is removed from Table 1. In state 9, re6 with lookahead symbol $ has already been removed in Step 2, and there is no succeeding action for shg. Therefore, action sh9 in state 3 is removed in Step 3, and hence marked as(3).</Paragraph>
      <Paragraph position="7"> Let us consider action re3 in state 8 with lookahead al. After this action is carried out, the GLR parser goes to state 4 after pushing X onto the stack. However, sh9 in state 4 with lookahead al has already been removed, and there is no succeeding action for re3. As a result, re3 in state 8 with lookahead symbol al is removed in Step 3. Similarly, re9 in state 7 with lookahead symbol al is also removed in Step 3. In this way, the removal of actions propagates to other removals. This chain of removals is  called Constraint Propagation, and occurs in Step 3. Actions removed in Step 3 are marked as (3) in  Careful readers will notice that there is now no action in state 9 and that it is possible to delete this state in Step 4. Table 3 shows the LR table after Step 4.</Paragraph>
      <Paragraph position="8"> As a final step, we would like to assign big-ram constraints to each action in Table 3. Let us consider the two tess in state 6, reached after executing sh6 in state 4 by pushing a lookahead of bl onto the stack. In state 6, P is calculated at Step 5 as shown below:</Paragraph>
      <Paragraph position="10"> We can assign the following probabilities p to each re8 in state 6 by way of Step 5:</Paragraph>
      <Paragraph position="12"> After assigning a probability to each action in the LR table at Step 5, there remain actions without probabilities. For example, the two conflict actions (re2/sh6) in state 3 with lookahead bl are not assigned a probability. Therefore, each of these actions is assigned the same probability, 0.5, in Step 7. A probability of 1 is assigned to remaining actions, since there is no conflict among them.</Paragraph>
      <Paragraph position="13"> Table 4 shows the final results of applying Algorithm 1 to G, and M,.</Paragraph>
      <Paragraph position="14"> lmai and Tanaka 228 A Method of Incorporating Bigram Constraints</Paragraph>
      <Paragraph position="16"/>
      <Paragraph position="18"/>
    </Section>
    <Section position="2" start_page="14" end_page="14" type="sub_section">
      <SectionTitle>
4.2 Comparison of Language Models
</SectionTitle>
      <Paragraph position="0"> Using the bigram LR table as shown in Table 4, the probability P1 of the string &amp;quot;a2 bl ag' is calculated as:</Paragraph>
      <Paragraph position="2"> where P(Treei) means the probability of the i-th parsing tree generated by the GLR parser and P(S,L,A) means the probability of an action A in state S with Iookahead L.</Paragraph>
      <Paragraph position="3"> On the other hand, using only bigram constraints, the probability P2 of the string &amp;quot;ae b1 a,~' is calculated as:</Paragraph>
      <Paragraph position="5"> The reason why P1 &gt; P2 can be explained as follows. Consider the beginning symbol a2 of a sentence. In the case of the bigram model, a2 can only be followed by either of the two symbols bl and $ (see Figure 5). However, consulting the bigram LR table reveals that in state 0 with lookahead ae, she is carried out, entering state 2. State 2 has only one action re7 with lookahead symbol bl. In other words, in state 2, $ is not predicted as a succeeding symbol of al. The exclusion of an ungrammatical prediction in $ makes P1 larger than P2.</Paragraph>
      <Paragraph position="6"> Perplexity is a measure of the complexity of a language model. The larger the probability of the language model, the smaller the perplexity of the language model. The above result (P1 &gt; P2) indicates  that the bigram LR table model gives smaller perplexity than the bigram model. In the next section, we will demonstrate this fact.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML