XML Viewer - w05-1510

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/05/w05-1510_intro.xml
Size: 8,167 bytes
Last Modified: 2025-10-06 14:03:17
<?xml version="1.0" standalone="yes"?>
<Paper uid="W05-1510">
  <Title>Probabilistic models for disambiguation of an HPSG-based chart generator</Title>
  <Section position="3" start_page="93" end_page="95" type="intro">
    <SectionTitle>
2 Background
</SectionTitle>
    <Paragraph position="0"> This section describes background of this work including the representation of the input to our generator, the algorithm of chart generation, and probabilistic models for HPSG.</Paragraph>
    <Section position="1" start_page="93" end_page="94" type="sub_section">
      <SectionTitle>
2.1 Predicate-argument structures
</SectionTitle>
      <Paragraph position="0"> The grammar we adopted is the Enju grammar, which is an English HPSG grammar extracted from the Penn Treebank by Miyao et al. (2004). In parsing a sentence with the Enju grammar, semantic relations of words is output included in a parse tree. The semantic relations are represented by a set of predicate-argument structures (PASs), which in turn becomes the input to our generator. Figure 1 shows an example input to our generator which corresponds to the sentence &amp;quot;He bought the book.&amp;quot;, which consists of four predicates. REL expresses the base form of the word corresponding to the predicate. INDEX expresses a semantic variable to identify each word in the set of relations. ARG1 and ARG2 express relationships between the predicate and its arguments, e.g., the circled part in Figure 1 shows &amp;quot;he&amp;quot; is the subject of &amp;quot;buy&amp;quot; in this example. The other constraints in the parse tree are omitted in the input for the generator. Since PASs abstract  away superficial differences, generation from a set of PASs contains ambiguities in the order of modifiers like the example in Section 1 or the syntactic categories of phrases. For example, the PASs in Figure 1 can generate the NP, &amp;quot;the book he bought.&amp;quot; When processing the input PASs, we split a single PAS into a set of relations like (1) representing the first PAS in Figure 1.</Paragraph>
      <Paragraph position="2"> This representation is very similar to the notion of HLDS (Hybrid Logic Dependency Semantics) employed by White and Baldridge (2003), which is a related notion to MRS (Minimal Recursion Semantics) employed by Carroll et al. (1999). The most significant difference between our current input representation (not PAS itself) and the other representations is that each word corresponds to exactly one PAS while words like infinitival &amp;quot;to&amp;quot; have no semantic relations in HLDS. This means that &amp;quot;The book was bought by him.&amp;quot; is not generated from the same PASs as Figure 1 because there must be the PASs for &amp;quot;was&amp;quot; and &amp;quot;by&amp;quot; to generate the sentence. We currently adopt this constraint for simple implementation, but it is possible to use the input where PASs for words like &amp;quot;to&amp;quot; are removed. As proposed and implemented in the previous studies (Carroll et al., 1999; White and Baldridge, 2003), handling such inputs is feasible by modification in chart generation described in the following section. The algorithms proposed in this paper can be integrated with their algorithms although the implementation is left for future research.</Paragraph>
    </Section>
    <Section position="2" start_page="94" end_page="94" type="sub_section">
      <SectionTitle>
2.2 Chart generation
</SectionTitle>
      <Paragraph position="0"> Chart generation is similar to chart parsing, but what an edge covers is the semantic relations associated with it. We developed a CKY-like generator which deals with binarized grammars including the Enju.</Paragraph>
      <Paragraph position="1"> Figure 2 shows a chart for generating &amp;quot;He bought the book.&amp;quot; First, lexical edges are assigned to each PAS. Then the following loop is repeated from a37a39a38 a33 to the cardinality of the input.</Paragraph>
      <Paragraph position="2">  a3 Apply binary rules to existing edges to generate new edges holding a37 PASs.</Paragraph>
      <Paragraph position="3"> a3 Apply unary rules to the new edges generated  in the previous process.</Paragraph>
      <Paragraph position="4"> a3 Store the edges generated in the current loop into the chart1.</Paragraph>
      <Paragraph position="5"> In Figure 2, boxes in the chart represent a40 a7a42a41a43a41a8a44 , which contain edges covering the same PASs, and solid arrows represent rule applications. Each edge is packed into an equivalence class and stored in a cell. Equivalence classes are identified with their signs and the semantic relations they cover. Edges with different strings (e.g., NPs associated with &amp;quot;a big white dog&amp;quot; and &amp;quot;a white big dog&amp;quot;) can be packed into the same equivalence class if they have the same feature structure.</Paragraph>
      <Paragraph position="6"> In parsing, each edge must be combined with its adjacent edges. Since there is no such constraint in generation, the combinations of edges easily explodes. We followed two partial solutions to this problem by Kay (1996).</Paragraph>
      <Paragraph position="7"> The one is indexing edges with the semantic variables (e.g., circleda36 in Figure 2). For example, since the SUBCAT feature of the edge for &amp;quot;bought the book&amp;quot; specifies that it requires an NP with an index a30 , we can find the required edges efficiently by checking the edges indexed with a30 .</Paragraph>
      <Paragraph position="8"> The other is prohibiting proliferation of grammatically correct, but unusable sub-phrases. During generating the sentence &amp;quot;Newspaper reports said that the tall young Polish athlete ran fast&amp;quot;, sub-phrases with incomplete modifiers such as &amp;quot;the tall young athlete&amp;quot; or &amp;quot;the young Polish athlete&amp;quot; do not construct the final output, but slow down the generation because they can be combined with the rest of the input to construct grammatically correct phrases or sentences. Carroll et al. (1999) and White (2004) proposed different algorithms to address the same problem. We adopted Kay's simple solution in the current ongoing work, but logical form chunking proposed by White is also applicable to our system.</Paragraph>
    </Section>
    <Section position="3" start_page="94" end_page="95" type="sub_section">
      <SectionTitle>
2.3 Probabilistic models for generation with
HPSG
</SectionTitle>
      <Paragraph position="0"> Some existing studies on probabilistic models for HPSG parsing (Malouf and van Noord, 2004; Miyao and Tsujii, 2005) adopted log-linear models (Berger et al., 1996). Since log-linear models allow us to 1To introduce an edge with no semantic relations as mentioned in the previous section, we need to combine the edges with edges having no relations.</Paragraph>
      <Paragraph position="1">  use multiple overlapping features without assuming independence among features, the models are suitable for HPSG parsing where feature structures with complicated constraints are involved and dividing such constraints into independent features is difficult. Log-linear models have also been used for HPSG generation by Velldal and Oepen (2005). In their method, the probability of a realization a0 given a semantic representation a44 is formulated as</Paragraph>
      <Paragraph position="3"> where a15 a11 a5 a0 a9 is a feature function observed in a0 , a13 a11 is the weight of a15a34a11 , and a35 a5a8a44 a9 represents the set of all possible realizations of a44 . To estimate a13 a11 , pairs of</Paragraph>
      <Paragraph position="5"> realization for a44 . Their method first automatically generates a paraphrase treebank, where a36a37a0 a28 a44 a28 a35 a5a8a44 a9a39a38 are enumerated. Then, a log-linear model is trained with this treebank, i.e., each a13 a11 is estimated so as to maximize the likelihood of training data. As well as features used in their previous work on statistical parsing (Toutanova and Manning, 2002), an additional feature that represents sentence probabilities of 4-gram model is incorporated. They showed that the combined model outperforms the model without the 4-gram feature.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML