File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/04/w04-0408_metho.xml

Size: 10,477 bytes

Last Modified: 2025-10-06 14:09:07

<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-0408">
  <Title>Multiword expressions as dependency subgraphs</Title>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3 Groups
</SectionTitle>
    <Paragraph position="0"> In this section, we consider MWE paraphrases and propose to model them as tuples of dependency subgraphs called groups. We start with an example: Consider (3), a paraphrase of (2): He dates her. (2) He takes her out. (3) We display the XDG analysis of (3) in (4). Again, the ID tree is on the left, and the PA dag  This example demonstrates that we cannot simply treat MWEs as contiguous word strings and include those in the lexicon, since the MWE takes out is interrupted by the object her in (3). Instead, we choose to implement the continuity hypothesis of (Kay and Fillmore, 1999) in terms of DG, and model MWEs as dependency subgraphs. We realize this idea by a new layer of lexical organization called groups. A group is tuple of dependency subgraphs covering one or more nodes, where each component of the tuple corresponds to a grammar dimension. We call a group component group dimension. We display the group corresponding to dates in (5), and the one correspond- null Groups can leave out nodes present in the ID dimension on the PA dimension. E.g. in (6), the node corresponding to the particle out is present on the ID dimension, but left out on the PA dimension. In this way, groups help us to weaken the 1:1-correspondence between words and nodes.</Paragraph>
    <Paragraph position="1"> We can make use of groups also to handle more complicated MWE paraphrases. Consider (8) below, a support verb construction paraphrasing (7): He argues with her. (7) He has an argument with her. (8) In (8), has is only a support verb; the semantic head of the construction is the noun argument. We display the ID tree and PA dag of (7) in (9). The ID tree of (8) is in (10), and the PA dag in (11):6</Paragraph>
    <Paragraph position="3"> pcomp for the complement of a preposition, pmod for a prepositional modifier, and det for a determiner.</Paragraph>
    <Paragraph position="4"> .</Paragraph>
    <Paragraph position="5"> He has an argument with her  In (9), the node corresponding to with is deleted in the PA dag. In (11), the support verb construction leads to the deletion of three nodes (corresponding to resp. has, an and with). These deletions are reflected in the corresponding groups. The group corresponding to argues with is displayed in (12), and the group corresponding to has an argument with in (13) (ID) and (14) (PA):.</Paragraph>
    <Paragraph position="6">  Groups can capture difficult constructions such as the support verb construction above in an elegant and transparent way, without having to resort to complicated machinery. The key aspect of groups is their multi-dimensionality, describing tuples of dependency subgraphs spanning over a shared set of nodes. This sharing enables groups to express interdependencies between the different dimensions. For instance in (13) and (14), the noun argument, the object of the support verb has in the ID dimension, is the semantic head in the PA dimension.</Paragraph>
  </Section>
  <Section position="6" start_page="0" end_page="0" type="metho">
    <SectionTitle>
4 Compilation
</SectionTitle>
    <Paragraph position="0"> In this section, we show how to compile groups into simple lexical entries. The benefit of this is that we can retain XDG in its entirety, i.e. we can retain its formalization and its axiomatization as a constraint satisfaction problem. This means we can also retain the implementation of the XDG solver, and continue to use it for parsing and generation. null</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.1 Node deletion
</SectionTitle>
      <Paragraph position="0"> The 1:1-correspondence between words and nodes is a key assumption of XDG. It requires that on each dimension, each word must correspond to precisely one node in the dependency graph. The groups shown in the previous section clearly violate this assumption as they allow nodes present on the ID dimension to be omitted on the PA dimension. null The first step of the compilation aims to accommodate this deletion of nodes. To this end, we assume for each analysis an additional root node, e.g. corresponding to the full stop at the end of the sentence. Each root in an XDG dependency graph becomes a daughter of this additional root node, the edge labeled root. The trick for deleting nodes is now the following: Each deleted node also becomes a daughter of the additional root node, but with a special node label indicating that this node is regarded to be deleted (del).</Paragraph>
      <Paragraph position="1"> As an example, here is the PA dag for example</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.2 Dependency subgraphs
</SectionTitle>
      <Paragraph position="0"> The second step of the compilation is to compile the dependency subgraphs into lexical entries for individual words. To this end, we make use of the valency principle. For each edge from v to v0 labeled l within a group, we require that v has an outgoing edge labeled l, and that v0 licenses an incoming edge labeled l. I.e. we include l! in the out specification of the lexical entry for v, and l? in the in specification of the lexical entry for v0.</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.3 Group coherence
</SectionTitle>
      <Paragraph position="0"> The final step of the compilation is about ensuring group coherence, i.e. to ensure that the inner nodes of a group dimension (neither the root nor the leaves) stay within this group in the dependency analysis. In other words, group coherence make sure that the inner nodes of a group dimension cannot become daughters of nodes of a different group. In our support verb example, group coherence ensures that e.g. that the determiner an cannot become the determiner of a noun of another group. We realize this idea through a new XDG principle called group coherence principle.</Paragraph>
      <Paragraph position="1"> This principle must hold on all dimensions of the grammar.</Paragraph>
      <Paragraph position="2"> Given a set of group identifiers Group, the principle assumes two new features: group : Group, and outgroups : Lab ! 2Group. For each node v, group(v) denotes the group identifier of the group of v. For each edge within a group from v to v0 labeled l, i.e. for each edge to an inner node, outgroups(v)(l) denotes the singleton set containing the group of both v and v0. For each edge from v labeled l which goes outside of a group, outgroups(v)(l) = Group, i.e. the edge can end in any other group. We formalize the group coherence principle as follows, where v l! v0 denotes an edge from v to v0 labeled l:</Paragraph>
      <Paragraph position="4"/>
    </Section>
    <Section position="4" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.4 Examples
</SectionTitle>
      <Paragraph position="0"> For illustration, we display the lexical entries of two compiled groups: The group argues with (with identifier g1), and the group has an argument with (with identifier g2), resp. in Figure 2 and Figure 3. We omit the specification of outgroups for the PA dimension for lack of space, and since it is not relevant for the example: In all groups, there are no edges which stay within a group in the PA dimension.</Paragraph>
    </Section>
    <Section position="5" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
4.5 Parsing and generation
</SectionTitle>
      <Paragraph position="0"> We can use the same group lexicon for parsing and generation, but we have to slightly adapt the compilation for the generation case.</Paragraph>
      <Paragraph position="1"> For parsing, we can use the XDG parser as before, without any changes.</Paragraph>
      <Paragraph position="2"> For generation, we assume a set Sem of semantic literals, multisets of which must be verbalized. To do this, we assume a function groups : Sem ! 2Group, mapping each semantic literal to a set of groups which verbalize it.</Paragraph>
      <Paragraph position="3"> Now before using the XDG solver for generation, we have to make sure for each semantic literal s to be verbalized that the groups which can potentially verbalize it have the same number of nodes. For this, we calculate the maximum number n of syntactic nodes for the groups assigned to s, and fill up the groups having less syntactic nodes with deleted nodes. Then for XDG solving, we introduce precisely n nodes for literal s. Using groups for generation thus comes at the expense of having to introduce additional nodes.7 As an example, consider we want to verbalize the semantic literal argue. groups(argue) = fg1; g2g, i.e. either groups g1 or g2 can verbalize it. The maximum number n of syntactic nodes for the two groups is 4 for g2 (has an argument with).</Paragraph>
      <Paragraph position="4"> Hence, we have to fill up the groups having less syntactic nodes with deleted nodes. In this case, we have to add two deleted nodes to the group g1 (argue with) to get to four nodes. The resulting lexical entries encoding g1 are displayed in Figure 4. The lexical entries for g2 stay the same as in Figure 3.</Paragraph>
      <Paragraph position="5"> After this step is done, we can use the existing XDG solver to generate from the semantic literals precisely the two possible verbalizations (7) and (8), as intended.</Paragraph>
      <Paragraph position="6"> 7We should emphasize that n is not at all unrestricted. For each semantic literal s to be verbalized, we have to introduce only as many nodes as are contained in the largest group which can verbalize s.</Paragraph>
      <Paragraph position="7"> word literal group outgroupsID inID outID inPA outPA link argues argue' g1 fpobj 7! fg1gg froot?g fsubj!; pobj!g froot?g farg1; arg2g f arg1 7! fsubjgarg2 7! fpcompgg with argue' g1 fg fpobj?g fpcomp!g fdel?g fg fg Figure 2: Lexical entries encoding the group for argues with word literal group outgroupsID inID outID inPA outPA link has argue' g2 fobj 7! fg2gg froot?g fsubj!; obj!g fdel?g fg fg an argue' g2 fg fdet?g fg fdel?g fg fg argument argue' g2 f det 7! fg2gpmod 7! fg2gg fobj?g fdet!; pmod!g froot?g farg1!; arg2!g f arg1 7! fsubjgarg2 7! fpcompgg with argue' g2 fg fpmod?g fpcomp!g fdel?g fg fg</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML