File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/p06-1146_intro.xml

Size: 5,743 bytes

Last Modified: 2025-10-06 14:03:40

<?xml version="1.0" standalone="yes"?>
<Paper uid="P06-1146">
  <Title>Optimal Constituent Alignment with Edge Covers for Semantic Projection</Title>
  <Section position="4" start_page="1161" end_page="1162" type="intro">
    <SectionTitle>
2 Cross-lingual Semantic Role projection
</SectionTitle>
    <Paragraph position="0"> Semantic role projection is illustrated in Figure 1 using English and German as the source-target language pair. We assume a FrameNet-style semantic analysis (Fillmore et al., 2003). In this paradigm, the semantics of predicates and their arguments are described in terms of frames, conceptual structures which model prototypical situations. The English sentence Kim promised to be on time in Figure 1 is an instance of the COMMITMENT frame. In this particular example, the frame introduces two roles, i.e., SPEAKER (Kim) and MESSAGE (to be on time). Other possible, though unrealised, roles are ADDRESSEE, MES-SAGE, and TOPIC. The COMMITMENT frame can be introduced by promise and several other verbs and nouns such as consent or threat.</Paragraph>
    <Paragraph position="1"> We also assume that frame-semantic annotations can be obtained reliably through shallow semantic parsing.1 Following the assignment of semantic roles on the English side, (imperfect) word alignments are used to infer semantic alignments between constituents (e.g., to be on time is aligned with punktlich zu kommen), and the role labels are transferred from one language to the other. Note that role projection can only take place if the source predicate (here promised) is word-alignedtoatargetpredicate(hereversprach) evoking the same frame; if this is not the case (e.g., in metaphors), projected roles will not be generally appropriate.</Paragraph>
    <Paragraph position="2"> We represent the source and target sentences as sets of linguistic units, Us and Ut, respectively. 1See Carreras and Marquez (2005) for an overview of recent approaches to semantic parsing.</Paragraph>
    <Paragraph position="3"> Kim versprach, punktlich zu kommen Kim promised to be on time  glish to German (word alignments as dotted lines) The assignment of semantic roles on the source side is a function roles : R - 2Us from roles to sets of source units. Constituent alignments are obtained in two steps. First, a real-valued function sim : Us xUt - R estimates pairwise similarities between source and target units. To make our model robust to alignment noise, we use only content words to compute the similarity function. Next, a decision procedure uses the similarity function to determine the set of semantically equivalent, i.e., aligned units A[?]UsxUt. Once A is known, semantic projection reduces to transferring the semantic roles from the source units onto their aligned target counterparts: rolet(r)={ut |[?]us [?]roles(r) : (us,ut)[?]A} In Pado and Lapata (2005), we evaluated two main parameters within this framework: (a) the choice of linguistic units and (b) methods for computing semantic alignments. Our results revealed  thatconstituent-basedmodelsoutperformedword-based ones by a wide margin (0.65 Fscore vs. 0.46), thus demonstrating the importance of bracketing in amending errors and omissions in the automatic word alignment. We also compared two simplistic alignment schemes, backward alignment and forward alignment. The first scheme aligns each target constituent to its most similar source constituent, whereas the second (Af) aligns each source constituent to its most  An example constituent alignment obtained from the forward scheme is shown in Figure 2 (left side). The nodes represent constituents in the source and target language and the edges indicate the resulting alignment. Forward alignment generally outperformed backward alignment (0.65 Fscore vs. 0.45). Both procedures have a time complexity quadratic in the maximal number of sentence nodes: O(|Us||Ut|) = O(max(|Us|,|Ut|)2).</Paragraph>
    <Paragraph position="4"> A shortcoming common to both decision procedures is that they are local, i.e., they optimise the alignment for each node independently of all other nodes. Consider again Figure 2. Here, the forward procedure creates alignments for all source nodes, but leaves constituents from the target set unaligned (see target node (1)). Moreover, local alignment methods constitute a rather weak model of semantic equivalence since they allow one target node to correspond to any number of source nodes (see target node (3) in Figure 2, which is aligned to three source nodes). In fact, by allowing any alignment between constituents, the local models can disregard important linguistic information, thus potentially leading to suboptimal results. We investigate this possibility by proposing well-understood global optimisation models which suitably constrain the resulting alignments.</Paragraph>
    <Paragraph position="5"> Besides matching constituents reliably, poor word alignments are a major stumbling block for achieving accurate projections. Previous research addresses this problem in a post-processing step, by reestimating parameter values (Yarowsky and Ngai, 2001), by applying transformation rules (Hwa et al., 2002), by using manually labelled data (Hi and Hwa, 2005), or by relying on linguistic criteria (Pado and Lapata, 2005). In this paper, we present a novel filtering technique based on tree pruning which removes extraneous constituents in a preprocessing stage, thereby disassociating filtering from the alignment computation. In the remainder of this paper, we present the details of our global optimisation and filtering techniques. We only consider constituent-based models, since these obtained the best performance in our previous study (Pado and Lapata, 2005).</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML