XML Viewer - p04-1032

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/04/p04-1032_evalu.xml
Size: 9,764 bytes
Last Modified: 2025-10-06 13:59:08
<?xml version="1.0" standalone="yes"?>
<Paper uid="P04-1032">
  <Title>Minimal Recursion Semantics as Dominance Constraints: Translation, Evaluation, and Analysis</Title>
  <Section position="7" start_page="4" end_page="4" type="evalu">
    <SectionTitle>
6 Evaluation
</SectionTitle>
    <Paragraph position="0"> The two remaining assumptions underlying the translation are the &amp;quot;net-hypothesis&amp;quot; that all linguistically relevant MRS expressions are nets, and the &amp;quot;qeq-hypothesis&amp;quot; that handle constraints can be given a dominance semantics practice. In this section, we empirically show that both assumptions are met in practice.</Paragraph>
    <Paragraph position="1"> As an interesting side effect, we also compare the run-times of the constraint-solvers we used, and we find that the dominance constraint solver typically outperforms the MRS solver, often by significant margins.</Paragraph>
    <Paragraph position="2"> Grammar and Resources. We use the English Resource Grammar (ERG), a large-scale HPSG grammar, in connection with the LKB system, a grammar development environment for typed feature grammars (Copestake and Flickinger, 2000). We use the system to parse sentences and output MRS constraints which we then translate into dominance constraints. As a test corpus, we use the Redwoods Treebank (Oepen et al., 2002) which contains 6612 sentences. We exclude the sentences that cannot be parsed due to memory capacities or words and grammatical structures that are not included in the ERG, or which produce ill-formed MRS expressions (typically violating M1) and thus base our evaluation on a corpus containing 6242 sentences.</Paragraph>
    <Paragraph position="3"> In case of syntactic ambiguity, we only use the first reading output by the LKB system.</Paragraph>
    <Paragraph position="4"> To enumerate the solutions of MRS constraints and their translations, we use the MRS solver built into the LKB system and a solver for weakly normal dominance constraints (Bodirsky et al., 2004),  which is implemented in C ++ and uses LEDA, a class library for efficient data types and algorithms (Mehlhorn and Naher, 1999).</Paragraph>
    <Section position="1" start_page="4" end_page="4" type="sub_section">
      <SectionTitle>
6.1 Relevant Constraints are Nets
</SectionTitle>
      <Paragraph position="0"> We check for 6242 constraints whether they constitute nets. It turns out that 5200 (83.31%) constitute nets while 1042 (16.69%) violate one or more netconditions. null Non-nets. The evaluation shows that the hypothesis that all relevant constraints are nets seems to be falsified: there are constraints that are not nets. However, a closer analysis suggests that these constraints are incomplete and predict more readings than the sentence actually has. This can also be illustrated with the average number of solutions: For the Redwoods corpus in combination with the ERG, nets have 1836 solutions on average, while non-nets have 14039 solutions, which is a factor of 7.7. The large number of solutions for non-nets is due to the &amp;quot;structural weakness&amp;quot; of non-nets; often, non-nets have only merging configurations.</Paragraph>
      <Paragraph position="1"> Non-nets can be classified into two categories (see Fig. 6): The first class are violated &amp;quot;strong&amp;quot; fragments which have holes without outgoing dominance edge and without a corresponding root-to-root dominance edge. The second class are violated &amp;quot;island&amp;quot; fragments where several outgoing dominance edges from one hole lead to nodes which are not hypernormally connected. There are two more possibilities for violated &amp;quot;weak&amp;quot; fragments-having more than one weak dominance edge or having a weak dominance edge without empty hole--, but they occur infrequently (4.4%). If those weak fragments were normalized, they would constitute violated island fragments, so we count them as such.</Paragraph>
      <Paragraph position="2"> 124 (11.9%) of the non-nets contain empty holes, 762 (73.13%) contain violated island fragments, and 156 (14.97%) contain both. Those constraints that contain only empty holes and no violated island fragments cannot be configured, as in configurations, all holes must be filled.</Paragraph>
      <Paragraph position="3"> Fragments with open holes occur frequently, but not in all contexts, for constraints representing for example time specifications (e. g., &amp;quot;from nine to twelve&amp;quot; or &amp;quot;a three o'clock flight&amp;quot;) or intensional expressions (e. g., &amp;quot;Is it?&amp;quot; or &amp;quot;I suppose&amp;quot;). Ill- null formed island fragments are often triggered by some kind of coordination, like &amp;quot;a restaurant and/or a sauna&amp;quot; or &amp;quot;a hundred and thirty Marks&amp;quot;, also implicit ones like &amp;quot;one hour thirty minutes&amp;quot; or &amp;quot;one thirty&amp;quot;. Constraints with both kinds of violated fragments emerge when there is some input that yields an open hole and another part of the input yields a violated island fragment (for example in constructions like &amp;quot;from nine to eleven thirty&amp;quot; or &amp;quot;the ten o'clock flight Friday or Thursday&amp;quot;, but not necessarily as obviously as in those examples).</Paragraph>
      <Paragraph position="4"> The constraint on the left in Fig. 7 gives a concrete example for violated island fragments. The topmost fragment has outgoing dominance edges to otherwise unconnected subconstraints ph  Under the merging-free semantics of the MRS dialect used in (Niehren and Thater, 2003) where every hole has to be filled exactly once, this constraint cannot be configured: there is no hole into which &amp;quot;available&amp;quot; could be plugged. However, standard MRS has merging configuration where holes can be filled more than once. For the constraint in Fig. 7 this means that &amp;quot;available&amp;quot; can be merged in almost everywhere, only restricted by the &amp;quot;qeq-semantics&amp;quot; which forbids for instance &amp;quot;available&amp;quot; to be merged with &amp;quot;sauna.&amp;quot; In fact, the MRS constraint solver derives sixteen configurations for the constraint, two of which are given in Fig. 7, although the sentence has only two scope readings.</Paragraph>
      <Paragraph position="5"> We conjecture that non-nets are semantically &amp;quot;incomplete&amp;quot; in the sense that certain constraints are missing. For instance, an alternative analysis for the above constraint is given in Fig. 8. The constraint adds an additional argument handle to &amp;quot;and&amp;quot; and places a dominance edge from this handle to &amp;quot;available.&amp;quot; In fact, the constraint is a net; it has exactly two readings.</Paragraph>
    </Section>
    <Section position="2" start_page="4" end_page="4" type="sub_section">
      <SectionTitle>
6.2 Qeq is dominance
</SectionTitle>
      <Paragraph position="0"> For all nets, the dominance constraint solver calculates the same number of solutions as the MRS solver does, with 3 exceptions that hint at problems in the syntax-semantics interface. As every configuration that satisfies proper qeq-constraints is also a configuration if handle constraints are interpreted under the weaker notion of dominance, the solutions computed by the dominance constraint solver and the MRS solver must be identical for every constraint. This means that the additional expressivity of proper qeq-constraints is not used in practice, which in turn means that in practice, the translation is sound and correct even for the standard MRS notion of solution, given the constraint is a net.</Paragraph>
    </Section>
    <Section position="3" start_page="4" end_page="4" type="sub_section">
      <SectionTitle>
6.3 Comparison of Runtimes
</SectionTitle>
      <Paragraph position="0"> The availability of a large body of underspecified descriptions both in MRS and in dominance constraint format makes it possible to compare the solvers for the two underspecification formalisms.</Paragraph>
      <Paragraph position="1"> We measured the runtimes on all nets using a Pentium III CPU at 1.3 GHz. The tests were run in a multi-user environment, but as the MRS and dominance measurements were conducted pairwise, conditions were equal for every MRS constraint and corresponding dominance constraint.</Paragraph>
      <Paragraph position="2"> The measurements for all MRS-nets with less than thirty dominance edges are plotted in Fig. 9.</Paragraph>
      <Paragraph position="3"> Inputs are grouped according to the constraint size.</Paragraph>
      <Paragraph position="4"> The filled circles indicate average runtimes within each size group for enumerating all solutions using the dominance solver, and the empty circles indicate the same for the LKB solver. The brackets around each point indicate maximum and minimum runtimes in that group. Note that the vertical axis is logarithmic.</Paragraph>
      <Paragraph position="5"> We excluded cases in which one or both of the solvers did not return any results: There were 173 sentences (3.33% of all nets) on which the LKB solver ran out of memory, and 1 sentence (0.02%) that took the dominance solver more than two minutes to solve.</Paragraph>
      <Paragraph position="6"> The graph shows that the dominance constraint solver is generally much faster than the LKB solver: The average runtime is less by a factor of 50 for constraints of size 10, and this grows to a factor of 500 for constraints of size 25. Our experiments show that the dominance solver outperforms the LKB solver on 98% the cases. In addition, its run-times are much more predictable, as the brackets in the graph are also shorter by two or three orders of magnitude, and the standard deviation is much smaller (not shown).</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML