File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/06/e06-2026_metho.xml

Size: 11,951 bytes

Last Modified: 2025-10-06 14:10:07

<?xml version="1.0" standalone="yes"?>
<Paper uid="E06-2026">
  <Title>Grammatical Role Labeling with Integer Linear Programming</Title>
  <Section position="3" start_page="0" end_page="187" type="metho">
    <SectionTitle>
2 The ILP Specification
</SectionTitle>
    <Paragraph position="0"> Integer Linear Programming (ILP) is the name of a class of constraint satisfaction algorithms which are restricted to a numerical representation of the problem to be solved. The objective is to optimize (minimize or maximize) the numerical solution of linear equations (see the objective function in Fig.</Paragraph>
    <Paragraph position="1"> 1). The general form of an ILP specification is given in Fig. 1 (here: maximization). The goal is to maximize a a2 -ary function a13 , which is defined as the sum of the variables a14a11a15a17a16a18a15 .</Paragraph>
    <Paragraph position="2"> Assignment decisions (e.g. grammatical role labeling) can be modeled in the following way: a16a20a19  are binary class variables that indicate the (non-) assignment of a constituent a47 a15 to the grammatical function a48 a45 (e.g. subject) of a verb a49a51a50 . To represent this, three indices are needed. Thus, a16 is a complex variable name, e.g. a48</Paragraph>
    <Paragraph position="4"> of readability, weadd somemnemotechnical sugar and use a48 a15a52a49a53a45a10a47a54a50 instead or a55a25a49a10a45a8a47a56a50 for a constituent a47a56a50 being (or not) the subject a55 of verb a49a4a45 (a55 thus is an instantiation of a48 a15 ) . If the value of such a class variable a48 a15 a49 a45 a47a56a50 is set to 1 in the course of the maximization task, the attachment was successful, otherwise ( a48 a15a57a49a53a45a8a47a56a50a58a13a60a59 ) it failed. a14a5a15 from Fig. 1 are weights that represent the impact of an assignment (or a constraint); they provide an empirically based numerical justification of the assignment (we don&amp;quot;t need the a20 a15a61a45 ). For example, we represent the impact of a48a7a15a57a49a8a45a53a47a54a50 =1 by a62a64a63a66a65a68a67a70a69a72a71a74a73 . These weights are derived from a maximum entropy model trained on a treebank (see section 5).</Paragraph>
    <Paragraph position="5"> a37 is used to set up numerical constraints. For example that a constituent can only be the filler of one grammatical role. The decision, which of the class variables are to be &amp;quot;on&amp;quot; or &amp;quot;off&amp;quot; is based on the weights and the constraints an overall solution must obey to. ILP seeks to optimize the solution.</Paragraph>
  </Section>
  <Section position="4" start_page="187" end_page="188" type="metho">
    <SectionTitle>
3 Formalization
</SectionTitle>
    <Paragraph position="0"> We restrict our formalization to the following set of grammatical functions: subject (a55 ), direct (i.e.</Paragraph>
    <Paragraph position="1"> accusative) object (a75 ), indirect (i.e. dative) object (a76 ), clausal complement (a1 ), prepositional complement (a77 ), attributive (np or pp) attachment (a78 ) and adjunct (a79 ). The set of grammatical relations of a verb (verb complements) is denoted with a48 , it comprises a55 , a75 ,a76 , a1 and a77 .</Paragraph>
    <Paragraph position="2"> The objective function is:</Paragraph>
    <Paragraph position="4"> a79 represents the weighted sum of all adjunct attachments. a78 is the weighted sum of all attributive a88a89a88 (&amp;quot;the book in her hand ..&amp;quot;) and genitive</Paragraph>
    <Paragraph position="6"> attachments (&amp;quot;die Frau desa91a72a92 a19 Professorsa91a35a92 a19 &amp;quot; [the wife of the professor]). a85 represents the weighted sum of all unassigned objects.3 a0 is the weighted sum of the case frame instantiations of all verbs in the sentence. It is defined as follows:</Paragraph>
    <Paragraph position="8"> This sums up over all verbs. For each verb, each grammatical role (a115a116a67a96a65 is the set of such roles) is instantiated from the stock of all constituents (a47a56a117 a2a119a118a10a120a96a118a53a121 , which includes all np and pp constituents but also the verbs as potential heads of clausal objects). a48a113a49a5a15a114a47a44a45 is a variable that indicates the assignment of a constituent a47 a45 to the grammatical function a48 of verb a49 a15 .</Paragraph>
    <Paragraph position="9"> a108 a63a109a67a44a65a106a71a106a69 is the weight of such an assignment. The (binary) value of each a48a113a49 a15a114a47a44a45 is to be determined in the course of the constraint satisfaction process, the weight is taken from the maximum entropy model.</Paragraph>
    <Paragraph position="10"> a78 isthefunction for weighted attributive attachments: null</Paragraph>
    <Paragraph position="12"> where a62a17a127a130a71a70a65a106a71a106a69 is the weight of an assignment of constituent a47a131a45 to constituent a47 a15 and a78a58a47 a15a114a47a44a45 is a binary variable indicating the classification decision whether a47a131a45 actually modifies a47 a15 . In contrast to</Paragraph>
    <Paragraph position="14"> The function for weighted adjunct attachments,</Paragraph>
    <Paragraph position="16"> where a47a56a117 a2a119a118a10a120a44a118a136a135 is the set of a88a89a88 constituents of the sentence. a62 a134 a67a44a65a106a71a52a69 is the weight given to a classification of a a88a89a88 as an adjunct of a clause with a49a11a15 as verbal head.</Paragraph>
    <Paragraph position="17"> The function for the weighted assignment to the null class, a85 , is:</Paragraph>
    <Paragraph position="19"> This represents the impact of assigning a constituent neither to a verb (as a complement) nor  to another constituent (as an attributive modifier).</Paragraph>
    <Paragraph position="21"> head (e.g. a finite verb as part of a sentential coordination), although it might be the head of other a47a44a45 .</Paragraph>
    <Paragraph position="22"> The equations from 1 to 5 are devoted to the maximization task, i.e. which constituent is attached to which grammatical function and with which impact. Of course, without any further restrictions, every constituent would get assigned to every grammatical role - because there are no co-occurrence restrictions. Exactly this would lead to a maximal sum. In order to assure a valid distribution, restrictions have to be formulated, e.g. that a grammatical role can have at most one filler object and that a constituent can be at most the filler of one grammatical role.</Paragraph>
  </Section>
  <Section position="5" start_page="188" end_page="188" type="metho">
    <SectionTitle>
4 Constraints
</SectionTitle>
    <Paragraph position="0"> A constituent a47 a45 must either be bound as an attribute, an adjunct, a verb complement or by the null class. This is to say that all class variables with a47a131a45 sum up to exactly 1; a47a131a45 then is consumed.</Paragraph>
    <Paragraph position="2"> Here,a2 isan index over all constituents and a48 is one of the grammatical roles of verb a49 a15 (a48a5a4 a115a113a67a44a65 ). No two constituents can be attached to each other symmetrically (being head and modifier of each other at the same time), i.e. a78 (among others) is defined to be asymmetric.</Paragraph>
    <Paragraph position="4"> Finally, we must restrict the number of filler objects a grammatical role can have. Here, we have to distinguish among our two settings. In setting one (all case roles of all frames of a verb are collapsed into a single set of case roles), we can't require all grammatical roles to be instantiated (since we have an artificial case frame, not necessarily aproper one). Thisis expressed as</Paragraph>
  </Section>
  <Section position="6" start_page="188" end_page="188" type="metho">
    <SectionTitle>
5 The Weighting Scheme
</SectionTitle>
    <Paragraph position="0"> Amaximum entropy model was used to fixa probability model that serves as the basis for the ILP weights. The model was trained on the Tiger tree-bank (Brants et al., 2002) with feature vectors stemming from the following set of features: the part of speech tags of the two candidate chunks, the distance between them in phrases, the number of verbs between them, the number of punctuation marks between them, the person, case and number of the candidates, their heads, the direction of the attachment (left or right) and a passive/active voice flag.</Paragraph>
    <Paragraph position="1"> The output of the maxent model is for each pair of chunks (represented by their feature vectors) a probability vector. Each entry in this probability vector represents theprobability (usedasaweight) that the two chunks are in a particular grammatical relation (including the &amp;quot;non-grammatical relation&amp;quot;, a90a86a48a116a115 ) . For example, the weight for an adjunct assignment, a62 a134 a67a9a8a74a71a11a10 , of two chunks a49a103a41 (a verb) and a47a13a12 (a a2a4a3 or a a3a5a3 ) is given by the corresponding entry in the probability vector of the maximum entropy model. The vector also provides values for a subject assignment of these two chunks etc.</Paragraph>
  </Section>
  <Section position="7" start_page="188" end_page="189" type="metho">
    <SectionTitle>
6 Empirical Results
</SectionTitle>
    <Paragraph position="0"> The overall precision of the maximum entropy classifier is 87.46%. Since candidate pairs are generated almost without restrictions, most pairs do not realize a proper grammatical relation. In the training set these examples are labeled with the non-grammatical relation label a90 a48 a115 (which is the basis of ILPs null class a85 ). Since maximum entropy modeling seeks to sharpen the classifier with respect to the most prominent class, a90 a48 a115 gets a strong bias. So things are getting worse, if wefocus on the proper grammatical relations. The precision then is low, namely 62.73%, the recall is 85.76%, the f-measure is 72.46 %. ILP improves the precision by almost 20% (in the &amp;quot;all frames in one setting&amp;quot; the precision is 81.31%).</Paragraph>
    <Paragraph position="1"> We trained on 40,000 sentences, which gives about 700,000 vectors (90% training, 10% test, including negative and positive pairings). Our first experiment was devoted to fix an upper bound for the ILP approach: we selected from the set of sub-categorization frames of averbthecorrect one(according to the gold standard). The set of licenced grammatical relations then is reduced to the cor- null rect subcategorized GR and the non-governable GR a79 (adjunct) and a78 (attribute). The results are given in Fig. 2 under Fa71a70a104 a94a96a94 (cf. section 3 for GR shortcuts, e.g. a55 for subject).</Paragraph>
    <Paragraph position="2">  The results of the governable GR (a55 down to a1 ) are quite good, only the results for prepositional complements (a77 ) are low (the f-measure is 76.4%). From the 36509 grammatical relations, 37173 were found and 31680 were correct. Over-all precision is 85.23%, recall is 86.77% and the f-measure is 85.99%. The most dominant error being made here is the coherent but wrong assignment of constituents to grammatical roles (e.g. the subject is taken to be object). This is not a problem with ILP or the subcategorization frames, but one of the statistical model (and the feature vectors). It does not discriminate well among alternatives. Any improvement of the statistical model will push the precision of ILP.</Paragraph>
    <Paragraph position="3"> The results of the second setting, i.e. to collapse all grammatical roles of the verb frames to a single role set (cf. Fig. 2, Fa71a70a104a1a0a3a0 ), are astonishingly good. The f-measures comes close to the results of (Buchholz, 1999). Overall precision is 79.99%, recall 82.67% and f-measure is 81.31%. As expected, the values of the governable GR decrease (e.g. recall for prepositional objects by 30.1%).</Paragraph>
    <Paragraph position="4"> The third setting will be to let ILP choose among all subcategorization frames of a verb (there are up to 20 frames per verb). First experiments have shown that the results are between the</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML