File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/04/w04-0817_metho.xml
Size: 8,277 bytes
Last Modified: 2025-10-06 14:09:12
<?xml version="1.0" standalone="yes"?> <Paper uid="W04-0817"> <Title>Semantic Role Labelling with Similarity-Based Generalization Using EM-based Clustering</Title> <Section position="4" start_page="0" end_page="0" type="metho"> <SectionTitle> 3 Features </SectionTitle> <Paragraph position="0"> Constituent features. The first group of features represents properties of instances (i.e. constituents). We used the phrase type and head lemma of each constituent, its preposition, if any (otherwise NONE), its relative position with respect to the target (left, right, overlapping), the phrase type of its mother node, and the simplified path from the target to the constituent: all phrase types encountered on the way, and whether each step was up or down.</Paragraph> <Paragraph position="1"> Two further features stated whether this path had been seen as a frame element in the training data, and whether the constituent was subcategorised for (determined heuristically).</Paragraph> <Paragraph position="2"> Sentence level features. The second type of feature described the context of the current instance: The target word was characterised by its lemma, POS, voice, subcat frame (determined heuristically), and its governing verb; we also compiled a list of all prepositions in the sentence.</Paragraph> <Paragraph position="3"> Semantic features. The third type of features made use of EM-based clustering, stating the most probable label assigned to the constituent by the clustering model as well as a confidence score for this decision.</Paragraph> </Section> <Section position="5" start_page="0" end_page="0" type="metho"> <SectionTitle> 4 Classification </SectionTitle> <Paragraph position="0"> We first describe our general procedure, then the two different machine learning systems we used.</Paragraph> <Paragraph position="1"> Classification Procedure. As the semantic role labels of FrameNet are frame-specific, we decided to train one classifier for each frame. To cope with the large amount of constituents bearing no role label, we divided the procedure into two steps, distinguishing argument identification and argument labelling. First, argument identification decides for all constituents whether they are role-bearers or not.</Paragraph> <Paragraph position="2"> Then, argument labelling assigns semantic roles to those sequences classified as role-bearing. In our example (Fig. 1), the first step of classification ideally would single out the two NPs as possible role fillers, while the second step would assign the COG-NIZER and CONTENT roles.</Paragraph> <Paragraph position="3"> Maximum Entropy Learning. Our first classifier was a log-linear model, where the probability of a class a19 given an feature vector a40a41 is defined as</Paragraph> <Paragraph position="5"> where a44 is a normalisation constant, a57 . The model is trained by optimising the weights a58 subject to the maximum entropy constraint which ensures that the least committal optimal model is learnt. Maximum Entropy (Maxent) models have been successfully applied to semantic role labelling (Fleischman et al., 2003). We used the estimate software for estimation, which implements the LMVM algorithm (Malouf, 2002) and was kindly provided by Rob Malouf.</Paragraph> <Paragraph position="6"> Memory-based Learning. Our second learner implements an instance of a memory-based learning (MBL) algorithm, namely the a59 -nearest neighbour algorithm. This algorithm classifies test instances by assigning them the label of the most similar examples from the training set. Its parameters are the number of training examples to be considered, the similarity metric, and the feature weighting scheme. We used the implementation provided by TiMBL (Daelemans et al., 2003) with the default parameters, i.e. a59 =1 and the weighted overlap similarity metric with gain ratio feature weighting.</Paragraph> </Section> <Section position="6" start_page="0" end_page="0" type="metho"> <SectionTitle> 5 Similarity-based Generalisation over </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> Training Instances </SectionTitle> <Paragraph position="0"> FrameNet role labels are frame-specific. This makes it necessary to either train individual classifiers with little training data per frame, or train a large classifier with many sparse classes. So one important question is whether we can generalise, i.e.</Paragraph> <Paragraph position="1"> exploit similarities between frame elements, to gain more training data.</Paragraph> <Paragraph position="2"> We experimented with different generalisation methods, all following the same basic idea: If frame element A1 of frame A and frame element B1 of frame B are similar, we re-use A1 training data as B1 instances. In this process, we mask out features which might harm learning for A1, such as targets or sentence level features, or semantic features in case of syntactic similarities (and vice versa). We explored three types of role similarities, two based on symbolic information from the FrameNet database, and one statistical.</Paragraph> <Paragraph position="3"> Frame Hierarchy. FrameNet specifies frame-to-frame relations, among them three that order frames hierarchically: Inheritance, the Uses relation of partial inheritance, and the Subframe relation linking larger situation frames to their individual stages. All three indicate semantic similarity between (at least some) frame elements; in some cases corresponding frame elements are also syntactically similar, e.g.</Paragraph> <Paragraph position="4"> the Victim role of Cause_harm and the Evaluee role of Corporal_punishment are both typically realised as direct objects.</Paragraph> <Paragraph position="5"> Peripheral frame elements. FrameNet distinguishes core, extrathematic, and peripheral frame elements. Peripheral frame elements are frameindependent adjuncts; however the same frame element may be peripheral to one frame and core to another. So we took a peripheral frame element as similar to the same peripheral frame element in other frames: Given an instance of a peripheral frame element, we used it as training instance for all frames for which it was marked as peripheral in the FrameNet database.</Paragraph> <Paragraph position="6"> ness of fit&quot; between a target word and a potential role filler. We now say that two frame elements are similar if they are appropriate for some common cluster. For the head lemma clustering model, we define the appropriatenessa60 a31 a1a10a61a63a62a49a5 of a target:role pair a61a64a62 for a cluster a19 as follows:</Paragraph> <Paragraph position="8"> where a60 a31 a1a10a61a63a62a49a5 is the total frequency of all head lemmas a79 that have been seen with a61a64a62 , weighted by the class-membership probability of a79 in a19 . This appropriateness measure a60 a31 a1a4a61a64a62a49a5 is built on top of the class-based frequencies a57 a1a10a79a16a5a10a0a2a1 a19 a38a79a16a5 rather than on the frequencies a57 a1a10a79a16a5 or the class-membership probabilitiesa0a2a1 a19 a38a79a80a5 in isolation: For some tasks the combination of lexical and semantic information has been shown to outperform each of the single information sources (Prescher et al., 2000). Our similarity notion is now formalised as follows: With a threshold a81 as a parameter, two frame elements a61a64a62 a11 , a61a64a62a82a15 count as similar if for some class</Paragraph> <Paragraph position="10"> In the syntactic clustering model, a role filler was described as a combination of the path from instance to target, the instance's preposition, and the target voice. The appropriateness of a target:role pair is defined as for the above model. For time reasons, only verbal targets were considered.</Paragraph> <Paragraph position="11"> Figure 2 shows excerpts of two &quot;syntactic&quot; clusters in the form of target:frame.role members.</Paragraph> <Paragraph position="12"> Group 6 is a very homogeneous group, consisting of roles that are usually realised as subjects. Group 11 contains roles realised as prepositional phrases, but with very diverse prepositions, including in, at, along, and from.</Paragraph> </Section> </Section> class="xml-element"></Paper>