File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/05/w05-1001_intro.xml

Size: 4,966 bytes

Last Modified: 2025-10-06 14:03:12

<?xml version="1.0" standalone="yes"?>
<Paper uid="W05-1001">
  <Title>Data Homogeneity and Semantic Role Tagging in Chinese</Title>
  <Section position="3" start_page="1" end_page="2" type="intro">
    <SectionTitle>
2 Related Work
</SectionTitle>
    <Paragraph position="0"> The definition of semantic roles falls on a continuum from abstract ones to very specific ones.</Paragraph>
    <Paragraph position="1"> Gildea and Jurafsky (2002), for instance, used a set of roles defined according to the FrameNet model (Baker et al., 1998), thus corresponding to the frame elements in individual frames under a particular domain to which a given verb belongs.</Paragraph>
    <Paragraph position="2"> Lexical entries (in fact not limited to verbs, in the case of FrameNet) falling under the same frame will share the same set of roles. Gildea and Palmer (2002) defined roles with respect to individual predicates in the PropBank, without explicit naming. To date PropBank and FrameNet are the two main resources in English for training semantic role labelling systems, as in the CoNLL-2004 shared task (Carreras and Marquez, 2004) and SENSEVAL-3 (Litkowski, 2004).</Paragraph>
    <Paragraph position="3"> The theoretical treatment of semantic roles is also varied in Chinese. In practice, for example, the semantic roles in the Sinica Treebank mark not only verbal arguments but also modifier-head relations (You and Chen, 2004). In our present study, we go for a set of more abstract semantic roles similar to the thematic roles for English used in VerbNet (Kipper et al., 2002). These roles are generalisable to most Chinese verbs and are not dependent on particular predicates. They will be further introduced in Section 3.</Paragraph>
    <Paragraph position="4"> Approaches in automatic semantic role labelling are mostly statistical, typically making use of a number of features extracted from parsed training sentences. In Gildea and Jurafsky (2002), the features studied include phrase type (pt), governing category (gov), parse tree path (path), position of constituent with respect to the target predicate (position), voice (voice), and headword (h). The labelling of a constituent then depends on its likelihood to fill each possible role r given the features and the target predicate t, as in the following, for example: ),,,,,|( tvoicepositiongovpthrP Subsequent studies exploited a variety of implementation of the learning component. Transformation-based approaches were also used (e.g.</Paragraph>
    <Paragraph position="5"> see Carreras and Marquez (2004) for an overview of systems participating in the CoNLL shared task).</Paragraph>
    <Paragraph position="6"> Swier and Stevenson (2004) innovated with an unsupervised approach to the problem, using a bootstrapping algorithm, and achieved 87% accuracy.</Paragraph>
    <Paragraph position="7"> While the estimation of the probabilities could be relatively straightforward, the trick often lies in locating the candidate constituents to be labelled.</Paragraph>
    <Paragraph position="8"> A parser of some kind is needed. Gildea and Palmer (2002) compared the effects of full parsing and shallow chunking; and found that when constituent boundaries are known, both automatic parses and gold standard parses resulted in about 80% accuracy for subsequent automatic role tagging, but when boundaries are unknown, results with automatic parses dropped to 57% precision and 50% recall. With chunking only, performance further degraded to below 30%. Problems mostly arise from arguments which correspond to more than one chunk, and the misplacement of core arguments. Sun and Jurafsky (2004) also reported a drop in F-score with automatic syntactic parses compared to perfect parses for role labelling in Chinese, despite the comparatively good results of their parser (i.e. the Collins parser ported to Chinese). The necessity of parse information is also reflected from recent evaluation exercises. For instance, most systems in SENSEVAL-3 used a parser to obtain full syntactic parses for the sentences, whereas systems participating in the CoNLL task were restricted to use only shallow  syntactic information. Results reported in the former tend to be higher. Although the dataset may be a factor affecting the labelling performance, it nevertheless reinforces the usefulness of full syntactic information.</Paragraph>
    <Paragraph position="9"> According to Carreras and Marquez (2004), for English, the state-of-the-art results reach an F  measure of slightly over 83 using gold standard parse trees and about 77 with real parsing results.</Paragraph>
    <Paragraph position="10"> Those based on shallow syntactic information is about 60.</Paragraph>
    <Paragraph position="11"> In this work, we study the problem in Chinese, treating it as a headword identification and labelling task in the absence of parse information, and examine how the nature of the dataset could affect the role tagging performance.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML