File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/w04-1501_intro.xml

Size: 2,696 bytes

Last Modified: 2025-10-06 14:02:40

<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-1501">
  <Title>Dependency and relational structure in treebank annotation</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Different treebanks use different annotation schemes which make explicit two distinct but interrelated aspects of the structure of the sentence, i.e. the function of the syntactic units and their organization according to a part-whole paradigm. The first aspect refers to a form of Relational Structure (RS), the second refers to its constituent or Phrase Structure (PS). The major difference between the two structures is that the RS allows for several types of relations to link the syntactic units, whilst the PS involves a single relation &amp;quot;part-of&amp;quot;. The RS can be seen as a generalization of the dependency syntax with the syntactic units instantiated to individual words in the dependency tree (Mel'Vcuk, 1988). As described in many theoretical linguistic frameworks, the RS provides a useful interface between syntax and a semantic or conceptual representation of predicate-argument structure. For example, Lexical Functional Grammar (LFG) (Bresnan, 1982) collocates relations at the interface between lexicon and syntax, Relational Grammar (RG) (Perlmutter, 1983) provides a description of the sentence structure exclusively based on relations and syntactic units not structured beyond the string level.</Paragraph>
    <Paragraph position="1"> This paper investigates how the notion of RS has been applied in the annotation of treebanks, in terms of syntactic units and types of relations, and presents a system for the definition of the RS that encompasses several uses in treebank schemata and can be viewed as a common underlying representation. The system, called Augmented Relational Structure (ARS) allows for an explicit representation of the three major components of linguistic structures, i.e. morpho-syntactic, functional-syntactic and semantic. Then the paper shows how a dependency-based annotation can descend on ARS, and describes the ARS-based annotation of a dependency treebank for Italian, the Turin University Treebank (TUT), which is the first available treebank for Italian, with a few quantitative results.</Paragraph>
    <Paragraph position="2"> The paper is organized as follows. The next section investigates both the annotation of RS in treebanks and the major motivations for the use of RS from language-specific issues and NLP tasks implementation; then we present the ARS system; finally, we show the dependency annotation of the TUT corpus.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML