File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/00/c00-2157_intro.xml

Size: 1,951 bytes

Last Modified: 2025-10-06 14:00:55

<?xml version="1.0" standalone="yes"?>
<Paper uid="C00-2157">
  <Title>A Description Language for Syntactically Annotated Corpora</Title>
  <Section position="3" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Syntactically annotated corpora like the Penn Treebank (Marcus et al., 1993), the NeGra corpus (Skut et al., 1998) or the statistically dismnbiguated parses in (Bell et al., 1999) provide a wealth of intbrmation, which can only be exploited with an adequate query language. For example, one might want to retrieve verbs with their sentential complements, or specific fronting or extraposition phenomena. So far, queries to a treebank have been formulated in scripting languages like tgrep, Perl or others. Recently, some powerful query languages have been developed: an exalnple of a highlevel, constraint-based language is described in (Duchier and Niehren, 1999). (Bird et al., 2000) propose a query language for the general concept of annotation grat)hs,, A graphical query notation tbr trees is under development in the ICE project (UCL, 2000).</Paragraph>
    <Paragraph position="1"> In the current paper, we present a proposal for a graph description language which is meant to fulfill two conflicting requirements: On the one hand, the language should be close to traditional linguistic descriptions languages, i.e. to grammar formalisms, as a basis for modular, understandable code, even for complex corpus queries. On the other lmnd, the language should not preclude etlicient query evaluation. Our answer is to profit from the research on typed, feature-based/constraint-based grammar tbrmalisms (e.g. (Carpenter, 1992), (Copestake, 1999), (DSrre and Dorna, 1993), (D6I're et al., 1996), (Emele and Zajac, 1990), (H6ht~ld and Smolka, 1988)), and to pick those ingredients which are known to be con~i)utationally 'tractable' in some sense.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML