File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/05/i05-6001_intro.xml
Size: 4,052 bytes
Last Modified: 2025-10-06 14:03:02
<?xml version="1.0" standalone="yes"?> <Paper uid="I05-6001"> <Title>The TIGER 700 RMRS Bank: RMRS Construction from Dependencies</Title> <Section position="3" start_page="0" end_page="1" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Treebanks are under development for many languages. They are successfully exploited for the induction of treebank grammars, training of stochastic parsers, and for evaluating and benchmarking competitive parsing and grammar models. While parser evaluation against treebanks is most natural for treebank-derived grammars, it is extremely difficult for hand-crafted grammars that represent higher-level functional or semantic information, such as LFG or HPSG grammars.</Paragraph> <Paragraph position="1"> In a recent joint initiative, the TIGER project provides dependency-based treebank representations for German, on the basis of the TIGER treebank (Brants et al., 2002).</Paragraph> <Paragraph position="2"> Forst (2003) applied treebank conversion methods to the TIGER treebank, to derive an f-structure bank for stochastic training and evaluation of a German LFG parser. A more theory-neutral dependency representation is currently derived from this TIGER-LFG tree-bank for cross-framework parser evaluation (Forst et al., 2004). However, while Penn-treebank style grammars and LFG analyses are relatively close to dependency representations, the output of HPSG parsing is difficult to match against such structures. HPSG analyses do not come with an explicit representation of functional structure, but directly encode semantic structures, in terms of (Robust) Minimal Recursion Semantics (henceforth (R)MRS.1 This leaves a gap to be bridged in terms of the encoding of arguments vs. adjuncts, the representation of special constructions like relative clauses, and not least, the representation of quantifiers and their (underspecified) scoping relations.</Paragraph> <Paragraph position="3"> In order to bridge this gap, we construct an RMRS &quot;treebank&quot; from a subset of the TIGERDependencyBank (Forst et al., 2004), which can serve as a gold standard for HPSG parsing for evaluation, and for training of stochastic HPSG grammar models. In contrast to treebanks constructed from analyses of hand-crafted grammars, our treebank con1RMRS (Copestake, 2003) is a formalism for partial semantic representation that is derived from MRS (Copestake et al., 2005). It is designed for the integration of semantic representations produced by NLP components of different degrees of partiality and depth, ranging from chunk parsers and PCFGs to deep HPSG grammars with (R)MRS output.</Paragraph> <Paragraph position="4"> version approach yields a standard for comparative parser evaluation where the upper bound for coverage is defined by the corpus (here, German newspaper text), not by the grammar.</Paragraph> <Paragraph position="5"> Our method for treebank conversion effectively performs priniciple-based (R)MRS semantics construction from LFG-based dependency representations, which can be extended to a general parsing architecture for (R)MRS construction from LFG f-structures.</Paragraph> <Paragraph position="6"> The remainder of this paper is organised as follows. Section 2 introduces the input dependency representations provided by the TIGER Dependency Bank, and describes the main features of the term rewriting machinery we use for treebank conversion, i.e., RMRS semantics construction from dependency structures. Section 3 presents the core of the semantics construction process. We show how to adapt the construction principles of the semantic algebra of Copestake et al. (2001) to RMRS construction from dependencies in a rewrite scenario, and discuss the treatment of some special phenomena, such as verbal complementation, coordination and modification.</Paragraph> <Paragraph position="7"> Section 4 reports on the treebank construction methodology, with first results of quality control. Section 5 concludes.</Paragraph> </Section> class="xml-element"></Paper>