File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/w06-2409_intro.xml

Size: 2,507 bytes

Last Modified: 2025-10-06 14:04:06

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-2409">
  <Title>Modeling Monolingual and Bilingual Collocation Dictionaries in Description Logics</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> We discuss the modeling of linguistic knowledge about collocations, for monolingual and bilingual electronic dictionaries, for multiple uses in NLP and for humans.</Paragraph>
    <Paragraph position="1"> Our notion of collocation is a lexicographic one, inspired by (Bartsch, 2004); we start from her working definition: &amp;quot;collocations are lexically and/or pragmatically constrained recurrent cooccurrences of at least two lexical items which are in a direct relation with each other.&amp;quot; The fact of being lexically and/or pragmatically constrained leads to translation problems, as such constraints are language specific. With Hausmann (2004), we assume that collocations have a base and a collocate, where the base is autosemantic and thus translatable without reference to the collocation, whereas the collocate is synsemantic, i.e. its reading is selected within a given collocation. Examples of collocations according to this definition include adjective+noun-combinations (heavy smoker, strong tea, etc.), verb+subject- (question arises, question comes up) and verb+complementgroups (give+talk, take+walk) etc. The definition excludes however named entities (Rio de Janeiro) and frequent compositional groups (e.g. the police said...). Our data have been semi-automatically extracted from 200 million words of German newspaper text of the 1990s (cf. Ritz (2005)).</Paragraph>
    <Paragraph position="2"> We claim that a detailed monolingual description of the linguistic properties of collocations provides a solid basis for bilingual collocation dictionaries. The types of linguistic information needed for NLP and those required for human use, e.g. in text production or translation into a foreign language, overlap to a large extent. Thus it is reasonable to define comprehensive monolingual data models and to relate these with a view to translation. null In section 2, we briefly list the most important phenomena to be captured (see also Heid and Gouws (2006)); section 3 introduces OWL DL, motivates its choice as a representation format and describes our monolingual modeling. In section 4, we discuss and illustrate the bilingual dictionary architecture.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML