File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/w06-2408_intro.xml
Size: 2,013 bytes
Last Modified: 2025-10-06 14:04:06
<?xml version="1.0" standalone="yes"?> <Paper uid="W06-2408"> <Title>Heiki-Jaan.Kaalep@ut.ee</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> This paper deals with verbal multiword expressions (VMWE) in real texts of a highly inflectional language - Estonian. The main emphasis is on the morphological and syntactic variability of such constructions with some implications and recommendations for their automatic treatment. Once we have a lexicon of VMWEs, large enough to be used in real-life applications (to help with morphological disambiguation, syntactic analysis, machine translation etc.), we need to devise algorithms to actually use them. This in turn requires knowledge about the behavior of VMWEs in real texts.</Paragraph> <Paragraph position="1"> Estonian language belongs to the Finnic group of the Finno-Ugric language family.</Paragraph> <Paragraph position="2"> Typologically Estonian is an agglutinating language but more fusional and analytic than the languages belonging to the northern branch of the Finnic languages. The word order is relatively free. One can find a detailed description of the grammatical system of Estonian in (Erelt 2003).</Paragraph> <Paragraph position="3"> In this paper we will focus on a special type of Estonian multi-word expressions, namely those that can function as a predicate in a clause. This paper is organized as follows. In section 2 we give a brief overview of the VMWEs in Estonian. Section 3 describes the database of the VMWEs and the corpus, where the VMWEs have been manually annotated.</Paragraph> <Paragraph position="4"> Here we will also present the statistics of the VMWEs in the corpus. In section 4 we discuss the variability of these expressions as registered in our corpus and the consequences of these variations for the automatic treatment of the VMWEs. And finally we will make our conclusions in section 5.</Paragraph> </Section> class="xml-element"></Paper>