File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/w06-0605_intro.xml
Size: 1,425 bytes
Last Modified: 2025-10-06 14:03:56
<?xml version="1.0" standalone="yes"?> <Paper uid="W06-0605"> <Title>Sydney, July 2006. c(c)2006 Association for Computational Linguistics Frontiers in Linguistic Annotation for Lower-Density Languages</Title> <Section position="4" start_page="29" end_page="29" type="intro"> <SectionTitle> 2 Lower-Density Languages </SectionTitle> <Paragraph position="0"> It should be noted from the outset that in this paper we interpret 'density' to refer to the amount of computational resources available, rather than the number of speakers any given language might have.</Paragraph> <Paragraph position="1"> The fundamental problem for annotation of lower-density languages is that they are lowerdensity. While on the surface, this is a tautology, it in fact is the problem. For a few languages of the world (such as English, Chinese and Modern Standard Arabic, and a few Western European languages), resources are abundant; these are the high-density Languages. For a few more languages (other European languages, for the most part), resources are, if not exactly abundant, at least existent, and growing; these may be considered medium-density languages. Together, high-density and medium-density languages account for perhaps 20 or 30 languages, although of course the boundaries are arbitrary. For all other languages, resources are scarce and hence they fall into our specific area of interest.</Paragraph> </Section> class="xml-element"></Paper>