File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/05/w05-0708_abstr.xml

Size: 1,096 bytes

Last Modified: 2025-10-06 13:44:36

<?xml version="1.0" standalone="yes"?>
<Paper uid="W05-0708">
  <Title>POS Tagging of Dialectal Arabic: A Minimally Supervised Approach</Title>
  <Section position="1" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> Natural language processing technology for the dialects of Arabic is still in its infancy, due to the problem of obtaining large amounts of text data for spoken Arabic. In this paper we describe the development of a part-of-speech (POS) tagger for Egyptian Colloquial Arabic. We adopt a minimally supervised approach that only requires raw text data from several varieties of Arabic and a morphological analyzer for Modern Standard Arabic. No dialect-specific tools are used. We present several statistical modeling and cross-dialectal data sharing techniques to enhance the performance of the baseline tagger and compare the results to those obtained by a supervised tagger trained on hand-annotated data and, by a state-of-the-art Modern Standard Arabic tagger applied to Egyptian Arabic.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML