File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/98/w98-1004_intro.xml
Size: 1,903 bytes
Last Modified: 2025-10-06 14:06:46
<?xml version="1.0" standalone="yes"?> <Paper uid="W98-1004"> <Title>Finite State Automata and Arabic Writing</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> INTRODUCTION </SectionTitle> <Paragraph position="0"> Arabic writing has specific features, which imply computational overload for any arabicized software. The first one, well known now for many years, is the fact that Arabic printing tries to imitate handwriting. Because of this, consonants and long vowels can have four or only two shapes depending of their ability to be bound to the following letter and of where they appear in the word.</Paragraph> <Paragraph position="1"> These shapes can be very different : for example letter o 2 (h) ICERTAL : Centre d'l~tudes et de Recherche en Traitement Automatique des Langues, INALCO : Institut National des Langues et Civilisations Orientales ~the Arabic parts of this paper have been typeset isolated final medial initial or present only small variations : for example letter ~r* (s) isolated final medial initial Letters which cannot be bound to the next one have only two shapes, for example letters (d) and .~ (w and fi) isolated final isolated final During the seventies and the beginning of the eighties, hard controversies took place within the Arabs concerned with these questions, linguists and computer scientists. Finally in 1983 the ASMO (Arab Society for Normalization which unfortunately does not exist any more), influenced by Pr. Lakhdar-Ghazal from IERA (Rabat Morocco) chose to give a unique code to all shapes of one particular letter. This is certainly a good choice from a linguistic point of view, but even so, compromises had to be made to take into account writing habits that conflicted with it. Letter hamza is the most noticeable example of such a compromise for reasons we shall explain later.</Paragraph> </Section> class="xml-element"></Paper>