File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/94/c94-1026_intro.xml
Size: 2,145 bytes
Last Modified: 2025-10-06 14:05:37
<?xml version="1.0" standalone="yes"?> <Paper uid="C94-1026"> <Title>A Part-of-Speech-Based Alignment Algorithm</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1. Introduction </SectionTitle> <Paragraph position="0"> Real texts provide the alive phenomena, usages, and tendency of langnage in a parlictflar space and time.</Paragraph> <Paragraph position="1"> This recommends us to do the researches on the corpora. Recently, many rese~{rchers timber claim that &quot;two languages art more informative than one&quot; (Dagan, 1991). They show that two languages coukl disambigna.te each other (Gale eC/ al., 1992); bilingual corpus could form a bilingual dictionary (Brown et al., 1988) and terminology correspondence bank (Eijk, 1993); a refined bilingual corpus could be formed the examples for machine translation systems (Sumita et al., 1990). To do such kinds of researches, the most impmlant task is to align the bilingual texts.</Paragraph> <Paragraph position="2"> Many length-based alignment algorithms have been proposed (Brown et al., 1991; Gale and Church, 1991a). The correct rates are good. However, the languages they processed belong to occidental family.</Paragraph> <Paragraph position="3"> When these algorithms are applied to other rtmning texts from different families, will the performance keep on tile same level? Other translation-based alignments (Kay, 199l; Chen, 1993) show the difficulty in determining the word correspondence and are very complex.</Paragraph> <Paragraph position="4"> In tiffs paper, we will introduce a part-of-speech (POS)-based alignment algorithm. Section 2 will touch on the level of alignment and define the sentence terminators. In Section 3, we will propose tile criterion of critical POSes and investigate the distribution of these POSes in the Chinese-English texts. Section ,l will describe a fifir and rigorous method for evaluating performance. Then, we apply simulated annealing technique to conducting experiments and show tile experimental results in Section 5. Section 6 will give a brief conclusion.</Paragraph> </Section> class="xml-element"></Paper>