File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/03/w03-0301_metho.xml
Size: 2,955 bytes
Last Modified: 2025-10-06 14:08:22
<?xml version="1.0" standalone="yes"?> <Paper uid="W03-0301"> <Title>al: A word alignment system with limited language</Title> <Section position="4" start_page="1" end_page="3" type="metho"> <SectionTitle> 4 Participating Systems </SectionTitle> <Paragraph position="0"> Seven teams from around the world participated in the word alignment shared task. Table 1 lists the names of the participating systems, the corresponding institutions, and references to papers in this volume that provide detailed descriptions of the systems and additional analysis of their results.</Paragraph> <Paragraph position="1"> All seven teams participated in the Romanian-English subtask, and five teams participated in the English-French subtask.</Paragraph> <Paragraph position="2"> There were no restrictions placed on the number of submissions each team could make. This resulted in a total of 27 submissions from the seven teams, where 14 sets of results were submitted for the English-French subtask, and 13 for the Romanian-English subtask. Of the 27 total submissions, there were 17 in the Limited resources subtask, and 10 in the Unlimited resources subtask. Tables 2 and 3 show all of the submissions for each team in the two subtasks, and provide a brief description of their approaches.</Paragraph> <Paragraph position="3"> While each participating system was unique, there were a few unifying themes.</Paragraph> <Paragraph position="4"> Four teams had approaches that relied (to varying degrees) on an IBM model of statistical machine translation (Brown et al., 1993). UMD was a straightforward implementation of IBM Model 2, BiBr employed a boosting procedure in deriving an IBM Model 1 lexicon, Ralign used IBM Model 2 as a foundation for their recursive splitting procedure, and XRCE used IBM Model 4 as a base for alignment with lemmatized text and bilingual lexicons.</Paragraph> <Paragraph position="5"> Two teams made use of syntactic structure in the text to be aligned. ProAlign satisfies constraints derived from a dependency tree parse of the English sentence being The two teams that did not participate in English-French were Fourday and RACAI.</Paragraph> <Paragraph position="6"> aligned. BiBr also employs syntactic constraints that must be satisfied. However, these come from parallel text that has been shallowly parsed via a method known as bilingual bracketing.</Paragraph> <Paragraph position="7"> Three teams approached the shared task with baseline or prototype systems. Fourday combines several intuitive baselines via a nearest neighbor classifier, RACAI carries out a greedy alignment based on an automatically extracted dictionary of translations, and UMD's implementation of IBM Model 2 provides an experimental platform for their future work incorporating prior knowledge about cognates. All three of these systems were developed within a short period of time before and during the shared task.</Paragraph> </Section> class="xml-element"></Paper>