File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/96/c96-2098_concl.xml
Size: 2,138 bytes
Last Modified: 2025-10-06 13:57:33
<?xml version="1.0" standalone="yes"?> <Paper uid="C96-2098"> <Title>Extraction of Lexical Translations from Non-Aligned Corpora</Title> <Section position="8" start_page="584" end_page="584" type="concl"> <SectionTitle> 6 Conclusions </SectionTitle> <Paragraph position="0"> Lexical translations were extracted from non-aligned corpora. The assumption that &quot;translations of two co-occurring words in a source language also co-occur in the target language&quot; was introduced and represented in the stochastic matrix formulation. The translation matrix provides the co-occurring information translated from the source into the target. This translated co-occurring information should resemble that in the target when the ambiguity of translational relation is resolved. This condition was used to obtain the best translation matrix.</Paragraph> <Paragraph position="1"> The proposed framework, aimed at ambiguity resolution, serves to globally obtain lexical translations using non-aligned corpora just as to choose a translation according to the local context. The algorithms for obtaining the best translation matrix were shown based on the Steepest Descent Method, an algorithm well known in the field of non-linear programming.</Paragraph> <Paragraph position="2"> Two experiments were t)erformed to exanfine the power of local ambiguity resolution and dictionary refinement. The former showed a precision of 82.1% with applicability of 75.5%. In the latter, irrelevant translations were intentionally added to the dictionary to examine whether the relevant ones will be chosen. It was found that 84.7% of the dropped words were indeed irrelevant ones.</Paragraph> <Paragraph position="3"> An important future task is to decrease the computational complexity. The method is applicable to matrix calculation with the size of an entire dictionary, but this is unrealistic at this stage. We must also increase the rate of ambigqfity resolution. The corpus is regarded as non-structured data in this paper, the ambiguity might be resolved more effectively by introducing a phrasal structure.</Paragraph> </Section> class="xml-element"></Paper>