File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/relat/00/w00-1312_relat.xml
Size: 2,042 bytes
Last Modified: 2025-10-06 14:15:39
<?xml version="1.0" standalone="yes"?> <Paper uid="W00-1312"> <Title>Cross-lingual Information Retrieval using Hidden Markov Models</Title> <Section position="11" start_page="100" end_page="101" type="relat"> <SectionTitle> 11 Related Work </SectionTitle> <Paragraph position="0"> Other studies which view IR as a query generation process include Maron and Kuhns, 1960; Hiemstra and Kraaij, 1999; Ponte and Croft, 1998; Miller et al, 1999. Our work has focused on cross-lingual retrieval.</Paragraph> <Paragraph position="1"> Many approaches to cross-lingual IR have been published. One common approach is using Machine Translation (MT) to translate the queries to the language of the documents or translate documents to the language of the queries (Gey et al, 1999; Oard, 1998). For most languages, there are no MT systems at all. Our focus is on languages where no MT exists, but a bilingual dictionary may exist or may be derived.</Paragraph> <Paragraph position="2"> Another common approach is term translation, e.g., via a bilingual lexicon. (Davis and Ogden, 1997; Ballesteros and Croft, 1997; Hull and Grefenstette, 1996). While word sense disambiguation has been a central topic in previous studies for cross-lingual IR, our study suggests that using multiple weighted translations and compensating for the incompleteness of the lexicon may be more valuable. Other studies on the value of disambiguation for cross-lingual IR include Hiernstra and de Jong, 1999; Hull, 1997.</Paragraph> <Paragraph position="3"> Sanderson, 1994 studied the issue of disarnbiguation for mono-lingual IR.</Paragraph> <Paragraph position="4"> The third approach to cross-lingual retrieval is to map queries and documents to some intermediate representation, e.g latent semantic indexing (LSI) (Littman et al, 1998), or the General Vector space model (GVSM), (Carbonell et al, 1997). We believe our approach is computationally less costly than (LSI and GVSM) and assumes less resources (WordNet in Diekema et al., 1999).</Paragraph> </Section> class="xml-element"></Paper>