File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/06/w06-1205_evalu.xml
Size: 1,881 bytes
Last Modified: 2025-10-06 13:59:52
<?xml version="1.0" standalone="yes"?> <Paper uid="W06-1205"> <Title>Detecting Complex Predicates in Hindi using POS Projection across Parallel Corpora</Title> <Section position="7" start_page="33" end_page="33" type="evalu"> <SectionTitle> 5 Conclusion </SectionTitle> <Paragraph position="0"> In this work we have presented a preliminary approach to a corpus-based lexicon of CPs in Hindi based on projecting POS tags across parallel English-Hindi corpora. Since the approach involves minimal linguistic analysis, it is easily extendable to other languages which exhibit similar CP constructs, provided the availability of a POS lexicon.</Paragraph> <Paragraph position="1"> Clearly, a number of problems will remain with any such approach. The limitiations of the parallel POS tagging is that certain kinds of maps may never be found (as in parallel CPs in source and target languages). On the other hand, some of our accuracies, we feel, would improve considerably given a larger parallel corpus and more refined use of a Hindi lexicon.</Paragraph> <Paragraph position="2"> In addition to the handling of discontinuous CPs hinted at above, another aspect that we would like to consider next is to tune some of the parameters of the parallel tagging algorithm, such as specifically tuning the distortion and fertility probabilities in situations (e.g. English verbs) that are likely to manifest CPs in Hindi.</Paragraph> <Paragraph position="3"> We feel that beyond the usefulness of this initial approach, the database of CPs constructed in this work may in itself be an important linguistic resource for Hindi.</Paragraph> <Paragraph position="4"> Furthermore, the approach can possibly be used to detect MWEs that radiate to a single lexical structure in another language, e.g.</Paragraph> <Paragraph position="5"> phrasal verbs in English.</Paragraph> </Section> class="xml-element"></Paper>