File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/05/i05-2022_concl.xml
Size: 1,473 bytes
Last Modified: 2025-10-06 13:54:38
<?xml version="1.0" standalone="yes"?> <Paper uid="I05-2022"> <Title>HMM Based Chunker for Hindi</Title> <Section position="10" start_page="130" end_page="130" type="concl"> <SectionTitle> 6 Conclusions </SectionTitle> <Paragraph position="0"> In this paper, we have studied HMM based chunking for Hindi. We tried out several schemes for chunk labels and input tokens. We found that for a certain type of words (function words), word information along with POS information gave better precision. A similar differentiation was done for punctuations. We tried several methods to classify the chunks and found that a simple rule-based approach gave the best results. The nal precision we got was 92.63% for chunk boundary identi cation task and 91.70% for the composite task of chunk labelling with a recall of 100%.</Paragraph> <Paragraph position="1"> This paper raises the issue that if there are two tag sets T1 and a more nely differentiated set T2, then T2 might give better accuracy than T1, provided the errors are measured using the same metric (say, using the T1 set). This, we believe, is likely to happen, when T2 is more nely and appropriately differentiated. The most striking example was where T1 consisted of chunk boundaries and T2 consisted of boundaries and labels.</Paragraph> <Paragraph position="2"> Training with T2 outperformed T1 for the boundary task, even though it did not perform very well in the labelling task.</Paragraph> </Section> class="xml-element"></Paper>