File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/04/p04-2011_concl.xml
Size: 2,554 bytes
Last Modified: 2025-10-06 13:54:09
<?xml version="1.0" standalone="yes"?> <Paper uid="P04-2011"> <Title>Beyond N in N-gram Tagging</Title> <Section position="7" start_page="0" end_page="0" type="concl"> <SectionTitle> 6 Conclusion and future work </SectionTitle> <Paragraph position="0"> This work has presented how the HMM for POS tagging was extended with global contextual information without increasing the number of parameters beyond practical limits. Two tagging experiments, using a model extended with a binary feature concerning the occurrence of nite verb forms, resulted in improved accuracies compared to using the standard model. The annotation of the training data with context labels was acquired automatically through the use of a wide-coverage parser.</Paragraph> <Paragraph position="1"> The tagger described here is used as a POS tag lter in wide-coverage parsing of Dutch (Prins and van Noord, 2004), increasing parsing ef ciency as fewer POS tags have to be considered. In addition to reducing lexical ambiguity, it would be interesting to see if structural ambiguity can be reduced. In the approach under consideration, the tagger supplies the parser with an initial syntactic structure in the form of a partial bracketing of the input, based on the recognition of larger syntactic units or 'chunks'. Typically chunk tags will be assigned on the basis of words and their POS tags. An alternative approach is to use an extended model that assigns chunk tags and POS tags simultaneously, as was done for nite verb occurrence and POS tags in the current work. In this way, relations between POS tags and chunk tags can be modeled in both directions.</Paragraph> <Paragraph position="2"> Another possible application is tagging of German. German features different cases, which can lead to problems for statistical taggers. This is illustrated in (Hinrichs and Trushkina, 2003) who point out that the TnT tagger wrongly assigns nominative case instead of accusative in a given sentence, resulting in the unlikely combination of two nominatives. The preference for just one assignment of the nominative case might be learned by including case information in the model.</Paragraph> <Paragraph position="3"> Acknowledgements. This research was carried out as part of the PIONIER Project Algorithms for Linguistic Processing, funded by NWO (Dutch Organization for Scienti c Research) and the University of Groningen. I would like to thank Hans van Halteren for supplying the Eindhoven corpus data set as used in (van Halteren et al., 2001).</Paragraph> </Section> class="xml-element"></Paper>