File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/95/j95-2004_concl.xml
Size: 2,091 bytes
Last Modified: 2025-10-06 13:57:27
<?xml version="1.0" standalone="yes"?> <Paper uid="J95-2004"> <Title>Deterministic Part-of-Speech Tagging with Finite-State Transducers</Title> <Section position="11" start_page="249" end_page="251" type="concl"> <SectionTitle> 12. Conclusion </SectionTitle> <Paragraph position="0"> The techniques described in this paper are more general than the problem of part-of-speech tagging and are applicable to the class of problems dealing with local transformation rules.</Paragraph> <Paragraph position="1"> We showed that any transformation-based program can be transformed into a deterministic finite-state transducer. This yields to optimal time implementations of transformation based programs.</Paragraph> <Paragraph position="2"> As a case study, we applied these techniques to the problem of part-of-speech tagging and presented a finite-state tagger that requires n steps to tag a sentence of length n, independently of the number of rules and the length of the context they require. We achieved this result by representing the rules acquired for Brill's tagger as nondeterministic finite-state transducers. We composed each of these nondeterministic transducers and turned the resulting transducer into a deterministic transducer. The resulting deterministic transducer yields a part-of-speech tagger that operates in optimal time in the sense that the time to assign tags to a sentence corresponds to the time required to follow a single path in this deterministic finite-state machine. The Computational Linguistics Volume 21, Number 2 tagger outperforms in speed both Brill's tagger and stochastic taggers. Moreover, the finite-state tagger inherits from the rule-based system its compactness compared with stochastic taggers. We also proved the correctness and the generality of the methods. We believe that this finite-state tagger will also be found useful when combined with other language components, since it can be naturally extended by composing it with finite-state transducers that could encode other aspects of natural language syntax.</Paragraph> </Section> class="xml-element"></Paper>