File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/02/c02-2014_concl.xml
Size: 2,509 bytes
Last Modified: 2025-10-06 13:53:18
<?xml version="1.0" standalone="yes"?> <Paper uid="C02-2014"> <Title>Recognition Assistance Treating Errors in Texts Acquired from Various Recognition Processes Gabor PROSZEKY MorphoLogic</Title> <Section position="6" start_page="2" end_page="2" type="concl"> <SectionTitle> 7 Implementation </SectionTitle> <Paragraph position="0"> The first version of the MorphoLogic Recognition Assistant framework has been implemented along with a demonstration interface. This application takes symbolic codes of different recognized symbols (phonemes, OCR-read characters etc.), and provides orthographical output. It has been programmed in C++ using MS Visual Studio 6.0, and runs on 32-bit Windows systems. As service modules, the framework incorporates the Humor (morphological analyser), the Helyesebb (grammatical validator), and the HumorESK (full parser) technologies. With a standard programming interface, it is ready to be integrated with existing recognition systems.</Paragraph> <Paragraph position="1"> Conclusion This paper has introduced a framework for treating common error classes occurring in the output of various recognition sources. We have shown that different types of recognition sources share the same error types: namely, (1) poor or nonexistent segmentation, (2) underspecified and (3) incorrectly recognized symbols.</Paragraph> <Paragraph position="2"> Our proposed solution is a post-processing phase performed on the output of the recognition source, where morpho-lexical and syntactic models validate (either accept or reject) different orthographical candidates derived from a single recognized symbol sequence.</Paragraph> <Paragraph position="3"> The system is language independent and completely data-driven: by replacing the databases, the MorphoLogic Recognition Assistant is immediately ready to work with a different language. For the Humor system, descriptions exist for several languages (Hungarian, English, German, Spanish, Czech, Polish and Romanian). Syntax descriptions are under development for Hungarian and English (prototypes exist).</Paragraph> <Paragraph position="4"> The proposed framework seems promising for continuous recognition systems. Its main advantage is the ease of application of any linguistic module, thanks to the separate symbol mapping process and the open architecture. However, we must emphasize again that the MorphoLogic Recognition Assistant supports existing recognition systems rather than replacing them.</Paragraph> </Section> class="xml-element"></Paper>