File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/00/c00-2165_concl.xml
Size: 1,518 bytes
Last Modified: 2025-10-06 13:52:51
<?xml version="1.0" standalone="yes"?> <Paper uid="C00-2165"> <Title>KCAT : A Korean Corpus Annotating Tool Minimizing Human Intervention</Title> <Section position="6" start_page="1099" end_page="1099" type="concl"> <SectionTitle> 3 Our test corpus includes 10,015 words </SectionTitle> <Paragraph position="0"> accurately and consistently POS annotated corpus with nlinilnal hunmn labor. To achieve the goal, we have proposed a POS ta,,-in- tool named KCAT which can use human linguistic knowledge as a lexical rule form. Once a lexical role is acquired, the hutnan expert doesn't need to spend titne in tagging the same word in the same context. By using the lexical roles, we could have very accurate and consistent results as well its reducing the amount of the hurnan labor.</Paragraph> <Paragraph position="1"> It is obvious that the more lexical roles the tool acquires the higher accuracy and consistency it achieves. But it still requires a lot of human labor and cost to acquire many lexical rules. And, as the number of the lexical rules is increased, the speed of rule application is decreased. To overcome the barriers, we try to find a way of rule generalization and a more efficient way of rule encoding scheme like the finite-state atttomata(Roche, 1995).</Paragraph> <Paragraph position="2"> Furthermore, we will use the distance of the best and second tag's probabilities to classify reliable automatic tagging result and unreliable ta,,,,in,, result(Brants, 1999).</Paragraph> </Section> class="xml-element"></Paper>