File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/95/e95-1027_intro.xml
Size: 2,026 bytes
Last Modified: 2025-10-06 14:05:54
<?xml version="1.0" standalone="yes"?> <Paper uid="E95-1027"> <Title>Towards a Workbench for Acquisition of Domain Knowledge from Natural Language</Title> <Section position="3" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> One of the standard methods for the extraction of domain knowledge (or domain schema in another terminology) from texts is known as Distributional Analysis (Hirshman 1986). It is based on the identification of the sublanguage specific co-occurrence properties of words in the syntactic relations in which they occur in the texts. These co-occurrence properties indicate important semantic characteristics of the domain: classes of objects and their hierarchical inclusion, properties of these classes, relations among them, lexico-semantic patterns for referring to certain conceptual propositions, etc. This knowledge about domain in the form it is extracted is not quite suitable to be included into the knowledge base and require a post-processing of the linguistically trained knowledge engineer. This is known as a conceptual analysis of the acquired lingistic data. In general all this is a time consuming process and often requires the help of a domain expert. However, it seems to be possible to automate some tasks and facilitate human intervention in many parts using a combination of NLP and statistical techniques for data extraction, type oriented patterns for conceptual characterization of this data and an intuitive user interface.</Paragraph> <Paragraph position="1"> All these resources are to be put together into a Knowledge Acquisition Workbench (KAWB) which is under development at LTG of the University of Edinburgh. The workbench supports an incremental process of corpus analysis starting from a rough automatic extraction and organization of lexico-semantic regularities and ending with a computer supported analysis of extracted data and a refinement of obtained hypotheses.</Paragraph> </Section> class="xml-element"></Paper>