File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/00/w00-0741_intro.xml

Size: 3,894 bytes

Last Modified: 2025-10-06 14:00:58

<?xml version="1.0" standalone="yes"?>
<Paper uid="W00-0741">
  <Title>Learning from Parsed Sentences with INTHELEX</Title>
  <Section position="3" start_page="0" end_page="194" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Language learning has gained growing attention in the last years. Statistical approaches, so far extensively used -- see (Saitta and Neri, 1997) for an overview of the research in this area --, have severe limitations, whereas the flexibility and expressivity of logical representations make them highly suitable for natural language analysis (Cussens, 1999). Indeed, logical approaches may have a relevant impact at the level of semantic interpretation, where a logical representation of the meaning of a sentence is important and useful (Mooney, 1999).</Paragraph>
    <Paragraph position="1"> Logical approaches have been already employed in Text Categorization and/or Information Extraction. Yet they try to use an expressive representation language such as first-order logic to define simple properties about textual sources, regarded, for instance, as bags of words (Junker et al., 1999) or as semi-structured texts (Freitag, 2000). These properties are often loosely related with the grammar of the underlying language, often relying on extra-grammatical features (Cohen, 1996). We intend to exploit a logic representation for exploiting the grammatical structure of texts, as it could be detected using a proper parser. Indeed, a more knowledge intensive technique is likely to perform better applied on the tasks mentioned above.</Paragraph>
    <Paragraph position="2"> When no background knowledge about the language structure is assumed to be available, one of the fundamental problems with the adoption of logic learning techniques is that a structured representation of sentences is required on which the learning algorithm can be run.</Paragraph>
    <Paragraph position="3"> Thus, the need arises for parsers that are able to discover such a structure starting from raw, unstructured text. Research in this field has produced a variety of tools and techniques for English, that cannot be applied to other languages, such as Italian, because of the different, and sometimes much more complex, grammatical structure. Such considerations led us to develop a prototypical Italian language parser, that could serve as a pre-processor of texts in order to obtain the structured representation of sentences that is needed for the symbolic learner to work. It is fundamental to note that the focus of this paper is not the parser, that does not adopt sophisticated NLP techniques. The aim here is investigating the feasibility of learning semantic definitions for some kinds of sentences.</Paragraph>
    <Paragraph position="4"> Even more so, the availability of a professional parser will further enhance the performance of the whole process.</Paragraph>
    <Paragraph position="5"> Further problems in applying relational learning to language are due to the intrinsic computational complexity of these methods, as a draw- null back of the expressive power gained through relations. Moreover, another weakness of our approach could be the dependence on the quality of the data coming from the preprocessing step: it is possible that noise coming from wrongly parsed sentences be present, thus having a negative influence towards the model to be induced. After briefly presenting in Section 2 the parser performance, in order to establish the degree of reliability of the data on which the learning step is performed, Section 3 shows the results of applying the first-order learning system INTHELEX (Esposito et al., 2000) for the inference of some simple events related to foreign commerce. Lastly, Section 4 draws some preliminary conclusions on this research and outlines future work issues to be addressed.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML