File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/06/e06-2016_evalu.xml

Size: 3,613 bytes

Last Modified: 2025-10-06 13:59:33

<?xml version="1.0" standalone="yes"?>
<Paper uid="E06-2016">
  <Title>The GOD model</Title>
  <Section position="4" start_page="148" end_page="149" type="evalu">
    <SectionTitle>
3 Evaluation
</SectionTitle>
    <Paragraph position="0"> Performing a rigorous evaluation of an ontology learning process is not an easy task (Buitelaar et al., 2005) and it is outside the goals of this paper.</Paragraph>
    <Paragraph position="1"> Due to time constraints, we did not performed a quantitative and objective evaluation of our system. In Subsection 3.1 we describe the data and the NLP tools adopted by the system. In Subsection 3.2 we comment some example of the system output, providing a qualitative analysis of the resultsafterhavingproposedsomeevaluationguide- null lines. Finally, in Subsection 3.3 we discuss issues related to the recall of the system.</Paragraph>
    <Section position="1" start_page="148" end_page="148" type="sub_section">
      <SectionTitle>
3.1 Experimental Settings
</SectionTitle>
      <Paragraph position="0"> To expect high coverage, the system would be trained on WEB scale corpora. On the other hand, the analysis of very large corpora needs efficient preprocessing tools and optimized memory allocation strategies. For the experiments reported in this paper we adopted the British National Corpus (BNC-Consortium, 2000), and we parsed each sentence by exploiting a shallow parser on the output of which we detected SVO patterns by means of regular expressions1.</Paragraph>
    </Section>
    <Section position="2" start_page="148" end_page="149" type="sub_section">
      <SectionTitle>
3.2 Accuracy
</SectionTitle>
      <Paragraph position="0"> Once a query has been formulated, and a set of relations has been extracted, it is not clear how to evaluate the quality of the results. The first four columnsoftheexamplebelowshowtheevaluation we did for the query Karl Marx.</Paragraph>
      <Paragraph position="1">  memory-based shallow parser developed at CNTS Antwerp and ILK Tilburg (Daelemans et al., 1999) together with a set of scripts to extract SVO patterns (Reinberger et al., 2004) kindly put at our disposal by the authors.</Paragraph>
      <Paragraph position="2">  Several aspects are addressed: truthfulness (i.e. True vs. False in the first column), relevance for the query (i.e. Relevant vs. Not-relevant in the second column), information content (i.e. Informative vs. Uninformative, third column) and meaningfulness (i.e. Meaningful vs. Error, fourth column). For most of the test queries, the majority of the retrieved predicates were true, relevant, informative and meaningful, confirming the quality of the acquired DM and the validity of the relation extraction technique2.</Paragraph>
      <Paragraph position="3"> From the BNC, GOD was able to extract good quality information for many different queries in very different domains, as for example music, unix, painting and many others.</Paragraph>
    </Section>
    <Section position="3" start_page="149" end_page="149" type="sub_section">
      <SectionTitle>
3.3 Recall
</SectionTitle>
      <Paragraph position="0"> An interesting aspect of the behavior of the system is that if the domain of the query is not well represented in the corpus, the domain discovery step retrieves few domain specific terms. As a consequece, just few relations (and sometimes no relations) have been retrieved for most of our test queries. An analysis of such cases showed that the low recall was mainly due to the low coverage of the BNC corpus. We believe that this problem can be avoided by training the system on larger scale corpora (e.g. from the Web).</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML