File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/92/a92-1043_evalu.xml

Size: 3,860 bytes

Last Modified: 2025-10-06 14:00:08

<?xml version="1.0" standalone="yes"?>
<Paper uid="A92-1043">
  <Title>Learning a Scanning Understanding for &amp;quot;Real-world&amp;quot; Library Categorization</Title>
  <Section position="4" start_page="251" end_page="251" type="evalu">
    <SectionTitle>
4 Results and Conclusions
</SectionTitle>
    <Paragraph position="0"> For the first approach the 76 titles were divided into 61 titles for training and 15 titles for testing. For the 2493 titles in the second and third approach we used 1249 titles for training and 1244 for testing. Using these training and test sets we examined different network architectures and training parameters. For the first approach a configuration with 6 hidden units and a learning rate of 0.001 showed the smallest number of errors on the training and test set. For the second and third approach 3 hidden units and the learning rate 0.0001 performed best.</Paragraph>
    <Paragraph position="1"> Below we show example titles, the titles after preprocessing, and their sequential class assignment. The first two titles illustrate that two titles with the same final headnoun (&amp;quot;design&amp;quot;) are assigned to different classes due to their different learned preceding context. The third title illustrates the second approach of classifying an unrestricted complete phrase. The network first assigns the CS class for the initial phrase &amp;quot;On the operating experience of the...&amp;quot; since such initial representations have occurred in the CS class. However, when more specific knowledge is available (&amp;quot;doppler sodar system...&amp;quot;) the assigned class is changed to the MG class. In the fourth example the same title is shown for the third approach which eliminates insignificant domain-independent words. In general, the second and third approach have the potential to deal with unanticipated grammatical and even ungrammatical titles since they do not rely on a predefined grammar.</Paragraph>
    <Paragraph position="2">  1. Title: Design of relational database schemes by deleting attributes in the canonical decomposition; Approachl: Compound noun: Decomposition (CS) attribute (CS) scheme (CS) design (CS) 2. Title: Design of bulkheads for controlling water in underground mines; Approach1: Compound noun: Mine (MG) water (MG) bulkhead (MG) design (MG) 3. Title: On the operating experience of the doppler sodar system at the Forschungszentrum Juelich; Approach2: Unrestricted complete title: On (CS) the (CS) operating (CS) experience (CS) of (CS) the (CS) doppler (MG) sodar (MG) system (MG) at (MG) the (MG) Forschungszentrum (MG) Juelich (MG) 4. Title: On the operating experience Of the doppler  sodar system at the Forschungszentrum Juelich; Approach3: Unrestricted reduced title: operating (CS) experience (CS) doppler (MG) sodar (MG) system (MG) Forschungszentrum (MG) Juelich (MG) The overall performance of the three approaches as recorded in the best found configuration is summarized in table 1. The first approach performed worst for classifying new titles from the test set although the titles in the training set were learned completely. The second approach performed better on the test set for a much bigger training and test set of unrestricted phrases. The third approach demonstrated that the elimination of insignificant words from unrestricted phrases can improve performance for the big set of titles.</Paragraph>
    <Paragraph position="3">  In conclusion, we described and evaluated three different approaches for semantic classification which use hybrid symbolic/connectionist and connectionist representations. Our results show that recurrent plausibility networks and automatically acquired feature representations can provide an efficient basis for learning and generalizing a scanning understanding of real-world library classifications.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML