File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/95/w95-0111_abstr.xml

Size: 1,462 bytes

Last Modified: 2025-10-06 13:48:29

<?xml version="1.0" standalone="yes"?>
<Paper uid="W95-0111">
  <Title>I Automatic Suggestion of Significant Terms for a Predefined Topic</Title>
  <Section position="1" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
ABSTRACT
</SectionTitle>
    <Paragraph position="0"> This paper presents a preliminary experiment in automatically suggesting significant terms for a predefined topic. The general method is to compare a topically focused sample created around the predefined topic with a larger and more general base sample. A set of statistical measures are used to identify significant word units in both samples. Identification of single word terms is based on the notion of word intervals. Two-word terms are identified through the computation of mutual information, and the extension of mutual information assists in capturing multi-word terms. Once significant terms of all these three types are identified, a comparison algorithm is applied to differentiate terms across the two data samples. If significant changes in the values of certain statistical variables are detected, associated terms will selected as being topic-oriented and included in a suggested list. To check the quality of the suggested terms, we compare them against terms manually determined by the domain expert. Though overlaps vary, we find that the automatical suggestion provides more terms that are useful for describing the predefined topic.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML