File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/94/a94-1027_concl.xml

Size: 1,666 bytes

Last Modified: 2025-10-06 13:57:09

<?xml version="1.0" standalone="yes"?>
<Paper uid="A94-1027">
  <Title>A Probabilistic Model for Text Categorization: Based on a Single Random Variable with Multiple Values</Title>
  <Section position="8" start_page="166" end_page="166" type="concl">
    <SectionTitle>
5 Conclusion
</SectionTitle>
    <Paragraph position="0"> We have proposed a new probabilistic model for text categorization. Compared to previous models, our model has the following advantages; 1) it considers within document term frequencies, 2) considers term weighting for target documents, and 3) is less affected by having insufficient training cases. We have also provided empirical results verifying our model's superiority over the others in the task of categorizing news articles from the &amp;quot;Wall Street Journal.&amp;quot; There are several directions along which this work could be extended.</Paragraph>
    <Paragraph position="1"> * We have to compare our probabilistic model to other non probabilistic models like decision tree/rule based models, one of which has recently been reported to be promising (Apt4 et al., 1994).</Paragraph>
    <Paragraph position="2"> * While we used simple document representation in which a document is defined as a set of nouns, there could be considered several improvements, such as using phrasal information (Lewis, 1992), clustering terms (Sparck Jones, 1973), reducing the number of features by using local dictionary (Apt4 et al., 1994), etc.</Paragraph>
    <Paragraph position="3"> * We are incorporating our probabilistic model into cluster-based text categorization that offers an efficient and effective search strategy.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML