File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/94/a94-1027_intro.xml

Size: 1,581 bytes

Last Modified: 2025-10-06 14:05:37

<?xml version="1.0" standalone="yes"?>
<Paper uid="A94-1027">
  <Title>A Probabilistic Model for Text Categorization: Based on a Single Random Variable with Multiple Values</Title>
  <Section position="4" start_page="0" end_page="0" type="intro">
    <SectionTitle>
TIONS (TNM)&amp;quot; and &amp;quot;COMPUTERS AND INFORMA-
TION TECHNOLOGY (CPK).&amp;quot; While there may be cer-
</SectionTitle>
    <Paragraph position="0"> tain rules or standards for categorization, it is very difficult for human experts to assign categories consistently and efficiently to large numbers of daily incoming documents. The purpose of this paper is to propose a new probabilistic model for automatic text categorization.</Paragraph>
    <Paragraph position="1">  t ake@cs, tit ech. ac. jp While many text categorization models have been proposed so far, in this paper, we concentrate on the probabilistic models (Robertson and Sparck Jones, 1976; Kwok, 1990; Fuhr, 1989; Lewis, 1992; Croft, 1981; Wong and Yao, 1989; Yu et al., 1989) because these models have solid formal grounding in probability theory. Section 2 quickly reviews the probabilistic models and lists their individual problems. In section 3, we propose a new probabilistic model based on a Single random Variable with Multiple Values (SVMV).</Paragraph>
    <Paragraph position="2"> Our model is very simple, but solves some problems of the previous models. In section 4, we verify our model's superiority over the others through experiments in which we categorize &amp;quot;Wall Street Journal&amp;quot; articles.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML