File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/02/p02-1059_intro.xml
Size: 2,499 bytes
Last Modified: 2025-10-06 14:01:30
<?xml version="1.0" standalone="yes"?> <Paper uid="P02-1059"> <Title>Supervised Ranking in Open-Domain Text Summarization</Title> <Section position="3" start_page="0" end_page="0" type="intro"> <SectionTitle> 2 Supervised Ranking with Probabilistic Decision Tree </SectionTitle> <Paragraph position="0"> One technical problem associated with the use of a decision tree as a summarizer is that it is not able to rank sentences, which it must be able do, to allow for the generation of a variable-length summary. In response to the problem, we explore the use of a probabilistic decision tree as a ranking model. First, let us review some general features of probabilistic decision tree (ProbDT, henceforth) (Yamanishi, 1997; Rissanen, 1997).</Paragraph> <Paragraph position="1"> ProbDT works like a usual decision tree except that rather than assigning each instance to a single class, it distributes each instance among classes. For each instance xi, the strength of its membership to each of the classes is determined by P(ck j xi) for each class ck.</Paragraph> <Paragraph position="2"> Consider a binary decision tree in Fig 1. Let X1 and X2 represent non-terminal nodes, and Y1 and Y2 leaf nodes. '1' and '0' on arcs denote values of some attribute at X1 and X2. iy and in represent the probability that a given instance assigned to the node i is labeled as yes and no, repectively.</Paragraph> <Paragraph position="3"> Abusing the terms slightly, let us assume that X1 and X2 represent splitting attributes as well at respective nodes. Then the probability that a given instance with X1 = 1 and X2 = 0 is labeled as yes (no) is 2y ( 2n). Note that Pc jc = 1 for a given node j.</Paragraph> <Paragraph position="4"> Now to rank sentences with ProbDT simply involves finding the probability that each sentence is assigned to a particular class designating sentences worthy of inclusion in a summary (call it 'Select' class) and ranking them accordingly. (Hereafter and throughout the rest of the paper, we say that a sentence is wis if it is worthy of inclusion in a summary: thus a wis sentence is a sentence worthy of inclusion in a summary.) The probabiliy that a sentence u is labeled as wis is expressed as in Table 1, where ~u is a vector representation of u, consisting of a set of values for features of u; fi is a smoothing function, e.g., Laplace's law; t(~u) is some leaf node assigned to ~u; and DT represents some decision tree used to classify ~u.</Paragraph> </Section> class="xml-element"></Paper>