File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/85/e85-1036_metho.xml
Size: 12,487 bytes
Last Modified: 2025-10-06 14:11:43
<?xml version="1.0" standalone="yes"?> <Paper uid="E85-1036"> <Title>A RULE-BASED APPROACH TO EVALUATING IMPORTANCE IN DESCRIPTIVE TEXTS</Title> <Section position="4" start_page="244" end_page="245" type="metho"> <SectionTitle> 3. A COMPUTATIONAL APPROACH </SectionTitle> <Paragraph position="0"> Most of the ideas outlined in the previous section have been implemented in the design of a subsystem of SUSY, called the importance evaluator, that takes in input the ELR representation of a natural language text and the representation of a reader's goal and produces in output the corresponding HPN. The evaluator is implemented by a rule-based system (Davis and King, 1976) with a forward chaining control regime. Knowledge available to the evaluator comprises two parts: a rule base and an encyclopedia.</Paragraph> <Paragraph position="1"> The rule base embodies expert knowledge necessary ~r importance evaluation. It is constituted by production rules, called importance rules, having the usual IF-THEN form. Rules can be classified according to their competence, i.e. to the different types of knowledge utilized for evaluating importance. From this point of view, three classes of rules are considered: structural rules, which express the fact that some parts of the text can be judged important just by looking at their structure and organization, discarding thei r meaning; semantic ~ules, which can evaluate importance by specifically taking into account some specific structural features of the text that convey a definite meaning; encyclopedic rules, which can evaluate importance by comparing the meaning of the text with domain specific knowledge contained in the encyclopedia.</Paragraph> <Paragraph position="2"> The IF-part of the rules contains conditions that are evaluated with respect to the current HPN (initially the ELR), and the THEN-part specifies either an importance evaluation or an action to be performed to further the analysis (e.g., a strategic choice, a criterion to solve conflicting evaluations, etc.).</Paragraph> <Paragraph position="3"> The evaluation of importance contained in the THEN-part of a rule takes usually the form of an ordering relation (e.g., less, equal, etc.) among importance values of concepts or propositions of the ELR, or it specifies ranges of importance values (e.g., high, low, etc.). Thus, rules only assert relative importance of different parts of the text: a constraint propagation algorithm will eventually transform these relative evaluations into absolute importance values according to a given scale.</Paragraph> <Paragraph position="4"> The encyclopedia is the second knowledge source employed by the evaluator and it contains domain specific knowledge. Encyclopedic knowledge is represented through a net of frames. Frames embody, in addition to a header, two kinds of slots: knowledge slots, that contain domain specific knowledge, represented in a form homogeneous with the propositional language of the ELR; reference slots, containing pointers to other frames that deal with related topics in the subject domain.</Paragraph> <Paragraph position="5"> This organization allows easy implementation of a property inheritance mechanism.</Paragraph> <Paragraph position="6"> We now illustrate the notion of goal which is of crucial importance for understanding the overall mode of operation of the evaluator. The goal is a chunk of variable knowledge, assigned by the user taking into account the pragmatic aspects of the understanding activity, that defines the motivations and objectives that are behind the reading process. The role of the goal is twofold: exerting control on the activation of importance rules that operate on the ELR; selective focusing, i.e. enabling the evaluator to choose from the encyclopedia the pieces of knowledge which are expected to be relevant to the current importance evaluation.</Paragraph> <Paragraph position="7"> The use of the goal in selective focusing comprises two activities: validating matching between the current ELR and the knowledge contained in a frame header or knowledge slot (direct frame activation), or activating a new frame pointed at in a reference slot of a currently active frame (i ndi rect frame acti vati on ).</Paragraph> <Paragraph position="8"> Therefore, the encyclopedia does not contain any a-priory judgment about importance. Full responsibility of this activity is left to the evaluator, which can interpret the content of the encyclopedia frames according to the current goal and can use the extracted knowledge to support the rule-based evaluation process.</Paragraph> </Section> <Section position="5" start_page="245" end_page="245" type="metho"> <SectionTitle> 4. SAMPLE OPERATION OF THE EVALUATOR </SectionTitle> <Paragraph position="0"> The current prototype version of the evaluator operates on simple texts taken from scientific and technical computer science literature on operating systems. It includes about 40 importance rules and a small encyclopedia of about 30 frames. The goal has been assigned a very simple structure: it is a logical con~)ination of key-terms, chosen in a predefinite set, that represent possible points of view a reader can take in analyzing a text.</Paragraph> <Paragraph position="1"> In this section we will illustrate some of the most basic mechanisms of importance evaluation through a few examples.</Paragraph> <Paragraph position="2"> Let us consider the following sample text: &quot;U-DOS is an operating system developed by Softproducts Ltd. in 1982. It has a modular organization and is suitable for real-time applications. U-DOS includes powerful tools for interactive processing and supports a sophisticated window management that makes it user friendly, i.e. easily usable by novices or untrained end-users, Easy operation is, in fact, the main reason of it widespread diffusion in the data processing market, especially among CAD/CAM users who appreciate its graphic utilities.&quot; The ELR of this text results (for description of the formalism refer to: and Tasso, 1984): The set of key-terms that can be used to specify the goal includes, among others: KNOW, BUY, and USE. We assume hereinafter the goal KNOW, i.e., we are particularly interested in knowing the main technical features of the U-DOS operating system. With such a goal, some pieces of the encyclopedia turn out to be relevant to the evaluation of our sample text, while others are discarded, as it will be illustrated below.</Paragraph> <Paragraph position="3"> In order to analyze the text, the evaluator generates from the ELR, as a preliminary step, a new structure, called the cohesion graph, that explicitly shows all the references among propositions of the ELR. The cohesion graph is a bipartite graph whose nodes are constituted by concepts and propositions connected by three kinds of arcs: directed arcs connecting pairs of propositions (say from P to Q), which represent embedding of a proposition into another (Q in P); simple arcs, connecting a concept and a proposition, which indicate that the concept appears as an argument in the proposition; double directed arcs, connecting two concepts via a propositional node (say from A to B via P), which show that a concept enters as the argument of a proposition stating an ISA relation (P states that A ISA B).</Paragraph> <Paragraph position="4"> A portion of the cohesion graph of our sample text is shown in Figure 1.</Paragraph> <Paragraph position="5"> Structural rules can exploit the information provided by the cohesion graph in order to selectively capturing the importance of the different parts of the text. An example of a structural rule is: Rule $4: Highly Referenced Concept IF in the cohesion graph there is a concept C which is at least K-referenced THEN assign C an importance value w(C) = high. This rule guesses that a concept which is highly referenced in a text is probably important. In our example (where the parameter K is set equal to 3), the concept U-DOS is considered important as it is highly referenced.</Paragraph> <Paragraph position="6"> Importance can be evaluated by chaining several rules. As an example, after rule $4 has been applied, the following rule can fire: Rule MT: ISA Proposition IF a proposition P represents an ISA relation</Paragraph> </Section> <Section position="6" start_page="245" end_page="247" type="metho"> <SectionTitle> AND </SectionTitle> <Paragraph position="0"> the argument of P is a concept C with importance value w(C) THEN assign P an importance value w(P) = w(C). The rationale of this rule is that, if a concept is important, any proposition that states an ISA relation about that concept is important too. This allows, for example, considering proposition 10 (which states that U-DOS is an operating system) as important.</Paragraph> <Paragraph position="1"> Rule M7 allows, moreover, the application of the following rule: Rule E6: ISA Frame Activation IF a proposition P represents an ISA relation</Paragraph> </Section> <Section position="7" start_page="247" end_page="247" type="metho"> <SectionTitle> AND </SectionTitle> <Paragraph position="0"> P has importance value w(P) > low</Paragraph> </Section> <Section position="8" start_page="247" end_page="248" type="metho"> <SectionTitle> AND </SectionTitle> <Paragraph position="0"> the predicate of P is the header of a frame F in the encyclopedia THEN activate F.</Paragraph> <Paragraph position="1"> In our example, the fact that proposition 10 is important (w(P) = high) and that it represents an ISA relation allows the OPERATING-SYSTEM frame to be activated (see Figure 2, where a portion of the encyclopedia relevant to the current example is shown). Note that rule E6 does not directly state whether a proposition or a concept has to be considered important or not, but it specifies which frames are to be considered relevant in the current context.</Paragraph> <Paragraph position="2"> Most evaluations are goal dependent and rely on a goal interpreter, able to evaluate a specific piece of ELR or a frame slot of the encyclopedia in order to determine its relevance to the current goal. The goal interpreter performs in such a way a complex matching, which allows implementation of selective focusing. Consider, for example, the following rule: RULE E19: Goal-Dependent Frame Activation IF the current goal matches a reference slot * R of an active frame THEN activate the frame whose header is pointed at by R.</Paragraph> <Paragraph position="3"> Successive applications of this rule allow activation, starting from the OPERATING-SYSTEM frame, of the SOFTWARE-SYSTEM frame and, then, of the COMPLEX-SYSTEM frame (see Figure 2). At this point the following rule applies:</Paragraph> <Section position="1" start_page="248" end_page="248" type="sub_section"> <SectionTitle> Rule E25: Goal Dependent Matching </SectionTitle> <Paragraph position="0"> IF a proposition P matches a pattern contained in a knowledge slot K of an active frame</Paragraph> </Section> </Section> <Section position="9" start_page="248" end_page="248" type="metho"> <SectionTitle> AND </SectionTitle> <Paragraph position="0"> the current goal matches K THEN assign P an importance value w(P) = high. In our example, since (i) the COMPLEX-SYSTEM frame is active and proposition 70 of the ELR matches the pattern MODULAR (ORGANIZATION) of the &quot;structure&quot; slot of the frame, and (ii) the goal interpreter evaluates that the knowledge slot &quot;structure&quot; is relevant to the goal KNOW, then proposition 70 is considered important.</Paragraph> <Paragraph position="1"> As a last example, we illustrate a rule that exploits knowledge concerning the macro-structure of the text: Rule M9: Macro Clarification IF if there exists a macro-proposition</Paragraph> </Section> <Section position="10" start_page="248" end_page="248" type="metho"> <SectionTitle> CLARIFICATION (P, Q) </SectionTitle> <Paragraph position="0"> THEN assign P and Q importance values such that w(P) < w(Q).</Paragraph> <Paragraph position="1"> Rule M9 implements the idea that a proposition which is used to clarify another proposition (i.e., it paraphrases its content or explains the meaning of some of its terms) has to be considered less important than the proposition it clarifies. This rule can be applied, for example, in rating propositions 210 and 220, the latter resulting less important than the former.</Paragraph> </Section> class="xml-element"></Paper>