XML Viewer - n03-3003

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/03/n03-3003_metho.xml
Size: 18,400 bytes
Last Modified: 2025-10-06 14:08:14
<?xml version="1.0" standalone="yes"?>
<Paper uid="N03-3003">
  <Title>Language choice models for microplanning and readability</Title>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2 Constructing the microplanner
</SectionTitle>
    <Paragraph position="0"> This section describes the stages in the construction of the microplanner. Each stage is based on empirical evidence. Firstly, we acquired knowledge about how human writers linguistically realise specific discourse relations by carrying out a corpus analysis (see Williams and Reiter 2003). Secondly, we selected the best method for representing this knowledge and built choice models from the corpus analysis data. Then, because the corpus was written for good readers, we had to adapt the models for poor readers. For this, we used results from psycholinguistic studies, including results from our own preliminary experiments (see Williams et al. 2003).</Paragraph>
    <Paragraph position="1"> Finally, these individual parts were combined to produce the finished microplanner.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.1 Reconfiguring our corpus analysis results
</SectionTitle>
      <Paragraph position="0"> We analysed seven discourse relations (Williams and Reiter 2003), including concession, condition, elaboration-additional, evaluation, example, reason and restatement, using the RST Discourse Treebank corpus (Carlson et al. 2002). We analysed one hundred instances of each relation noting the following six features: null L1: length of the first text span (in words).</Paragraph>
      <Paragraph position="1"> L2: length of the second text span (in words).</Paragraph>
      <Paragraph position="2"> O: ordering of the text spans.</Paragraph>
      <Paragraph position="3"> Ps: position(s) of discourse cue phrase(s).</Paragraph>
      <Paragraph position="4"> P: between-text-span punctuation.</Paragraph>
      <Paragraph position="5"> C: discourse cue phrase(s).</Paragraph>
      <Paragraph position="6"> An example to demonstrate these features is the concession relation in the last example given above: &amp;quot;If you practise, you can learn to fill in forms. You made four mistakes, though.&amp;quot; Here, L1 is ten words (this includes the whole of the condition daughter), L2 is five words, O is concession-statement, Ps is after the statement, P is a full stop and C is &amp;quot;though&amp;quot;.</Paragraph>
      <Paragraph position="7"> These features were chosen on the basis of previous work (Moser and Moore 1996) and because they influence sentence length and lexical choice which are known to be important factors in readability. The analysis revealed some of the ways in which human authors select these features when writing for good readers.</Paragraph>
      <Paragraph position="8"> These provided a partial specification for modelling discourse-level choices that should be available in an</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
RELATION
</SectionTitle>
    <Paragraph position="0"> id: R1 type: concession concession: statement:</Paragraph>
  </Section>
  <Section position="6" start_page="0" end_page="0" type="metho">
    <SectionTitle>
RELATION
</SectionTitle>
    <Paragraph position="0"> id: R2 type: condition condition: consequent: TEXT SPAN id: S1 text: &amp;quot;you made four mistakes&amp;quot; TEXT SPAN id: S3 text: &amp;quot;you practise &amp;quot; TEXT SPAN id: S2 text: &amp;quot;you can learn to fill in forms&amp;quot;</Paragraph>
    <Paragraph position="2"> NLG system. Furthermore the analysis demonstrated that the features are interdependent.</Paragraph>
    <Paragraph position="3"> The results from our corpus analysis (Williams and Reiter 2003) were simplified. The numbers of values for some features were cut down by re-classifying them as members of a smaller number of categories. Length became either &amp;quot;long&amp;quot; or &amp;quot;short&amp;quot;. The data for each relation was split into two, so that roughly half the L1 instances fell into the &amp;quot;short&amp;quot; category (e.g. for concession, short = 1-15 words, long = &gt;15 words).</Paragraph>
    <Paragraph position="4"> Between-text-span punctuation was divided into just three categories: none, non-sentence-breaking, and sentence-breaking. The restatement relation was an exception because it had such a large proportion of openparentheses (62%) that an extra category was created. In restatement, it seems that punctuation is often used instead of a cue phrase to signal the relation. The cue phrase feature was left with larger numbers of values to provide GIRL with the maximum number of choices for lexical selection.</Paragraph>
    <Paragraph position="5"> The data was reconfigured as sets of 6-tuples. Each represents a set of values for one instance of a relation: i.e. &lt;L1,L2,O,Ps,P,C&gt;. For instance, the concession relation described above would be represented as &lt;short, short, concession-statement, after_statement, full stop, &amp;quot;though&amp;quot;&gt;. We thus created seven hundred 6-tuples in total, one hundred per relation. For each relation, these were sorted, duplicates were counted and superfluous duplicates removed. Of the resulting unique 6-tuples, some were rejected and are not used in the current language choice models. For example, in the concession choice model forty-six unique 6-tuples cover 100% of the corpus data and sixteen were rejected, resulting in a coverage of 75%. For condition, forty-seven unique 6-tuples cover 100% but only twenty-six were included and these cover 72%.</Paragraph>
    <Paragraph position="6">  current language models marked with asterisks The reason why some tuples were rejected is because GIRL's present shallow approaches to syntactic and semantic processing cannot generate them. It cannot currently generate embedded discourse text spans, nor can it generate discourse cue phrases in mid-text-span positions. Both of these would require the implementation of deeper syntactic processing. Certain 6-tuples contain discourse cue phrases that would not make sense when generated unless we implement deeper semantic processing. Figure 3 shows some examples of these. Cue phrases marked with asterisks have been rejected from the current language models because they require deeper processing.</Paragraph>
    <Paragraph position="7"> Our current method for reconfiguring the data is manual, using existing spreadsheet, database and statistics packages. We are investigating how it could be automated, given that some decisions, such as which 6-tuples to reject, require human judgement.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.2 Building CSP graphs for good readers
</SectionTitle>
      <Paragraph position="0"> Having reconfigured the results of our corpus analysis, we searched for the best way to model the choices they represent. We tried exploring both discriminant analysis statistics and machine learning of decision trees in attempts to identify which feature(s) would most clearly divide the data into groups. For most discourse relations, the positions of discourse cue phrases were the most discriminating features.</Paragraph>
      <Paragraph position="1"> The most crucial characteristic of the choice models we were attempting to build was that they should reflect the interdependencies of the features found in the corpus analysis. For instance, in most relations the selection of between-span punctuation is dependent on the length of the first text span. For some relations (not all), this means that as the first text span gets longer, the between-span punctuation tends to change from no punctuation, to comma, to full stop. Similarly, the selection of punctuation depends on the order of text spans, particularly with the condition relation. If the order is condition-consequent, there tends to be a comma between text spans, if the order is consequent-condition, there is often no punctuation. And so on with interdependencies between all the other features.</Paragraph>
      <Paragraph position="2"> The best representation we have found to date that fits this requirement is constraint satisfaction problem (CSP) graphs. Power (2000) demonstrated that CSPs could be used to map rhetorical structures (i.e. discourse relation trees) to text structures (paragraphs, sentences, etc.). Our task is similar to Power's, but we emphasise different processes, such as cue phrase choice, our choice models are based on empirical evidence, and we have the additional criteria that the representations should be adaptable for different reading abilities. It turned out that CSP graphs were ideal for this purpose, since we exploit CSP's notion of 'tightening' the constraints in our solution for adapting the models for poor readers (see section 2.3).</Paragraph>
      <Paragraph position="3">  We used the Java Constraint Library (JCL 2.1) from the Artificial Intelligence Laboratory at the Swiss Federal Institute of Technology in Lausanne (Torrens 2002) which we found to be portable, relatively bug-free and easy to plug straight into our system which is written entirely in Java.</Paragraph>
      <Paragraph position="4"> We built computer models representing the six key features of discourse relations and their interdependent values. One CSP graph was built for each of the seven discourse relations. The structure of the graphs is exactly the same for each relation with six nodes and fifteen connections linking every node to all the others. This structure is illustrated in figure 4.</Paragraph>
      <Paragraph position="5">  The nodes in the graph in figure 4 are CSP domain variables. Each represents one of the six features. The numbers of values for each node varies for each relation. Constraints between the variables were represented as &amp;quot;good lists&amp;quot;. Both values and constraints were coded directly from the 6-tuple data. Good lists contain pairs of values that are &amp;quot;legal&amp;quot; for two variables. For instance, a connection between L1 and P might contain the pair &lt;short, non-sentence-breaking&gt; in its good list, meaning: if the length of the first text span in the relation is short, put non-sentence-breaking punctuation, such as a comma, between the text spans. The numbers of pairs in the &amp;quot;good lists&amp;quot; attached to each of the fifteen connections varies for each relation.</Paragraph>
      <Paragraph position="6"> We used pairs of &amp;quot;legal&amp;quot; values in the CSP good lists because the corpus analysis is too small to predict the probabilities of triples. We are currently working on expanding the size of our corpus analysis. We wanted the CSP graphs to generate solutions that gave as good a coverage of the 6-tuples included in the models as possible, but we did not want to overgenerate instances that did not occur in the analysis. This required delicate balancing of the two factors.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.3 Adapting the models for poor readers
</SectionTitle>
      <Paragraph position="0"> based on psycholinguistic evidence The language choice models were adapted for poor readers by tightening the constraints. We studied the psycholinguistic and educational literature to determine how they should be tightened. We also carried out preliminary experiments of our own (Williams et al. 2003) which indicated that certain discourse-level features affect readability for poor readers more than good readers. Selecting more common discourse cue phrases and the placing punctuation between discourse segments were both particularly helpful for poor readers.</Paragraph>
      <Paragraph position="1"> Existing psycholinguistic research on reading has little to say about adults with poor literacy. It has tended to focus on proficient adult readers (University students), rather than on the problems of adult learner readers. Where it has investigated the development of reading skills, it has tended to focus on children, rather than adults. Educationalists maintain that the reading skill profiles of adults with poor literacy are different from those of children. 'Normal' children tend to develop reading skills evenly, whereas adults who are functionally illiterate tend to have developed unevenly (Strucker 1997). Yet another problem is that it tends to focus on single words, single sentences, or pairs of sentences, that are presented to a reader out-of-context, rather than in multiple-sentence documents.</Paragraph>
      <Paragraph position="2"> There are some exceptions, however. Devlin and Tait (1998) found that the readability of newspaper texts was increased for seven out of ten aphasic readers when they replaced infrequent words with more frequent synonyms. Leijten and Van Waes (2001) reported that elderly readers' comprehension and recall improved when they were presented with causal discourse structures containing explicit discourse cue phrases and explicit headings. Degand et al. (1999) observed that removal of even a few cue phrases affects comprehension and recall of the entire content. The last two studies were with adult readers from the general public with (presumably) varying levels of reading ability.</Paragraph>
      <Paragraph position="3"> To sum up, use of cue phrases, selection of common cue phrases and use of between-span punctuation all seem to help bad readers. We therefore chose to tighten the constraints to favour solutions with these features.</Paragraph>
      <Paragraph position="4"> Frequencies for cue phrases were obained from a part-of-speech (POS) search (Aston and Burnard 1998) in the 100 million word British National Corpus.</Paragraph>
      <Paragraph position="5"> Phrases like 'for example' are annotated with a single part-of-speech in the BNC. Some results are shown in Table 1. Cue phrases do not all have the same POS, and they are not, of course, exact synonyms, so it is not always possible to substitute one for another even if both are from the same relation. 'Such as' can not always be substituted for 'for instance', but 'for example' is a close synonym and it is possible to do a substitution.</Paragraph>
      <Paragraph position="6"> We tightened constraints, where possible, to favour words that occur in the Dolch lists used by adult literacy tutors. These list the most commonly occurring function words that beginner readers are taught to sight read.</Paragraph>
      <Paragraph position="7"> Another danger with substituting common phrases for less common ones is that the most common phrases</Paragraph>
      <Paragraph position="9"> are also the most ambiguous. The cue phrases 'but' and 'and' both occurred in four relations (concession, elaboration-additional, evaluation and reason) out of seven in the corpus analysis and these are relations with very different meanings. These problems require further investigation. null Cue phrase BNC freq. Dolch list</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.4 Putting it all together - the microplanner
</SectionTitle>
      <Paragraph position="0"> Figure 4 shows the main components of the microplanner. The inputs are a model of the user's reading ability (marked 'user model') and a document plan containing discourse relation trees, marked 'DocPlan'. Both are built by system modules occurring earlier than the microplanner in the processing sequence. The document plan in figure 4 is the same as shown above in figure 2.</Paragraph>
      <Paragraph position="1"> Working bottom-up, a CSP graph for the current relation is retrieved from the CSP graph knowledge base and the constraints are tightened or relaxed according to the user model. The CSP Solver (Torrens 2002) then uses simple backtracking search to find all solutions for the relation. The solutions found by the CSP Solver are passed through a filter which currently picks the most frequently occurring one for good readers and the one with overall shortest sentences for poor readers. The output is a schema that the next module of GIRL uses to construct messages.</Paragraph>
      <Paragraph position="2"> It does not always output the most coherent solution.</Paragraph>
      <Paragraph position="3"> For instance, the output shown in figure 5 would result in a final output of &amp;quot;You made four mistakes. But if you practise, you can learn to fill in forms&amp;quot;. Adjacent discourse cue phrases do not improve coherency. The microplanner is still under development, however, future improvements, possibly including backtracking, will improve readability, possibly including coherence considerations, such as focus and reference.</Paragraph>
    </Section>
  </Section>
  <Section position="7" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3 Discussion
</SectionTitle>
    <Paragraph position="0"> Additional functionality would need to be added to the 'filter' module to choose solutions that optimise discourse coherence. Additional nodes might be required in the constraint graphs. The simple string content of discourse relations would have to be replaced by semantic representations. If it were, the simple pipeline architecture would no longer be appropriate, since it currently depends on knowing the final length of the strings.</Paragraph>
    <Paragraph position="1"> On the other hand, when generating text for bad readers, we might have to sacrifice some of these, since they might impact on readability. Ellipsis, for instance, may not be good for bad readers. Ellipsis is one way that conciseness can be achieved during aggregation.</Paragraph>
    <Paragraph position="2"> Current opinion in the NLG community is that aggregation for conciseness is 'a good thing'. Reape and Mellish (1999) even suggest that an NLG system should 'aggregate whenever possible'. But conciseness may be less comprehensible for poor readers. The sentences in A, below, could be aggregated as in B.</Paragraph>
    <Paragraph position="3"> A. Spelling is hard. But spelling is important.</Paragraph>
    <Paragraph position="4"> B. Spelling is hard but important.</Paragraph>
    <Paragraph position="5"> However, in B a single sentence is longer and the cognitive load for poor readers in working out the ellipse could be higher. A little repetition and redundancy might actually turn out to be beneficial!</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML