File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/96/c96-1043_metho.xml

Size: 11,210 bytes

Last Modified: 2025-10-06 14:14:08

<?xml version="1.0" standalone="yes"?>
<Paper uid="C96-1043">
  <Title>Evaluating and comparing three text-production techniques</Title>
  <Section position="3" start_page="0" end_page="249" type="metho">
    <SectionTitle>
2 Three techniques for producing
</SectionTitle>
    <Paragraph position="0"> multisentential text This section describes tile three text-production techniques under assesslllelt\[.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.1 Fill-in-the-blank semi-automatic
</SectionTitle>
      <Paragraph position="0"> technique Since 1975, the mail department el' lea Redoule (a l~;uropcan mail-order colnpany) has been using a semi ;automatic reply system, referred to below as &amp;quot;SA&amp;quot;, consisting of a nutnbel + of predelined attd fill-in-the-blank sentences or paragraphs which are identified by codes that the writers memorisc. Writing a letter thcrcfore involves typing the code that corresponds to the desired pm'agraph and inserting the relevant elcnlents. The sentences or paragraphs thus produced are thcl'clbre concatenations o1' predefined and illSertcd texts.</Paragraph>
      <Paragraph position="1"> l. A relatively high number of prcdefined sentences and paragraphs have to be provided, to cover the writers' needs, but:</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="249" type="sub_section">
      <SectionTitle>
2.2 Automatic Hybrid Generation
</SectionTitle>
      <Paragraph position="0"> (IAnguistic + Template approach) lea Rcdoutc and GSI-Erli have developed a real-situation pilot system (for details on this project, see (Coch, David, and Magnolcr, 1995)) which builds up a text (i.e. a letter) fronl data entered by tile human operator who processes the request; a custonlcf database; and knowledge bases. It uses GSi-Erli's AlcthGen text generation toolbox (see (Coch, 1996)).</Paragraph>
      <Paragraph position="1"> The overall system is composed of two Inain modules: thc I)ccision module and the Generation module.</Paragraph>
      <Paragraph position="2"> The Decision module has the following functions:  * it allows the writer (who reads the request letter) to identify the author and subject of the request letter; * it asks the writer for relevant information; * it suggests a decision (for example, order cancellation, renewal, etc.), after consulting the customer database and the domain knowledge; * it asks the writer to validate the decision (or make a different choice); * it communicates the relevant information to the Generation module.</Paragraph>
      <Paragraph position="3"> The Generation module automatically produces the reply letter in a standard l~rmat (SGML). This module consists of several submodulcs (for more details, see (Coch, David and Magnoler 1995) and (Coch and David, 1994)): the direct generator; the text deep-structure planner (or conceptual planner); the text surface-structure planner (or rhetorical planner); and linguistic realisation, inspired by the Meaning-Text Theory.</Paragraph>
      <Paragraph position="4"> The direct generator has two functions:  1. planning the text in direct mode (top-down), anti 2. generating more or less fixed expressions or non- null linguistic texts (i.e. tables, addresses, lists, etc.). The direct generator could be used without the other submodules to generate texts in an automatic but non-linguistic way (manipulation of character strings). Reiter (Reiter, 1995) calls this technique &amp;quot;the template approach&amp;quot;.</Paragraph>
      <Paragraph position="5"> The output of the conceptual planner is the text's deep structure, in which the events to be era'tied out are not yet in a definitive order. The conceptual planner uses logical, causality, and time rules (see (Coch and David, 1994)).</Paragraph>
      <Paragraph position="6"> The rhetorical module chooses concrete operators, modalities and surlace order, according to rhetorical rules. The choices made depend on certain attributes, e.g. whether the addressee is aware of an event, whether an event is in the addressee's favour, and so on, Lastly, the linguistic generation submodule realises each event li'om the text surface structure. It uses anaphora (see (Coch, David and Wonsever, 1994)), semantic, deep-syntactic, surface-syntactic, and morphological rules. This sub-module is inspired mainly by the Meaning-Text Theory (as developed for example in (Mel'euk, 1988) and (Mel'euk and Polgubre 1988)).</Paragraph>
      <Paragraph position="7"> In accordance with Reiter (Reiter, 1995), La Redoute and GSI-Erli's system can be defined as &amp;quot;hybrid&amp;quot;, because it uses both linguistic and template techniques.</Paragraph>
    </Section>
    <Section position="3" start_page="249" end_page="249" type="sub_section">
      <SectionTitle>
2.3 Human writing
</SectionTitle>
      <Paragraph position="0"> The third technique used was human writing in &amp;quot;ideal&amp;quot; conditions: one of La Redoute's best writers wrote the letters with no time constraints.</Paragraph>
    </Section>
    <Section position="4" start_page="249" end_page="249" type="sub_section">
      <SectionTitle>
2.4 Functional differences
</SectionTitle>
      <Paragraph position="0"> It is to be noted that the three techniques describexl differ from an external functional point of view: * in the semi-automatic approach, the writer compose the letter themselves, even if assisted by a set of predefined-paragraph codes; * in the autonmtic hybrid approach, the operator enters data on the addressee and letter, but does not have to compose the reply letter; * in the third case, the writer has to write the letter. Reiter (Reiter, 1995) studied the difference between the linguistic generation anti template approaches.</Paragraph>
      <Paragraph position="1"> The two techniques do not differ from an external functional point of view.</Paragraph>
    </Section>
  </Section>
  <Section position="4" start_page="249" end_page="250" type="metho">
    <SectionTitle>
3. Methodology
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="249" end_page="250" type="sub_section">
      <SectionTitle>
3.1 Evaluation Tests
</SectionTitle>
      <Paragraph position="0"> Black-box methodology was used for the assessmcm, which was era'tied out by an independent jury of 14 people, who were representative of end users, in a blind-test context. The jury was not informed of the automatic generation project.</Paragraph>
      <Paragraph position="1"> Each member of the jury examined the quality of a set of 60 letters (20 produced by the SA system, 20 by the automatic hybrid system, and 20 humanwritten, for identical cases). No member of the jury knew which technique had been used for producing each of the letters.</Paragraph>
      <Paragraph position="2"> Each member o1' the jury wrote a report on cad1 letter, with assessment values according to quality criteria. Examples of these criteria are:  used.</Paragraph>
      <Paragraph position="3"> The first three criteria were considered as eliminatory, and were marked 0 or I. The other criteria were marked out of 20.</Paragraph>
      <Paragraph position="4"> There were also other criteria, but they were too application-oriented and confidential.</Paragraph>
    </Section>
    <Section position="2" start_page="250" end_page="250" type="sub_section">
      <SectionTitle>
3.2 Reprcsentativity of the results
</SectionTitle>
      <Paragraph position="0"> Given that the tests used only 20 letters of each type, one might question their representativity.</Paragraph>
      <Paragraph position="1"> In fact, representativity is ensured by the projection of the results of the previous phase (system tests) which used the same quality criteria, involved a reductxl Jury (2 to 6 members), and was based on 200 test cases (200 letters of each type).</Paragraph>
      <Paragraph position="2"> The test cycle was performed six timcs:</Paragraph>
    </Section>
    <Section position="3" start_page="250" end_page="250" type="sub_section">
      <SectionTitle>
Diagnosis
</SectionTitle>
      <Paragraph position="0"> After the sixth cycle, the average quality scores showed thai the results wottld be sufficiently representative.</Paragraph>
      <Paragraph position="1"> For example, for the following criteria: * rhythm and flow 1.21 precision of terminology 0 absence of rel)ctitions</Paragraph>
      <Paragraph position="3"> We can thus conclude that, for the automatic letters, the results are representative, The semi-automatic letters were produced hy \[ittnlan &amp;quot;writers&amp;quot; in a real situalion. There is no proo\[ o1&amp;quot; this, but several people who know the semi-autotnatic systetn were of tim opinion that the scmi-automatic letters ttsed in the test were butter than the average semi-atttomatic letter.</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="250" end_page="251" type="metho">
    <SectionTitle>
4. Assessment results
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="250" end_page="250" type="sub_section">
      <SectionTitle>
4.1 Eliminatory criteria and overall
</SectionTitle>
      <Paragraph position="0"> average All the automatic and human letters met the eliminatory criteria standards. However, this was not the case for the senti-automatic system, in particular due to problems of comprehension, but also due to grammatical mistakes in the fill-in-the-blank system.</Paragraph>
      <Paragraph position="1"> The overall averages of the entire jury, for all the quality criteria (including application-oriented criteria), and for all the letters were as follows. * semi-automatic system: I 1 out of 20 * automatic hybrid system: 14.5 out of 20 * human-written letters: 15.5 out of 20.</Paragraph>
      <Paragraph position="2"> It can be seen that the quality of the letters generated by the pilot systeln using AlethGen was lar superior to that of the senti-automatic system using predetinexl paragraphs.</Paragraph>
      <Paragraph position="3"> These tests show that the &amp;quot;Ideal&amp;quot; human-written letters are, obviously, thc best. However, the differences between the hmnan-written letters and those produced by the automatic hybrid system ,'ue relatively slight.</Paragraph>
    </Section>
    <Section position="2" start_page="250" end_page="251" type="sub_section">
      <SectionTitle>
4.2 Detailed results
</SectionTitle>
      <Paragraph position="0"> Below are the averages for the whole jury and all the letters, as regards the non-eliminatory criteria:  The difli:rence between the ideal human letters mid those obtained with the automatic hybrid system is considerable: 2.8 out of 20.</Paragraph>
      <Paragraph position="1"> vs. automatic letters: vs. SA lcttcrs: vs. SA letters:  * ideal human letters vs. SA letters: 2.8 The results obtained by the ideal human letters ~md those generated automatically are close. However, the ditTemnce between automatic and semi-autonmtic letters is considerable: 2 out of 20.</Paragraph>
      <Paragraph position="2">  Here, all the difli:renccs are considerable. The human letters are obviously the best, but the dil\]~rence between the automatic and semi-automatic letters is  Here, all differences are relatively great. That between the atmmmtic and semi-automatic letters is considerable: 2.4 out of 20.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML