File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/92/c92-3158_metho.xml

Size: 15,135 bytes

Last Modified: 2025-10-06 14:13:01

<?xml version="1.0" standalone="yes"?>
<Paper uid="C92-3158">
  <Title>Generation of Extended Bilingual Statistical Reports</Title>
  <Section position="2" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2 Text Planning for Statistical
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
Reports
</SectionTitle>
      <Paragraph position="0"> Our approach to planning statistical reports is similar to that used on tile Gossip project \[2\]. A &amp;quot;conceptual frame&amp;quot; tree schema is mstantiated with input data to provide an initial characterization of the intended content of the reports. Input data for the employment domain is in the form of relational ta-AcrEs DE COLING-92. NANTES. 23-28 not~q&amp;quot; 1992 1 0 I 9 I)ROC. OF COLING-92, NANTES. AUG. 23-28. 1992 bles which provide numerical values for employment, unemployment, participation rate, etc., broken down by age, sex, region and industry, for the current reporting period (e.g., month), as well as for previous comparison periods (e.g., preceding month, one year ago, etc.). The instantiated tree gives a preliminary hierarchical structure for the future text, and provides a framework for further processing on the content to determine the details of text structure.</Paragraph>
      <Paragraph position="1"> For example, comparisons of employment changes in various labour force groups will lead to ordering messages (future clauses) so as to highlight the most significant changes. The tree structure is traversed and modified as a part of this process. The conceptual text tree also carries annotations of theme and theme specifications which will constrain the set of possible texts which can be derived from it.</Paragraph>
      <Paragraph position="2"> An important part of text planning is the identification of messages which can be grouped together into structures which will give rise to single sentences.</Paragraph>
      <Paragraph position="3"> This includes conjoining two messages with identical theme to give marked structures that will produce linguistic conjunction and subject pronominalization later as in: (1) Employment increased by 20,000 among women while it decreased slightly among men.</Paragraph>
      <Paragraph position="4"> Conceptual conjunction includes checking and marking similarities in thematic elements that may later lead to ellipsis, as in (2), with the possible introduction of lexical functions such as in (3) 2.</Paragraph>
      <Paragraph position="5">  (2) Employment increased by 5000 in Manitoba, by 10,800 in Alberta and by 15,000 in Ontario.</Paragraph>
      <Paragraph position="6"> (3) For the week ended November 18, 1989, the sea null sonally adjusted level of employment was estimated at 12,518,000, up 32,000from October.</Paragraph>
      <Paragraph position="7"> It has been noticed \[5\] that certain types of report texts have complex internal dependencies that put special demands on the planning mechanism used.</Paragraph>
      <Paragraph position="8"> In particular, top-down expansion of rhetorical operators is inadequate for generating statistical reports in our domain. Our planning approach, by making use of the power of arbitrary tests and operations on tree schemata, allows us to adequately represent the cross-serial dependencies found among the pieces of content of these reports. However, a more general, but appropriately constrained language for report planning seems to be a desirable goal for future research.</Paragraph>
      <Paragraph position="9"> ZThe lexeme up is the value of tile lexical function Adv 1 applied to the verb ir.crea#e.</Paragraph>
    </Section>
  </Section>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3 Interlingual Representation
</SectionTitle>
    <Paragraph position="0"> Published bilingual reports in our domain occasionally exhibit deep differences between corresponding  Not only are the surface syntactic structures incomparable in this case, but they cannot be easily related on the level of linguistic semantics, because their semantic predicates are dissimilar. We have therefore chosen to use a conceptual interlingua (the output of the text planning process) in order to derive separate semantic net representations of the sentences in each language. Hence the sentences (4a) and (4b) are derived from non-isomorphic Meaning-Text semantic networks, which allow us to fully represent the two languages' different '~iewpoints&amp;quot; on the same conceptual material.</Paragraph>
  </Section>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
4 Realizer Design
</SectionTitle>
    <Paragraph position="0"> Grammatical realization in the LFS system is the process by which the semantic nets produced by the planner for the incipient sentences are converted into surface sentences of each language. Our realizer for English is based largely on the general Meaning-Text sentence realizer used in Gossip, with some additions to cover structures found in statistical texts. A comparable realizer for French has been built for LFS. As in the case of Gossip, we use four main linguistic levels of representation between conceptual structures and texts: semantic nets (SemR), deep syntactic dependency trees (DSyntR), surface syntactic dependency trees (SSyntR) and morphological strings (MorphR). For each language, the first linguistic operation requires searching the semantic net for a given sentence to determine the communicatively dominant node. This search is constrained by the theme/theme specifications which the SemR inherits from the conceptual structure (see \[9, 3\]).</Paragraph>
    <Paragraph position="1"> The second operation consists of &amp;quot;replacing&amp;quot; single or complex (configurations of) meauing-bearing nodes in the semantic network by actual lexemes of the language, and replacing semantic features on those nodes by grammatical features which will be attached to the nodes of the future deep-syntactic tree. These operations lead to a reduced semam tic graph (RSemR), which is intermediate between SemR and DSyntR. In fact, the SemR is not modi~ ACIES DE COLING-92, NANTES, 23-28 ^ot~r 1992 I 0 2 0 PROC. or COLING-92, Nhtcrns, AUG. 23-28, 1992 fled, but rather it is used as a blueprint for building the RSemR, just as each subsequent representation is built by mapping rules from its ancestor representation. null The production of the DSyntR tree out of the RSemR, called &amp;quot;arborization&amp;quot;, entails the mapping of predicate-argument relations to deep syntactic relations using information about potential dominant nodes of tim RSemR and grammatical features.</Paragraph>
    <Paragraph position="2"> The SSyntR is built by mapping deep-syntactic nodes and relations into their surface-syntactic counterparts. Single DSyntR nodes corresponding to phrasemes (i.e., locutions) give rise to syntactic sub-trees in SSyntR, and some grammatical lexemes are introduced, including auxiliary verbs, articles and syntactically motivated prepositions.</Paragraph>
    <Paragraph position="3"> The next mapping, to MorpbR structure, determines word order and all syntactically motivated morphological features. A final operation produces actual text by computing the final (graphical) wordforms based on the morphological features attached to lexemes in MorphR.</Paragraph>
  </Section>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
5 Lexical Functions
</SectionTitle>
    <Paragraph position="0"> The sublanguage of statistical summary reports shows a certain amount of variation in the syntactic structure and \[exieal choices used to express a given content. We have used lexical functions to implement this paraphrastic variation in a systematic way within our Meaning-Text models, much as was done in Gossip \[3\]. Briefly stated, lexical functions (LFs) can be considered abstract meanings which have different lexical values depending on their argument lexemes. LFs provide a way of delaying some idiosyncratic lexical realizations until after major syntactic choices have beeu made. They also allow us to formulate very general paraphrase rules.</Paragraph>
    <Paragraph position="1"> Our statistical reports, with their emphasis on numerical changes and comparisons of change, provide an excellent opportunity to use lexical flmctions, such as Magn (&amp;quot;intensifying&amp;quot; word), S O (action nominal) and Oper I (agent-oriented support verb). For example, sentence (6) can be calculated to be a paraphrase of (5):  (5) Employment decreased sharply in October.</Paragraph>
    <Paragraph position="2"> (6) Employment showed a sharp decrease in October.</Paragraph>
    <Paragraph position="3">  A general paraphrase rule states that a verbal lexeme (here, decrease), can be paraphrased by a syntactic construction where the new verb (i.e., show) is the value of Oper I operating on the nominalizatiou (i.e., S0) of the old verb. This computation, carried out by successively looking up LF values in argument word lexieal entries, derives the new verb show by functional composition. In the derived paraphrase sentence (6) this new verb takes as its syntactic object the nominalization of the old verb. In a separate operation which if factored out of the paraphrase operation, the lexical value of the intensifier sharp is computed via the lexical function Magn operating on lexeme decrease. It is simpler to delay its evaluation until after the change in grmnmatieal category of the head word. The paraphrase rule which relates the two verbal constructions of (5) and (6) can be stated using only lexieal functions, lexical class (part-of-speech) symbols and grammatical relations, without reference to specific lexical |tents.</Paragraph>
    <Paragraph position="4"> In addition to the above &amp;quot;well-known&amp;quot; lexical functions, our domain also makes use of Syn (synonym), AntiMagn (diminutive modifier), Locin (locative preposition), Adv 1 (locative adverb) and several more &amp;quot;exotic&amp;quot; ones. Most LFs used in our system are introduced during the mapping from RSemR to DSyntR. Exceptions include Syn, which is used during reduction of SemR to RSemR. When there are two semantic nodes with identical lexemic meanings, the realizer uses Syn to lexicalize one differently from the other by finding a synonym.</Paragraph>
  </Section>
  <Section position="6" start_page="0" end_page="0" type="metho">
    <SectionTitle>
6 Implementation and Future
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
Directions
</SectionTitle>
      <Paragraph position="0"> The LFS system is implemented in Quintus Prolog on Sun 4 workstations. Adaptations to several specific varieties of employment reports have been carried out, including the multi-paragraph general summary reports for English and French, given below in SS6.1 and SS6.2 respectively. The approach outlined here is now being extended to produce other varieties of statistical reports, dealing with different kinds of data (e.g., retail trade summaries). The user interface, which currently allows various choices from among a set of options, is being made more flexible and dynamic by tying tile choices more directly to tile tree schemata that guide the planning process.</Paragraph>
      <Paragraph position="1"> Until now, LFS paraphrasing capability has been implemented only for eases where variation is needed to avoid repetition within a given sentence. The next step, now in preparation, is to enforce variation over longer stretches of text such as whole paragraphs.</Paragraph>
      <Paragraph position="2"> Ac'r~ DE COLING-92. NANTES, 23-28 AOt~T 1992 1 0 2 1 PRoc. OF COLING-92. NANTES, Auo. 23-28, 1992</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
6.1 Sample English output 6.2 Corresponding French output
</SectionTitle>
      <Paragraph position="0"/>
    </Section>
  </Section>
  <Section position="7" start_page="0" end_page="0" type="metho">
    <SectionTitle>
COMMENTARY
</SectionTitle>
    <Paragraph position="0"> Overview Estimates for November 1989 from Statistics Canada's Labnur Force Survey show th~,t the seasonally adjusted level of employment r~ae by 32000 and that the level of unemployment inclosed by 30000. The unemployment rate increased by 0.2 to 7.6.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
Employment
</SectionTitle>
      <Paragraph position="0"> For the week ended November 3, 1989, the seasonally adjusted level of employment ~ estimated at 12568000, up 32000 from October. The increase was concentrated among women aged 25 and over. The employment / population ratio remained virtually unchanged ( 62.1 ).</Paragraph>
      <Paragraph position="1"> Employment among women aged 25 and over rose by 44000 and their employment / population ratio increased by 0.5 to 52.3 Employment among men aged 25 and over fell by 12000 and their employment / population ratio decreased by 0.3 to 72.5.</Paragraph>
      <Paragraph position="2"> Part-time employment increased by 25000. The increase was evenly distributed between men and women.</Paragraph>
      <Paragraph position="3"> Full-time employment remained virtually unchanged. An increase among women was offset by a decrease among men, The level of employment fell by 10000 in agriculture, by 12000 in transportation, communication and other utilities and by 12000 in primary industries other than agriculture. The level of employment rose by 68000 in services and by 20000 in trade. The level of employment remained virtually unchanged in the other sectors. The level of employment rose by 11000 in Quebec, by 8000 in Alberta, by 6000 in British Columbia and by 0000 in Ontario. The level of employment remained virtually unchanged in the other lectors.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
Unemployment and Participation Rate
</SectionTitle>
      <Paragraph position="0"> The a~aaonally adjusted level of unemployment was estimated at 1032000 for November 1989, up 30000 from October. The unemployment rate rose by 0.2 to 7.0 and the participation rate increased by 0.3 to 07.2.</Paragraph>
      <Paragraph position="1"> The increase in unemployment was concentrated among men aged 25 and over.</Paragraph>
      <Paragraph position="2"> Unemployment among men aged 25 and over increased by 24000 while unemployment remained virtually unchanged among women aged 25 and over.</Paragraph>
      <Paragraph position="3"> The unemployment rate among men aged 15 to 24 increaLed by 0.7 to 12,9.</Paragraph>
      <Paragraph position="4"> The participation rate among men aged 15 to 24 increased by ft.5 to 73.4 and the participation rate remained virtually unchanged among women aged 15 to 24.</Paragraph>
      <Paragraph position="5"> The seasonally adjusted level of unemployment remained virtually unchanged in moat provinces. The level of unemployment increm~d only in Ontario ( + 24000 )</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML