File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/84/p84-1053_metho.xml

Size: 18,302 bytes

Last Modified: 2025-10-06 14:11:37

<?xml version="1.0" standalone="yes"?>
<Paper uid="P84-1053">
  <Title>DEALING WITH CONJUNCTIONS IN A MACHINE TRANSLATION ENVIRONMENT</Title>
  <Section position="1" start_page="0" end_page="0" type="metho">
    <SectionTitle>
DEALING WITH CONJUNCTIONS
IN A MACHINE TRANSLATION ENVIRONMENT
</SectionTitle>
    <Paragraph position="0"/>
  </Section>
  <Section position="2" start_page="0" end_page="0" type="metho">
    <SectionTitle>
ABSTRACT
</SectionTitle>
    <Paragraph position="0"> The paper presents an algorithm, written in PROLOG, for processing English sentences which contain either Gapping, Right Node Raising (RNR) or Reduced Conjunction (RC). The DCG (Definite Clause Grammar) formalism (Pereira &amp; Warren 80) is adopted. The algorithm is highly efficient and capable of processing a full range of coordinate constructions containing any number of coordinate conjunctions ('and', 'or', and 'but'). The algorithm is part of an English-Chinese machine translation system which is in the course of construction.</Paragraph>
  </Section>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
0 INTRODUCTION
</SectionTitle>
    <Paragraph position="0"> Theoretical linguists have made a considerable investigation into coordinate constructions (Ross 67a, Hankamer 73, Schachter 77, Sag 77, Gazdar 81 and Sobin 82, to name a few), giving descriptions of the phenomena from various perspectives. Some of the descriptions are stimulating or convincing. Computational linguists, on the other hand, have achieved less than their theoretical counterparts.</Paragraph>
    <Paragraph position="1"> (Woods 73)'s SYSCONJ, to my knowledge, is the first and the most often referenced facility designed specifically for coordinate construction processing. It can get the correct analysis for RC sentences like (i) John drove his car through and completely demolished a plate glass window but only after trying and failing an indefinite number of times, due to its highly non-deterministic nature.</Paragraph>
    <Paragraph position="2"> (Church 79) claims '~ome impressive initial progress&amp;quot; processing conjunctions with his NL parser YAP. Using a Marcus-type attention shift mechanism, YAP can parse many conjunction constructions including some cases of Gapping. It doesn't offer a complete solution to conjunction processing though: the Gapping sentences YAP deals with are only those wlth two NP remnants in a Gapped conjunct.</Paragraph>
    <Paragraph position="3"> * Mailing address: Cognitive Studies Centre,  (McCord 80) proposes a &amp;quot;more straightforward and more controllable&amp;quot; way of parsing sentences like (I) within a Slot Grammar framework. He treats &amp;quot;drove his car through and completely demolished&amp;quot; as a conjoined VP, which doesn't seem quite valid.</Paragraph>
    <Paragraph position="4"> (Boguraev 83) suggests that when &amp;quot;and&amp;quot; is encountered, a new ATN arc be dynamlcally constructed which seeks to recognise a right hand constituent categorlally similar to the left hand one just completed or being currently processed. The problem is that the left-hand conjunct may not be the current or most recent constituent hut the constituent of which that former one is a part. (Berwlck 83) parses successfully Gapped sentences like (2) Max gave Sally a nickel yesterday, and a dime today using an extended Marcus-type deterministic parser. It is not clear, though, how his parser would treat RC sentences llke (I) where the fi~t conjunct is not a complete clause.</Paragraph>
    <Paragraph position="5"> The present work attacks the coordinate construction problem along the lines of DCG. Its coverage is wider than the existing systems: both Gapping, RNR and RC, as well as ordinary cases of coordinate sentences, are taken into consideration. The work is a major development of (Huang 83)'s CASSEX package, which in turn was based on (Boguraev 79)'s work, a system for resolving linguistic ambiguities which combined ATN grammars (Woods 73) and Preference Semantics (Wilks 75).</Paragraph>
    <Paragraph position="6"> In the first section of the paper, problems raised for Natural Language Processing by Gapping, RNR and RC are investigated. Section 2 gives a grouping of sentences containing coordinate conjunctions. Finally, the algorithm is described in Section 3.</Paragraph>
    <Paragraph position="7"> I GAPPING, RIGHT NODE RAISING AND</Paragraph>
  </Section>
  <Section position="4" start_page="0" end_page="244" type="metho">
    <SectionTitle>
REDUCED CONJUNCTION
I.I Gapping
</SectionTitle>
    <Paragraph position="0"> Gapping is the case where the verb or the verb together with some other elements in the non-leftmost conjuncts is deleted from a sentence:  (3) Bob saw Bill and Sue \[saw\] Mary.  (4) Max wants to try to begin to write a  novel, and Alex \[wants to try to begin to write\] a play.</Paragraph>
    <Paragraph position="1"> Linguists have described rules for generating Gapping, though none of them has made any effort to formulate a rule for detecting Gapping. (Ross 67b) is the first who suggested a rule for Gapping. The formalisation of the rule is due to</Paragraph>
    <Paragraph position="3"> where A and B are nonidentical major constituents*.</Paragraph>
    <Paragraph position="4"> (Sag 76) pointed out that there were cases where the left peripheral in the right conjunct might be a non-NP, as in (5) At our house, we play poker, and at Betsy's house, bridge.</Paragraph>
    <Paragraph position="5"> It should be noted that the two NPs in the Gapping rule must not be the same, otherwise (7) would be derived from (6):  (6) Bob saw Bill and Bob saw Mary.</Paragraph>
    <Paragraph position="6"> (7) Bob saw Bill and Bob Mary.</Paragraph>
    <Paragraph position="7"> whereas people actually say (8) Bob saw Bill and Mary.</Paragraph>
    <Paragraph position="8">  When processing (8), we treat it as a simplex containing a compound object (&amp;quot;Bill and Mary&amp;quot;) functioning as a unit (&amp;quot;unit interpretation&amp;quot;), although as a rule we treat sentence containing conjunction as derived from a &amp;quot;complex&amp;quot;, a sentence consisting of more than one clause, in this case &amp;quot;Bob saw Bill and Bob saw Mary&amp;quot; (&amp;quot;sentence coordination interpretation&amp;quot;). The reason for analysing (8) as a simplex is first, for the purpose of translation, unit interpretation is adequate (the ambiguity, if any, will be &amp;quot;transferred&amp;quot; to the target language); secondly, it is easier to process.</Paragraph>
    <Paragraph position="9"> Another fact worth noticing is that in the above Gapping rule, B in the second conjunct could be anything, but not empty. E.g., the (a)s in the following sentences are Gapping examples, but the  (b)s are not: (9) (a) Max spoke fluently, and Albert haltingly.</Paragraph>
    <Paragraph position="10"> *(b) Max spoke fluently, and Albert.</Paragraph>
    <Paragraph position="11"> (I0) (a) Max wrote a novel, and Alex a play.</Paragraph>
    <Paragraph position="12"> *(b) Max wrote a novel, and Alex.</Paragraph>
    <Paragraph position="13"> (II) (a) Bob saw Bill, and Sue Mary.</Paragraph>
    <Paragraph position="14"> (b) Bob saw Bill, and Sue.</Paragraph>
    <Paragraph position="15">  Before trying to draw a rule for detecting * According to the dependency grammar we adopt, we define a major constituent of a given sentence S as a constituent immediately dominated by the main verb of S.</Paragraph>
    <Paragraph position="16"> Gapping, we will observe the difference between  (12) and (13) on one hand, and (14) on the other: (12) Bob met Sue and Mar k in London.</Paragraph>
    <Paragraph position="17"> (13) I knew the man with the telescope and the woman with the umbrella.</Paragraph>
    <Paragraph position="18"> (14) Bob met Sue in Paris and Mary in London.  As we stated above, (12) is not a case of Gapping; instead, we take &amp;quot;Sue and Mary&amp;quot; as a coordinate NP. Nor is (13) a case of Gapping. (14), however, cannot be treated as phrasal coordination because the PP in the left conjunct (&amp;quot;in Paris&amp;quot;) is directly dominated by the main verb so that &amp;quot;Mary&amp;quot; is prevented from being conjoined to &amp;quot;Sue&amp;quot;. Now, the Gapping Detecting Rule: The structure &amp;quot;NPI V A X and NP2 B&amp;quot; where the left conjunct is a complete clause, A and B are major constituents, and X is either NIL or a constituent not dominated by A, is a case of</Paragraph>
    <Paragraph position="20"> 1.2 Right Node Raising (RNR) RNR is the case where the object in the nonrightmost conjunct is missing.</Paragraph>
    <Paragraph position="21"> (15) John struck and kicked the boy.</Paragraph>
    <Paragraph position="22"> (16) Bob looked at and Bill took the jar. RNR raises less serious problems than Gapping does. All we need to do is to parse the right  conjunct first, then copy the object over to the left conjunct so that a representation for the left clause can be constructed. Then we combine the two to get a representation for the sentence. Sentences llke the following may raise difficulty for parsing: (17) I ate and you drank everything they brought. (cf. Church 79) (17) can be analysed either as a complex of two full clauses, or RNR, according to whether we treat '~te&amp;quot; as transitive or intransitive.</Paragraph>
    <Section position="1" start_page="243" end_page="244" type="sub_section">
      <SectionTitle>
1.3 Reduced Conjunction
</SectionTitle>
      <Paragraph position="0"> Reduced Conjunction is the case where the conjoined surface strings are not well-formed constituents as in (18) John drove his car through and completely demolished a plate glass window.</Paragraph>
      <Paragraph position="1"> where the conjoined surface strings &amp;quot;drove his car through&amp;quot; and &amp;quot;completely demolished&amp;quot; are not well-formed constituents. The problem will not be as * 3-valency verbs are those which can appear in the structure &amp;quot;NP V NP NP', such as &amp;quot;give', &amp;quot;name', &amp;quot;select', 'call', etc.</Paragraph>
      <Paragraph position="2"> ** Here &amp;quot;/=&amp;quot; means &amp;quot;is not&amp;quot;.</Paragraph>
      <Paragraph position="3">  serious as might have seemed, given our understanding of Gapping and RNR. After we process the left conjunct, we know that an object is still needed (assuming that &amp;quot;through&amp;quot; is a preposition). Then we parse the right conjunct, copying over the subject from the left; finally, we copy the object from the right conjunct to the left to complete the left clause.</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="244" end_page="244" type="metho">
    <SectionTitle>
II GROUPING OF SENTENCES CONTAINING CONJUNCTIONS
</SectionTitle>
    <Paragraph position="0"> We can sort sentences containing conjunctions into three major groups on the basis of the nature of the left-most conjunct: Group A contains sentences whose left-most conjuncts are recognized by the analyser as complete clauses; Group B, the left-most conjuncts are not complete clauses, but contain verbs; and Group C, all the other cases.</Paragraph>
    <Paragraph position="1"> The following is a detailed grouping with example sentences:  AI. (Gapping) Clause-lnternal ellipsis: (19) I played football and John tennis.</Paragraph>
    <Paragraph position="2"> (20) Bob met Sue in Paris and John Mary in London.</Paragraph>
    <Paragraph position="3"> (21) Max spoke fluently and Albert haltingly.</Paragraph>
    <Paragraph position="4"> A2. (Capping) Left-peripheral ellipsis wlth two NP remnants: (22) Max gave a nickel to Sally and a dime to Harvey.</Paragraph>
    <Paragraph position="5"> (23) Max gave Sally a nickel and Harvey a dime.</Paragraph>
    <Paragraph position="6"> (24) Jack calls Joe Mike and Sam Harry. A3. (Gapping)Left-perlpheral ellipsis with one NP remnant and some non-NP remnant(s): (25) Bob met Sue in Paris and Mary In London.</Paragraph>
    <Paragraph position="7"> (26) John played football yesterday and tennis today.</Paragraph>
    <Paragraph position="8"> A4. (Gapping) Right-perlpheral ellipsis concomitant with clause-internal elllpsls: (27) Jack begged Elsie to get married and Wilfred Phoebe.</Paragraph>
    <Paragraph position="9"> (2~) John persuaded Dr. Thomas to examine Mary, and Bill Dr. Jones.</Paragraph>
    <Paragraph position="10"> (29) Betsy talked to Bill on Sunday, and Alan to Sandy.</Paragraph>
    <Paragraph position="11"> A5. The right conjunct is a complete clause: (30) I played football and John watched the television.</Paragraph>
    <Paragraph position="12"> A6. The right conjunct is a verb phrase to be treated as a clause with the subject deleted: (31) The man kicked the child and threw the ball.</Paragraph>
    <Paragraph position="13"> AT. Sentences where the &amp;quot;unit interpretation&amp;quot; should be taken: (32) Bob met Sue and Mary in London. (33) I knew the glrl bitten by the dog and the cat.</Paragraph>
    <Paragraph position="14"> BI. Right Node Raising: (34) The man kicked and threw the ball. (35) The man kicked and the woman threw the ba I 1.</Paragraph>
    <Paragraph position="15"> B2. Reduced Conjunction: (36) John drove hls car through and completely demolished a plate glass window.</Paragraph>
    <Paragraph position="16"> C. Unlt interpretations: (37) The man with the telescope and the woman with the umbrella kicked the ball.</Paragraph>
    <Paragraph position="17"> (38) Slowly and stealthily, he crept towards his victim.</Paragraph>
  </Section>
  <Section position="6" start_page="244" end_page="245" type="metho">
    <SectionTitle>
III THE ALGORITHM
</SectionTitle>
    <Paragraph position="0"> The following algorithm, implemented in PROLOG Version 3.3 (shown here in much abridged form), produces correct syntactlco-semantic representations for all the sentences given in Section 2. We show here some of the essential clauses* of the algorithm: &amp;quot;sentence', &amp;quot;rest sentencel&amp;quot; and &amp;quot;sentence conjunction'. The top-most clause &amp;quot;sentence&amp;quot; parses sentences consisting of one or more conjuncts. In the body of &amp;quot;sentence', we have as sub-goals the disjunction of &amp;quot;noun_phrase&amp;quot; and 'noun phrasel', for getting the sentence subject; the disjunction of &amp;quot;\[W\], Is verb&amp;quot; and 'verbl', plus 'rest verb', for treating the verb of the sentence; the disjunction of 'rest sentence&amp;quot; and &amp;quot;rest sentence1&amp;quot; for handling The object, preposltlonaT phrases, etc; and finally &amp;quot;sentence conJunctlon', for handling coordinate conjunctlon~ The Gapping, RNR and RC sentences In Section II contain deletions from either left or right conjuncts or both. Deleted subjects in right conjuncts are handled by 'noun phrasel' in our program; deleted verbs in right conjuncts by 'verbl'. The most difficult deletions to handle (for previous systems) are those from the left conjuncts, ie. the deleted objects of RNR (Group BI) and the deleted preposition objects of RC (Group B2), because when the left conJuncts are being parsed, the deleted parts are not avallabl~ This is dealt with neatly in PROLOG DCG by using logical variables which stand for the deleted parts, are &amp;quot;holes&amp;quot; In the structures built, and get filled later by unification as the parsing proceeds.</Paragraph>
    <Paragraph position="1"> sentence(Stn, P Sub j, P Subj Head Noun, P Verb, P V Type, P Contentverb, P Tense,  P~Ob-j, PObJH~dNoun)--&gt; % P means &amp;quot;possible&amp;quot;: P arguments only % ~ve values if &amp;quot;sentenCe' is called by % 'sentence_conjunctlon' to parsea second % (right) conjunct. Those values will be % carried over from the left conjunct.</Paragraph>
    <Paragraph position="2"> (noun phrase(Sub J, HeadNoun); noun phrasel (P Sub J, P SubJ Head Noun, Sub J, HeadNoun) ), % &amp;quot;noun_phrasel&amp;quot; copies over the subject % from the left conjunct.</Paragraph>
    <Paragraph position="3"> adve rblal_phrase (Adv), (\[w\], % W is the next lexlcal item.</Paragraph>
    <Paragraph position="4"> is_verb(W, Verb, Tense) ; % Is W a verb? verbl(P_Verb, Verb, PContentverb, Contentverb, P Tense, Tense, P_VType, VType)), &amp;quot;verb1&amp;quot; copies over the verb from the % left conjunct.</Paragraph>
    <Paragraph position="5"> * A &amp;quot;clause&amp;quot; in our DCG comprises a head (a single goal) and a body (a sequence of zero or more goal s ).</Paragraph>
    <Paragraph position="6">  sentence conjunction(S, S, _, _, _, _, _, _, _, _) --&gt; \]\]. % Boundary condition.</Paragraph>
    <Paragraph position="7">  For sentence (36) (&amp;quot;John drove his car through and completely demolished a plate glass window&amp;quot;), for instance, when parsing the left conjunct, &amp;quot;rest sentencel&amp;quot; will be called eventually. The follo~ing verb structure will be built: v(drovel ,agent(np(pronoun(John))), object(np(det (his), pre mod(\[\]), n(carl), post mods(\[\]))), post verbmods~prep mods ( prep ( through~, pre~obJ (Prep Obj)), where th\[ logical variable PrepObJ will be unified later with the argument standing for the object in the right conjunct (ie, &amp;quot;a plate glass window&amp;quot;). When 'sentence&amp;quot; is called via the sub-goal 'sentence_conjunctlon&amp;quot; to process the right conjunct, the deleted subject &amp;quot;John&amp;quot; will be copied over via &amp;quot;noun phrasel'. Finally a structure is built which i-s a combination of two complete clauses. During the processing little effort is wasted. The backward deleted constituents (&amp;quot;a plate glass window&amp;quot; here) are recovered by using logical variables; the forward deleted ones (&amp;quot;John&amp;quot; here) by passing over values (via unification) from the conjunct already processed. Moreover, the &amp;quot;try-and-fail&amp;quot; procedure is carried out in a controlled and intelligent way. Thus a high efficiency lacking in many other systems is achieved (space prevents us from providing a detailed discussion of this issue here).</Paragraph>
  </Section>
  <Section position="7" start_page="245" end_page="245" type="metho">
    <SectionTitle>
ACKNOWLEDGEME NTS
</SectionTitle>
    <Paragraph position="0"> I would llke to thank Y. Wilks, D. Arnold, D. Fass and C. Grover for their comments and instructive discussions. Any errors are mine.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML