File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/83/e83-1013_metho.xml

Size: 15,619 bytes

Last Modified: 2025-10-06 14:11:37

<?xml version="1.0" standalone="yes"?>
<Paper uid="E83-1013">
  <Title>DEALING WITH CONJUNCTIONS IN A MACHINE TRANSLATION ENVIRONMENT</Title>
  <Section position="4" start_page="0" end_page="81" type="metho">
    <SectionTitle>
* Mailing address:
</SectionTitle>
    <Paragraph position="0"> parsing and have been widely applied in various NL processing systems. However, the standard ATN grsamars are rather weak in dealing with conjunctions. null In (Woods 73), a special facility SYSCONJ for processing conjunctions was designed and implemented in the LUNAR speech question-answering system. It is capable of analysing reduced conjunctions impressively (eg, &amp;quot;John drove his car through and completely demolished a plate glass window&amp;quot;), but it has two drawbacks: first, for the processing of general types of conjunction constructions, it is too costly and too inefficient; secondly, the method itself is highly non-deterministic and easily results in combinatorial explosions. null In (Blackwell 81), a WRD AND arc was proposed. The arc would take the interpreter from the final to the initial state of a computation, then analyse the second argument of a coordinated construction on a second pass through the ATN network. With this method she can deal with some rather complicated conjunction constructions, but in fact a WRD AND arc could have been added to nearly every state of the network, thus making the grammar extremely bulky. Furthermore, her syste~ lacks the power for resolving the ambiguities contained in structures like (1).</Paragraph>
    <Paragraph position="1"> In the machine translation system designed by (Nagao et al 82), when dealing with conjunctions, only the nearest two items of the same parts of speech were processed, while the following types of coordinated conjunctions were not analysed correctly: (noun + prep + noun) + and + (noun + prep + noun); (adj + noun) + and + noun.</Paragraph>
    <Paragraph position="2"> (Boguraev in press) suggested that a demon should be created which would be woken up when &amp;quot;and&amp;quot; is encountered. The demon will suspend the normal processing, inspect the current context (the local registers which hold constituents recognised at this level) and recent history, and use the information thus gained to construct a new ATN arc dynamically which seeks to recognise a constituent categorially similar to the one just completed or being currently processed. Obviously the demon is based on expectations, but what follows the &amp;quot;and&amp;quot; is extremely uncertain so that it would be very difficult for the demon to reach a high efficiency. A kind of &amp;quot;data-driven&amp;quot; alter- null native which may reduce the non-determinism is to try to decide the scope of the left conjunct retrospectively by recognising first the type of the right conjunct, rather than to predict the latter by knowing the category of the constituent to the left of the coordinator which is &amp;quot;just completed or being currently processed&amp;quot; -- an obscure or even misleading specification.</Paragraph>
    <Paragraph position="3"> the ball.</Paragraph>
    <Paragraph position="4"> Exl3. The man kicked the child and threw the ball.</Paragraph>
    <Paragraph position="5"> Exlh. The man kicked and threw the ball.</Paragraph>
    <Paragraph position="6"> ExlS. The man kicked and the woman threw the ball.</Paragraph>
  </Section>
  <Section position="5" start_page="81" end_page="81" type="metho">
    <SectionTitle>
I CASSEX PACKAGE
CASSEX (Chinese Academy of Social Sciences;U-
</SectionTitle>
    <Paragraph position="0"> niversity of Essex) is an ATN parser based on part of the programs developed by Boguraev (1979) which was designed for the automatic resolution of linguistic ambiguities. Conjunctions, one major source of linguistic ambiguities, however, were not taken into consideration there because, as the author put it himself, &amp;quot;they were felt to be too large a problem to be tackled along with all the others&amp;quot; (Boguraev 79, 1.6).</Paragraph>
    <Paragraph position="1"> A new set of grammars has been written, and a lot of modifications has been made to the grammar interpreter, so that conjunctions could be dealt with within the ATN framework.</Paragraph>
  </Section>
  <Section position="6" start_page="81" end_page="81" type="metho">
    <SectionTitle>
II PARSING MATERIALS
</SectionTitle>
    <Paragraph position="0"> The following are the example sentences rectly parsed by the package: Exl. The man with the telescope and the brella kicked the ball.</Paragraph>
    <Paragraph position="1">  The man with the telescope and the umbrella with a handle kicked the ball. The man with the telescope and the woman kicked the ball.</Paragraph>
    <Paragraph position="2"> The man with the telescope and the woman with the umbrella kicked the ball. The man with the child and the woman kicked the ball.</Paragraph>
    <Paragraph position="3"> The man with the child and the woman with the umbrella kicked the ball.</Paragraph>
    <Paragraph position="4"> The man with the child and the woman is kicking the ball.</Paragraph>
    <Paragraph position="5"> The man with the child and the woman are kicking the ball.</Paragraph>
    <Paragraph position="6"> The man with the child and the umbrella fell.</Paragraph>
    <Paragraph position="7"> The man kicked the ball and the child threw the ball.</Paragraph>
    <Paragraph position="8"> The man kicked the ball and the child. The man kicked the child and the woman</Paragraph>
  </Section>
  <Section position="7" start_page="81" end_page="81" type="metho">
    <SectionTitle>
III ELEMENTARY NP AND EXPANDED NP
</SectionTitle>
    <Paragraph position="0"> The term 'elementary NP' is used to indicate a noun phrase which can be embedded in but has no other noun phrases embedded in it. A noun phrase which contains other, embedded, NPs is called 'expanded Np,. Thus, when analysing the sentence fr84~ment &amp;quot;the man with the telescope and the woman with the umbrella&amp;quot;, we will have four elementary NPs (&amp;quot;the man&amp;quot;, &amp;quot;the telescope&amp;quot;, &amp;quot;the woman&amp;quot; and &amp;quot;the umbrella&amp;quot;) and two expanded NPs (&amp;quot;the man with the telescope&amp;quot; and &amp;quot;the woman with the umbrella&amp;quot;). We may well have a third kind of NP, the coordinated NP with conjunction in it, but it is the result of, rather than the material for, conjunction processing, and therefore will not receive particular attention. In the text followed we will use 'EL-NP' and 'EXP-NP' to represent the two types of noun phrases, respectively.</Paragraph>
    <Paragraph position="1"> LEFT-PART will stand for the whole fragment to the left of the coordinator;andRIGHT-PART for the fragment to the right of it. LEFT-WORD and RIGHT-WORD will indicate the word immediately precedes and follows, respectively, the coordinator.</Paragraph>
    <Paragraph position="2"> The conjunct to the right of the coordinator will be called RIGHT-PHRASE.</Paragraph>
  </Section>
  <Section position="8" start_page="81" end_page="82" type="metho">
    <SectionTitle>
VI CSDC RULES
</SectionTitle>
    <Paragraph position="0"> Constraints for determining the grammaticalness of constructions involving coordinating conjunctions have been suggested by linguists, among which are (Ross 67)'s CSC (Coordinate Structure Constraint), (Schachter 77)'s CCC (Coordinate Constituent Constraint), (Williams 78)'s Across-the-Board (ATB) Convention, and (Gazdar @l)'s nontransformational treatment of coordinate structures using the conception of 'derived categories'. These constraints are useful in the investigation of co-ordination phenomena,but in order to process coordinating structures automatically, some constraint defined from the procedural point of view is still required.</Paragraph>
    <Paragraph position="1"> The following ordered rules, named CSDC (Conjuncts Scope Determination Constraints), are suggested and embodied in the CASSEX package so as to meet the need for automatically deciding the scope of the conjuncts: i. Syntactical constraint.</Paragraph>
    <Paragraph position="2"> The syntactical constraint has two parts:  i.i The conjuncts should be of the same syntactical category;</Paragraph>
    <Section position="1" start_page="82" end_page="82" type="sub_section">
      <SectionTitle>
1.2 The coordinated constituent should be in
</SectionTitle>
      <Paragraph position="0"> conformity syntactically with the other constituents of the sentence, eg if the coordinated constituent is the subject, it should agree with the finite verb in terms of person and number.</Paragraph>
      <Paragraph position="1"> Acoording to this constraint, Ex8 should be analysed as follows (the representation is a tree diagram with 'CLAUSE' as the root and centred around the verb, with various case nodes indicating the dependency relationships between the verb and  NPs whose head noun semantic nrimitives are the same should be preferred when deciding the scope of the two conjuncts coordinated by &amp;quot;and&amp;quot;. However, if no such NPs can be found, NPs with different head noun semantic primitives are coordinated anyhow.</Paragraph>
      <Paragraph position="2"> Cf (Wilks 75).</Paragraph>
      <Paragraph position="3"> According to rule 2, Exl should be roughly represented as 'The man with (AND (telescope) (umbrella))'; Ex2, 'The man with (AND (telescope) (umbrella with a handle))'; Ex3, '(AND (man with  telescope) (woman))' and Exh, '(AND (man with telescope) (woman with umbrella))' 3. Symmetry constraint.</Paragraph>
      <Paragraph position="4">  When rules i and 2 are not enough for deciding the scope of the conjuncts, as for Ex5 and Ex6, this rule of preferring conjuncts with symmetrical pre-modifiers and/or post-modifiers will be in effect: Ex5 .... with (AND (child) (woman)) ...</Paragraph>
      <Paragraph position="5"> Ex6. (AND (the man with ...) (the woman with ...))...</Paragraph>
      <Paragraph position="6"> h. Closeness constraint.</Paragraph>
      <Paragraph position="7"> If all the three rules above cannot help, the NP to the left of &amp;quot;and&amp;quot; which is closest to the coordinator should be coordinated with the NP immediately following the coordinator: Ex9. The man with (AND (child) (umbrella)) fell.</Paragraph>
    </Section>
  </Section>
  <Section position="9" start_page="82" end_page="84" type="metho">
    <SectionTitle>
V THE IMPLEMENTATION
</SectionTitle>
    <Paragraph position="0"> The seemingly straightforward way for dealing with conjunctions using the ATN grammars would be to add extra WRD AND arcs to the existing states, as (Black-well 81) proposed. The problem with this method is that, as (Boguraev in press) pointed out, &amp;quot;generally speaking, one will need WRD AND arcs to take the ATN interpreter from just about every state in the network back toalmosteach preceding state on the same level, thus introducing large overheads in terms of additional arcs and complicated tests.&amp;quot; Instead of adding extra WRD AND arcs to the existing states in a standard ATN gra~,nar, I set up a whole set of states to describe coordination phenomena. The first few states in the set are as follows:  The CONJ/ states can be seen as a subgrammr which is separated from the main (conventional) ATN grezmar, and is connected with the main grammar via the interpreter.</Paragraph>
    <Paragraph position="1"> The parser works in the following way.</Paragraph>
    <Paragraph position="2"> Before a conjunction is encountered, the parser works normally except that two extra stacks are set: **NP-STACK and **PREP-STACK. Each NP, either EL-NP or EXP-NP, is pushed into **NP-STACK,together with a label indicating whether the NP in question is a subject (SUBJ) or an object (OBJ) or a preposition object (NP-IN-NMODS).</Paragraph>
    <Paragraph position="3"> The interpreter takes responsibility of looking ahead one word to see whether the word to come is a conjunction. This happens when the interpreter is processing &amp;quot;word-consuming&amp;quot; arcs, ie CAT, WRD, MEM and TST arcs. Hence no need for explicitly writing into the grammar WRD AND arcs at all. By the time a conjunction is met, while the interpreter is ready to enter the CONJ/ state, either a clause (ExlO-13) or a noun phrase in subject position (Exl-9) would have been POPed, or a verb (Exlh-15) would have been found. For the first case, a flag LEFT-PART-IS-CLAUSE will be set to true, and the interpreter will t~j to parse RIGHT-PART as a clause. If it succeeds, the representation of a sentence consisted of two coordinated clauses will be outputted. If it fails, a flag RIGHT-PART-IS-NOT-CLAUSE is set up, and the sentence will be reparsed. This time the left-part will not be treat -ed as a clause, and a coordinated NP object will be looked for instead. ExlO and Exll are examples of coordinated clauses and coordinated NP object, respectively. One case is treated specially: when LEFT-PART-IS-CLAUSE is true and RIGHT-WORD is a verb (Exl3), the subject will be copied from the left clause so that a right clause could be built.</Paragraph>
    <Paragraph position="4"> For the second case, a coordinated NP subject will be looked for. Eg, for Exh, by the time &amp;quot;and&amp;quot; is met, an I~P &amp;quot;the man with the telescope&amp;quot; would have been POPed, and the state of affairs or the **NP-STACK would be like this:</Paragraph>
    <Paragraph position="6"> After the excution of the arc ((PUSH NP) (NP-START)), RIGHT-PHRASE has been found. If it has an PP modifier, a register NMODS-CONJ will be set to the value of the modifier. Now the NPs in the **NP-STACK will be POPed one by one to be compared with the right phrase semantically. The NP whose formula head (the head of the NOUN in it) is the same as that of the right conjunct will be taken as the proper left conjunct. If the NP matched is a subject or object, then a coordinative NP sub-ject or object will be outputted; if it is an EL-NP in a PP modifier, then a function REBUILD-SUBJ or REBUILD-OBJ, depending on whether the modified EXP-NP is the subject or the object, will be called to re-build the EXP-NP whose PP modifier should consist of a preposition and two coordinated NPs.</Paragraph>
    <Paragraph position="7"> Here one problem arises: for Ex5, the first NP to be compared with the right phrase (&amp;quot;the woman&amp;quot;) would be &amp;quot;the man with the child&amp;quot; whose head noun &amp;quot;~usn&amp;quot; would be matched to &amp;quot;woman&amp;quot; but, according to our Symmetry Constraint, it is &amp;quot;child&amp;quot; that should be matched. In order to implement this rule, whenever NMODS-CONJ is empty (meaning that the right NP has no post-modifier), the **NP-STACK should be reversed so that the first NP to be tried would be the one nearest to the coordinator (in this case &amp;quot;the child&amp;quot;).</Paragraph>
    <Paragraph position="8"> For the third case (LEFT-WORD is a transitive verb and the object slot is empty, Exs lh and 15), right clause will be built first, with or without copying the subject from LEFT-PART depending on whether a subject can be found in RIGHT-PART.Then, the left clause will be completed by copying the object from the right clause, and finally a clausal coordination representation will be returned.</Paragraph>
    <Paragraph position="9"> In the course of parsing, whenever a finite verb is met, the NPs at the same level as the verb and havin~ been PUSHed into the **NP-STACK should be deleted from it so that when constructing p(ssible coordinative NP object, the NPs in the sub-ject position would not confuse the matching. Exll is thus correctly analysed.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML