File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/87/e87-1050_metho.xml
Size: 18,261 bytes
Last Modified: 2025-10-06 14:12:01
<?xml version="1.0" standalone="yes"?> <Paper uid="E87-1050"> <Title>DEALING WITH THE NOTION &quot;OBLIGATORY&quot; IN SYNTACTIC ANALYSIS</Title> <Section position="1" start_page="0" end_page="0" type="metho"> <SectionTitle> DEALING WITH THE NOTION &quot;OBLIGATORY&quot; IN SYNTACTIC ANALYSIS </SectionTitle> <Paragraph position="0"/> </Section> <Section position="2" start_page="0" end_page="0" type="metho"> <SectionTitle> ABSTRACT </SectionTitle> <Paragraph position="0"> In the paper the use of the notion &quot;obligatory complement&quot; in syntactic analysis is discussed. In many theories which serve as bases for syntactic analysis procedures there are devices to express the difference between obligatory and optional complements on the rule level, i.e. via the lexicon the wordforms are connected with these rules where the fitting properties are expressed. I'll show that such an approach leads to some problems, if we want to handle real texts in syntactic analysis.</Paragraph> <Paragraph position="1"> In the first part I'll outline the theoretical framework we work with. Then I'll discuss for which purpose the use of the notion obligatory has some advantages and in the last part I'll show shortly how we intend to use this notion - in lexical entries (with respect to morphological analysis) and - in the syntactic analysis process.</Paragraph> </Section> <Section position="3" start_page="0" end_page="315" type="metho"> <SectionTitle> SOME THEORETICAL PREREQUISITES </SectionTitle> <Paragraph position="0"> The basis of our work is a special version of a dependency grammar (Kunze 1975). In this theory a syntactic structure of a sentence is represented as a tree, where the nodes correspond to the wordforms of the sentence and the edges express the dependencies between the wordforms. The edges are marked by subordination relations (SR's) which describe the relation between the subtree &quot;under&quot; the edge and the remaining tree context.</Paragraph> <Paragraph position="1"> Besides the syntactic dependencies other connections between the wordforms of the sentence remain which express certain congruences and restrictions. Here we have congruences - so-called paradigmatic connections - like (the listed categories concern the German variant): from a noun to an attribute (gender, number, case) from a preposition to the noun (case) from the subject to the finite verb (number, person) and restrictions - selective connections like: null from the verb to the (deep) subject from the verb to the direct object etc.</Paragraph> <Paragraph position="2"> The selective connections also apply to all transformational variants of the concerned phenomenon (let us take the (John reads a book. ) Ich sehe John ein Buch lesen.</Paragraph> <Paragraph position="3"> (I see John reading a book. ) Das Buch wird von John gelesen.</Paragraph> <Paragraph position="4"> (The book is read by John.) Das von John gelesene Buch ...</Paragraph> <Paragraph position="5"> (The book read by John ...) Das Lesen des Buches durch John ..</Paragraph> <Paragraph position="6"> (The reading of the book by John .) Der ein Buch lesende John ...</Paragraph> <Paragraph position="7"> (John reading a book ...) (7) John, der ein Buch liest ....</Paragraph> <Paragraph position="8"> (John who reads a book .... ) It is easy to see that the tree property would be destroyed if these connections were included as edges in the tree. To save the tree property Kunze introduced the mechanism of paths of action for the paradigmatic and selective connections. These paths run along the edges, i.e. they can be expressed also by the subordination relations. This is one essential reason for differentiating the SR's very strongly.</Paragraph> <Paragraph position="9"> For instance, it is necessary to differentiate between - the &quot;normal&quot; direct object and the direct object with subject role: John reads a boo~.</Paragraph> <Paragraph position="10"> I see Joh~ reading a book.</Paragraph> <Paragraph position="11"> - an adjective as attribute and a participle as attribute: The ~E book ...</Paragraph> <Paragraph position="12"> The r_~gding John ...</Paragraph> <Paragraph position="13"> - the subject in an active clause and the subject in a passive clause: John reads a book.</Paragraph> <Paragraph position="14"> A boQ~ is read by John.</Paragraph> <Paragraph position="15"> Besides the subordination relations another central concept in Kunze's theory are the h~les (see also Reimann, 1982). A bundle is a substructure of a dependency tree which contains exactly one top node and all nodes directly subordinated to it together with the edges between (and their markings - the subordination relations). The original idea was to use the bundles as syntactic rules. For this purpose, the bundle is regarded as a system of conditions which have to be fulfilled by a set of nodes to construct the structure which the bundle prescribed.</Paragraph> <Paragraph position="16"> But another possibility to use bundles is the following: They can serve as descriptions for the dominance behaviour of wordforms (i.e. the surface form of valency). In this way, the approach is similar to other theories: In the lexical entries of the wordforms there is a pointer to the rules which can be applied with the concerned wordform as top node. Our approach goes farther in the direction of dominance behaviour descriptions. Having in mind that, especially for nouns and verbs, the dominance behaviour is a very complex one, i.e. many different things can be subordinated to nouns and verbs: many of them are optional, some of them stand in certain relations to others, etc. Thus we concentrate all these bundles by defining another form of a bundle, which consists, in general, of many simple bundles.</Paragraph> <Paragraph position="17"> As we can see, only the subject is obligatory (in the active sentence), but the indirect object as well as the directional circumstance are only used, if the direct object belongs to the sentence. These facts can be expressed by a logical formula like this: (SUBJ @a v ((IOBJ vDIR)-~DOBJ)) That means we represent the dominance behavicur of wordforms by logical formulas (in subordination relations) - we call these formulas bundles. It is quite clear that it is not so easy to use these bundles as rules for syntactic analysis, but to describe the dominance behaviour of wordforms they seem to be quite appropriate. I won't deal here with free modifications (real adjuncts and other peripheral elements), although they belong, according to the theory, also to the bundles. To handle them a special mechanism is included in the analysis procedure.</Paragraph> <Paragraph position="18"> THE PHENOMENON OF OBLIGATORY COMPLEMENTS In the valency theory obligatory complements are normally regarded as special parts of the concept of the verb. On this level the notion &quot;obligatory&quot; has often been investigated. It is connected with the classification &quot;complementadjunct&quot;, but there are also optional complements and obligatory adjuncts.</Paragraph> <Paragraph position="19"> For automatic processing this classification is not sufficient: H. Somers (1986) showed that a more flexible classification lead to better results, especially with respect to machine translation. Somers referred also to the problem that obligatory complements can be &quot;hidden&quot; in the text: - Ellipses and other phenomena lead to omissions which are hard to handle.</Paragraph> <Paragraph position="20"> - In modified syntactic constructions (passive, nominalisations) complements can be omitted regularly.</Paragraph> <Paragraph position="21"> - In other constructions the complements stand in quite different relations to the form derived from a verb (the phenomenon of control, attributive participles etc.). In these cases the complements have to be found by special tools.</Paragraph> <Paragraph position="22"> Concerning the examples in the first paragraph regular omissions are possible in (3), (4), (5) and (6) while the sentences (2), (6) and (7) belong to the third category. They all have to be handled in syntactic analysis, but the question arises: What is the advantage of using the notion obligatory under the named circumstances? Obligatory in syntactic analysis Normally we suppose that sentences to be analysed are correct. But, if we construct a set of bundles (with obligatory edges), we are defining a set of sentences which will never be complete. If there are no obligatory edges, the described set is better covering the set of correct sentences. Only very simple demands have to be regarded like the necessity of the surface subject. In this way a parsing system can work quite well. In the $aarbrGcken MT-systems a dictionary is used where all complements are entered in a cumulative way without the classification obligatory-optional or other relations (Luckhardt, 1985).</Paragraph> <Paragraph position="23"> But I think, the possibilities to combine complements of verbs (and of derived forms) and thus also the notion obligatory can be very useful to solve ambiguities and to distinguish different meanings of a verb. By the way, also in SaarbrGcken such mechanisms are used, but only in the so-called semantic analysis following the syntactic analysis.</Paragraph> <Paragraph position="24"> To show the advantages I'll take the following verbs as examples: a) E@chn@~ (I) Er rechnet (die Aufgaben).</Paragraph> <Paragraph position="25"> (He calculates (the exercices).) (2) Er rechnet ihn zu seinen Freunden. (He reckons him among his friends.) (3) Er rechnet mit ibm.</Paragraph> <Paragraph position="26"> (He takes him into account.) In the first case the direct object is optional, but the prepositional objects in both other cases as well as the direct object in the second case are obligatory. If not, the first sentence would have all three meanings! Only the subject is not important for the distinction of the meanings, and it is not as obligatory as the other complements, because it can be omitted by passive transformation.</Paragraph> <Paragraph position="27"> b) b_.~e s__~t eh en (I) Es besteht Hoffnung.</Paragraph> <Paragraph position="28"> (There is hope.) (2) Er besteht die PrGfung.</Paragraph> <Paragraph position="29"> (He passes the examination.) (3) Die Fabrik besteht seit 3 Jahren. (The factory has existed for ...) (4) Er besteht auf seiner Meinung.</Paragraph> <Paragraph position="30"> (He insists on his opinion.) (5) Die Wand besteht aus Steinen.</Paragraph> <Paragraph position="31"> (The wall consists of stones.</Paragraph> <Paragraph position="32"> (6) Das Wesen der Sache besteht darin,.. (The nature ... consists in ...) Here in (I) and (3) the subject is obligatory, but in (2) only the direct object. In the other cases the prepositional objects are obligatory, thus the distinction of the different meanings is possible without ambiguities.</Paragraph> <Paragraph position="33"> c) erw~rten (i) Er erwartet G~ste.</Paragraph> <Paragraph position="34"> (He is waiting for guests.) (2) Die Kinder erwarten (von den Eltern) ein Geschenk.</Paragraph> <Paragraph position="35"> (The children expect a gift (from their parents).) Because of the possibility to form a passive sentence from (I), the subject is not obligatory in this case. But in (2) it is obligatory. Unfortunately the distinctive complement with yon is not obligatory, thus the distinction of these two meanings requires also to take into consideration the selective properties of the direct object.</Paragraph> <Paragraph position="36"> The conclusion of this paragraph can be that the classification in obligatory and optional complements is only important in a final stage of syntactic analysis to support the distinction of different meanings of wordforms (especially verbal forms or forms derived from verbs). But this distinction is very useful mainly with respect to machine translation, as we can see translating the different meanings of the examples.</Paragraph> </Section> <Section position="4" start_page="315" end_page="317" type="metho"> <SectionTitle> PRACTICAL CONCLUSIONS </SectionTitle> <Paragraph position="0"> As we have seen in the first paragraph the bundles (i.e. the logical formulas) have their place in the lexicon as description of the dominance behaviour of the wcrdforms. There is no problem, if a wordform lexicon (with full forms) is used. But in an extensive syntactic analysis system a morphological analysis has to be included.</Paragraph> <Paragraph position="1"> Obligatory in the lexicon For a morphological analysis (not only an inflexion analysis) we need a lexicon of bases and a lexicon of affixes. In the lexicon of bases there must be a general description of the grammatical properties and with the affixes rules have to be stated for calculating the properties of the derived wordforms.</Paragraph> <Paragraph position="2"> What does this mean for the description of the dominance behaviour? To calculate with the logical formulas seems to be not very convenient.</Paragraph> <Paragraph position="3"> Therefore the dominance component is divided into two parts: The first one is a cumulative list of the subordination relations and the second one contains the bundles.</Paragraph> <Paragraph position="4"> For the first part a splitting of the subordination relations is advantageous. The subordination relations are very complex things consisting of different kinds of information: - usual ideas about syntactic parts of sentences like subject, attribute .... - paths of action for selective connections, - paths of action for paradigmatic connections, - wordclass conditions etc.</Paragraph> <Paragraph position="5"> The first two express the well-known syntactic functions (SF's), the others their a~pearances - so-called morpho-syntactic relations (MSR's) - which are only necessary to recognize the syntactic functions. If a syntactic function is recognized, the used morpho-syntactic relation can be forgotten.</Paragraph> <Paragraph position="6"> Thus this part of the dominance component is a list of syntactic functions which have pointers to the MSR's expressing this syntactic function in case of the concerned wordform (SF-MSR-Iist). The rules for the derivations concern only this list, i.e. only the MSR's under the SF's can be changed. For instance: rechnen SF's MSR's SUBJ N-I noun in nominative case DOBJ N-4 noun in accussative case to the following: SUBJ N-2 noun in genitive case or P-PRACT prepositional actor DOBJ N-2 noun in genitive case or P-VON preposition yon ZU see above MIT see above Thus the bundles are not concerned by the rules connected with the derivations. But the problem remains how to handle the property &quot;obligatory&quot; here. We have two possibilities: - Only those complements which are obligatory in all derived forms are marked by the sign OB. In this case, the subject is not obligatory for many verbs, especially for all transitive verbs. Choosing this possibility, the &quot;surface obligateness&quot; (e.g. of a surface subject) has to be generated during the process (depending on derivation).</Paragraph> <Paragraph position="7"> - All semantically obligatory complements are marked by OB. Then changes have to be performed during the analysis process, too.</Paragraph> <Paragraph position="8"> We intend to follow the first way. At this point the question arises how to deal with the omissions of the third category, where the complements are not really omitted, but have to be looked for at other places within the sentence. That means that these complements are not connected with the verbal node by a direct edge (downward), but - in our theory they are connected by a path of action for the corresponding selective connection. In this way it is possible to let these complements be obligatory and to remark in the SF-MSR-Iist that instead of a MSR a path af action leads to the concerned complement.</Paragraph> <Paragraph position="9"> Thus the SF-MSR-Iist for the infinitive EeRhnen will have the following form: SUBJ via SUBJ-path of action DOBJ N-4 ZU see above MIT see above As result of the discussion we have the following formulas for the different meanings of rechnen: (I) (SUBJ v DOBJ) (2) (SUBJ v (ZU A DOBJ) oB) (3) (SUBJ v MITeS) Obligatory in the analysis process Finally I'll give a short survey of our syntactic analysis system to show that the bundles and with them also the notion obligatory - are used only in the very final stage.</Paragraph> <Paragraph position="10"> The first step of the procedure is a sequential preanalysis (performed by an ATN) which has the task to find the segments of the sentence and the verbal groups of each clause.</Paragraph> <Paragraph position="11"> The second step is a local analysis where only two nodes and the relations between them are regarded. Here the SF-MSR-lists are used to recognize the possible syntactic functions.</Paragraph> <Paragraph position="12"> But in the third step wrong readings from the first two steps are filtered out using the bundles, i.e. the logical formulas, together with the selective conditions (transported by the paths of action). A side effect of this so-called global bundle analysis is the selection of the actual verbal meaning. Only here the notion &quot;obligatory&quot; is used. To conclude this paper I'll emphasize once more the problems which have to be taken into consideration, if the notion &quot;obligatory&quot; is used for syntactic analysis: - The advantage of using such a concept is the possibility to solve ambiguities and to s.elect actual meanings of word-forms (especially verbal forms). This is the reason why it shall be used only in a final stage of analysis.</Paragraph> <Paragraph position="13"> - The different possibilities to omit obligatory complements have to be treated in an adequate way. Here special procedures during morphological analysis and the mechanism of selective connections (paths of action) can help to handle the regular cases. For other omissions (in ellipses etc.) default solutions are proposed.</Paragraph> </Section> class="xml-element"></Paper>