File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/80/c80-1051_metho.xml

Size: 9,433 bytes

Last Modified: 2025-10-06 14:11:19

<?xml version="1.0" standalone="yes"?>
<Paper uid="C80-1051">
  <Title>PARSING FREE WORD ORDER LANGUAGES IN PROLOG \]anusz Stanistaw Biefi +</Title>
  <Section position="2" start_page="0" end_page="0" type="metho">
    <SectionTitle>
(2) BARDZO PRZYJEM~E BYLO PRACOWAd
(3) BYLO BARDZO PRZYJEMNIE PRACOWAd
</SectionTitle>
    <Paragraph position="0"> To make the grammar accept these sentences we should, for example, add two</Paragraph>
    <Paragraph position="2"> One-third of the possible permutations of words BYLO, BARDZO, PRACOWAC, PRZY\]EMNIE constittzte admissible Polish sentences (although sometimes stylistically marked). The complete grammar should then have 21 rules, including dictionary rules. Such a solution is obviously clumsy and not satisfactory.</Paragraph>
    <Paragraph position="3"> Our first proposal consists in allowing two kinds of terminal symbols:anchored terminals, retrieved in the current position of a given sentence (available in metamorphosis grammars 2 and prefixed By in our example) and floating terminals, retrieved anywhere in the unprocessedpart of a sentence (we shall prefix them by ~ ).</Paragraph>
    <Paragraph position="4"> The easiest and most concise way of expressing a grammar for the sentences mentioned above consists in replacing every anchored terminal by a floating terminal. It is, however, not satisfactory because such a grammar accepts also deviant (syntactically or stylistically) sequences, e.g.</Paragraph>
    <Paragraph position="5"> (g) BYLO BARDZO PRACOWAC PRZYJEMNIE</Paragraph>
  </Section>
  <Section position="3" start_page="0" end_page="348" type="metho">
    <SectionTitle>
(5) PRZYJEMNIE PRACOWAC BARDZO BYLO
</SectionTitle>
    <Paragraph position="0"> By using both the anchored terminals and the floating terminals we can define the following grammar :</Paragraph>
    <Paragraph position="2"> The grammar accepts only half of the incorrect sequences, but (a usual trade-off) it rejects some correct Polish sentences.</Paragraph>
    <Paragraph position="3"> It seems that only a grammar with numerous specific rules can satisfy the strong requirement of accepting those and only those sequences which are considered correct and no others.</Paragraph>
    <Paragraph position="4"> The formalism is, however, quite appropriate to describe e.g. the syntax of some noun phrases in Polish or syntactically unbound modifiers.</Paragraph>
    <Paragraph position="5"> Introducing the floating terminals into the Marseille-originated Prolog interpreter requires only minor alterations of the bootstrap. The facility has been already made standard in the Prolog version for ODRAI$05 (ICL ~900 compatible)which is distributed in Poland.</Paragraph>
    <Paragraph position="6"> To illustrate deficiencies of the proposed mechanism in parsing certain kinds of free word-order constructions we shall consider the following Polish sentences:  (6) TRZEBA BY CZEGOg WII~C EJ 'is needed&amp;quot; &amp;quot;something&amp;quot; &amp;quot;more&amp;quot; \[present, \[condi- \[genitive\]</Paragraph>
    <Paragraph position="8"> &amp;quot;Something more would be needed.&amp;quot; The sentences (6),(7) consist of the ~mpersonal conditional verb-like phrase TRZEBA BY and the noun phrase CZEGO~ WII~CEJ. The words CZEGOS and WII~CE\] may occupy any position, but the order of TRZEBA and BY is restricted. If BY precedes TRZEBA then BY must not be the first word of a sentence, otherwise, BY must be adjacent to TRZEBA.</Paragraph>
    <Paragraph position="9"> Therefore in order to make a concise grammar accepting all correct Polish sentences built of the words TRZEBA, BY, WII~CEJ, CZEGO~, we must introduce a more selective information concerning the order of words. We supply selected terminals and nonterminals with-control items restricting their scopes of floating. The lack of such an item means the restriclions inheriled from the left-hand nonterminal (in particular no restrictions). .For example, such restrictions could be: a terminal should be the last (the firsl), a terminal must follow (immediately follow) the recently retrieved terminal.</Paragraph>
    <Paragraph position="10"> Coming back to our example we should specify: either BY follows a verb immediately, or BY must not be the first and must precede a verb.</Paragraph>
    <Paragraph position="11"> We can now write the grammar accepting the sentences (6),(?). The grammar is as follows (variable parameters prefixed by asterisks, control items separated by</Paragraph>
    <Paragraph position="13"> In order to make the example clear we use only the categories relevant for the sentences under discussion. We omit, for instance, the number and gender of a noun phrase ; the parameter ~SYNTREQ expresses a single syntactic requirement (in general a verb can have more then one requirement ; for details, see Szpakowicz 5 ). The rule for NP is also very simplified. .From the point of view of the description of Polish syntax the grmnmar presented above is, in fact, unsophisticated and fragmentary. It is sufficient, however, to illustrate some linguistic phenomena mentioned earlier.</Paragraph>
    <Paragraph position="14"> An experimental version of the ODRA-Prolog accepts the metamorphosis grammar  --347-rules with control items (syntactically just Prolog terms). The inventory of the word order restrictions has yet to Be established by the research on word order in Polish.</Paragraph>
    <Paragraph position="15"> Thus, for the time Being, the interpretation of the control items is implemented in an ad hoc manner.</Paragraph>
    <Paragraph position="16"> A formal description of the syntax of a natural language of free word-order type, as for example Polish and other Slavonic languages, requires, however, some additional technical and linguistic problems to Be solved.</Paragraph>
    <Paragraph position="17"> We want to present now those problems which we find to Be the most important. null In some cases the occurence of a word-form depends on particular properties of the word which immediately precedes it (usually it is the phonetic shape of the preceding word which influences the choice of the proper word-form ). For example, agglutin,ative present tense form of the verb BYC in second person, singular, masculine can Be realized either by or By E~. The forms ~, EL are written jointly with the preceding syntactic item But on the level of syntactic description they are clearly distinguishable. Let us illustrate this problem by the following sentences : (8)NAROBIL + E~ LADNEGO &amp;quot;to cause&amp;quot; &amp;quot;cute&amp;quot; here : 'big&amp;quot; \[sg, masc\] \[2p, sg, \[sg, masc, \[sg,masc,  The very simple grammar presented below accepts these two sentences but it accepts also some incorrect sequences because the rules do not express the dependency phenomena mentioned above.</Paragraph>
    <Paragraph position="19"> eLADNEGO @KLOPOTU, AFTER.</Paragraph>
    <Paragraph position="20"> (VPT- the abbreviated present tense form I of the verb BYC; VOW and CON mean &amp;quot;used after a vowel&amp;quot; and &amp;quot;used after a consonant&amp;quot; ).</Paragraph>
    <Paragraph position="21"> So far we do not see the simple and satisfactory way of relating the parameter * X of %VPT to the other words and phrases. Provisionally the agreement of the agglutinative forms of the verb BYE with the corresponding words may Be resolved during dictionary lookup in the pre-parsing phase.</Paragraph>
    <Paragraph position="22"> The other purely linguistic problems are related to influence of the free word-order on accomodating the verb phrase to the gender of a compound noun phrase.</Paragraph>
    <Paragraph position="23"> For example, the verb phrases in the aposition agree in gender with the last constituent of the noun phrase, as in:  It is only recently that this difficult problem has been a subject of a partial research. The formal syntax description of written sentences in Polish with neutra\] word-order is availableS, 6. It accepts practically all nonelliptical declarative and negative sentences, as well as the majority of interrogative sentences, nevertheless, we can propose only a provisional solution of this problem.</Paragraph>
    <Paragraph position="24"> Another complicated question consists in the discontinuity of the phrases which constitute the sentence, as for example interpenetration of the verb phrase and the  Therefore the contro\] information should allow the search of missing constituents of the phrases even far off the main component. On the other hand it should protect against &amp;quot;borrowing&amp;quot; an inappropriate constituent from a quite different phrase, e.g. from the subordinate clause. It is now clearly visible that parsing free word-order languages is really different from the syntactic analysis of, say, English. Although the presented modifications of metamorphosis grammars do not solve all the problems discussed above, they provide a useful instrument for further experimental studies.</Paragraph>
    <Paragraph position="25"> Finally we want to emphasize that we were aware of the semantic and pragmatic functions of free word-order, which are studied e.g. by Sgal! 4 and Szwedek 7. But we believe that, from the methodological point of view, it is justified to prescind from them in the syntax description.</Paragraph>
    <Paragraph position="26"> A reader interested in some notions of the impact of word-order on semantico-pragmatic level, may wish to consult Biell I .</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML