File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/88/c88-2127_metho.xml

Size: 20,428 bytes

Last Modified: 2025-10-06 14:12:13

<?xml version="1.0" standalone="yes"?>
<Paper uid="C88-2127">
  <Title>SENSITIVE PARSING: ERROR ANALYSIS AND EXPLANATION IN AN INTELLIGENT LANGUAGE TUTORING SYSTEM</Title>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
1. Agreement errors
</SectionTitle>
    <Paragraph position="0"> In German, articles, adjectives and nouns within a noun group have to agree in gender, number and case, and verbs have to agree in person and number with the subject noun phrase of the sentence. The object complements of verbs take certain cases and so do prepositions. Agreement errors are errors on the syntactic level, but they do not concern the order of the words in a sentence, hence they can be corrected by changing the case, the number, the person or the gender of the noun phrases or parts of them.</Paragraph>
  </Section>
  <Section position="5" start_page="0" end_page="608" type="metho">
    <SectionTitle>
2. Syntactic errors
</SectionTitle>
    <Paragraph position="0"> We consider two types of syntactic errors, the first involving words which have been omitted or added, e.g. when an article or a preposition is missing or superfluous, and the.</Paragraph>
    <Paragraph position="1"> second involving the permutation of words or syntactic groups. The latter error is very frequent in German because here the possible places of verbs in a sentence can differ from many other languages : for example, the verb can go to the very end of a sentence or to the very beginning. Some syntactic errors have to be partly  analysed at the morphological level because they involve both word construction and word order : for an example, in German there are verbs which have a prefix which in certain cases has to be detached and placed at the very  end of the sentence.</Paragraph>
    <Paragraph position="2"> (1) Er kommt zurl3ck (He comes back).</Paragraph>
    <Paragraph position="3">  This is the correct formulation of the sentence. Students of German tend typically not to detach the prefix zur(Jck and construct the ill-formed sentence (2) * E=&amp;quot; zurOckkommt (He backcomes).</Paragraph>
    <Paragraph position="4"> The word zurEickkommt does not exist in German (although the infinitive is zurEIckkommen) and this error has to be recognized at the word level, because zurgckkommt is an ill-formed word, although the underlying error is a syntactic error.</Paragraph>
  </Section>
  <Section position="6" start_page="608" end_page="608" type="metho">
    <SectionTitle>
3. Semantic Errors
</SectionTitle>
    <Paragraph position="0"> We have actually been working on one type of semantic errors, namely errors in the semantic verb cases which arise from a misunderstanding of the meaning of verbs and noun~; and their semantic relationships, thus posing lexical problems. For example, when a student forms the sentence (3) *Das Heft arbeitet (The notebook works).</Paragraph>
    <Paragraph position="1"> he has not understood-thatarbeiten requires an animate subject and that Heft is not animate, i.e. he has a lexical problem.</Paragraph>
    <Paragraph position="2"> In what follows, we will first introduce, in a informal way, the theoretical background for error definition and then discuss the treatment of syntactic and semantic errors within the language teaching system.</Paragraph>
    <Paragraph position="3"> Our system is implemented in PROLOG II and has been tested with various dictionaries and by different users (adult language students, pupils).</Paragraph>
  </Section>
  <Section position="7" start_page="608" end_page="608" type="metho">
    <SectionTitle>
2. THEORETICAL BACKGROUND
</SectionTitle>
    <Paragraph position="0"> Jn this chapter, the concepts of feature grammar and unification are introduced informally. We provide a slightly modified definition of unification where the result of the unification is the unified elements and the set of the pairs of elements for which the unification did not work out. Thi~ set is necessary for interpreting and explaning errors.</Paragraph>
    <Paragraph position="1"> Complex features have been used by most schools ol linguistics \[KAPLAN R.M., BRESNAN J. 1983; KARTUNEN 1984\]. The whole process of syntactic analysis is governed by features and their values. Not only the lexical elements are classified by features but also the syntactic categories. For example, the category sentence is subclassified by the features satzstellung, whose values.</Paragraph>
    <Paragraph position="2"> indicate whether the sentence has a normal word order or has the verb at the very end or at the beginning, (this corresponds in German to different types of embedded phrases), and by the feature embedded with values + and u indicating whether a sentence or a noun phrase contains * embedded phrases. We have constructed a grammar using 25 syntactic and 40 semantic features. To our knowledge, until now feature grammars have never been applied to the pr0biem of analysing illformed sentences, nor within the context of language teaching.</Paragraph>
    <Paragraph position="3"> A feature grammar is an extension of a CHOMSKYgrammar. The alphabet consists of structured symbols which are sets of pairs (feature,value). In the rest of this paper, a structured symbol a will be written as a tuple a=\[fl(Vl),...fn(vn)\] where the fi denote feature names and the v i their values. No feature can occur twice within a structured symbol. The set of features occurring in a, {fl .... fn} is called the domain of a, and is written d(a). The value of f~ d(a) in a is denoted by a(f). Hence a(f) = v iff f(v) a. In many cases it is useful to introduce a more concise notation for sets of structured symbols. We need to denote such sets, because many words are ambigous and have to be described by a set of structured symbols rather than by a single one. Most current theories also allow features that have complex values. By using disjunction and negation of values many structured symbols can be written much more economically. For example, the German noun Kind can have three cases (nominative, dative and accusative), but in this formalism it is denoted by just one symbol \[gender(neutr),case(neg(genitive)),number(singular)\]. Let us call sets of structured symbols complex symbols.</Paragraph>
    <Paragraph position="4"> Structured symbols are used in a formal grammar for natural language sentences in the following way: there is one feature, cat (category), that plays a special part and whose values are the categories usually needed in a natural language grammar: sent (sentence)i np (noun phrase), vp (verb phrase), etc. Further features characterize properties according to which categories are subclassified; e.g. vcat (verb category) is a feature whose values are intrans (intransitive) and trans (transitive) and prep (prepositional complement); the feature place subclassifies verbs and its possible values are the numbers 1, 2, 3, standing for the number of complements of a verb. tense is a feature with the properties pluperfect, imperfect, perf, pros, fut specifying the time of a verb.</Paragraph>
    <Paragraph position="5"> These three features all subclassify verbs. Other more ifrequently cited features are case, number, gender which characterize articles, nouns, adjectives, but also noun :phrases and noun groups. Semantic properties of categories are equally characterized by features and :formally these &amp;quot;semantic&amp;quot; features are not distinguished from &amp;quot;syntactic&amp;quot; features; e.g. animate is a semantic feature whose values are + and and which belongs to :nouns, durative and static and action are features which  All possible types of errors have been defined by means of features. -.</Paragraph>
    <Paragraph position="6"> The definition of unification is slightly different from the usual definition (see \[KARTUNEN 1984\]), because in our application we need not only to find whether two symbols can be unified but also for what reasons they might possibly not be unified. Hence we need to have all the pairs of elements which cannot be unified.</Paragraph>
    <Paragraph position="7"> Let a= If1 (Vl),...fn(vn)\] and b= \[gl(Wl ),...gm(wm)\], where v i and wj are sets of values. Then we define a predicate unify(a,b,r,e) where r is the result of the unification and e is the set consisting of all the pairs of value sets for which a and b! could not be unified, together with all the symbols! contained in the symmetrical difference between a and b.</Paragraph>
    <Paragraph position="9"> The unification is defined on sets of complex symbols. Let be a={al, a2,...an} and b={bl, b2,...bm} where all a i and bj are of the form \[fl(Vl),..fn(Vn)\]. Then the predicate unification of a and b (with the results r and e) Is defined as the union set of all elements which unify(ai,bj,r,e) unification(a,b,r,e) r= u{r:unify( ai,bj,r,e ) and a I ~ aand bjE b} e= u(e:unify( ai,bj,r,e )and a( E aand bj~ b} The unification is obviously successful when r~.</Paragraph>
    <Paragraph position="10"> Example 1 : The definite article der is described by the complex symbol</Paragraph>
    <Paragraph position="12"> Feature grammars are defined as formal grammars manipulating strings of complex symbols and the derivability concept is modified according to the structures of the complex symbols. To each production rule belongs an operation on the feature sets of the symbolsinvolved.</Paragraph>
  </Section>
  <Section position="8" start_page="608" end_page="612" type="metho">
    <SectionTitle>
3. THE TREATMENT OF ERRORS
</SectionTitle>
    <Paragraph position="0"> The whole syntactic analysis is usit~g the metamorphosis grammar formalism \[COLMERAURER 1978\], enriched with unification predicates for syntactic and semantic agreement.</Paragraph>
    <Paragraph position="1"> 3.1. Agreement errors.</Paragraph>
    <Paragraph position="2"> The analysis of agreement errors in German is complex because morphologically, the words are highly ambigous. There are 24 different definite articles (4 cases, 3 genders and 2 numbers} but there are only 6 different words for them all, each of which can have between 2 and 8 interpretations (or meanings). In the same way, every noun has at most four different forms which can have 8 different morphosyntactic meanings. Adjectives are even more ambigous, because there are (at least} 4 different declinations according to their context within a sentence : preceded by a definite or an indefinite article, by no article or by a negation. Our grammar contains these four declinations, i.e. 4*3*2*4 adjective meanings and only 5 forms for them (ending by &amp;quot;e&amp;quot;, &amp;quot;en&amp;quot;, &amp;quot;e'm&amp;quot;, &amp;quot;er&amp;quot;, &amp;quot;es&amp;quot;). But the case and number of a noun phrase within a sentence depend on the verb, since a verb takes a certain case and determines the number. Hence an error in the number of a  noun phrase could also be an error in the number of the 'verb. Moreover, when two elements of a phrase do not .agt:ee, there are frequently several possible ways of analysing and explaining the disagreement. For this reason, the definition of unification has been slightly modified so as to produce all the pairs of features which bdo not agree as to their values. Consider example 2. The noun phrase der Kind cannot be unified and we want to explain to a student why. In the above example, three different analyses have been found. It depends on the context within a sentence which explanation is the right one. We have found that case filtering gives plausible explanations. In German, verb complements have cases.</Paragraph>
    <Paragraph position="3"> Hence, for any noun phrase in a sentence, there is an expectation as to the case. Consider the following  sentences : (4) *Der Kind spielt (The child plays).</Paragraph>
    <Paragraph position="4"> (5) *Er {\]ibt der Kind Milch (He gives milk to the child\]. (6) *Sie kennt der Kind (She knows the child).</Paragraph>
    <Paragraph position="5">  In (4), Der Kind is the subject of the sentence and the expected case is the nominative. Case filtering gives the third error analysis : disagreement in the gender, since der is masculine whereas Kind is neuter. In (5), der Kindis the indirect object of the sentence and the expected case is the dative. Case filtering gives us the second error analysis: disagreement in the gender, since der is feminine whereas Kind is neuter. In (6), der Kind is the: direct object of the sentence and the expected case is the acCUsative. By case fiitedng, we find that tier:cannot be accusative.</Paragraph>
    <Paragraph position="6"> The most likely strategy for analysing of agreement errors consists of placing an error as high as possible within a syntax tree. But this procedure can be eliminated in the' following situation. Take the sentence : (7) * Der G6tter zf3rnen (The gods are angry).</Paragraph>
    <Paragraph position="7"> the &amp;quot;easiest&amp;quot; case. Poeole make errors in order to make their lives easierl Hence, the strategy of analysing errors as high as possible is not applicable when a subject noun phrase, which should be in the nominative case, could be analysed as having another case whereas parts of it are in the nominative. Now, we have seen, that in the definition of unification even when the unification is successful, the set of nonunifiable elements is produced. Besides computational issues, because the algorithm runs only once through the lists, this set is very useful when a noun phrase already analysed, such as the one in our example, :}ia.S to be re~vieWed'in&amp;quot;order to find a possible idisagreement between its parts. Case filtering of the jdisagreeing interpretations gives us the correct error analysis: disagreement in the number, since der is singular and GStter is plural.</Paragraph>
    <Paragraph position="8"> During our numerous essays of the system, these explanations of agreement errors have always turned out to be plausible.</Paragraph>
    <Paragraph position="9"> 13.2. Syntactic Errors.</Paragraph>
    <Paragraph position="10"> We distinguish between low level and high level syntactic ierrors. Low level syntactic errors involve the omission or 'addition of functional words such as articles or prepositions, and the permutation of words on the lexical level. High level syntactic errors involve the permutation of groups of words. High level errors are mostly due to non application of obligatory transformational rules or to application of the wrong rules, usually derived from the native language of the student .In \[SCHUSTER 1986\] this relationship between errors made by second language istudents and the grammar of their first language is systematically used for error handling. We will show by giving two examples how such types of errors can be clearly represented in PROLOG. All types of syntactic ~errors are treated by error rules.</Paragraph>
    <Paragraph position="11"> GOtter has the representation</Paragraph>
    <Paragraph position="13"> case(genitive), number(&lt;slngular,plural&gt;)\], \[Art-cat(def), Gender(masc), case(nominative),number(&lt;singular, plural&gt;)\]} Der GtJtter is the subject of the sentence and the expected case is the nominative, tier GtJtter is genitive plural and this is the error signalled (disagreement on the case). But this analysis is not at all plausible. It is very unlikely that a student should try to construct a genitive plural, which is a &amp;quot;difficult&amp;quot; case, when the nominative is required., which is I. In German, adjectives precede the noun group, whereas in French , they frequently follow it. This is described by the following rules (formulated as PROLOG clauses):</Paragraph>
    <Paragraph position="15"> art(&amp;quot;das&amp;quot;.Y,Y).</Paragraph>
    <Paragraph position="16"> noun(&amp;quot;Auto&amp;quot;.Y=,Y).</Paragraph>
    <Paragraph position="17"> adj(&amp;quot;blaue&amp;quot;.Y,Y).</Paragraph>
    <Paragraph position="18"> correct.</Paragraph>
    <Paragraph position="19"> error(noun,ag) :- error-message.  For the sake of clarity, we have simplified these rules by suppressing all terms relating to the morphological and semantic analysis and properties of the categories. The noun phrase das blaue Auto would be analysed correctly as np(&amp;quot;das&amp;quot;.&amp;quot;blaue&amp;quot;.&amp;quot;Auto&amp;quot;.nil,nil,correct) whereas the incorrect noun phrase das Auto blaue is analysed as np(&amp;quot;das&amp;quot;.&amp;quot;Auto&amp;quot;.&amp;quot;blaue&amp;quot;.nil,nil,error(noun,ag)). The np-ruie treats the error predicate F, which is a PROLOG term, by calling it.</Paragraph>
    <Paragraph position="20"> II. In German, verb groups in the perfect tense are frequently split up. The auxiliary takes the place of the verb, and the participle goes to the end of the sentence, as, for example in : (8) Ich habe dem Baby Milch gegeben (I have to the baby milk given).</Paragraph>
    <Paragraph position="21"> French (and equally English) students of German might say (9) *lch habe gegeben dem Baby Milch.</Paragraph>
    <Paragraph position="22"> This transformation rule, as well as its erroneous omission, is represented in PROLOG as follows: vp(X,XE,correct) :verb(X,Xl,t,XH), compls(X1,X0),eq(X0,XH.XE).  vp(X,X0,error(verb,part-perf)):verb(X,X1 ,perf,XH), freeze(X2,compls(X2,X0)),eq(X1 ,XH.X2).</Paragraph>
    <Paragraph position="23"> ve rb(&amp;quot;f&amp;h rt&amp;quot;.Y,Y,pres,0).</Paragraph>
    <Paragraph position="24"> ve rb(&amp;quot;ist&amp;quot;.Y,Y,perf,&amp;quot;gefah ren&amp;quot;). Again, this description has been simplified in order to make clear how these transformation rules function in PROLOG. freeze is a predefined predicate of PROLOG II \[ProloglA\]. freeze(X,P) delays the evaluation of P until X: takes a value, compls analyses the verb complements of the sentence.The order of the sentence parts is produced by the equations between them (predicate eq).</Paragraph>
    <Paragraph position="25"> 3.31Semantic errors.</Paragraph>
    <Paragraph position="26"> The only type of semantic errors on which we have been working so far concerns the violation of semantic restrictions on verbs and their complements. Consider the. sample sentence (3) in chapter 1. The semantic relationships between verbs and their complements as well as their semantic features are all described by Semantic predicates.</Paragraph>
    <Paragraph position="27"> subj-sem(arbeiten (work), human).</Paragraph>
    <Paragraph position="28"> subj-sem(v,n) :- sup(n',n), subj-sem(v,n').</Paragraph>
    <Paragraph position="29"> obj-sem(schreiben (write), text).</Paragraph>
    <Paragraph position="30"> sup(human ,individual).</Paragraph>
    <Paragraph position="31"> sup(human ,group).</Paragraph>
    <Paragraph position="32"> sup(human ,humanized).</Paragraph>
    <Paragraph position="33"> sup(text,Heft (notebook)).</Paragraph>
    <Paragraph position="34">  In the grammar rules, the semantic predicates are called as follows : sg(...&lt;v,n&gt;...) :np(,.n...), vp(...v...), default(subjsem(v,n),sem-error(v,n)).</Paragraph>
    <Paragraph position="35"> This is a rule analysing a sentence (sg). &amp;quot;default(p,q)&amp;quot; is a predefined predicate of PROLOG II first evaluating all possibilities for p, and only when none of these succeeds is q evaluated, sem-err produces an explanation of the type : arbeiten requires a human subject, Heft is not human but an written object.</Paragraph>
    <Paragraph position="36"> ;4.'CONCLUSION The results of our experiments can be summarized as follows :  1. Agreement errors can perfectly well be handled in a very general way.</Paragraph>
    <Paragraph position="37"> 2. High and low level synactic errors as well as lexical  (semantic) errors can be satisfactorily dealt with but high level syntactic errors have to be anticipated, so that their treatment is not very general. Consequently, totally disordered sentences cannot be analysed (but should they be?).</Paragraph>
    <Paragraph position="38"> 3. Ambiguously interacting errors present a serious problem. Consider the following example.</Paragraph>
    <Paragraph position="39"> (10) *Er schreibt dem Heft (He writes to the notebook); The error could be analysed as a semantic error (schreiben requires a human dative object) or as a low leVel syntactic error (schreiben requires the preposition an). Obviously, there is no means of deciding which error the student has committed if there is no contextual information, which is generally the case in a language teaching environment.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML