File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/93/e93-1027_metho.xml

Size: 28,479 bytes

Last Modified: 2025-10-06 14:13:17

<?xml version="1.0" standalone="yes"?>
<Paper uid="E93-1027">
  <Title>Linguistic Knowledge Acquisition from Parsing Failures</Title>
  <Section position="4" start_page="222" end_page="223" type="metho">
    <SectionTitle>
3 Formalism and the Parser
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="222" end_page="223" type="sub_section">
      <SectionTitle>
3.1 Linguistic Knowledge to be Acquired
</SectionTitle>
      <Paragraph position="0"> The formalism and linguistic theories which one chooses as the bases for grammatical learning largely determine the types of linguistic knowledge to be acquired as well as their representational forms.</Paragraph>
      <Paragraph position="1"> If one chooses a general form of CFG without committment to any specific linguistic theory, the knowledge to be learned is just a set of general rewriting rules. On the other hand, if one chooses more specific linguistic frameworks, they impose further restrictions on possible forms of knowledge to be learned, and introduce more diverse forms of representing knowledge. For example, if one chooses a lexicon-oriented framework, it may assume the existence of subcategorization frames as lexical properties, and impose restrictions on the form of rewriting rules such as &amp;quot;the LHS of each rewriting rule should</Paragraph>
      <Paragraph position="3"> While minimal commitment to specific linguistic theories is possible for research on general algorithms of robust parsing (as in \[Mellish, 1989\]), it does not seem feasible for our paradigm, as our aim (learning linguistic knowledge) is directly related to the problems of what type of knowledge is to be learned and how it is properly represented. To learn such recta-principles from corpora, starting from a weak assumption formalism like CFG, requires induction and an impractically huge search space.</Paragraph>
      <Paragraph position="4"> Instead, our aim is far less ambitious than automatic grammar learning from corpora. Our goal is to make existing grammar and lexical resources more comprehensive or to adapt them to new application domains. That is, from the very beginning, a system has a set of linguistic knowledge represented in specific forms by assuming that meta-principles proposed by current linguistic theories are valid. We use established linguistic concepts such as 'Number-Property', subcategorization frames of predicates, syntactic categories, etc. Most of the inductive processes required in grammar learning will have been performed in advance (by linguists), though hypothesizing lacking knowledge may require induction even in our framework.</Paragraph>
    </Section>
    <Section position="2" start_page="223" end_page="223" type="sub_section">
      <SectionTitle>
3.2 Grammar Formalism
</SectionTitle>
      <Paragraph position="0"> Figure 1 and Figure 2 show the general forms of the rules in our grammar and specific examples respectively. For experiments, we use a grammar which consists of 190 rewriting rules, giving us reasonable coverage of English.</Paragraph>
      <Paragraph position="1"> As can be seen, the formalism used is a conventional kind of unification grammar where context free rules are augmented by feature conditions. In Figure 1, each syntactic category Cati in a rewriting rule has a feature structure Fi, which is unified either wholly or partially to another by using the same variable or by applying the unification function f(F, F1, F2,..., F,~) (See examples in Figure 2).</Paragraph>
      <Paragraph position="2"> Although we do not commit ourselves to any specific linguistic theory, it can be seen from the example rules that we use basic concepts in modern linguistic theories such as Head, Subcat, a set of grammatical functions (Subject, Object, etc.), etc.</Paragraph>
      <Paragraph position="4"/>
    </Section>
    <Section position="3" start_page="223" end_page="223" type="sub_section">
      <SectionTitle>
3.3 Parsing Results
</SectionTitle>
      <Paragraph position="0"> The parser we use is a left corner, bottom-up parser with top-down filtering. When it fails to parse, it reparses the same sentence without top-down filtering and outputs the following intermediate tuples.</Paragraph>
      <Paragraph position="1"> Successful Category: succes sful~oal (Cat, Words, WordsRest) This tuple means that a word sequence between 'Words' and 'WordsRest' was successfully analysed as an expected category 'Cat'.</Paragraph>
      <Paragraph position="2"> ex.) successful_goal(np, \[the,boy, has,a,book\], \[has,a,book\]) Failed Category: failed_goal(Cat .Words) This tuple means that an expected category 'Cat' could not be analysed from a word list 'Words'.</Paragraph>
      <Paragraph position="3"> ex.) failed.goal(np,\[has,a,book\]) These tuples are similar to active and inactive edges of a chart parser but the 'Failed Category' above directly expresses the local ungrammaticality while an active edge expresses an incomplete expectation of a category within a grammar rule.</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="223" end_page="226" type="metho">
    <SectionTitle>
4 Generation of Hypotheses
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="223" end_page="225" type="sub_section">
      <SectionTitle>
4.1 Hypothesizing Grammar Rules from
Parsing Failures
</SectionTitle>
      <Paragraph position="0"> When the parser fails to analyse a sentence, the grammar rule hypothesizing program (shortly GRHP) investigates the parsing results and hypothesizes all the possible modifications of the existing grammar that produce a complete parsing result.</Paragraph>
      <Paragraph position="1"> GRHP starts from the top category's' and proceeds by breaking down each failed category in accordance with the existing grammar.</Paragraph>
      <Paragraph position="2">  The hypothesizing procedure (hypo_proc) works for each category CatA as follows (See also Figure 3):</Paragraph>
      <Paragraph position="4"> if (CatA is a non-lexical category) then  HYPO(rule: CatA =~ CatC1 +... + CatCz) ...... (5) else if (CatA is a failed category) then HYPO(lexical_entry: CatA =~ \[Word\]) ...... (6) endif end (1) If CatA is a failed category, the procedure breaks CatA down into its daughter categories according to the rule 'CatA :C/, CatBil + ... + CatBin' in the existing grammar. The procedure iterates this breakdown for each rule composing CatA.</Paragraph>
      <Paragraph position="5"> (2) The procedure calls itself recursively for each daughter category CatBii.</Paragraph>
      <Paragraph position="6"> (3) The procedure also checks whether CatBij is a failed category. If it is a failed category, the procedure hypothesizes a new left recursive rule for the preceding category CatBij_l and generates a rule 'CatBij_l =:~ CatBii-1 + CatR1 + * .. +CatRo' by searching adjacent successful categories next to CatBij-1 unless this rule is included in the existing grammar.</Paragraph>
      <Paragraph position="7"> (4) If all the daughter categories are successful categories, the procedure hypothesizes the feature disagreement between them. For example, if the existing grammar contains a rule's ::C/, np+ vp' and both 'np' and 'vp' are successfully parsed but still 's' is a failed category, the procedure hypothesizes the feature disagreement between 'np' and 'vp'.</Paragraph>
      <Paragraph position="8"> (5) When the procedure finishes applying all the  known rules of CatA, it hypothesize a new rule of CatA unless CatA is a lexical category. The procedure searches adjacent successful categories starting from the word position where CatA is expected and generates a rule  'CatA :=~ CatC1 + ... + CatCl' unless the rule is included in the existing grammar. This step is directly executed if CatA is not a failed category or there are no known rules which compose CatA.</Paragraph>
      <Paragraph position="9"> (6) If CatA is a failed lexical category, the proce null dure hypothesizes a new lexical entry 'CatA ==~ \[Word\]' at the word position where CatA is expected. By this hypothesis, an unknown word as well as a known word is assigned into an expected category.</Paragraph>
      <Paragraph position="10"> Actually, this process is implemented on Prolog and each hypothesis is generated alternatively. When GRHP generates a hypothesis, it passes the hypothesis to the parser to analyse the remaining part of the sentence. As the result, GI~HP outputs only the hypotheses that lead to complete structures of the sentences.</Paragraph>
      <Paragraph position="11"> On this search algorithm, we imposed a strict condition that a sentence does not have more than one cause of its parsing failure and the combination of hypotheses is not allowed to account for one ungrammaticality. Therefore, GRHP generates each hypothesis independently and all the hypotheses generated from a sentence are alternatives.</Paragraph>
    </Section>
    <Section position="2" start_page="225" end_page="226" type="sub_section">
      <SectionTitle>
4.2 Elimination of Redundant Hypotheses
</SectionTitle>
      <Paragraph position="0"> GRHP in Section 4.1 generates a lot of alternative hypotheses, many of which are nonsensical from the linguistic viewpoint. GRHP as it is stated there does not include any criteria for judging the appropriateness of hypotheses as linguistic rules. In the extreme, it can hypothesize a rule which directly derives the input string of words from the start symbol 's'. Although such a rule allows the grammar to accept the input as a sentence, the rule obviously lacks the generality which we expect a linguistic rule to have. More seriously, it ignores all the generalizations which the existing grammar embodies.</Paragraph>
      <Paragraph position="1"> One can conceive of an automatic procedure of grammar learning which starts from a set of such rules and gradually discovers grammatical concepts, such as NP, VP, etc., based on the replaceability among sub-strings. However, as we discussed in Section 3, such a procedure has to solve the difficulties caused by a huge search space which an induction process generally has, and we are convinced that it is impossible to induce from scratch the rules involved in complex systems such as human languages.</Paragraph>
      <Paragraph position="2"> Instead, our framework assumes that most of the induction processes required in grammar learning have been done by linguists and embodied in the form of the existing grammar. The system has only to discover defects or incompleteness of the existing grammar or to discover the differences between the sublanguage in a new domain and the sublanguage which the existing grammar has been prepared for. In other words, the hypotheses GRHP generates should use the generalizations embodied in the existing grammar as much as possible, and the hypotheses which ignore them should be rejected as nonsensical or redundant ones.</Paragraph>
      <Paragraph position="3"> GRHP hypothesizes a set of new rules which collect sequences of successful categories starting at the same word position into the same failed category.</Paragraph>
      <Paragraph position="4"> If a substring of the input which is collected into the failed category contains a sequence of &amp;quot;a good student&amp;quot;, for example, and if the existing grammar contains rules like 'nhead :=~ adj + nhead', 'np =~ det + nhead', etc., GRHP will generate hypotheses whose RHSs contain the sequence, such as 'det + adj + nhead', 'det + nhead', etc., as well as the ones whose RHSs contain 'np' for the same part of the input.</Paragraph>
      <Paragraph position="5"> However, because the hypothesized rules containing smaller constituents, such as 'det', 'nhead', etc. instead of 'np', ignore the generalization captured by 'np' in the existing grammar, they should be disregarded as redundant, while only the ones which contain 'np' in their RHSs are kept as viable hypotheses. Much simpler criteria could also be used to prevent nonsensical hypotheses from being generated.</Paragraph>
      <Paragraph position="6"> For example, a rule whose RHS consists of a large number of constituents would not be viable, if we assume that the existing grammar has already been equipped with a reasonable set of syntactic categories (non-terminals) which allow sentences to be assigned reasonably structured descriptions.</Paragraph>
      <Paragraph position="7"> The following is a list of the criteria which Gl~HP can use to disregard nonsensical hypotheses.</Paragraph>
      <Paragraph position="8"> \[1\] Priority to the hypotheses of feature disagreement: Assuming that the existing grammar is quite comprehensive, we can give priority to the hypotheses of feature disagreement, which do not create new rules. In the current implementation, if GI:tHP finds a feature disagreement hypothesis to restore a failed category, it stops the recursion and generates no more hypotheses. null \[2\] Number of daughter nodes: A rule which collects an excessive number of constituents into one large constituent at once is not viable. We currently restrict the number of daughter nodes to 4.</Paragraph>
      <Paragraph position="9"> \[3\] Priority to the hypotheses using generalizations embodied by the existing grammar: As discussed in the above, priority is given to the hypotheses which contain 'np' as daughters over those which contain 'det + nhead', 'det + adj + nhead', etc. In general, hypotheses containing sequences of constituents which can be collected into larger constituents by existing rules are disregarded as redundant (See Figure 4).</Paragraph>
      <Paragraph position="10"> \[4\] Distinction of lexical categories from other cateogries: While the general form of CFG  does not distinguish lexical categories from other non-terminals, our grammar does. Therefore, we prohibit GRHP to hypothesize a new rule whose mother category is one of the lexical categories. The lexical categories are allowed only to appear in new lexical rules.</Paragraph>
      <Paragraph position="11"> \[5\] Distinction of closed and open lexical categories: We assume that the existing grammar has a complete list of function words. This means that LHSs of rules for new lexical entries are restricted to the open lexical categories, such as noun, verb, adjective, and adverb.</Paragraph>
      <Paragraph position="12"> \[6\] Use of subcategorization frames: As in our grammar formalism a subcategorization frame is embedded in the feature structure of a head category, the correspondence between the head category and its subcategories does not appear explicitly in rules. Therefore, a subcategorization frame checking mechanism should be incorporated into the search algorithm and executed before hypothesizing any rule or any lexical entry in order to filter out redundant hypotheses.</Paragraph>
      <Paragraph position="13"> \[7\] Prohibition of unary rules: While the general form of CFG allows unary rules and they are sometimes used as category conversion rules in actual descriptions of a grammar, they differ from the constituent rules which specify mother-daughter relationships. For example, a rule 'np =C/, infinitive' means that an infinitival clause behaves as a noun phrase in larger constituents without changing its structure. Unrestricted introduction of such unary rules, however, increases drastically not only parsing ambiguities but also possible hypotheses generated by GRHP. Except for lexical rules which are unary in nature, we can prohibit unary hypotheses by assuming that the existing grammar exhausts all possible category conversion rules among the categories it uses (See Section 5).</Paragraph>
      <Paragraph position="14"> \[8\] Distinction of closed and open categories: We can extend the distinction of open and closed lexical categories in \[5\] to the other categories.</Paragraph>
      <Paragraph position="15"> Depending on the completeness of the existing grammar, we can specify a set of categories as closed categories and prohibit GRHP to generate new rules whose RHSs belong to the set.</Paragraph>
      <Paragraph position="16"> \[9\] Restricted patterns of new rules: This restriction could be realized by introducing meta-rules which specify the form of a new rule and the relations between adjacent categories. For example, according to the X-bar theory, we can confine a category appearing at the complement position to be a maximal projection.</Paragraph>
      <Paragraph position="17"> \[10\] Restriction on Lexical Rules: As we discussed in \[7\], unary rules are one of the major causes of explosion of the search space. Unary lexical rules can also be restricted by introducing a pr/or knowledge of possible lexical category conversions. For example, while the conversion between a noun and a verb is very frequent in English, the conversion of an adverb with the suffix -ly to a verb is extremely rare. This means that, though verb is an open lexical category, we can prohibit a lexical rule which forces a word registered in the dictionary as an adverb to be interpreted as a verb.</Paragraph>
    </Section>
  </Section>
  <Section position="6" start_page="226" end_page="229" type="metho">
    <SectionTitle>
5 Preliminary Experiment
</SectionTitle>
    <Paragraph position="0"> To see what sort of hypotheses are actually generated, and how many of them are reasonable (in other words, how many of them are nonsensical), we have conducted a preliminary experiment with the following six sentences.</Paragraph>
    <Paragraph position="1">  (1) The girl in the garden has a bouquet.</Paragraph>
    <Paragraph position="2"> (2) Buy a new car.</Paragraph>
    <Paragraph position="3"> (3) Dogs do dream.</Paragraph>
    <Paragraph position="4"> (4) The box is so heavy that I could not move it. (5) The student has a BMW.</Paragraph>
    <Paragraph position="5"> (6) The boy caught several fish.</Paragraph>
    <Paragraph position="6">  We deliberately introduce defects into the existing grammar which are relevant to the analysis of these sentences. That is, the following rules are removed from the existing grammar for the sake of the experiment. null  The criteria \[1\]-\[5\] of redundant hypotheses are included in the basic algorithm of GRHP so that the following lists of hypotheses for these examples do  not contain those which are rejected by these criteria. The hypotheses marked with '--*' are the plausible hypotheses. The hypotheses marked by x and (r) are the hypotheses removed by adding \[6\] and \[7\] as further criteria of redundant hypotheses, respectively. We do not use the criteria of \[8\]-\[10\] in this experiment, partly because these are highly dependent on the completeness of the existing grammar and, though very effective for reducing the number of hypotheses, can be arbitrary.</Paragraph>
    <Paragraph position="7">  (1) &amp;quot;The girl in the garden has a bouquet.&amp;quot; (r) Rule: colonp =&gt; pp -* Rule: np =&gt; np,pp  imperative sentences. This rule looks plausible but the fact that the criteria \[7\] of redundant hypotheses suppresses this rule indicates that a rule for imperative sentences should not be treated as a normal unary (category conversion) rule but rather a whole-sentencial constituent rule.</Paragraph>
    <Paragraph position="8">  (3) &amp;quot;Dogs do dream.&amp;quot; X Rule: ajp =&gt; nhead x Rule: ajp =&gt; vp (r) Rule: colonp =&gt; auxdo @ Rule: colonp =&gt; vp X Rule: infinitive =&gt; nhead x Rule: infinitive =&gt; vp Rule: np =&gt; np,auxdo Rule: np =&gt; np,vp (r) Rule: np =&gt; relc (r) Rule: np =&gt; s (r) Rule: np =&gt; vp  (r) Rule: sub_clause =&gt; nhead (r) Rule: sub_clause =&gt; vp x Rule: that_clause =&gt; nhead x Rule: that_clause =&gt; vp Rule: vp =&gt; auxdo,nhead -*Rule: vp =&gt; auxdo,vp (r) Rule: vp =&gt; auxdo (4)  Although this sentence is short, quite a few hypotheses are generated. This is partly because both &amp;quot;do&amp;quot; and &amp;quot;dream&amp;quot; are ambiguous in their parts of speech. Some of the generated hypotheses are based on the interpretation of &amp;quot;dream&amp;quot; as a noun. However, even in the cases in which the main verb is not ambiguous, GRHP always hypothesizes 'vp =~ vp + vp' as well as the correct DO-emphasis rule, as &amp;quot;do&amp;quot; has two parts of speech. As we discuss in the following section, it is impossible to choose one of these hypotheses on the basis of single parsing failures. We need corpus-based techniques to rate the plausibility of these two hypotheses.</Paragraph>
    <Paragraph position="9">  &amp;quot;The box is so heavy that I could not move it.&amp;quot; X Rule: x Rule: x Rule: x Rule: x Rule: x Rule: x Rule: x Rule: x Rule: x Rule:  (r) Rule: x Rule: x Rule: x Rule: x Rule: x Rule: x Rule:  x Rule: vp =&gt; adv,ajp,that.~lause x Rule: vp =&gt; adv,ajp x Rule: vp =&gt; ajp,relc,np x Rule: vp =&gt; ajp,relc x Rule: vp =&gt; ajp,that_clause x Rule: vp =&gt; ajp x Rule: vp =&gt; relc,np x Rule: vp =&gt; relc x Rule: vp =&gt; that_clause x Rule: vp =&gt; vp,relc,np x Rule: vp =&gt; vp,relc X Rule: vppsv =&gt; adv,ajp,relc,np x Rule: vppsv =&gt; adv,ajp,relc x Rule: vppsv =&gt; adv,ajp,that_clause x Rule: vppsv =&gt; adv,ajp x Rule: vppsv =&gt; ajp,relc,np x Rule: vppsv =&gt; ajp,relc x Rule: vppsv =&gt; ajp,that_clause x Rule: vppsv =&gt; ajp x Rule: vppsv =&gt; relc,np x Rule: vppsv =&gt; relc x Rule: vppsv =&gt; that_clause  In this example, 'vp ~ vp + that_clause' (or 's ~ s + that_clause') could be the appropriate hypothesis. However, simple addition of such a rule to the existing grammar results in overgeneralization. The rule should have a condition on the existence of &amp;quot;so&amp;quot; in 'vp' (or 's') while a similar effect can also be attained by adding a new lexical entry for &amp;quot;heavy&amp;quot; which has a sub-categorization frame containing a 'that clause'. That is, the system has to decide which hypothesis is more plausible, either &amp;quot;heavy&amp;quot; can subcategorize a 'that clause' or &amp;quot;so&amp;quot; is crucial in making 'vp' to be related with a 'that clause'. This decision may not be possible, if this sentence is the only one sentence in a corpus which contains this construction. Like Example 3, we need corpus-based techniques to choose the right one.</Paragraph>
    <Paragraph position="10">  GRHP generates the correct hypothesis of the feature disagreement between the plural determiner &amp;quot;several&amp;quot; and the noun &amp;quot;fish&amp;quot; as one of possible hypotheses.</Paragraph>
    <Paragraph position="11"> Table 2 summarizes the number of hypotheses generated for each sample sentence. As can be seen, while appropriate hypotheses are generated, quite a few other hypotheses are also generated, especially in the case of the third and the fourth sentences. However, as shown in Table 3, the criteria \[6\] and \[7\] of redundant hypotheses can eliminate significant portions of nonsensical hypotheses (Table 3 shows the effects of these criteria on the number of hypothesized new rules). In Example (4), for example, 31 out of 58 initially hypothesized rules are eliminated by \[6\] and \[7\], while 16 out of 28 rules are eliminated in Example (3). Furthermore, we expect that introduction of other criteria for redundant elimination based on \[8\]-\[10\] will reduce the number of hypotheses significantly and make the succeeding stage of the corpus-based statistical analysis feasible. The experiment on another set of sample sentences from the UNIX on-line manual confirms our expectation (See Table 4). The number of hypotheses generated in this experiment is very much similar to that of the experiment on artificial samples (note that Table 4 shows the number of hypotheses generated before elimination by the criteria \[6\] and'J7\]).</Paragraph>
  </Section>
  <Section position="7" start_page="229" end_page="229" type="metho">
    <SectionTitle>
6 Corpus-based Techniques and
Linguistic Knowledge Acquisition
</SectionTitle>
    <Paragraph position="0"> We discussed that using an existing grammar should enable us to avoid a huge search space which grammatical learning would otherwise have. Instead of inducing grammatical concepts from scratch, our framework uses the categories prepared in an existing grammar for formulating new structural rules.</Paragraph>
    <Paragraph position="1"> However, linguistic knowledge acquisition is inherently an inductive process. We cannot expect GttHP alone to choose correct hypotheses without observing analysis results of other sentences in a corpus.</Paragraph>
    <Paragraph position="2"> Although we have not yet implemented the corpus-based component, the result of the preliminary experiment indicates what sorts of functions this component should have.</Paragraph>
    <Paragraph position="3"> \[1\] In Example (6), we have a feature disagreement hypothesis for &amp;quot;several fish&amp;quot; and two lexical hypotheses for &amp;quot;several&amp;quot;. Further analysis of the feature disagreement hypothesis will lead to two competing hypotheses, one of which requires a revised lexical de.</Paragraph>
    <Paragraph position="4"> scription of &amp;quot;several&amp;quot; and the other of which suggests that of '~ish&amp;quot;. The other two lexical hypotheses also suggest different revisions in the description of &amp;quot;several&amp;quot;. However, the analysis of this sentence alone may not enable us to decide which of these four hypotheses is the right one.</Paragraph>
    <Paragraph position="5"> We reported in \[Tsujii et al., 1992\] that a simple statistical measure like the Failure Rate o/ a Word (ratio of the number of sentences containing a word that cannot be parsed to the total number of sentences containing the same word) is useful for discovering words whose lexical descriptions contain de.</Paragraph>
    <Paragraph position="6"> fects. This kind of simple measures would also be effective in a situation like Example (6). That is, we can expect that, while the frequency of the word &amp;quot;several&amp;quot; would be high, the frequency of the hypotheses suggesting the revisions of the lexical de.</Paragraph>
    <Paragraph position="7"> scriptions of this word would be relatively low.</Paragraph>
    <Paragraph position="8"> \[2\] As we noted in the comment on Example (3), whenever DO-emphasis construction appears, the same pair of the hypotheses, 'vp ::~ vp + vp' and 'vp =~ auzdo + vp', will be generated. Unless other types of failures lead to one of these hypotheses, they would be judged to have exactly the same remedial powers, i.e. the same set of failures are restored by them. In such a situation, we may be able to choose the right one by comparing the specificities of competing hypotheses. In this example, the former hypothesis which uses 'vp' instead of'auzdo' can be judged as having excessive generative powers and therefore inappropriate because the other competing hypothesis with far restricted generative powers can restore the same set of parsing failures.</Paragraph>
    <Paragraph position="9"> In order for such comparison to be meaningful, the system first have to judge, by corpus-based techniques, whether competing hypotheses have the same remedial powers or not. If the more general ones appear frequently as remedial rules for parsing failures which cannot be restored by the specific ones, the general ones would be the right ones.</Paragraph>
    <Paragraph position="10"> \[3\] Example (4) shows a situation opposite to Example (3). We have two (or three) viable competing hypotheses in this example. One is the specific hypothesis with very restricted generative powers which suggests to revise the lexical description of &amp;quot;heavy&amp;quot;. The other is a more general hypothesis which allows 'vp' (or 's') to be followed by 'that_clause'. Although either of these two can restore the parsing failure of this sentence, the specific one cannot restore parsing failures in other sentences in which SO-THAT constructions appear with different adjectives. That is, unlike Example (3), these two hypotheses have different remedial powers and, because of this, the general one should be chosen as the right one.</Paragraph>
    <Paragraph position="11"> Furthermore, though simple addition of this general rule results in serious over-generalization, to curb this over-generalization needs complex revisions of related grammar rules in order for a feature indicating the existence of &amp;quot;so&amp;quot; to be percolated to the node of 'vp' (or 's'). Such invention of a new feature and re-organization of related rules seem beyond the current framework and we expect human linguists to examine suggested hyoptheses.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML