File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/92/h92-1071_metho.xml

Size: 8,380 bytes

Last Modified: 2025-10-06 14:13:08

<?xml version="1.0" standalone="yes"?>
<Paper uid="H92-1071">
  <Title>A NATIONAL RESOURCE GRAMMAR</Title>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2. WHAT THE NATIONAL
RESOURCE GRAMMAR WOULD
BE
</SectionTitle>
    <Paragraph position="0"> The National Resource Grammar should include everything we know how to do well. In particular, it should include the following features:</Paragraph>
  </Section>
  <Section position="4" start_page="0" end_page="352" type="metho">
    <SectionTitle>
* Complete English inflectional morphology.
</SectionTitle>
    <Paragraph position="0"> * A very broad grammatical coverage, including all the subcategorization patterns, sentential complements; complex adverbials, relative clauses, complex determiners and quantifiers, conjunction and comparative constructions, and the most common sentence fragments.</Paragraph>
    <Paragraph position="1">  * Mechanisms for defining and applying selectional constraints, although the actual ontology would not be provided, since that is too domain-dependent. * A &amp;quot;quasi-logical form&amp;quot; defined for every construction in the grammar. The quasi-logical form would encode all operator-operand relationships, but not attempt to decide among the various quantifier scope readings. It would be easily convertible into other semantic representations.</Paragraph>
    <Paragraph position="2"> * The most commonly used parse preference heuristics. null * An optional routine for pronoun reference resolution according to syntactic or centering criteria.</Paragraph>
    <Paragraph position="3"> * An optional routine for quantifier-scope generation, either generating all quantifier scopings from the quasi-logical form, or using various common heuristics for ranking the alternate scopings.</Paragraph>
    <Paragraph position="4"> * A lexicon of several thousand words, including examples of all lexical categories and subcategories de null fined by the grammar.</Paragraph>
    <Paragraph position="5"> The grammar should be  * As modular as possible, for easy modification. * As reflective as possible of current linguistic theory. * As neutral as possible on controversial issues. * Compatible with the classification scheme used in the Penn Tree Bank.</Paragraph>
    <Paragraph position="6"> (The third and fourth of these items exert pressure in different directions, of course, and where the conflict is unresolvable, the fourth should take priority.) The system should include * An efficient parser, programmed in C for portability. * Convenient grammar development tools, for users to extend the grammar as required in specialized domains.</Paragraph>
    <Paragraph position="7"> * Complete documentation on the grammar and on  the algorithms used.</Paragraph>
    <Paragraph position="8"> During the development of the National Resource Grammar, it should be continually tested on a large set of key examples. Periodically, it should be tested on sentences taken at random from the Penn Tree Bank. Computational linguists and potential users should be consulted regularly to make certain that the system produces analyses that are maximally useful to others.</Paragraph>
  </Section>
  <Section position="5" start_page="352" end_page="352" type="metho">
    <SectionTitle>
3. USES
</SectionTitle>
    <Paragraph position="0"> Among the uses of the National Resource Grammar would be the following:  * To provide a convenient syntactic analysis component for researchers wishing to investigate other problems, such as semantics, pragmatics, or discourse. null * To provide a quick and effective syntactic analysis component for government agencies and members of the LDC and others implementing natural language processing applications.</Paragraph>
    <Paragraph position="1"> * To serve as a basis for experimentation with stochastic models of syntactic analysis.</Paragraph>
    <Paragraph position="2"> * To serve as an aid in the the annotation of sentences  in the Penn Tree Bank and other corpora.</Paragraph>
    <Paragraph position="3"> We believe, on the other hand, that a National Resource Grammar should not in any way be required or imposed on research projects. It should be just what it says-a resource. We believe it should promote rather than retard research on grammar and grammar formalisms.</Paragraph>
  </Section>
  <Section position="6" start_page="352" end_page="353" type="metho">
    <SectionTitle>
4. ORGANIZATION OF THE
PROJECT
</SectionTitle>
    <Paragraph position="0"> By basing the effort on an existing, very broad-coverage grammar, the development of very nearly the entire National Resource Grammar and its supporting system could be completed in one year. Our guess is that roughly 90% of the phrase structure rules and 70% of the constraints on the rules could be completed in the first year. During the second year, the grammar could be put into the hands of a variety of users, who would be consulted frequently, ensuring that the final product was responsive to their needs.</Paragraph>
    <Paragraph position="1"> More specifically, we feel the first year's task could be broken down into six different areas, each representing roughly two months' effort for the implementation of an initial solution. Further development of all aspects of the grammar, especially in response to comments from potential users and an advisory committee of linguists and computational linguists, would continue throughout the two years. Completing the initial implementation in the first year would give the developers sufficient time to respond to this feedback.</Paragraph>
    <Paragraph position="2"> The six areas are as follows:  1. A core, skeletal grammar, which would allow the developers to trace out the broad outlines of the grammar and give them a tool for testing further developments.</Paragraph>
    <Paragraph position="3"> 2. The structure of the noun phrase and adjective phrase to the left of the head, including complex determiner and quantifier structures, and adjective specifiers.</Paragraph>
    <Paragraph position="4"> 3. The auxiliary complex, noun complements and predicate complements, including cleft and pseudo-cleft constructions.</Paragraph>
    <Paragraph position="5"> 4. The structure of the verb phrase, subcategorization and sentential complements for verbs and adjectives. 5. Relative clauses and other &amp;quot;wh&amp;quot; constructions. 6. Adverbials and other sentence adjuncts.</Paragraph>
    <Paragraph position="6"> * To serve as a challenge to linguists and computa- null tional linguists to handle the various phenomena in better ways.</Paragraph>
    <Paragraph position="7"> Conjunction and comparative constructions would be handled not as a separate item, but throughout the effort. It would be a bad idea, for example, to develop a  treatment of nonconjoined relative clauses in Month 3 and a treatment of conjoined relative clauses in Month 10, because the latter may force a complete rethinking of how the former was done. Similarly, semantic interpretation, the lexicon, mechanisms for selectional constraints, and parse preference heuristics would be implemented and documented in tandem with grammar development. Each of these phenomena is of course a huge problem, and worthy of years of investigation. However, since at least one treatment of each of the phenomena has already been implemented, and encoding the current best existing treatment is what is required, we are confident such a schedule could be met. However, the developers would have to be very sensitive to black holes, since syntax abounds with them, and more grammar development projects have been derailed by them than have avoided them.</Paragraph>
    <Paragraph position="8"> Of course, an effort of this scope could not be done by committee, but it would be extremely useful to have an advisory committee consisting of linguists and computational linguists of a wide variety of theoretical orientations. The advisory committee would be solicited, before each two-month period, for key examples and key treatments of the phenomena. As the initial implementation in each area of the grammar is completed, the results, that is, the rules together with complete documentation, would be circulated to the advisory committee for a critique. Where this critique yielded clearly superior solutions to the problems, those solutions would be incorporated into the implementation.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML