File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/92/c92-4188_concl.xml

Size: 4,013 bytes

Last Modified: 2025-10-06 13:56:50

<?xml version="1.0" standalone="yes"?>
<Paper uid="C92-4188">
  <Title>CONVERTING LARGE ON-LINE VALENCY DICTIONARIES FOR NLP APPLICATIONS: FROM PROTON DESCRIPTIONS TO METAL FRAMES</Title>
  <Section position="6" start_page="0" end_page="0" type="concl">
    <SectionTitle>
5. Discussion of results
</SectionTitle>
    <Paragraph position="0"> Using the conversion software, the complete Proton database (at that time, i.e. March 1990, consisting of 85130 valency structures for French and 600(I for Dutch) ACTES lYE COLING-92, NANTES, 23-28 AO~rr 1992 1 1 8 5 PRec. OF COL1NG-92, NANTES, AUG. 23-28, 1992 was processed into a database with Metal valency patterns that could form the basis of manual coding. In the first place, checks were run to compare the results of the conversion with the frames already coded in the dictionary. This already led to an improvement of the existing database. In the second place, additional verb coding is now being done on the basis of the conversion output, and not from scratch (i.e. from paper dictionaries).</Paragraph>
    <Paragraph position="1"> The total effort spent on developing the software (including the preliminary study phas~ constructing the mapping table) was about four man-months. When we compared the time needed to code Metal valency frames starting from scratch (the way the first 1000 verbs were added to the system) with the time needed to code frames starting from the output of the conversion, we found that on the whole, and subtracting the conversion development effort, coding productivity is speeded up by a factor of 2. In other words, the practical goal of fast extension of the verb dictionaries was certainly reached.</Paragraph>
    <Paragraph position="2"> As to the more general questions of requirements for convertibility of lexical resources or standardization of lexical information, a few remarks are in place. First, in our case the input lexical resource was in a fairly easily convertible format, viz. Prolog clauses. Even so, since it was the fu'st time the Proton databases were used outside of the project, several ambiguities and inconsistencies were found that needed correction before the conversion could take place. A basic requirement for convertibility then seems to be a rigid description of the syntax and semantics of the database entries; before the resource is made available to the outside world, it should be checked thoroughly against its own specifications (parsers can be generated automatically on the basis of a BNF-like syntax). More ambitiously, if the formats of valency information in different applications were known, the resource could be made available along with converters or converter specifications. As to the long-term goal of standardization, we are planning to use the experiences gathered from the conversion (along with knowledge about other formalisms, like that used in EUROTRA or in the databases of the Nijmegen Centre for Lexical Information CELEX) to study the requirements for a theory-neutral and application-neutral standard for valency representation. Since valency is not restricted to verbs, but also concerns adjectives and nouns, the standard could even try to be category-neulxal as well.</Paragraph>
    <Paragraph position="3"> Although the Proton-Metal conversion proved a successful experiment in computational lexicography, many linguistic and computational issues concerning valency and its processing have not been touched upon here and certainly need further research. To name but a few: nominal and adjectival valency, a foolproof methodology for making and/or merging reading distinctions, valency and idiomatic expressions, the interactions of the different types of valency information in an NLP application, and the link with more semantically oriented approaches to valency. On the basis of the availability of large amounts of valency data, and the experience with different formalisms, we hope to be able to tackle some of these issues in the future.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML