File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/94/h94-1121_metho.xml

Size: 3,847 bytes

Last Modified: 2025-10-06 14:13:56

<?xml version="1.0" standalone="yes"?>
<Paper uid="H94-1121">
  <Title>PANGLOSS: KNOWLEDGE-BASED MACHINE TRANSLATION</Title>
  <Section position="1" start_page="0" end_page="0" type="metho">
    <SectionTitle>
PANGLOSS: KNOWLEDGE-BASED MACHINE
TRANSLATION
</SectionTitle>
    <Paragraph position="0"/>
  </Section>
  <Section position="2" start_page="0" end_page="0" type="metho">
    <SectionTitle>
PROJECT GOALS
</SectionTitle>
    <Paragraph position="0"> The goals of the PANGLOSS project are to investigate and develop a new-generation knowledge-based interlingual machine translation system, combining symbolic and statistical techniques. The system is to translate newspaper texts in arbitrary domains (though a specific financial domain is given preference) to as high quality as possible using as little human intervention as possible.</Paragraph>
    <Paragraph position="1"> The project involves three sites (USC/ISI, New Mexico State University, and Carnegie Mellon University).</Paragraph>
    <Paragraph position="2"> NMSU is responsible for Spanish parsing and lexicon acquisition; CMU for glossary and example-based MT translation, interlingua specification, workstation development, and system integration, and ISI for Japanese parsing and analysis, Spanish analysis, English generation, Japanese and English lexicon acquisition, and semantic term lexicon (Ontology) acquisition.</Paragraph>
    <Paragraph position="3"> Within PANGLOSS, it is the particular focus of ISI to strive toward large-scale system coverage by investigating the feasibility and utility of combined statistical and human acquisition techniques of grammars, lexicons, and semantic knowledge. To this end, we have acquired several large resources, especially of Japanese lexical information, and are developing methods to integrate this knowledge with the ongoing development of Japanese parsing and semantic analysis and Ontology term acquisition and taxonomization.</Paragraph>
  </Section>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
RECENT RESULTS
</SectionTitle>
    <Paragraph position="0"> The most recent ARPA evaluations of several MT systems, including PANGLOSS, are not yet available. However, preliminary measurements indicate that translators performed around 40% more quickly using the system than translating manually (for Spanish to English; the Japanese effort is only 6 months old at this time).</Paragraph>
    <Paragraph position="1"> In recent work, we have: * continued the construction of the PANGLOSS Ontology, the taxonomy of terms used in the semantic interlingua representation (the Ontology now contains approx. 50,000 items); * acquired and deployed the lexical analyzer JU-MAN and the parser SAX, with their accompanying  rules that govern the inclusion of the articles &amp;quot;the&amp;quot; and &amp;quot;a&amp;quot; into English text without articles (which is how it would come from Japanese).</Paragraph>
  </Section>
  <Section position="4" start_page="0" end_page="478" type="metho">
    <SectionTitle>
PLANS FOR THE COMING YEAR
</SectionTitle>
    <Paragraph position="0"> Our major efforts for the next year fall in four areas:  1. Japanese parsing, analysis, and lexis: the continued extension and testing of the current systems and lexicons; 2. Spanish semantic analysis: the development of the current mapper from the NMSU parser output to interlingua form into a more powerful and robust semantic mapper; 3. Ontology enrichment: the extraction of concept features and interrelationships from online resources and text, and their inclusion into the Ontology; 4. Sentence planning and English generation: the en null hancement of the current interlingua-to-Penman mapper into a true Sentence Planner and the continued extension of the Penman generator.</Paragraph>
    <Paragraph position="1"> .J</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML