File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/93/m93-1028_metho.xml

Size: 4,262 bytes

Last Modified: 2025-10-06 14:13:33

<?xml version="1.0" standalone="yes"?>
<Paper uid="M93-1028">
  <Title>Report from the Text Analysis Techniques Topic Session</Title>
  <Section position="2" start_page="0" end_page="0" type="metho">
    <SectionTitle>
DISAMBIGUATING TEMPORAL EXPRESSION S
</SectionTitle>
    <Paragraph position="0"> Lois Childs discussed GE's efforts to extract temporal expressions from text through the identification of relevant patterns . The Shogun system used 37 patterns in English, and 7 for Japanese . Patterns were context dependent, and referenced a dateline in order to handle relative time . The patterns were able to perform temporal calculations, and the system computed a temporal structure from reference points on th e dateline. The system was able to handle temporal references which were spread throughout a message . This approach allowed the Shogun system to have a good coverage of time fills ; extensions to this approach wil l provide improved handling of ambiguous dates.</Paragraph>
  </Section>
  <Section position="3" start_page="0" end_page="341" type="metho">
    <SectionTitle>
AUTOMATICALLY TRAINABLE ERROR CORRECTION O F
JAPANESE SEGMENTATION AND PART-OF-SPEECH ANAL-
YSIS
</SectionTitle>
    <Paragraph position="0"> Sean Boisen presented BBN 's work on a learning algorithm which was used to improve the performanc e of the Juman Japanese word-segmentation system provided by Kyoto University . BBN used AMED (Automatic Morphological Error Detection) as a segment correction model between Juman and BBN's POS T tagger. Using hand-produced segmentation and tagging for training purposes, the system was able to ac quire transformations from tags, and learn rules for segment correction, in order to reclassify words, pu t words together, and take words apart . The system produced a chart of possible corrections . The supervised training used Treebank software . The AMED/POST combination was able to improve segmentation and tagging performance with little data ; this approach will be extended to parsing in the future .</Paragraph>
  </Section>
  <Section position="4" start_page="341" end_page="341" type="metho">
    <SectionTitle>
SOME ASPECTS OF PRINCIPLE-BASED PARSING IN THE
MUC-5 TAS K
</SectionTitle>
    <Paragraph position="0"> Robert Belvin described the use of principle-based parsing in LSI's parser, in particular the use of principle s of grammatical theory and parsing principles (based on empirical knowledge of language) . LSI's parser incorporates a number of features of the Government-Binding theory of syntax, including projection and thematic principles, in an essentially head-driven parser which employs bottom-up and expectation-base d characteristics . The parser is designed to be language independent, to produce syntactic structures whic h facilitate semantic processing, and to be robust enough to produce partial parses which are usable in late r semantic processing . Robert discussed the handling of empty categories, with respect to passive constructions and embedded infinitivals. Robert concluded with a discussion of the insertion of special structures as a mean s of providing a &amp;quot;quick fix &amp;quot; for constructions which are not completely handled by the principles which have been implemented.</Paragraph>
  </Section>
  <Section position="5" start_page="341" end_page="342" type="metho">
    <SectionTitle>
DEALING WITH AMBIGUITY
</SectionTitle>
    <Paragraph position="0"> Jim Cowie discussed experiences with NMSU's reference resolution module in their Diderot system . The system attempted to disambiguate text into a list of sense tokens . Disambiguation was performed in parsing and semantic tagging stages . Tagging was done using word-lists with semantic and type tags . Parsing use d tags in conjunction with co-specification patterns . Jim discussed various problems which occurred, includin g a lack of sense-tokens for Japanese, multiple-tagging problems, the need for a lexical database featurin g compound terms, the need for domain-specific markers, the need for combinatorial rules, and the need fo r negative blocking information . Future experiments will focus on the use of machine learning techniques fo r acquiring semantic tagging information and deriving semantic patterns .</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML