File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/93/x93-1019_metho.xml
Size: 2,270 bytes
Last Modified: 2025-10-06 14:13:38
<?xml version="1.0" standalone="yes"?> <Paper uid="X93-1019"> <Title>Name:\[Tie Up between JGC CORP, C. ITOH AND CO, NISSHO IWAI CORP, and F,e Object Slot Name: I ENTITY Ch~ Confidence\] \[Change Offsets\] \[ New Object \] ~ \[A Company In JAPAN \[\] r \[~\]~\]:g:E~Zjj\]j \[\] = \[c:aTO \[\] \[\] \[NISSHO IWAI CORP</Title> <Section position="2" start_page="0" end_page="0" type="metho"> <SectionTitle> 2. KEY SYSTEM FEATURES </SectionTitle> <Paragraph position="0"> Three key design features distinguish PLUM from other approaches: statistical language modeling, learning algorithms and partial understanding. The first key feature is the use of statistical modeling to guide processing. For the version of PLUM used in MUC-5, part of speech information was determined by using well-known Markov modeling techniques embodied in BBN's part-of-speech tagger POST \[5\]. We also used a correction model, AMED \[3\], for improving Japanese segmentation and part-of-speech tags assigned by JUMAN. For the microelectronics domain, we used a probabilistic model to help identify the role of a company in a capability (whether it is a developer, user, etc.). Statistical modeling in PLUM contributes to portability, robustness, and trainability.</Paragraph> <Paragraph position="1"> algorithms. We feel the key to portability of a data extraction system is automating the acquisition of the knowledge bases that need to change for a particular language or application. For the MUC-5 applications we used learning algorithms to train POST, AMED, and the template-filler model mentioned above. We also used a statistical learning algorithm to learn case frames for verbs from examples (the algorithm and empirical results are in \[4\]).</Paragraph> <Paragraph position="2"> A third key feature is partial understanding, by which we mean that all components of PLUM are designed to operate on partially interpretable input, taking advantage of information when available, and not failing when information is unavailable. Neither a complete grammatical analysis nor complete semantic interpretation is required. The system finds the parts of the text it can understand and pieces together a model of the whole from those parts and their context.</Paragraph> </Section> class="xml-element"></Paper>