File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/95/m95-1012_intro.xml
Size: 2,524 bytes
Last Modified: 2025-10-06 14:05:53
<?xml version="1.0" standalone="yes"?> <Paper uid="M95-1012"> <Title>MITRE: DESCRIPTION OF THE ALEMBIC SYSTEM USED FOR MUC-6</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> ALEMBIC'S OVERALL ARCHITECTUR E </SectionTitle> <Paragraph position="0"> For all the changes that the system has undergone, the coarse architecture of the Muc-6 version ofAlembic is remarkably close to that of its predecessors. As illustrated in Fig. r, below, processing is still divided int o three main steps : a UNIX- and c-based preprocess, a Lisp-based syntactic analysis, and a Lisp-based inferenc e phase. Beyond these coarse-grain similarities, the system diverges significantly from earlier incarnations . We replaced our categorial grammar pseudo-parser, as suggested above. We also redesigned the preprocess fro m the ground up . Only the inferential back end of the system is largely unchanged .</Paragraph> <Paragraph position="1"> The internal module-by-module architecture of the current Alembic is illustrated in Fig. 2, below. The central innovation in the system is its approach to syntactic analysis, which is now performed through a sequence of phrase-finding rules that are processed by a simple interpreter . The interpreter has somewhat less recognition power than a finite-state machine, and operates by successively relabeling the input according t o the rule actions--more on this below. In support of the syntactic phrase finder, or phraser as we call it, the input text must be tagged for part-of-speech . This part-of-speech tagging is the principal role of the UNIX preprocess, and it is itself supported by a number of pretaggers (e .g., for labeling dates and title words) and zoners (e.g., for word tokenization, sentence boundary determination and headline segmentation) .</Paragraph> <Paragraph position="2"> The phrases that are parsed by the phraser are subsequently mapped to facts in the inferential database, a mapping mediated by a simple semantic interpreter . We then exploit inference to instantiate domai n constraints and resolve restricted classes of coreference . The inference system also supports equality reasoning by congruence closure, and this equality machinery is in turn exploited to perform TE-specific processing, i n particular, acronym and alias merging. Finally, the template generation module forms the final TE and S T output by a roughly one-to-one mapping from facts in the inferential database to templates .</Paragraph> </Section> class="xml-element"></Paper>