File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/98/w98-1118_concl.xml
Size: 3,320 bytes
Last Modified: 2025-10-06 13:58:15
<?xml version="1.0" standalone="yes"?> <Paper uid="W98-1118"> <Title>Exploiting Diverse Knowledge Sources via Maximum Entropy in Named Entity Recognition</Title> <Section position="12" start_page="158" end_page="159" type="concl"> <SectionTitle> 10 CONCLUSIONS AND FUTURE WORK </SectionTitle> <Paragraph position="0"> MENE is a very new, and, we feel, still immature system. Work started in October, 1397, and the system described above was not in place until mid-February. 1998. We believe that we can push the score of the MENE-only system higher by incorporating long-range reference-resolution on MENE's output. We are also missing a large number of acronyms which could be picked up by dynamically building them from entities which MENE had tagged elsewhere and then pulling that data in as a new class of feature. The other key element missing from the current system is a set of general compound features, which, as discussed above, would require the use of a more sophisticated feature selection algorithm. All three of these elements are present in systems such as IsoQuest's (Krupka and Hausman, 1998), and their absence from MENE probably explains much of the reason why the MENE-only system failed to perform at the state-of-the-art. We intend to add all of these elements to MENE in the near future to test this hypothesis.</Paragraph> <Paragraph position="1"> Nevertheless, we believe that we have already demonstrated some very useful results. MENE is highly portable, as we have already demonstrated with our result on upper-case English text and even in its current state, its results are already comparable to that of the only other purely statistical English NE system which we are aware of (Miller et al., 1998). As shown with our result on running MENE with only the lexical features that it learns from the training corpus, porting MENE can be done with very little effort if appropriate training data is provided-it isn't even necessary to provide it with dictionaries to generate an acceptable result.</Paragraph> <Paragraph position="2"> We are working on a port to Japanese NE to further demonstrate MENE's flexibility.</Paragraph> <Paragraph position="3"> However, we believe that the results on combining MENE with other systems are some of the most intriguing. We would hypothesize that, given sufficient training data, any handcoded system would benefit from having its output passed to MENE as a final step. MENE also opens up new avenues for collaboration whereby different organizations could focus on different aspects of the problem of N.E. recognition with the maximum entropy system acting as an arbitrator. MENE also offers the prospect of achieving very high performance with very little effort. Since MENE starts out with a fairly high base score just on its own, we speculate that a MENE user could then construct a hand-coded system which only focused on MENE's weaknesses, while skipping the areas in which MENE is already strong.</Paragraph> <Paragraph position="4"> Finally, one can imagine a user acquiring licenses to several different N.E. systems, generating some training data, and then combining it all under a MENE-like system. We have shown that this approach can yield performance which is competitive with that of a human tagger.</Paragraph> </Section> class="xml-element"></Paper>