File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/06/w06-3305_evalu.xml
Size: 2,492 bytes
Last Modified: 2025-10-06 13:59:57
<?xml version="1.0" standalone="yes"?> <Paper uid="W06-3305"> <Title>A Priority Model for Named Entities</Title> <Section position="7" start_page="123" end_page="123" type="evalu"> <SectionTitle> 4 Results </SectionTitle> <Paragraph position="0"> We ran all three methods on the SemCat sets gp1, gp2 and gp3. Results are shown in Table 2. For evaluation we applied the standard information retrieval measures precision, recall and F-measure.</Paragraph> <Paragraph position="1"> For name classification, rel_ret refers to true positive entities, non-rel_ret to false positive entities and rel_ not_ret to false negative entities.</Paragraph> </Section> <Section position="8" start_page="123" end_page="123" type="evalu"> <SectionTitle> 5 Discussion </SectionTitle> <Paragraph position="0"> Using a variable order Markov model for strings improved the results for all methods (results not shown). The gp1-3 results are similar within each method, yet it is clear that the overall performance of these methods is PM > PCFG-8 > LM > PCFG3. The very large size of the database and the very uniform results obtained over the three independent random splits of the data support this conclusion. null The improvement of PCFG-8 over PCFG-3 can be attributed to the considerable ambiguity in this domain. Since there are many cases of term overlap in the training data, a grammar incorporating some of this ambiguity should outperform one that does not. In PCFG-8, additional production rules allow phrases beginning as CATPs to be overall NotCATPs, and vice versa.</Paragraph> <Paragraph position="1"> The Priority Model outperformed all other methods using F-measure. This supports our impression that the right-most words in a name should be given higher priority when classifying names. A decrease in performance for the model is expected when applying this model to the named entity extraction (NER) task, since the model is based on terminology alone and not on the surrounding natural language text. In our classification experiments, there is no context, so disambiguation is not an issue. However, the application of our model to NER will require addressing this problem.</Paragraph> <Paragraph position="2"> SemCat has not been tested for accuracy, but we retain a set of manually-assigned scores that attest to the reliability of each contributing list of terms. Table 2 indicates that good results can be obtained even with noisy training data.</Paragraph> </Section> class="xml-element"></Paper>