File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/97/a97-1028_concl.xml

Size: 1,613 bytes

Last Modified: 2025-10-06 13:57:47

<?xml version="1.0" standalone="yes"?>
<Paper uid="A97-1028">
  <Title>Language Chinese English French Japanese Portuguese</Title>
  <Section position="6" start_page="192" end_page="192" type="concl">
    <SectionTitle>
5 Discussion
</SectionTitle>
    <Paragraph position="0"> The results of this analysis indicate that it is possible to perform much of the task of named-entity recognition with a very simple analysis of the strings composing the NE phrases; even more is possible with an additional inspection of the common phrasal contexts. The underlying principle is Zipf's Law; due to the prevalence of very frequent phenomena, a little effort goes a long way and very high scores can be achieved directly from the training data. Yet according to the same Law that gives us that initial high score, incremental advances above the baseline can be arduous and very language specific. Such improvement can most certainly only be achieved with a certain amount of well-placed linguistic intuition.</Paragraph>
    <Paragraph position="1"> The analysis also demonstrated the large differences in languages for the NE task, suggesting that we need to not only examine the overall score but also the ability to surpass the limitations of word lists, especially since extensive lists axe available in very few languages. It is particularly important to evaluate system performance beyond a lower bound, such as that proposed in Section 4. Since the baseline scores will differ for different languages and corpora, scores for different corpora that appear equal may not necessarily be comparable.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML