File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/06/p06-2053_concl.xml

Size: 2,091 bytes

Last Modified: 2025-10-06 13:55:24

<?xml version="1.0" standalone="yes"?>
<Paper uid="P06-2053">
  <Title>Sydney, July 2006. c(c)2006 Association for Computational Linguistics Towards the Orwellian Nightmare Separation of Business and Personal Emails</Title>
  <Section position="8" start_page="409" end_page="410" type="concl">
    <SectionTitle>
7 Conclusion
</SectionTitle>
    <Paragraph position="0"> This paper describes the process of creating an email corpus annotated with business or personal labels. By measuring inter-annotator agreement it shows that this process was successful. Furthermore, by analysing the disagreements in the fine categories, it has allowed us to characterise the areas where the business/personal decisions are difficult.</Paragraph>
    <Paragraph position="1"> In general, the separation of business and personal mails is a task that humans can perform. Part of the project has allowed the identification of the areas where humans cannot make this distinction (as demonstrated by inter-annotator agreement scores) and one would not expect machines to perform the task under these conditions either. In all other cases, where the language is not ambiguous as judged by human annotators, the challenge has been made to automatic classifiers to match this performance.</Paragraph>
    <Paragraph position="2"> Some initial results were reported where machines attempted exactly this task. They showed that accuracy almost as high as human agreement was achieved by the system. Further work, using much larger sets and incorporating all types of business and personal emails, is the next logical step.</Paragraph>
    <Paragraph position="3">  Any annotation project will encounter its problems in deciding appropriate categories. This paper described the various stages of evolving these categories to a stage where they are both intuitive and logical and also, produce respectable inter-annotator agreement scores. The work is still in progress in ensuring maximal consistency within the data set and refining the precise definitions of the categories to avoid possible overlaps.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML