File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/w04-1101_intro.xml

Size: 2,535 bytes

Last Modified: 2025-10-06 14:02:34

<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-1101">
  <Title>Segmentation of Chinese Long Sentences Using Commas</Title>
  <Section position="4" start_page="3" end_page="3" type="intro">
    <SectionTitle>
2 Related Work
2.1 Related Work for Clause Segmenta-
</SectionTitle>
    <Paragraph position="0"> tion Syntactic ambiguity problems increase drastically as the input sentence becomes longer. Long sentence segmentation is a way to avoid the problem. Many studies have been made on clause segmentation (Carreras and Marquez, 2002, Leffa, 1998, Sang and Dejean,2001). In addition, many studies also have been done on long sentences segmentation by certain patterns (Kim and Zhang, 2001, Li and Pei, 1990, Palmer and Hearst, 1997).</Paragraph>
    <Paragraph position="1"> However, some researchers merely ignore punctuation, including the comma, and some researchers use a comma as one feature to detect the segmentation point, not fully using the information from the comma.</Paragraph>
    <Section position="1" start_page="3" end_page="3" type="sub_section">
      <SectionTitle>
2.2 Related Work for Punctuation Proc-
essing
</SectionTitle>
      <Paragraph position="0"> Several researchers have provided descriptive treatment of the role of punctuations: Jones (1996b) determined the syntactic function of the punctuation mark. Bayraktar and Akman (1998) classified commas by means of the syntax-patterns in which they occur. However, theoretical forays into the syntactic roles of punctuation were limited.</Paragraph>
      <Paragraph position="1"> Many researchers have used the punctuation mark for syntactic analysis and insist that punctuation indicates useful information. Jones (1994) successfully shows that grammar with punctuation outperforms one without punctuation. Briscoe and Carroll 1995) also show the importance of punctuation in reducing syntactic ambiguity. Collins (1999), in his statistical parser, treats a comma as an important feature. Shiuan and Ann (1996) separate complex sentences with respect to the link word, including the comma. As a result, their syntactic parser performs an error reduction of 21.2% in its accuracy.</Paragraph>
      <Paragraph position="2"> ( Say (1997) provides a detailed introduction to using punctuation for a variety of other natural language processing tasks.</Paragraph>
      <Paragraph position="3"> All of these approaches prove that punctuation analyses improve various natural language processing performance, especially in complex sentence segmentation.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML