File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/04/w04-1101_abstr.xml
Size: 1,083 bytes
Last Modified: 2025-10-06 13:43:49
<?xml version="1.0" standalone="yes"?> <Paper uid="W04-1101"> <Title>Segmentation of Chinese Long Sentences Using Commas</Title> <Section position="2" start_page="3" end_page="3" type="abstr"> <SectionTitle> Abstract </SectionTitle> <Paragraph position="0"> The comma is the most common form of punctuation. As such, it may have the greatest effect on the syntactic analysis of a sentence. As an isolate language, Chinese sentences have fewer cues for parsing. The clues for segmentation of a long Chinese sentence are even fewer. However, the average frequency of comma usage in Chinese is higher than other languages. The comma plays an important role in long Chinese sentence segmentation. This paper proposes a method for classifying commas in Chinese sentences by their context, then segments a long sentence according to the classification results. Experimental results show that accuracy for the comma classification reaches 87.1 percent, and with our segmentation model, our parsers dependency parsing accuracy improves by 9.6 percent. null</Paragraph> </Section> class="xml-element"></Paper>