File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/05/i05-5001_concl.xml

Size: 3,026 bytes

Last Modified: 2025-10-06 13:54:37

<?xml version="1.0" standalone="yes"?>
<Paper uid="I05-5001">
  <Title>Support Vector Machines for Paraphrase Identification and Corpus Construction</Title>
  <Section position="6" start_page="6" end_page="7" type="concl">
    <SectionTitle>
6 Conclusions
</SectionTitle>
    <Paragraph position="0"> We have shown that supervised machine learning techniques such as SVMs can significantly expand available paraphrase corpora, and achieve a reduction of noise as measured by AER on non-identical words.</Paragraph>
    <Paragraph position="1"> Although from the present research has focused on &amp;quot;ready-made&amp;quot; news clusters found on the web, nothing in this paper depends on the availability of such clusters. Given standard clustering techniques, the approach that we have described for inductive classifier learning should in principle be applicable to any flat corpus which contains multiple sentences expressing similar content. We expect also that the techniques described here could be extended to identify bilingual sentence pairs in comparable corpora, helping automate the construction of corpora for machine translation.</Paragraph>
    <Paragraph position="2"> The ultimate test of paraphrase identification technologies lies in applications. These are likely to be in fields such as extractive multi-document summarization where paraphrase detection might eliminate sentences with comparable content and Question Answering, for both identifying sentence pairs with comparable content and generating unique new text. Such pracyoung female chimps learn skills earlier , spend more time studying and tend to do better than young male chimpanzees - at least when it comes to catching termites .</Paragraph>
    <Paragraph position="3"> young female chimpanzees are better students than males , at least when it comes to catching termites , according to a study of wild chimps in tanzania 's gombe national park . Paraphrase (accepted)g14967 a %%number%% -year-old girl was arrested , handcuffed and taken into custody on charges of stealing a rabbit and a small amount of money from a neighbor 's home .</Paragraph>
    <Paragraph position="4"> sheriff 's deputies in pasco county , fla. , this week handcuffed and questioned a %%number%% -year-old girl who was accused of stealing a rabbit and %%money%% from a neighbor 's  roy moore , the chief justice of alabama , installed the two-ton sculpture in the rotunda of his courthouse in montgomery , and has refused to remove it .</Paragraph>
    <Paragraph position="5"> the eight associate justices of alabama 's supreme court voted unanimously %%day%% to overrule moore and comply with u.s. district judge myron thompson 's order to remove the monu- null tical applications will only be possible once large corpora are available to permit the development of robust paraphrase models on the scale of the best SMT models. We believe that the corpus construction techniques that we have described here represent an important contribution to this goal.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML