File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/03/w03-1609_concl.xml
Size: 1,491 bytes
Last Modified: 2025-10-06 13:53:47
<?xml version="1.0" standalone="yes"?> <Paper uid="W03-1609"> <Title>Paraphrase Acquisition for Information Extraction</Title> <Section position="8" start_page="0" end_page="0" type="concl"> <SectionTitle> 7 Conclusions </SectionTitle> <Paragraph position="0"> In this paper, we described a method to obtain paraphrases automatically from corpora. Our key notion is to use comparable articles which report the same event on the same day. Some noun phrases, especially Extended Named Entities such as names, locations and numbers, are preserved across articles even if the event is reported using different expressions. We used these noun phrases as anchors and extracted portions which share these anchors. Then we generalized the obtained expressions as usable paraphrases.</Paragraph> <Paragraph position="1"> We adopted dependency trees as a format for expressions which preserve syntactic constraints when extracting paraphrases. We generate possible sub-trees from dependency trees and find pairs which share the anchors. However, simply generating all subtrees ends up obtaining many inappropriate portions of sentences. We tackled this problem by calculating a score which tells us how plausible extracted candidates are. We confirmed that it contributed to the overall accuracy. This metric was also useful to trimming the search space for matching subtrees. We used a simple coreference resolver to handle some additional anchors such as pronouns.</Paragraph> </Section> class="xml-element"></Paper>