File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/05/h05-1106_concl.xml
Size: 2,119 bytes
Last Modified: 2025-10-06 13:54:32
<?xml version="1.0" standalone="yes"?> <Paper uid="H05-1106"> <Title>Language & Information Engineering</Title> <Section position="6" start_page="848" end_page="849" type="concl"> <SectionTitle> 5 Conclusions </SectionTitle> <Paragraph position="0"> We here proposed a new terminology extraction method and showed that it signi cantly outperforms two of the standard approaches in distinguishing terms from non-terms in the biomedical literature. While mining scienti c literature for new terminological units and assembling those in controlled vocabularies is a task involving several components, one essential building block is to measure the degree of termhood of a candidate. In this respect, our study has shown that a criterion which incorporates a vital linguistic property of terms, viz. their limited paradigmatic modi ability, is much more powerful than linguistically more uninformed measures. This is in line with our previous work on general-language collocation extraction (Wermter and Hahn, 2004), in which we showed that a linguistically motivated criterion based on the limited syntagmatic modi ability of collocations outperforms alternative standard association measures as well.</Paragraph> <Paragraph position="1"> We also collected evidence that the superiority of the P -Mod method relative to other term extraction approaches holds independent of the underlying corpus size (given a reasonable offset). This is a crucial nding because other domains might lack large volumes of free-text material but still provide suf cient corpus sizes for valid term extraction. Finally, since we only require shallow syntactic analysis (in terms of NP chunking), our approach might be well suited to be easily portable to other domains. Hence, we may conclude that, although our methodology has been tested on the biomedical domain only, there are essentially no inherent domain-speci c restrictions.</Paragraph> <Paragraph position="2"> Acknowledgements. This work was partly supported by the European Network of Excellence Semantic Mining in Biomedicine (NoE 507505).</Paragraph> </Section> class="xml-element"></Paper>