File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/04/w04-1203_concl.xml
Size: 2,905 bytes
Last Modified: 2025-10-06 13:54:20
<?xml version="1.0" standalone="yes"?> <Paper uid="W04-1203"> <Title>Analysis of Link Grammar on Biomedical Dependency Corpus Targeted at Protein-Protein Interactions</Title> <Section position="10" start_page="19" end_page="20" type="concl"> <SectionTitle> 8 Conclusion </SectionTitle> <Paragraph position="0"> We have presented an analysis of Link Grammar performance using a custom dependency corpus targeted at protein-protein interactions.</Paragraph> <Paragraph position="1"> We introduced the concept of the interaction subgraph and reported parser performance for three criteria: recovery of dependencies, interaction subgraphs and fully correct linkages. WhileLGwasabletorecover73%ofdependenciesintheflrstlinkage,only7%ofsentenceshad null a fully correct flrst linkage. However, fully correctlinkagesarenotrequiredforinformationex- null traction, and we found that 25% of interaction subgraphs were recovered in the flrst linkage.</Paragraph> <Paragraph position="2"> Resourceexhaustionwasfoundtobeasignificant cause of poor performance. Furthermore, an evaluation of performance in the case when optimal heuristics for ordering linkages are appliedindicatedthatthefractionofrecoveredin- null teractionsubgraphscouldbemorethandoubled (to 57%) by optimal heuristics.</Paragraph> <Paragraph position="3"> To further analyze the cases where the parser cannot produce a correct linkage, we carefully examined the sentences and were able to identify flve problem types. For each identifled case, we discussed potential modiflcations for addressing the problems. We also considered the possibility of using a named entity recognition system to improve parser performance and foundthat28%ofLGfailureswouldbeavoided by a awless named entity recognition system.</Paragraph> <Paragraph position="4"> We evaluated the efiect of the dictionary extensionproposedbySzolovits(2003),andfound null that while it signiflcantly reduced ambiguity andimprovedperformanceforthemostambiguous sentences, overall improvement was only 2.5%. This indicates that extending the dictionary is not su-cient to address the performance problems and that modiflcations to the grammar and parser are necessary.</Paragraph> <Paragraph position="5"> The quantitative analysis of LG performance conflrmsthat,initscurrentstate,LGisnotwell suitedtotheIEtaskdiscussed. However, inthe failure analysis we have identifled a number of speciflc issues and problematic areas for LG in parsing biomedical publications, and suggested improvements for adapting the parser to this domain. The examination and implementation of these improvements is a natural follow-up of this study. Our initial experiments suggest that it is indeed possible to implement general solutions to many of the discussed problems, and suchmodiflcationswouldbeexpectedtoleadto improved applicability of LG to the biomedical domain.</Paragraph> </Section> class="xml-element"></Paper>