File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/05/w05-1304_evalu.xml
Size: 5,244 bytes
Last Modified: 2025-10-06 13:59:32
<?xml version="1.0" standalone="yes"?> <Paper uid="W05-1304"> <Title>Proceedings of the ACL-ISMB Workshop on Linking Biological Literature, Ontologies and Databases: Mining Biological Semantics, pages 25-31, Detroit, June 2005. c(c)2005 Association for Computational Linguistics A Machine Learning Approach to Acronym Generation</Title> <Section position="11" start_page="27" end_page="29" type="evalu"> <SectionTitle> 5 Experiments </SectionTitle> <Paragraph position="0"> To evaluate the performance of the acronym generation method presented in the previous section, we ran five-fold cross validation experiments using the manually curated data set. The data set consists of 1,901 definition-acronym pairs.</Paragraph> <Paragraph position="1"> For comparison, we also tested the performance of the popular heuristics for acronym generation in which we choose the letters at the beginning of each word in the definition and capitalize them.</Paragraph> <Section position="1" start_page="28" end_page="29" type="sub_section"> <SectionTitle> 5.1 Generated Acronyms </SectionTitle> <Paragraph position="0"> Tables 2 to 5 show some examples of generated acronyms together with their probabilities. They are sorted with their probabilities and the top ten acronyms are shown. The correct acronym given in the training data is described in the bottom row in each table.</Paragraph> <Paragraph position="1"> In Table 2, the definition is &quot;traumatic brain injury&quot; and the correct acronym is &quot;TBI&quot;. This is the simplest case in acronym generation, where the first letter of each word in the definition is to be capitalized. Our acronym generator gives a high probability to the correct acronym and it is ranked at the top. Table 3 shows a slightly more complex case, where the generator needs to convert the space be- null tween 'F' and '1' into a hyphen. The correct answer is located at the third rank.</Paragraph> <Paragraph position="2"> The definition in Table 4 is &quot;RNA polymerase&quot; and the correct acronym is &quot;RNAP&quot;, so the generator needs to the first three letters unchanged. The correct answer is located at the fourth rank, and the probability given the correct answer does not have a large gap with the top-ranked acronym.</Paragraph> <Paragraph position="3"> Table 5 shows a more difficult case, where you need to output the first letter in lowercase and choose appropriate letters from the string having no delimiters (e.g. spaces and hyphens). Our acronym generator outputs the correct acronym at the nine-th rank but the probability given this acronym is very low compared to that given to the top-ranked string.</Paragraph> <Paragraph position="4"> Table 6 shows a similar case. The probability given to the correct acronym is very low.</Paragraph> </Section> <Section position="2" start_page="29" end_page="29" type="sub_section"> <SectionTitle> 5.2 Coverage </SectionTitle> <Paragraph position="0"> Table 7 shows how much percentage of the correct acronyms are covered if we take top N candidates from the outputs of the acronym generator.</Paragraph> <Paragraph position="1"> The bottom line (BASELINE) shows the coverage achieved by generating one acronym using the standard heuristic rule for acronym generation. Note that the coverage achieved with a single candidate (Rank 1) is better that of BASELINE.</Paragraph> <Paragraph position="2"> If we take top five candidates, we can have a coverage of 75.4%, which is considerably better than that achieved by the heuristic rule. This suggests that the acronym generator could be used to significantly improve the performance of the systems for information retrieval and information integration.</Paragraph> </Section> <Section position="3" start_page="29" end_page="29" type="sub_section"> <SectionTitle> 5.3 Features </SectionTitle> <Paragraph position="0"> To evaluate how much individual types of features affect the generation performance, we ran experiments using different feature types. Table 8 shows the results. Overall, the results show that various types of features have been successfully incorporated in the MEMM modeling and individual types of features contribute to improving performance.</Paragraph> <Paragraph position="1"> The performance achieved with only unigram features is almost the same as that achieved by the heuristic rule. Note that the features on the previous state improve the performance, which suggests that our selection of the states in the Markov modeling is a reasonable choice for this task.</Paragraph> </Section> <Section position="4" start_page="29" end_page="29" type="sub_section"> <SectionTitle> 5.4 Learning Curve </SectionTitle> <Paragraph position="0"> Figure 2 shows a learning curve of our acronym generator, which shows the relationship between the number of the training samples and the performance of the system. The graph clearly indicates that the performance consistently improves as the training data increases and still continues to improve even when the size of the training data reaches the maximum. This suggests that we can achieve improved performance by increasing the annotated data for training.</Paragraph> </Section> </Section> class="xml-element"></Paper>