File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/w04-0509_intro.xml
Size: 13,658 bytes
Last Modified: 2025-10-06 14:02:26
<?xml version="1.0" standalone="yes"?> <Paper uid="W04-0509"> <Title>Analysis of Semantic Classes in Medical Text for Question Answering</Title> <Section position="3" start_page="1" end_page="1" type="intro"> <SectionTitle> 2 Identifying Semantic Classes in Medical </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="1" end_page="1" type="sub_section"> <SectionTitle> Text 2.1 Diseases and Medications </SectionTitle> <Paragraph position="0"> The identification of named entities (NEs) in the biomedical area, such as PROTEINS and CELLS,has been extensively explored; e.g., Lee et al. (2003), Shen et al. (2003). However, we are not aware of any satisfactory solution that focuses on the recognition of semantic classes such as MEDICATION and DISEASE. To straightforwardly identify DISEASE and MEDICATION in the text, we use the knowledge base Unified Medical Language System (UMLS) (Lindberg et al., 1993) and the software MetaMap (Aronson, 2001).</Paragraph> <Paragraph position="1"> UMLS contains three knowledge sources: the Metathesaurus, the Semantic Network, and the Specialist Lexicon. Given an input sentence, MetaMap separates it into phrases, identifies the medical concepts embedded in the phrases, and assigns proper semantic categories to them according to the knowledge in UMLS. For example, for the phrase immediate systemic anticoagulants, MetaMap identifies immediate as a TEMPORAL CONCEPT, systemic as a FUNCTIONAL CONCEPT,andanticoagulants as a PHARMACOLOGIC SUBSTANCE. More than one semantic category in UMLS may correspond to MEDICATION or DISEASE. For example, either a PHAR-MACOLOGIC SUBSTANCE or a THERAPEUTIC OR PREVENTIVE PROCEDURE can be a MEDICATION; either a DISEASE OR SYNDROME or a PATHOLOGIC FUNCTION can be a DISEASE.</Paragraph> <Paragraph position="2"> We use some training text to find the mapping between UMLS categories and the two semantic classes in the treatment scenario. The training text was tagged for us by a clinician to mark DISEASE and MEDICATION. It was also processed by MetaMap. After that, the annotated text was compared with the output of MetaMap to find the corresponding UMLS categories. Medical text containing these categories can then be identified as either MEDICATION or DISEASE. In the example above, anticoagulants will be taken as a MEDICATION.The problem of identification of medical terminology is still a big challenge in this area. MetaMap does not provide a full solution to it. For cases in which the output of MetaMap is not consistent with the judgment of the clinician who annotated our text, our decisions rely on the latter.</Paragraph> </Section> <Section position="2" start_page="1" end_page="1" type="sub_section"> <SectionTitle> 2.2 Clinical Outcome </SectionTitle> <Paragraph position="0"> The task of identifying clinical outcomes is more complicated. Outcomes are often not just noun phrases; instead, they usually are expressed in complex syntactic structures. The following are some examples: (1) Thrombolysis reduces the risk of dependency,butincreases the risk of death. (2) The median proportion of symptom free days improved more with salmeterol than with placebo.</Paragraph> <Paragraph position="1"> In our analysis of the text, we found another type of outcome which is also very important: the outcome of clinical trials: (3) Several small comparative RCTs [randomized clinical trials] have found sodium cromoglicate to be less effective than inhaled corticosteroids in improving symptoms and lung function.</Paragraph> <Paragraph position="2"> (4) In the systematic review of calcium channel antagonists, indirect and limited comparisons of intravenous versus oral administration found no significant difference in adverse events.</Paragraph> <Paragraph position="3"> We treat these as a special type of clinical outcome. For convenience, we refer to them as &quot;results&quot; in the following description when necessary. A &quot;result&quot; might contain a clinical outcome within it, as results often involve a comparison of the effects of two (or more) interventions on a disease.</Paragraph> <Paragraph position="4"> In medical text, the appearance of some words is found often to be a signal of the occurrence of an outcome, and usually several words signal the occurrence of one single outcome. The combination approach that we applied for identifying outcomes is based on this observation. Our approach does not extract the whole outcome at once. Instead, it tries to identify the different parts of an outcome that may be scattered in the sentence, and then combines them to form the complete outcome.</Paragraph> <Paragraph position="5"> Rule-based methods and machine-learning approaches have been used for similar problems. Gildea and Jurafsky (2002) used a supervised learning method to learn both the identifier of the semantic roles defined in FrameNet such as theme, target, goal, and the boundaries of the roles (Baker et al., 2003). A set of features were learned from a large training set, and then applied to the unseen data to detect the roles. The performance of the system was quite good. However, it requires a large training set for related roles, which is not available in many tasks, including tasks in the medical area.</Paragraph> <Paragraph position="6"> Rule-based methods are explored in information extraction (IE) to identify roles to fill slots in some pre-defined templates (Catal`a et al., 2003). The rules are represented by a set of patterns, and template role identification is usually conducted by pattern matching. Slots indicating roles are embedded in these patterns. Text that satisfies the constraints of a pattern will be identified, and the contents corresponding to the slots are extracted. This approach has been proved to be effective in many IE tasks. However, pattern construction is very timeconsuming, especially for complicated phrasings. In order to select the roles and only the roles, their expression has to be customized specifically in patterns. This results in increasing difficulties in pattern construction, and reduces the coverage of the patterns.</Paragraph> <Paragraph position="7"> Different pieces of an outcome are identified by various cue words. Each occurrence of a cue word suggests a portion of the expression of the outcome. Detecting all of them will increase the chance of obtaining the complete outcome. Also, different occurrences of cue words provide more evidence of the existence of an outcome.</Paragraph> <Paragraph position="8"> The first step of the combination approach is to collect the cue words. Two sections of CE (stroke management, asthma in children) were analyzed for detection of outcome. The text was annotated by a clinician in the EpoCare project. About two-thirds of each section (267 sentences in total) was taken as the analysis examples for collecting the cue words, and the rest (156 sentences) as the test set. Some words we found in the analysis are the following: Nouns: death, benefit, dependency, outcome, evidence, harm, difference.</Paragraph> <Paragraph position="9"> Verbs: improve, reduce, prevent, produce, increase. null Adjectives: beneficial, harmful, negative, adverse, superior.</Paragraph> <Paragraph position="10"> After the cue words are identified, the next question is what portion of text each cue word suggests as the outcome, which determines the boundary of the outcome. The text was pre-processed by the Apple Pie parser (Sekine, 1997) to obtain the part-of-speech and phrase information. We found that for the noun cues, the noun phrase that contains the noun will be part of the outcome. For the verb cue words, the verb and its object together constitute one portion of the outcome. For the adjective cue words, often the corresponding adjective phrase or the noun phrase belongs to the outcome. Cue words for the results of clinical trials are processed in a slightly different way. For example, for difference and superior, any immediately following prepositional phrase is also included in the results of the trial.</Paragraph> <Paragraph position="11"> Our approach does not rely on specific patterns, it is more flexible than pattern-matching techniques in IE systems, and it does not need a large training set. A limitation of this approach is that some connections between different portions of an outcome may be missing.</Paragraph> <Paragraph position="12"> 2.2.3 Evaluation and analysis of results We evaluated the cue word method of detecting the outcome on the remaining one-third of the sections of CE. (The test set is rather small because of the difficulty in obtaining the annotations.) The outcome detection task was broken into two sub-tasks, each evaluated separately: to identify the outcome itself and to determine its textual boundary. The result of identification is shown in Table 1. Eighty-one sentences in the test set contain either an outcome or result, which is 52% of all the test sentences. This was taken as the baseline of the evaluation: taking all sentences in the test set as positive (i.e., containing an outcome or result). By contrast, the accuracy of the combination approach is 83%.</Paragraph> <Paragraph position="13"> There are two main reasons why some outcomes were not identified. One is that some outcomes do not have any cue word: (5) Gastrointestinal symptoms and headaches have been reported with both montelukast and zafirlukast.</Paragraph> <Paragraph position="14"> The other reason is that although some outcomes contained words that might be regarded as cue words, we did not include them in our set; for example, fewer and higher. Adjectives were found to have the most irregular usages. It is normal for them to modify both medications and outcomes, as shown in the following examples: (6) . . . children receiving higher dose inhaled (7) . . . mean morning PEFR was 4% higher in the salmeterol group.</Paragraph> <Paragraph position="15"> Other adjectives such as less, more, lower, shorter, longer, and different have similar problems. If they are taken as identifiers of outcomes then some false positives are very likely to be generated. However, if they are excluded, some true outcomes will be missed. There were 14 samples of false positives. The main cause was sentences containing cue words that did not have any useful information: (8) We found that the balance between benefits and harms has not been clearly established for the evacuation of supratentorial haematomas.</Paragraph> <Paragraph position="16"> (9) The third systematic review did not evaluate these adverse outcomes.</Paragraph> <Paragraph position="17"> Table 2 shows the result of boundary detection for those outcomes that were correctly identified. The true boundary is the boundary of an outcome that was annotated manually. The no match case means that there is a true outcome in the sentence but the program missed the correct portions of text and marked some other portions as the outcome.</Paragraph> <Paragraph position="18"> The program identified 39% of the boundaries exactly the same as the true boundaries. In 19% of the samples, the true boundaries were entirely within the identified fragments. The spurious text in them (the text that was not in the true boundary) was found to be small in many cases, both in terms of number of words and in terms of the importance of the content. The average number of words correctly identified was 7 for each outcome and the number of spurious words was 3.4. The most frequent content in the spurious text was the medication applied to obtain the outcome. In the following examples, text in &quot;CWCX&quot; is the outcome (result) identified automatically, and text in &quot;CUCV&quot; is spurious. (10) The RCTs found CWno significant adverse effects CUassociated with salmeterolCVCX.</Paragraph> <Paragraph position="19"> (11) The second RCT . . . also found CWno sig null nificant difference in mortality at 12 weeks CUwith lubeluzole versus placeboCVCX ...</Paragraph> <Paragraph position="20"> Again, adjectives are most problematic. Even when a true adjective identifier is found, the boundary of the outcome is hard to determine by an unsupervised approach because of the variations in the expression. In the following examples, the true boundaries of outcomes are indicated by &quot;[ ]&quot;, ad- null jectives are highlighted.</Paragraph> <Paragraph position="21"> (12) Nebulised . . . , but [CWserious adverse effectsCX are rare].</Paragraph> <Paragraph position="22"> (13) Small RCTs . . . found that [. . . was CWeffectiveCX, with ...].</Paragraph> <Paragraph position="23"> The correctness of the output of the parser also had an important impact on the performance, as shown in the following example: (14) RCTs found no evidence that lubeluzole improved clinical outcomes in people with acute ischaemic stroke.</Paragraph> <Paragraph position="24"> (S...(NPL(DTthat)(JJlubeluzole) (JJimproved)(JJclinical) (NNSoutcomes))...) In this parse, the verb improve was incorrectly assigned to be an adjective in a noun phrase. Thus improve as a verb cue word was missed in identifying the outcome. However, another cue word outcomes was matched, so the whole noun phrase of outcomes was identified as the outcome. On the one hand, the example shows that the wrong parsing output directly affects the identification process. On the other hand, it also shows that missing one cue word in identifying the outcome can be corrected by the occurrence of other cue words in the combination approach.</Paragraph> </Section> </Section> class="xml-element"></Paper>