File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/relat/04/w04-0508_relat.xml

Size: 3,485 bytes

Last Modified: 2025-10-06 14:15:43

<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-0508">
  <Title>Answering Questions in the Genomics Domain</Title>
  <Section position="5" start_page="0" end_page="0" type="relat">
    <SectionTitle>
4 Related Work
</SectionTitle>
    <Paragraph position="0"> Question Answering in Biomedicine is surveyed in detail in (Zweigenbaum, 2003), in particular regarding clinical questions. An example of a system applied to such questions is presented in (Niu et al., 2003), where it is applied in a setting for Evidence-Based Medicine. This system identifies specific 'roles' within the document sentences and the questions, determining the answers is then a matter of comparing the roles in each. To this aim, natural language questions are translated into the PICO format (Sackett et al., 2000).</Paragraph>
    <Paragraph position="1"> Automatic knowledge extraction (or strategies for improving these methods) over Medline articles are numerous. For example, (Craven and Kumlien, 1999) identifies possible drug-interaction relations (predicates) between proteins and chemicals using a 'bag of words' approach applied to the sentence level. This produces inferences of the type: druginteractions (protein, pharmacologic-agent) where an agent has been reported to interact with a protein. null (Sekimizu et al., 1998) uses frequently occurring predicates and identifies the subject and object arguments in the predication, in contrast (Rindflesch et al., 2000) uses named entity recognition techniques to identify drugs and genes, then identifies the predicates which connect them. This type of 'object-relation-object' inference may also be implied (Cimino and Barnet, 1993). This method uses 'if then' rules to extract semantic relationships between the medical entities depending on which MeSH headings these entities appear under. For example, if a citation has &amp;quot;Electrocardiography&amp;quot; with the subheading &amp;quot;Methods&amp;quot; and has &amp;quot;Myocardial Infarction&amp;quot; with the subheading &amp;quot;Diagnosis&amp;quot; then &amp;quot;Electrocardiography&amp;quot; diagnoses &amp;quot;Myocardial Infarction&amp;quot;.</Paragraph>
    <Paragraph position="2"> (Spasi'c et al., 2003) uses domain-relevant verbs to improve on terminology extraction. The co-occurrence in sentences of selected verbs and candidate terms reinforces their termhood. But where such linguistic inferences are stored in a KB as facts, statistical inferences are only used to visualize possible relations between objects for further investigation. (Stapley and Benoit, 2000) measures statistical gene name co-occurrence and graphically displays the results for an expert to investigate the dominant patterns. The PubMed4 system uses the UMLS to relate metathesaurus concepts against a controlled vocabulary used to index the abstracts. This allows efficient retrieval of abstracts from medical journals, but it makes use of hyponymy and lexical synonymy to organize the terms. It collects terminologies from differing sub-domains in a metathesaurus of concepts. null All such inferences (especially statistical) need to be verified by an expert to ensure their validity. Syntactic parsing, if any, is reserved to shallow NP identifying strategies (Sekimizu et al., 1998), or possibly supplemented with PP information (Rindflesch et al., 2000). Semantic interpretation of the documents is only attempted through their MeSH headings (Mendonca and Cimino, 1999).</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML