File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/83/a83-1021_metho.xml

Size: 16,917 bytes

Last Modified: 2025-10-06 14:11:29

<?xml version="1.0" standalone="yes"?>
<Paper uid="A83-1021">
  <Title>Append/x: A Cane of MethAnol Poisoning</Title>
  <Section position="3" start_page="125" end_page="125" type="metho">
    <SectionTitle>
4. The Current I~CONSIDKR Implementation
4-. L The Inverted File
</SectionTitle>
    <Paragraph position="0"> Using abstract syntax to represent the structure in the text. C?41T was scanned and &amp;quot;parsed ''~ to produce a sequence of ts~w.~, each with the following attributes: ordinal position of teem in phrase ordinal position of phrase in clause ordinal position of clause in sentence ordinal position of sentence in part name of part disease body system(s) of disease Thus. a dictionary (containing in excess of 20.000 such terms) was formed and CM.IT &amp;quot;inverted&amp;quot;, so that each dictionary entry was fol-Lowed by pointers to every occurrence of that entry in CMIT. Included with every pointer were the seven attributes associated with each occurrence of that term. There are 333.211 terra occurrences in CMIT. for an average of about 102 terms per disease, or 79 unique terms per disease, the difference being terms that are used more than once in a given deflnitior~ In principle, this &amp;quot;dictionary&amp;quot; could be used to reconstruct CMIT, as it r~preswrtts, in aZter,tat~ve for.7.at, e=actty tlw saw~ ~nformat~rn! This Large inverted file allows et~cient searching for terms in the text. The searches can be (I) constrained to a context (diseases of the skin). (2) constrained to textual proximity (adjacency. or membership within a clause), or  (3) constrained to a definition part (symptoms only).</Paragraph>
    <Paragraph position="1"> 4.2~ Synonym Dictionary  A 15.388 term &amp;quot;synonym&amp;quot; dictionary a. includes words not in CMIT which are synonyms of words used in the CMIT definitions and words already in CMIT that are synonyms of each other (e.g. prur~tu,s and itcA g) These are partitioned a_mongst 4.165 &amp;quot;synonym classes&amp;quot; (the two or more words within each class are synonyms of each other). Search options allow searches with or without equiv~dencmg the synonyms, and with or without invoking hierarchical synonyms. The term &amp;quot;synonym&amp;quot; is used generously, as the dictionary is actually functioning as a kind of semantic net - connecting words with strong conceptual Links. It should also be noted that RECONSIDER does not employ &amp;quot;stemming&amp;quot;. All variants of a term (and some phrases, e.g. abdominal penn ). including. in some cases, mis-speLlings, appear wittun a single * 'synonym class&amp;quot;. Though we have not proven this, it is our opinion that this synonym dictionary is what converts an interesting tool for research into medical term-use, into something vOnce age.u~ this par~e L~ .nOt identifylr~ &amp;quot;pa.&amp;quot;~ of speech&amp;quot; Ln L~e conventional serme. Rat.her &amp;quot;.he ab~rac% t~ (a BICF grammar akan to those deflnLng program,'vzJ~ languages) encodes the meaning of :he ex'.erna~ mcrke..-s and ptmctuatic~W conventions employed in C.MIT.</Paragraph>
    <Paragraph position="2"> a(:ort.~J'ucted &amp;quot;0y Rod~ey Ludw~, U.D. and HTo ~ \]LD.. L2b -that functions not unlike an expert system.</Paragraph>
    <Section position="1" start_page="125" end_page="125" type="sub_section">
      <SectionTitle>
4.3. Searches
</SectionTitle>
      <Paragraph position="0"> Searches for a set of terms can require a match on every term. or a match on one or more of the terms in the set. In the latter case. matches are scored in a manner reminiscent of techniques used for literature and infm-mation retrieve/ by Salton. ~parck-Jones and others.</Paragraph>
      <Paragraph position="1"> and in particular Doszkocs \[8\]. The scoring a/gorithm is illustrated in the next section.</Paragraph>
      <Paragraph position="2">  4. 4. The User-Interface  RECONSIDER is an interactive user interface running on top of the inverted file and the search algorithms. It accepts terms, search modifiers, and requests for one of the two matching algorithms, formulates the appropriate query, searches the inverted files, computes the score of the diseases retrieved (if requested), constructs a body-system histogram (if requested), ranks the diseases if appropriate, and displays any disease definitions selected for viewing or browsing by the user.</Paragraph>
    </Section>
  </Section>
  <Section position="4" start_page="125" end_page="127" type="metho">
    <SectionTitle>
5. Performance
S.L &amp; Compartsen with two lY~ncet/e Expert
</SectionTitle>
    <Paragraph position="0"> When applied Lo the published cases diagnosed by INTERNIST and PIP \[R0.17,16\], RECONSIDER produced the correct diagnosis (or diagnoses) at. or near, the top of the disease List produced by enterin~ the positive findings given Lo these programs \[5\]. (Again. CADUCEUS considers 300 diseases from internal medicine, and PIP considers 20 diseases featuring edema.) While these cases were often complex, a large amount of clinical information was available for each patient.</Paragraph>
    <Section position="1" start_page="125" end_page="127" type="sub_section">
      <SectionTitle>
5.2. Diagnostic Pr~nptin~: An Example
</SectionTitle>
      <Paragraph position="0"> We believe that RECONSIDER performs better, and much more usefully, at an earlier point in the diagnostic process, at a time prior to any extensive patient work-up, when the physician's &amp;quot;cognitive span&amp;quot; is widest \[2\]. For example, a patient presents with findings as noted at the beginning of the appendix. RECONSIDER begins by prompting for terms. The prefix ~s/is used by the physician-user to indicate that the succeeding terms are to be searched ~or in either the s'y~npton~s, or s~\]rts portions of Lhe disease descriptions. This grouping, a union of the two vocabularies, was necessitated by the non-consistent usage of terms in these contexts. ~ The phrase oObdo~r~,,/ pmL't will match (given the RECONSIDER options setected to run this case) any co-occurrence of these two words (or its synonyms) within a si~gie ctause. RECONSIDER responds with the synonyms it knows for the terms entered, and sThe ~ of terms withe CMIT did not follow t~e ~dioal do~rrm as to what was * ~m;~om, and what w,,, a ssgn. the number of diseases containing one of mor'e occurrences of each of the terms within the ss/ context. The response =bdo.tinaL pain\[ 191+80\] indicates that the pah&amp;quot; abdom~n~ pen occurs in 191 diseases and that 80 additional diseases have been retrieved by the synonyms for ~bdompaL-t, namely ccL/c\[3~\], csL~ck~/~16\], end pWut /n abdom4~48\]. The fact that 3.5/15+48 exceeds 80. and 191+35+18+48 exceeds 191+180, indicates that some diseue definitions contain more than one term from this synonym class.</Paragraph>
      <Paragraph position="1"> The score (a measure of selectivity) for</Paragraph>
      <Paragraph position="3"> where 271 is the number of &amp;quot;disease occurrences&amp;quot; of abdomma~ p~, and 3252 is the total nurnbar of diseases in CMIT. A disease's score is the sum of the scores of the terms its description matched.</Paragraph>
      <Paragraph position="4"> Most physicians would probably conclude that the observation that the patienL smoked was not relevant to the patient's illness, but the term smo~ was entered here to show its obvious effect on the disease List (it brings n~.ott~ur, to=/C//~ and ~g ~pende.ce, ma=-/h~m~ nearer to the top, partly because it is so &amp;quot;seLective&amp;quot;). It is not clear which 'part' of the disease descriptions the term ~wlo~g will be found in.</Paragraph>
      <Paragraph position="5"> so its search context is all/. and the same decision is made with respect to =e/dos.iv. An/on gap ~/dos/8 is not used in C~\]T, so we enter the more genera/ form. Ideg Entering swto~lk~g in the a~/context has the disadvantage that it brings in a reference to smoky, which is used as an adjective.</Paragraph>
      <Paragraph position="6"> The histogram displays the body system frequencies for the diseases near the top of the disease list (the top 4~, was selected by the user to include about the first &amp;quot;screen's worth&amp;quot; of the disease List - 8?9 diseases containing one or more of the terms entered, or their synonyms).</Paragraph>
      <Paragraph position="7"> A physician-user viewing the first screenfull of this ~st (the portion shown in the appendix) would next formulate a strategy for resolving it, assuming the diagnosis was still noL immediately apparent. A methodical approach would note first that no disease matched all five entries (as no disease has a score of 4.738).</Paragraph>
      <Paragraph position="8"> Similarly, diseases #I, #2. and #3 would be ruled out by asking the patient appropriate questions.</Paragraph>
      <Paragraph position="9"> (If the patient were from Matin County, here in the Bay Area. we might focus our initial aLLen-Lion on #2, rn~.sh~'oont, toe'S.city, in response to recent news reports of cases of tt there 1degAn a%ternpt on the par~ of the ,~er so enter witch g~ ~na, whJJe !audable (it wou~d be very Selectee). wouid be greeted 5y a rr~essage 01at the :er.'n was not found m CMIT or its synonym, dictionary - Ln QI~s case because CM,'T predates wide v~e o+ '+ this ~es~. At t~e point the phy~e~a.~user must use hi~ ~&amp;quot; her own knowled~\[e of med~cme, to know ~hat ~he term ~'ldom Ls the bern. ~bst~tu~e under ~.hese c~rcum~anc~. Looked at differently, our eva~uaUon ~ee.'u= to con.~Lrm ~h~t, in genera\], alor~ medical ~alowledge makes one a more effee~ve ~ECOH~ID~ user. ~f t.~e, we regard &amp;quot;~h~s as a po~Uve featm-~ ~' RRCON.elDER.</Paragraph>
      <Paragraph position="10">  &amp;quot;~owledge that is not available tu RECON-SIDER.) Disease #4, ecto~p~, raises a more interesting issue. RECONSIDER does not have a model of gender (or of anything else), so a disease that occurs during pregnancy is not automatically ruled out when the patient is male. WhAle understandably distracting at first.</Paragraph>
      <Paragraph position="11"> users are soon comfortable ignoring such inclusions, especially since it's easy to understand RECONSIDER put the disease there. Viewing the C~\]T definition of disease #5. nej#u-Lbi~, s&amp;quot;It PSo~ reveals that it is usually accompanied by a rich complex of symptoms, so while it can not be ruled out at this point, it becomes extremely unlikely. Since the patient is not an alcoholic, the definition of disease #6, rn.ethTjt ~l.cohoL, Lozic%tll. suggests the possibility of occupational exposure (perhaps percutaneous or respkatory). Once considered, an appropriate test would confirm the existence of the toxic substance in the body.</Paragraph>
      <Paragraph position="12"> 8. /k~l-Umr Experience We have not permitted RECONSIDER to be used '~iva&amp;quot; in a clinical context. In addition to the fact that evaluation of the program is not complete, the knowledge base is known to be out of date. Nonetheless since we have been able to move RECONSIDER to the MIS-UCSF VAX 11/750 running UNIX~ (Berkeley 4.1) students, postdoctoral fellows and some faculty have been able to use the program. The initial reaction usually consists of the following three observations: (I) &amp;quot;Why is that disease there?&amp;quot; (sometimes it's there Legitimately, and sometimes not), (2) &amp;quot;How does such a dumb program do so well?&amp;quot; (referring to RECONSIDER's lack of evident reasoning power), and (3) &amp;quot;What I need to be able to do now is ...&amp;quot; (1111 in your favorite interactiveknowledge-base user-feature).</Paragraph>
      <Paragraph position="13"> We tolerate the probiem alluded to by question (I) because it is more important, at this stage of development, not to miss important diseases, and because it is easier for a physician-user to reject totally inappropriate diseases than it is for the program to do so.</Paragraph>
      <Paragraph position="14"> Question (2) alludes to the point raised by the title of this paper. RECONSIDER can only be considered an &amp;quot;expert&amp;quot; (if at all) because its knowledge base is so Large (relative to what a physician can keep readily available in his or her head), and because of its performance. It is obviously not like a human &amp;quot;expert&amp;quot; un the way it a~'~ves at the disease list. And question (3) we take to be a comphment that reveals, among other things, that occasionally the utility of RECONSIDER is iu~uted not by the knowledge it eonteuns, but by the means we currently have of accessing it through the narrow window of a 23line CRT terminal.</Paragraph>
      <Paragraph position="15"> Question (1) deserves further comment.</Paragraph>
      <Paragraph position="16"> The author (MST) has observed considerable user-discomfort caused by CMIT hexing diseases from several body systems near the top of a eUN1X is a produc~ of Ben Telephone Laboratories, ~nc. sorted disease list. Apparently, the cognitive dissonance is usually avoided by thinking about diseases by system, an the discomfort can be relieved by restricting the search (and thus the sorted list) to a single body system. The problem with the latter practice is that the preliminary results of our evaluation reveals that contextless (0~/searches) are the most e~Lcacious.</Paragraph>
      <Paragraph position="17"> on average. AS this is also the opposite of the behavior predicted by our model of context in a norruna/-attribute knowled4\[e base. further study is suggested. In any case, it may prove necessary to re-design the user-intorface to accomodate some users' need to view deseases by system, within a contextless search.</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="127" end_page="128" type="metho">
    <SectionTitle>
7. Evaluation
</SectionTitle>
    <Paragraph position="0"> A formal evaluation of RECONSIDER on i00 serial admissions to a tertmry care medical ward. is in progress (and will be reported elsewhere), but the prelim/nary results are both encouraging and interesting. They are encouraging because the correct diagnoses is included so often in the first frame or two (and usually higher), and interesting because the difference between diagnostic programs, and diagnostic p~rn4~g programs is made quite clear, The former have a very specific goal. and it is easy to tell whether it is reached or not. A prompting program is evaluated against a different standard; not whether it is correct but whether it is halpfui And judging whether something is helpful or not may be a subtle matter. If the correct diagnosis is included h~h on the List, the performance can be given a hiKh score.</Paragraph>
    <Paragraph position="1"> But if, instead, a listed disease closely related to the correct one has the result of directing the physician's attention to the correct body system, and finally the correct diagnosis, how is this to be scored? 8. Suspected \[~mitat/ons:</Paragraph>
    <Section position="1" start_page="127" end_page="128" type="sub_section">
      <SectionTitle>
8.1. The \]Qaowtedge
</SectionTitle>
      <Paragraph position="0"> As has been the experience with similar projects, computer processing subjects &amp;quot;knowledge&amp;quot; to a harsh and unyielding l~ht. We anticipate that a half a man-year of &amp;quot;tuning&amp;quot; would significantly improve RECONSIDEEs performance, but that the next and much more serious Limitation will be the quality, uniformity, completeness, and timeliness of CMIT and the synonym dictionary. Given the opportunity to rewrite CMIT (and continue to do so on an on-going basis), or introducing A\] techniques to RECONSIDER (we have received many suggestions), we would choose the former.</Paragraph>
      <Paragraph position="1"> 8.2, Other lJrnit~Uol~ Our experience to date has taught us that, in this context, negatives are ~nportant. Terms such as fe~r u~b.se~tt are teated as if/e-vet were a positive finding: while not fatal, such retrievals increase the number of false positives. Also users often wish to search using &amp;quot;rule-out&amp;quot;, e.g. elirmnate all diseases from consideration  containing a certain term. or terms. Especially tricky would be interactions between these two uses of negation.</Paragraph>
      <Paragraph position="2"> On a more global level. CMITs homogenization of diseases contributes to confusion and loss of information. Congestive heart failure is listed as a disease under ~r(. fa~,ui'e, cov~sstt~e. as a symptom under ~art. AMoe~tm,~mye. ~e--e. as a sign under Hart, AV/m,-tT~pA V, /mm-t. f~tv 0kge~rt and ~ sta~a'~.</Paragraph>
      <Paragraph position="3"> ,fu.bv,dvuZar, and as a complication in, for exampie. tr~pznaso~a~s, ~awm~c~m. And to illustrate the stress on the process of attempting to form a closed set of synonyms, the symptoms and signs of c0n~es~e ~m'~ ~s are described at various points as in cm-dio~.,~opaPSA!@, but the phrase conges~bue heart f~ does not occur in that description.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML