File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/90/c90-1019_metho.xml
Size: 5,293 bytes
Last Modified: 2025-10-06 14:12:27
<?xml version="1.0" standalone="yes"?> <Paper uid="C90-1019"> <Title>Deep Sentence Understanding in a Restricted Domain*</Title> <Section position="4" start_page="0" end_page="0" type="metho"> <SectionTitle> 3 Semantic rules for Syntactic </SectionTitle> <Paragraph position="0"> Disambiguation Structural ambiguity is ubiquitous in our target texts, since they contain descriptions that often make use of series of prepositional phrases to qualify a noun. We have then decided to submit ambiguous attachments to semantic approval and ranking before building complete parses.</Paragraph> <Paragraph position="1"> An ultimate test of semantic validity would consist in comparing complete semantic representations built for each attachment proposal \[1\]. However, such a method is too expensive to allow systematic application. Our system implements a more tractable approach that generalizes selectional restrictions (or preferences). Evaluation is performed by executing a set of heuristic positive and negative rules that vote for or against each proposal. Rule conditions embody criteria that refer to the semantic components (see below) of the predicates to be attached, and include the notion of isotopy \[8\]. They apply not only to predicate-argument selection, but also to predicate-adjunct combination. null</Paragraph> </Section> <Section position="5" start_page="0" end_page="0" type="metho"> <SectionTitle> 4 Semantic Construction </SectionTitle> <Paragraph position="0"> Semantic processing of a sentence results in the activation of a relevant body of domain knowledge with related inferences within the knowledge base.</Paragraph> <Paragraph position="1"> Domain knowledge (here concerning a single disease: thyroid cancer) is embedded in a model \[6,4\] describing domain objects, actions operating on them and specific processes involving these objects. Such a model is thus a dynamic causal model rather than a memory structure devoted to object and event integration \[9\]. It is analogous to deep-knowledge models used in modern expert systems \[2\].</Paragraph> <Paragraph position="2"> Domain objects are represented in a frame-like formalism. Actions and operative aspects of processes are described as production rules simulating a distributed parallel activation \[4\]. The whole model corresponds to a dynamic, data-driven environment.</Paragraph> <Paragraph position="3"> Some domain concepts specifically represent states, relationships between objects, or state 2 83 transitions. They can be triggered by their occurrence as word meanings in the sentence.</Paragraph> <Paragraph position="4"> Implicit occurrence of these concepts may also be recognized by observing the evolution of the model. In this case default procedures create the corresponding concepts inside the model just as if these elements were explicitely stated in the proposition. These concepts subsume important situations in the model and translate them into a higher description level, thus allowing output to the user for a trace of correct understanding. No deep understanding would be possible without a treatment, even partial, of common sense, which in this application is concerned mainly with part-whole relationships \[5\], reasoning about transitions and change, and elementary physical actions (e.g., removing, touching). Default knowledge on actions, roles and reference (e.g., used in the resolution of pragmatic anaphora) are associated to the common sense module.</Paragraph> <Paragraph position="5"> ColImlon sense mechanisms are incorporated as production systems similar to those describing other active elements of the model, and can thus recombine freely with them in order to complete or modify existing representations.</Paragraph> <Paragraph position="6"> Domain representations are built from the assembly of lexical contents along the syntactic structure of the proposition. Words contain semantic, components \[8\], which are markers referring to elements of the knowledge base or properties of these representations. The existence of explicit colnmon sense concepts in the knowledge base makes it possible to decompose homogeneously technical and ordinary words.</Paragraph> <Paragraph position="7"> The lexical contents are assembled by heuristic rules to form candidate domain objects which are recognized as instances of prototypes in the representation. Lexical content itself is loosely structured; the association of the components is completed according to their type (which is derived t rom the type of entity they refer to) and the dependency (predicate-argument, predicateadjunct) relations between the lexemes that contain such components, as provided by the LFG.</Paragraph> <Paragraph position="8"> As such elements of representation are recognized by the model, the reactive environment is triggered and interprets data until new information is analyzed.</Paragraph> <Paragraph position="9"> The prototype currently runs on a small set of 30 sentences taken from patient discharge summaries. These sentences were selected for the linguistic issues they illustrate and the domain inferences they trigger. A fully compiled version of the program running on a VAX 8810 processes a sentence in an average 12 sec. CPU time.</Paragraph> </Section> class="xml-element"></Paper>