File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/relat/00/a00-2043_relat.xml
Size: 4,255 bytes
Last Modified: 2025-10-06 14:15:34
<?xml version="1.0" standalone="yes"?> <Paper uid="A00-2043"> <Title>An Empirical Assessment of Semantic Interpretation</Title> <Section position="6" start_page="332" end_page="332" type="relat"> <SectionTitle> 4 Related Work </SectionTitle> <Paragraph position="0"> After a period of active research within the logic-based paradigm (e.g., Charniak and Goldman (1988), Moore (1989), Pereira and Pollack (1991)), work on semantic interpretation has almost ceased with the emergence of the empiricist movement in NLP (cf. Bos et al. (1996) for one of the more recent studies dealing with logic-based semantic interpretation in the framework of the VERBMOBIL project). Only few methodological proposals for semantic computations were made since then (e.g., higher-order colored unification as a mechanism to avoid over-generation inherent to unconstrained higher-order unification (Gardent and Kohlhase, 1996)).</Paragraph> <Paragraph position="1"> An issue which has lately received more focused attention are ways to cope with the tremendous complexity of semantic interpretations in the light of an exploding number of (scope) ambiguities. Within the underspecification framework of semantic representations, e.g., DSrre (1997) proposes a polynomial algorithm which constructs packed semantic representations directly from parse forests.</Paragraph> <Paragraph position="2"> All the previously mentioned studies (with the exception of the experimental setup in DSrre (1997)), however, lack an empirical foundation of their various claims. Though the MUC evaluation rounds (Chinchor et al., 1993) yield the flavor of an empirical assessment of semantic structures, their scope is far too limited to count as an adequate evaluation platform for semantic interpretation. Nirenburg et al. (1996) already criticize the 'black-box' architecture underlying MUC-style evaluations, which precludes to draw serious conclusions from the shortcomings of MUC-style systems as far as single linguistic modules are concerned. More generally, in this paper the rationale underlying size (of the lexicons, knowledge or rule bases) as the major assessment category is questioned. Rather dimensions relating to the depth and breadth of the knowledge sources involved in complex system behavior should be taken more seriously into consideration. This is exactly what we intended to provide in this paper.</Paragraph> <Paragraph position="3"> As far as evaluation studies are concerned dealing with the assessment of semantic interpretations, few 7At least for the medical domain, we are currently actively pursuing research on the semiautomatic creation of large-scale ontologies from weak knowledge sources (medical terminologies); cf. Schulz and Hahn (2000).</Paragraph> <Paragraph position="4"> have been carried out, some of which under severe restrictions. For instance, Bean et al. (1998) narrow semantic interpretation down to a very limited range of spatial relations in anatomy, while Gomez et al. (1997) bias the result by preselecting only those phrases that were already covered by their domain models, thus optimizing for precision while shunting aside recall considerations.</Paragraph> <Paragraph position="5"> A recent study by Bonnema et al. (1997) comes closest to a serious confrontation with a wide range of real-world data (Dutch dialogues on a train travel domain). This study proceeds from a corpus of annotated parse trees to which are assigned type-logical formulae which express the corresponding semantic interpretation. The goal of this work is to compute the most probable semantic interpretation for a given parse tree. Accuracy (i.e., precision) is rather high and ranges between 89,2%-92,3% depending on the training size and depth of the parse tree. Our accuracy criterion is weaker (the intended meaning must be included in the set of all readings), which might explain the slightly higher rates we achieve for precision. However, this study does not distinguish between different syntactic constructions that undergo semantic interpretation, nor does it consider the level of conceptual interpretation (we focus on) as distinguished from the level of semantic interpretation to which Bonnema et al. refer.</Paragraph> </Section> class="xml-element"></Paper>