File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/97/w97-0204_concl.xml
Size: 3,933 bytes
Last Modified: 2025-10-06 13:57:50
<?xml version="1.0" standalone="yes"?> <Paper uid="W97-0204"> <Title>Hanshin Publishing Co., Seoul, South Korea.</Title> <Section position="7" start_page="21" end_page="23" type="concl"> <SectionTitle> 5 Conclusion </SectionTitle> <Paragraph position="0"> We have suggested a theoretical basis and a working methodology for coming up with an appropriate set of semantic tags for the semantic frame elements, and believe that such frames may constitute a sort of &quot;basic level&quot; of lexical semantic description. As such they would be an appropriate starting-point for both a broad-coverage semantic lexicon and for the semantic tagging of corpora.</Paragraph> <Paragraph position="1"> We have also pointed out the importance of incorporating the notions of inheritance and other substructuring conventions in tagsets to reduce the size and complexity of the descriptions and to capture generalizations over natural classes.</Paragraph> <Paragraph position="2"> We recognize several shortcomings with our approach which we hope to be able to address in the future.</Paragraph> <Paragraph position="3"> First, it is clear that the size of the descriptions will increase rapidly as the annotation proceeds and we will need to find some explicit means of abbreviating representations, of collapsing FEGs in a principled way, and of relating frames together (both within and across semantic fields). This is both a practical and theoretical problem. We have shown a few clear examples in which the judicious use of the notion of inheritance, along the general lines of the ACQUILEX Project (Briscoe et al., 1993), should permit the concise representation of the lexical knowledge required to give a useful and relatively complete description of a word's semantic range. If the valence description (the FEG together with links to grammatical functions) associated with individual words is attached to each valence-bearing lexical token in a corpus, then if the corpus is parsed according to the same criteria by which the linking has been stated, we can avoid the problem of actually tagging the phrases that instantiate frame elements (and hence avoid the problem of multiple tagging for constituents that figure in more than one frame in the same sentence), because the constituents that play specific semantic roles in the sentence can be computed from the parse. The ability to accomplish something like that is desirable, but it is not something to which we are presently committed.</Paragraph> <Paragraph position="4"> We intend first to focus on prototypical or core uses of the words. However, our preliminary research indicates that it would be difficult, and undesirable, to exclude metaphorical uses, if only because the metaphorical uses can often shed light on the structure of the core uses. However, we are limiting our attention to a limited number of semantic domains, and metaphorical extensions from the words in our wordlist that go far beyond our semantic fields will probably have to be set aside.</Paragraph> <Paragraph position="5"> Finally, we should make a few remarks on the scope of our intended effort. We plan to create a &quot;starter lexicon&quot; containing some 5,000 lexical items indexed to examples of their use. With each entry we shall associate token frequencies with the various FEGs for each word sense, in order to assist NLP programs in picking likely interpretations. Initially the frequencies would be generated using our hand-tagged corpus examples; eventually we hope to be able to train on the hand-tagged examples and ultimately automate (at least partially) the tagging of instances, at least for preliminary word sense disambiguation, to be reviewed by a researcher. The automatic categorization of the arguments would use such information as WordNet synonyms and hypernyms (cf.(Resnik, 1993)), machine-readable thesauri, etc.,</Paragraph> </Section> class="xml-element"></Paper>