File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/03/w03-0612_intro.xml
Size: 10,031 bytes
Last Modified: 2025-10-06 14:01:53
<?xml version="1.0" standalone="yes"?> <Paper uid="W03-0612"> <Title>Population Testing: Extracting Semantic Information On Near-Synonymy From Native Speakers</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> The problem of near-synonym discrimination presents a formidable challenge to computer-based natural language processing systems (Edmonds 1999; Edmonds and Hirst, 2002), as well as to humans who are attempting to acquire near-native competency in a foreign language. In both cases, a comprehensive lexical database specifically designed for near-synonymy in the target language is a pre-requisite for the further development of practical applications in their respective domains.</Paragraph> <Paragraph position="1"> Some promising approaches have appeared in recent literature. These include corpus based procedures (Inkpen and Hirst 2001, 2002), and applied componential analysis, in particular continuing work on cross-lingual semantic primitives by Wierzbicka and her colleagues (Wierzbicka 1996, 1999).</Paragraph> <Paragraph position="2"> Corpus-based approaches are, however, constrained by the kind and scope of pre-existing corpora and tools that are currently available; while componential analysis necessarily depends heavily on the subjective judgment of its investigators. Under such conditions, it may prove difficult to achieve complete and evenly distributed lexical coverage that truly reflects the diversity of the language community.</Paragraph> <Paragraph position="3"> In this paper we propose another approach that we hope would complement these existing methods. In this approach, we go directly and repeatedly, in an iterative process, to the native speakers of the speech community to acquire and to verify the semantic information thus collected.</Paragraph> <Paragraph position="4"> We also briefly describe a visualization tool (a Java applet) that we are currently developing to aid us in analyzing the collected data, and in further refining the semantic model.</Paragraph> <Paragraph position="5"> tion that the semantics of human language is intersubjective in nature. The term intersubjectivity has long been associated with theories and practice in philosophy, cognitive science, and experimental and developmental psychology derived from, or influenced by phenomenology, a branch of philosophical thinking pioneered by the German philosopher Edmund Husserl in early 20th Century. It is also known, in the field of semiotics, as a central concern in the works of Walker Percy (Percy 1976). In this paper, however, we generally use this term in a more restricted sense, namely, to refer to the guiding principles for a specific empirical method, due to Raukko, for acquiring semantic information from non-expert informants in a speech community (Raukko 1999).</Paragraph> <Paragraph position="6"> Another background framework of PTM is inspired by an idea from Wierzbicka's Natural Semantic Meta-language (NSM) -- that all complex meanings are decomposable into constituent parts that can be readily expressed in natural language.</Paragraph> <Paragraph position="7"> Unlike NSM, however, PTM has a more practical goal and a more narrow scope, namely, that of extracting information to help differentiate a relatively small group of closely related words. Thus, instead of searching for and verifying whether a semantic feature is a proper universal primitive, we take a more ad-hoc approach, i.e. if it is evident from empirical data that a new feature would help distinguish one group of words from another, then we will adopt it at the next iteration of our investigation as one of the dimensions to test the population with, and deal with the theoretical issues later.</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.2 Practical Considerations </SectionTitle> <Paragraph position="0"> Since the very nature of PTM is to examine the productive use of actual everyday words in their natural settings, the tests need to be specifically tailored both for the words of interest, and for the study population. Also, it should be noted that PTM is by design intended as a re-iterative process, where each test round generates hypothesis to be tested in the following round.</Paragraph> <Paragraph position="1"> 2.2.1 Tailoring the Tests for Features Specific to the Words under Investigation While some semantic dimensions are common to all vocabulary, many words or word groups also have their own unique semantic characteristics that are not apparent at first, even to a trained semanticist. These subtle nuances often do not come out automatically in conscious explanations, but can nevertheless be drawn out very prominently with the right kind of testing (see Vanhatalo 2002a, 2002b).</Paragraph> <Paragraph position="2"> For instance, while most native English speakers would instinctively choose either shout or yell in his or her speech, such speakers are often at a loss at first when asked to explain why one choice is made over the other.</Paragraph> <Paragraph position="3"> To draw out such hidden linguistic intuition, we need to think of some non-rigid way of testing that encourages creative brainstorming. In PTM this often comes in the form of a natural-sounding, open-ended task given in a non-pressured setting, such as a free-form question framed in a plausible context, e.g.</Paragraph> <Paragraph position="4"> &quot;You've just met an exchange student from Japan. She would like to know what the difference is between shout and yell. How would you explain the difference to her?&quot; Finally, a practical concern is that the number of semantic features for words in any given word group can be quite large. However, since we are only interested in differentiation among these closely related words, we can choose only features that contribute to such differentiation. Furthermore, because of the re-iterative nature of PTM, the feature set for each group of words can grow or shrink as we go.</Paragraph> <Paragraph position="5"> the Informants In order to generate data that are as authentic as possible, our test settings are tailored so that they are natural for each informant group. For instance, since it would appear more natural for high school students to explain the difference between words to their friends, or to place themselves in situations that are plausible for an average teenager, our tests for them are designed accordingly.</Paragraph> <Paragraph position="6"> Below in Fig. 1 is an example of a multiple choice task where the various near-synonyms for the Finnish version of the verb &quot;to nag&quot; is used in a realistically plausible setting (for the Finnish high school students who were our informants): Yesterday I came home late and Mom jakatti.</Paragraph> <Paragraph position="7"> __ __ __ __ __ Yesterday I came home late and Mom valitti.</Paragraph> <Paragraph position="8"> __ __ __ __ __ Yesterday I came home late and Mom marisi.</Paragraph> <Paragraph position="9"> __ __ __ __ __ .... ... ... ... ... ...</Paragraph> <Paragraph position="10"> Fig. 1 A Multiple Choice Question</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.3 Details of one Pilot Study: Procedures and Results </SectionTitle> <Paragraph position="0"> We have conducted several pilot studies with over 450 subjects in Finland and Estonia to date. One such study was carried out with 154 high school students in Finland. The tests were delivered on paper. The tested vocabulary comprised of 18 speech act verbs that describe complaining (e.g. English &quot;to nag&quot; or &quot;to carp&quot;) in Finnish (see Appendix A for the list with English glosses.) According to existing dictionaries, these words are considered near-synonyms. The tasks constituting the testing were either production tasks or multiple choice tasks.</Paragraph> <Paragraph position="1"> In most production tasks (i.e. open-ended tests), the informants were asked to compare two or more nearsynonyms, often by explaining them to their non-native peers. In the analysis phase, features in their descriptions were extracted and collected into matrices, which were then used to generate frequency charts for compilation of further test series. Semi-quantitative comparisons were also performed with the results from multiple choice tasks. The most surprising observations were the abundance of discriminating features between words, and the high frequency of some answers (e.g. reasons for a certain speech act).</Paragraph> <Paragraph position="2"> In multiple choice tasks (i.e. difference evaluation tests), the informants were requested (1) to choose the best word for the given context, (2) to choose the best context for the given word, or (3) to rate/rank the word in a given semantic dimension. All these results were analyzed statistically. Tasks requiring word ranking or rating yielded direct numerical values with measures of variance.</Paragraph> <Paragraph position="3"> An example of numerical rating of a semantic dimension is given in Figure 2, where the informants were asked to rate volume of the speech act on a scale of 1 to 5. It appears that the assumed near-synonyms are clearly distinguishable in this semantic dimension, and the calculated confidence intervals (short vertical bars) demonstrate the high consensus among the informants.</Paragraph> <Paragraph position="4"> Fig. 2 Volume of the Speech Act An example of ranking between near-synonyms is given in Figure 3, which shows the result of a task to select the gender (of the agent) in the speech act. The result reveals that some verbs are clearly associated with female or male gender, while others are not as clearly gender-associated.</Paragraph> <Paragraph position="5"> Fig. 3 Gender (of Agent) in the Speech Act</Paragraph> </Section> </Section> class="xml-element"></Paper>