File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/97/w97-0212_intro.xml
Size: 2,062 bytes
Last Modified: 2025-10-06 14:06:23
<?xml version="1.0" standalone="yes"?> <Paper uid="W97-0212"> <Title>Sense Tagging in Action Combining Different Tests with Additive Weightings</Title> <Section position="3" start_page="0" end_page="0" type="intro"> <SectionTitle> 2. Methodology </SectionTitle> <Paragraph position="0"> The tagger, at present, works on one sentence at a time.</Paragraph> <Paragraph position="1"> Each word in the sentence has a certain number of possible senses. The tagger assigns a score (initially 0) to each possible sense of each word. A number of different tagging process could then adjust any of these scores, increasing them for a positive match (e.g. a collocation that indicates a particular sense), decreasing them for a negative match (e.g. capitalisation indicating a particular sense to be unlikely). At the end of all these processes, each sense of each word will have a particular score. For each word, the sense with the highest score is assumed to be the sense meant in the context.</Paragraph> <Paragraph position="2"> Simple additive weightings are also commonly used in the evaluation of chess positions by computers, where for example, a pawn less could score -100 and an open file for a rook +15. It is thus possible for a number of positional factors to outweigh more concrete material factors.</Paragraph> <Paragraph position="3"> It would be possible to use multiplicative probabilities rather than additive weightings. Chess programmers tend to prefer additive weightings because they ate far simpler to program and also more efficient. There are more rigorous rules for combining probabilities, but it is not clear how much benefit this gives if the original probabilities are only rough estimates anyway. Probabilities can be derived from training corpora, but it is acknowledged that these can vary enormously from corpus to corpus, e.g. on grounds of register (Biber 1993). Such methods are far more appropriate for work in restricted contexts, where representative training corpora can be more easily derived.</Paragraph> </Section> class="xml-element"></Paper>