File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/06/w06-0905_metho.xml
Size: 18,742 bytes
Last Modified: 2025-10-06 14:10:34
<?xml version="1.0" standalone="yes"?> <Paper uid="W06-0905"> <Title>Evaluating Knowledge-based Approaches to the Multilingual Extension of a Temporal Expression Normalizer</Title> <Section position="4" start_page="31" end_page="31" type="metho"> <SectionTitle> 3 The starting point: TERSEO </SectionTitle> <Paragraph position="0"> As a starting point for our experiments we used TERSEO, a system originally developed for the automatic annotation of TEs appearing in a Spanish written text in compliance with the TIMEX2 standard (see (Saquete, 2005) for a thorough description of TERSEO's main features and functionalities). null Basically (see Figure 1), the TE recognition and normalization process is carried out in two phases. The first phase (recognition) includes a pre-processing of the input text, which is tagged with lexical and morphological information that will be used as input to a temporal parser. The temporal parser is implemented using an ascending technique (chart parser) and relies on a language-specific temporal grammar. As TEs can be divided into absolute and relative ones, such grammar is tuned for discriminating between the two groups. On the one hand, absolute TEs directly provide and fully describe a date. On the other hand, relative TEs require some degree of reasoning (as in the case of anaphora resolution). In the second phase of the process, in order to translate these expressions into their normalized form, the lexical context in which they occur is considered. At this stage, a normalization unit is in charge of determining the appropriate reference date (anchor) associated to each anaphoric TE, calculating its value, and finally generating the corresponding TIMEX2 tag.</Paragraph> <Paragraph position="1"> ?From a multilingual perspective, an important feature of TERSEO is the distinction between recognition rules, which are language-specific, and normalization rules, which are language-independent and potentially reusable for any other language. Taking the most from the modular architecture of the system, a first multilingual extension has been evaluated over the English TERN 2004 test set. In that extension, the English temporal model was automatically obtained from the Spanish one, through the automatic translation into English8 of the Spanish TEs recognized by the system (Saquete et al., 2004). The resulting English TEs were then mapped onto the corresponding language-independent normalization rules, with good results (compared with other participants to the competition) both in terms of precision and recall. These results are shown in The positive results of this experience demonstrated the viability of the adopted solutions, and motivate our further investigation with Italian as a new target language.</Paragraph> </Section> <Section position="5" start_page="31" end_page="34" type="metho"> <SectionTitle> 4 Porting TERSEO to Italian </SectionTitle> <Paragraph position="0"> Due to the separation between language-specific recognition rules and language-independent normalization rules, the bulk of the porting process relies on the adaptation of the recognition rules to the new target language. Taking advantage of different knowledge sources (either alone or in combination), an incremental approach has been adopted, in order to determine the contribution of additional information on the performance of the resulting system for Italian.</Paragraph> <Section position="1" start_page="32" end_page="33" type="sub_section"> <SectionTitle> 4.1 Using online translators </SectionTitle> <Paragraph position="0"> As a first experiment, the same procedure adopted for the extension to English has been followed.</Paragraph> <Paragraph position="1"> This represents the simplest approach for porting TERSEO to other languages, and will be considered as a baseline for comparison with the results achieved in further experiments. The only minor difference with respect to the original procedure is that now, since two aligned sets of recognition rules (i.e. for Spanish and for English) are available, both models have been used. The reason for considering both models is the fact that they complement each other: on the one hand, the Spanish model was obtained manually and showed high precision values in detection (88%); on the other hand, although the English model showed lower precision results in detection (77%), the on-line translators from English to Italian perform better than translators from Spanish to Italian.</Paragraph> <Paragraph position="2"> The process is carried out in the following four steps.</Paragraph> <Paragraph position="3"> 1. Eng-Ita translation. All the English TEs known by the system are translated into Italian9. Starting English, the probability of obtaining higher quality translations is maximized. null 2. Spa-Ita translation. For each English TE without an Italian translation, the corresponding Spanish expression is translated into Italian. Also the Spanish TEs that do not have an English equivalent are translated from Spanish10 into Italian. This way, the coverage of the resulting model is maximized, becoming comparable to the hand-crafted Spanish model.</Paragraph> <Paragraph position="4"> 3. TE Filtering. A filtering module is used to guarantee the correctness of the translations.</Paragraph> <Paragraph position="5"> For this purpose, the translated expressions are searched in the Web with Google. If an expression is not found by Google it is given up; otherwise it is considered as a valid Italian TE. The inconvenience of adopting this simple filtering strategy occurs in case of ambiguous expressions, i.e. when a correct expression is obtained through translation, and Google returns at least on document containing it, but the expression is not a temporal one. In these cases the system will erroneously store in its database non-temporal expressions. In this experiment the results returned by Google have not been analyzed (only the number of hits has been taken into account), nor the impact of these errors has been estimated. A more precise analysis of the output of the web search has been left as a future improvement direction.</Paragraph> <Paragraph position="6"> 4. Normalization rules assignment. Finally, the resulting Italian translations are mapped onto the language-independent normalization rules associated with the original English and Spanish TEs.</Paragraph> <Paragraph position="7"> The development of this first automatic porting procedure required one person/week for software implementation, and less than an hour to obtain the new model for Italian. The performance of the resulting system, evaluated over the test set of I-CAB, is shown in table 2.</Paragraph> <Paragraph position="8"> The results achieved by the translation-based approach are controversial. On the one hand, we observe a detection performance in line with the English version of the system. The timex2 attribute, which indicates the proportion of detected TEs11, has even higher scores, both in terms of precision (+5%) and recall (+11%), with respect to the English system. On the other hand, both bracketing (see the text attribute, which indicates the quality of extent recognition) and normalization (described by the other attributes) show a performance drop. Unfortunately, the reasons of this drop are still unclear. One possible explanation is that, due to the intrinsic difficulties presented by the Italian language, the translation-based approach falls short from providing an adequate coverage of the many possible TE variants. While 11At least one overlapping character in the extent of the reference and the system output is required for tag aligment. the presence of lexical triggers denoting a TE appearing in a text (e.g. the Italian translations of &quot;years&quot;, &quot;Monday&quot;, &quot;afternoon&quot;, &quot;yesterday&quot;) can be easily captured by this approach, the complexity of many language-specific constructs is out of its reach.</Paragraph> </Section> <Section position="2" start_page="33" end_page="33" type="sub_section"> <SectionTitle> 4.2 Using an annotated corpus </SectionTitle> <Paragraph position="0"> In a second experiment, the annotations of the training portion of I-CAB have been used as a primary knowledge source. The main purpose of this approach is to maximize the coverage of the Italian TEs, starting from language-specific knowledge mined from the corpus. The basic hypothesis is that a bottom-up porting methodology, led by knowledge in the target language, is more effective than the top-down approach based on knowledge derived from models built for other languages.</Paragraph> <Paragraph position="1"> The former, in fact, is in principle more suitable to capture language-specific TE variations. In order to test the validity of ths hypothesis, the following two-step process has been set up: 1. TE Collection and translation. The Italian expressions are collected from the I-CAB training portion, and translated both into Spanish and English.</Paragraph> <Paragraph position="2"> 2. Normalization rules assignment. Italian TEs are assigned to the appropriate normalization rules. For each Italian TE mined from the corpus, the selection is done considering the normalization rules assigned to its translations. If both the Spanish and English expressions are found in their respective models, and are associated with the same normalization rule, then this rule is assigned also to the Italian expression. Also, when only one of the translated expressions is found in the existing models, the normalization rule is assigned. In case of discrepancies, i.e. if both expressions are found, but are not associated to the same normalization rule, then one of the languages must be prioritized. Since the manually obtained Spanish model has shown a higher precision, Spanish rules are preferred. null As the corpus-based approach is mostly built on the same software used for the translation-based porting procedure, it did not require additional time for implementation. Also in this case, the new model for Italian has been obtained in less than one hour. Performance results calculated over the I-CAB test set are reported in Table 3.</Paragraph> <Paragraph position="3"> These results partially confirm our working hypothesis, showing a performance increase in terms of the Italian TEs correctly recognized by the system. In fact, both the timex2 attribute, which indicates the coverage of the system (detection), and the text attribute, which refers to the TEs extent determination (bracketing), are slightly increased. This may lead to the conclusion that automatic porting procedures can actually benefit from language-specific knowledge derived from a corpus. null However, looking at the other TIMEX2 attributes, the situation is not so clear due to the less coherent behaviour of the system on normalization. While for two attributes (anchor dir and anchor val) the system performs better, for the other two (set and val) a performance drop is observed. A possible reason for that could be related to the limited number of TE examples that can be extracted from the Italian corpus (whose dimensions are relatively small compared to the annotated corpora available for English). In fact, compared to the sum of English and Spanish examples used for the translation-based porting procedure, the Italian expressions present in the corpus are fewer and repetitive. For instance, with 131, 140, and 30 occurrences, the expressions &quot;oggi&quot; (&quot;today&quot;), &quot;ieri&quot; (&quot;yesterday&quot;), and &quot;domani&quot; (&quot;tomorrow&quot;) represent around 12.5% of the 2,393 Italian TEs contained in the I-CAB training set.</Paragraph> </Section> <Section position="3" start_page="33" end_page="34" type="sub_section"> <SectionTitle> 4.3 Combining online translators and an </SectionTitle> <Paragraph position="0"> annotated corpus In light of the previous considerations, a third experiment has been conducted combining the top-down approach proposed in Section 4.1 and the bottom-up approach proposed in Section 4.2. The underlying hypothesis is that the induction of an effective temporal model for Italian can benefit from the combination of the large amount of examples coming from translations on the one side, and from the more precise language-specific knowledge derived from the corpus on the other. To check the validity of this hypothesis, the process described in Section 4.2 has been modified adding an additional phase. In this phase, the set of TEs derived from I-CAB is augmented with the expressions already available in the Spanish and English TE sets. The new porting process is carried out in the following steps: 1. TE Collection and translation. The Italian expressions are collected from the I-CAB training portion, and translated both into Spanish and English.</Paragraph> <Paragraph position="1"> 2. Normalization rules assignment. With the same methodology described in Section 4.2 (step 2), the Italian TEs mined from the corpus are mapped onto the appropriate normalization rules assigned to their translations. 3. TE set augmentation. The set of Italian TEs is automatically augmented with new expressions derived from the Spanish and English TE sets. As described in Section 4.1, these expressions are first translated into Italian using on-line translators, then filtered through Web searches. The remaining TEs are included in the Italian model, and related to the same normalization rules assigned to the corresponding Spanish or English TEs.</Paragraph> <Paragraph position="2"> Also this porting experiment was carried out with minimal modifications of the existing code. The automatic acquisition of the new model for Italian required around one hour. Evaluation results, calculated over the I-CAB test set are presented in Table 4.</Paragraph> <Paragraph position="3"> online translators As can be seen from the table, the combination of the two approaches leads to an overall performance improvement with respect to the previous experiments. Apart from a slight decrease in terms of detection (timex2 attribute), both bracketing and normalization performance benefit from such combination. The improvement on bracketing (text attribute) is around 4% with respect to both the previous experiments. On average, the improvement for the normalization attributes is around 15% with respect to the translation-based method (ranging from +4,5% for the set attribute, to +20% for the val attribute), and 20% with respect to the corpus-based method (ranging from +11% for the anchor dir attribute, to +30% for the set attribute). These performance improvements are summarized in Table 5, which reports the F-Measure scores achieved by the three porting approaches.</Paragraph> <Paragraph position="4"> F-Tran. F-Corpus F-Comb.</Paragraph> <Paragraph position="5"> These results confirm the validity of our working hypothesis, showing that: * taken in isolation, both the knowledge derived from models built for other languages, and the language-specific knowledge derived from an annotated corpus, have a limited impact on the system's performance; * taken in combination, the top-down and the bottom-up approaches can complement each other, allowing to cope with the complexity of the porting task.</Paragraph> </Section> </Section> <Section position="6" start_page="34" end_page="35" type="metho"> <SectionTitle> 5 Comparing TERSEO with a </SectionTitle> <Paragraph position="0"> language-specific system For the sake of completeness, the results achieved by our combined porting procedure have been compared with those achieved, over the I-CAB test set, by a system specifically designed for Italian. The ITA-Chronos system (Negri and Marseglia, 2004), a multilingual system for the recognition and normalization of TEs in Italian and English, has been used for this purpose. Up to date, being among the two top performing systems at TERN 2004, Chronos represents the state-of-the-art with respect to the TERN task. In addition, to the best of our knowledge, this is the only system effectively dealing with the Italian language. Like all the other state-of-the-art systems addressing the recognition/normalization task, ITA-Chronos is a rule-based system. From a design point of view, it shares with TERSEO a rather similar architecture which relies on different sets of rules. These are regular expressions that check for specific features of the input text, such as the presence of particular word senses, lemmas, parts of speech, symbols, or strings satisfying specific predicates12. Each set of rules is in charge of dealing with different aspects of the problem. In particular, a set of around 350 rules is designed for TE recognition and is capable of recognizing with high Precision/Recall rates a broad variety of TEs. Other sets of regular expressions, for a total of around 700 rules, are used in the normalization phase, and are in charge of handling each specific TIMEX2 normalization attribute. The results obtained by the Italian version of Chronos over the I-CAB test set are shown in Table 6.</Paragraph> <Paragraph position="1"> CAB test set As expected, the distance between the results obtained by ITA-Chronos and the best Italian system automatically obtained from TERSEO (F-Comb) is considerable. On average, in terms of F-Measure, the scores obtained by ITA-TERSEO are 20% lower, ranging from -1.3% for the anchor val attribute, to -57% for the text attribute. However, going beyond the raw numbers, a comprehensive evaluation must also take into account the great difference, in terms of the required time, effort, and resources deployed in the development of the two systems. While the implementation of the manual one took several months, the automatic porting procedure of TERSEO to Italian (in all the three modalities described in this paper) is a very fast process that can be accomplished in less than an hour. Considering the trade-off between performance and effort required for system's devel12For instance, the predicates &quot;Weekday-p&quot; and &quot;Time Unit-p&quot; are respectively satisfied by strings such as &quot;Monday&quot;, &quot;Tuesday&quot;, ..., &quot;Sunday&quot;, and &quot;second&quot;, &quot;minute&quot;, &quot;hour&quot;, &quot;day&quot;, ..., &quot;century&quot;. Of course, this also holds for the Italian equivalents of these expressions opment, the proposed methodology represents a viable solution to attack the porting problem.</Paragraph> </Section> class="xml-element"></Paper>