File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/93/w93-0107_concl.xml

Size: 3,629 bytes

Last Modified: 2025-10-06 13:57:07

<?xml version="1.0" standalone="yes"?>
<Paper uid="W93-0107">
  <Title>HIERARCHICAL CLUSTERING OF VERBS</Title>
  <Section position="6" start_page="76" end_page="77" type="concl">
    <SectionTitle>
4 Discussion
</SectionTitle>
    <Paragraph position="0"> The Appendix 2 shows all the basic level categories derived from a small learning set, named DPR633, that belongs to the legal corpus. CIAULA receives in input 293 examples of 30 verbs. The reason for showing DPR633 rather than an excerpt of the results derived from the full corpus is that there was no objective way to select among the over 300 basic level classes. In Appendix 2, the relatively low values of gtl and I,t 2 are due to the exiguity of the example set, rather than to errors in parsing, as remarked in the previous section. Of corse, the basic-level classes extracted from the larger corpora exhibit a more striking similarity among their members, indicated by highest values of global and local membership. An example of cluster extracted from the whole legal corpus was shown in Figure 1.</Paragraph>
    <Paragraph position="1"> The example shown in Appendix 2 is however &amp;quot;good enough&amp;quot; to highlight some interesting property of our clustering method. Each cluster has a semantic description, and the degree of local and global membership of verbs give an objective measure of the similarity among cluster members. It is interesting to observe that the algorithm classifies in distinct clusters different verb usages. For example, the cluster 4 and the cluster 6 classify two different usages of the verb indicare, e.g. indicare un'ammontare (to indicate an amount) and indicare un motive (to specify a motivation), where &amp;quot;ammontare&amp;quot; is a type of AMOUNT(AM) and &amp;quot; motive&amp;quot; is a type of ABSTRACT_ENTITY (AE).</Paragraph>
    <Paragraph position="2"> The two clusters 13 and 14 capture the physical and abstract use of eseguire, e.g. eseguire un'opera (to build a building(=REAL_ESTATE) yrs. eseguire un pagamento (to make a payment(= AMOUNT,A CT) ).</Paragraph>
    <Paragraph position="3">  The clusters 3 and 6 classify two uses of the verb tenere, i.e. tenere un registro (to keep a record(=DOCUMENT) yrs. tenere un discorso (to hold a speech(=ABSTRACT_ENTITY)). Many other (often domain-dependent) examples are reflected in the derived classification. To sum up, we believe that CIAULA has several advantages over other clustering algonthrns presented in literature.</Paragraph>
    <Paragraph position="4"> (1) The derived clusters have a semantic description, i.e. the predicted thematic roles of its members.</Paragraph>
    <Paragraph position="5"> (2) The clustering algorithm incrementally assigns instances to classes, evaluating its choices on the basis of a formal cfitefium, the global utility.</Paragraph>
    <Paragraph position="6"> (3) The defined measures of typicality and generalization power make it possible to select the basic-level classes of a hierarchy, i.e. those that are repository of most lexical information about their members. These classes demonstrated substantially stable with respect to the order of presentation ofption, i.e. the predicted thematic roles of its members.</Paragraph>
    <Paragraph position="7"> (4) It is possible to discriminate different usages of verbs, since verb instances are considered individually.</Paragraph>
    <Paragraph position="8"> The hierarchy, as obtained by CIAULA, is not usable tout court by a NLP system, however class descriptions and basic-level categories appear to be greatly useful at addressing the intuition of the linguist.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML