File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/96/c96-2212_abstr.xml

Size: 1,038 bytes

Last Modified: 2025-10-06 13:48:41

<?xml version="1.0" standalone="yes"?>
<Paper uid="C96-2212">
  <Title>Hierarchical Clustering of Words</Title>
  <Section position="2" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> This plq)er (lescril)es a (hit i~-(triven nlet, hod for hiera, rchicM chlstering of words ill whicii a, la, rge vo(:aJ)ul~ry of I,;ii. glis\]'l words is (:histered botl;oln--uf) &gt; with resl)e(:t 1,o (:orpor;~ ranghig in size fi'otn 5 to 50 nlillion wor(ts, using a greedy al gorithm that I;ries I,o nliniluize i~veri~ge lOS8 Of liCllltllal iriforuu:l,l, ion of a, djax:ent classes. The resulting hierar('.hi('al (:illStiers of woMs are then tumirMly 1,ransrorlned to a bit-string representld, ion of (i.e. word bits for) all the words ill the vocabulary, Introducing wor(l bits hito i.he ATI{ I)ecision-Tree DOS Tagger is shown to signific~mt,ly reduce l, he ti~gging error rld;e. PortM)ility of word t)il.s h:om Olle (tonlMn to i~Hotilel: iS ~tlSO diss(:ussed.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML