File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/02/w02-0605_concl.xml

Size: 2,341 bytes

Last Modified: 2025-10-06 13:53:24

<?xml version="1.0" standalone="yes"?>
<Paper uid="W02-0605">
  <Title>Using eigenvectors of the bigram graph to infer morpheme identity</Title>
  <Section position="8" start_page="3" end_page="3" type="concl">
    <SectionTitle>
4 Conclusions
</SectionTitle>
    <Paragraph position="0"> We have presented a simple yet mathematically sound method for representing the similarity of local syntactic behavior of words in a large corpus, and suggested one practical application.</Paragraph>
    <Paragraph position="1"> We have by no means exhausted the possibilities of this treatment. For example, it seems very reasonable to adjust the number of nearest neighbors permitted in the graph based on wordfrequency: the higher the frequency, the fewer the number of nearest neighbors would be permitted in the graph. We leave this and other questions for future research.</Paragraph>
    <Paragraph position="2"> This method does not appear strong enough at present to establish syntactic categories with sharp boundaries, but it is strong enough to determine with some reliability whether sets of words proposed by other, independent heuristics (such as presence of suffixes determined by unsupervised learning of morphology) are syntactically homogenous.</Paragraph>
    <Paragraph position="3"> The reader can download the files discussed in this paper and a graphical viewer from http://humanities.uchicago.edu/faculty/ goldsmith/eigenvectors/.</Paragraph>
    <Paragraph position="4"> Appendix 1 Typical examples from corners of Figure 1.</Paragraph>
    <Paragraph position="5"> Bottom: be do me make see get take go say put find give provide keep run tell leave pay hold live Left: was had has would said could did might went thought told took asked knew felt began saw gave looked became Right: world way same united right system city case church problem company past field cost department university rate center door surface Top: and to in that for he as with on by at or from but I they we there you who  most number kind full type secretary amount front instead member sort series rest types piece image lack Right: of in for on by at from into after through under since during against among within along across including near Top: going want seems seemed able wanted likely difficult according due tried decided trying related try</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML