File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/93/j93-1007_intro.xml

Size: 6,365 bytes

Last Modified: 2025-10-06 14:05:27

<?xml version="1.0" standalone="yes"?>
<Paper uid="J93-1007">
  <Title>Retrieving Collocations from Text: Xtract</Title>
  <Section position="4" start_page="146" end_page="148" type="intro">
    <SectionTitle>
3. Three Types of Collocations
</SectionTitle>
    <Paragraph position="0"> Collocations come in a large variety of forms. The number of words involved as well as the way they are involved can vary a great deal. Some collocations are  Computational Linguistics Volume 19, Number 1 &amp;quot;The NYSE's composite index of all its listed common stocks rose NUMBER* to *NUMBER*&amp;quot; &amp;quot;On the American Stock Exchange the market value index was up NUMBER* at *NUMBER*&amp;quot; &amp;quot;The Dow Jones average of 30 industrials fell NUMBER* points to *NUMBER*&amp;quot; &amp;quot;The closely watched index had been down about *NUMBER* points in the first hour of trading&amp;quot; &amp;quot;The average finished the week with a net loss of *NUMBER*&amp;quot; Figure 4 Some examples of phrasal templates.</Paragraph>
    <Paragraph position="1"> very rigid, whereas others are very flexible. For example, a collocation such as the one linking &amp;quot;to make&amp;quot; and &amp;quot;decision&amp;quot; can appear as &amp;quot;to make a decision,&amp;quot; &amp;quot;decisions to be made,&amp;quot; &amp;quot;made an important decision,&amp;quot; etc. In contrast, a collocation such as &amp;quot;The New York Stock Exchange&amp;quot; can only appear under one form; it is a very rigid collocation, a fixed expression. We have identified three types of collocations: rigid noun phrases, predicative relations, and phrasal templates. We discuss the three types in turn, and give some examples of collocations.</Paragraph>
    <Section position="1" start_page="147" end_page="147" type="sub_section">
      <SectionTitle>
3.1 Predicative Relations
</SectionTitle>
      <Paragraph position="0"> A predicative relation consists of two words repeatedly used together in a similar syntactic relation. These lexical relations are the most flexible type of collocation. They are hard to identify since they often correspond to interrupted word sequences in the corpus. For example, a noun and a verb will form a predicative relation if they are repeatedly used together with the noun as the object of the verb. &amp;quot;Make-decision&amp;quot; is a good example of a predicative relation. Similarly, an adjective repeatedly modifying a given noun such as &amp;quot;hostile-takeover&amp;quot; also forms a predicative relation. Examples of automatically extracted predicative relations are given in Figure 3. 3 This class of collocations is related to Mel'~uk's lexical functions (Mel'~uk 1981), and Benson's Ltype relations (Benson, Benson, and Ilson 1986b).</Paragraph>
    </Section>
    <Section position="2" start_page="147" end_page="148" type="sub_section">
      <SectionTitle>
3.2 Rigid Noun Phrases
</SectionTitle>
      <Paragraph position="0"> Rigid noun phrases involve uninterrupted sequences of words such as &amp;quot;stock market,&amp;quot; &amp;quot;foreign exchange,&amp;quot; &amp;quot;New York Stock Exchange,&amp;quot; &amp;quot;The Dow Jones average of 30 industrials.&amp;quot; They can include nouns and adjectives as well as closed class words, and are similar to the type of collocations retrieved by Choueka (1988) and Amsler (1989). They are the most rigid type of collocation. Examples of rigid noun phrases are: 4 &amp;quot;The NYSE's composite index of all its listed common stocks,&amp;quot; &amp;quot;The NASDAQ composite index for the over the counter market,&amp;quot; &amp;quot;levera ged buyout ,&amp;quot; &amp;quot;the gross national product,&amp;quot; &amp;quot;White House spokesman Marlin Fitzwater.&amp;quot; In general, rigid noun phrases cannot be broken into smaller fragments without losing their meaning; they are lexical units in and of themselves. Moreover, they often refer to important concepts in a domain, and several rigid noun phrases can be used to express the same concept. In the New York Stock Exchange domain, for example, &amp;quot;The 3 In the examples, the &amp;quot;\[\]&amp;quot; sign represents a gap of zero, one or several words. The &amp;quot;4=~&amp;quot; sign means that the two words can be in any order. 4 All the examples related to the stock market domain have actually been retrieved by Xtract.</Paragraph>
    </Section>
    <Section position="3" start_page="148" end_page="148" type="sub_section">
      <SectionTitle>
Frank Smadja Retrieving Collocations from Text: Xtract
</SectionTitle>
      <Paragraph position="0"> Dow industrials,&amp;quot; &amp;quot;The Dow Jones average of 30 industrial stocks,&amp;quot; &amp;quot;the Dow Jones industrial average,&amp;quot; and &amp;quot;The Dow Jones industrials&amp;quot; represent several ways to express a single concept. As we have seen before, these rigid noun phrases do not seem to follow any simple construction rule, as, for example, the examples given in sentences 6-8 at the beginning of the paper are all incorrect.</Paragraph>
    </Section>
    <Section position="4" start_page="148" end_page="148" type="sub_section">
      <SectionTitle>
3.3 Phrasal Templates
</SectionTitle>
      <Paragraph position="0"> Phrasal templates consist of idiomatic phrases containing one, several, or no empty slots. They are phrase-long collocations. Figure 4 lists some examples of phrasal templates in the stock market domain. In the figure, the empty slots must be filled in by a number (indicated by *NUMBER* in the figure). More generally, phrasal templates specify the parts of speech of the words that can fill the empty slots. Phrasal templates are quite representative of a given domain and are very often repeated in a rigid way in a given sublanguage. In the domain of weather reports, for example, the sentence &amp;quot;Temperatures indicate previous day's high and overnight low to 8 a.m.&amp;quot; is actually repeated before each weather report, s Unlike rigid noun phrases and predicative relations, phrasal templates are specifically useful for language generation. Because of their slightly idiosyncratic structure, generating them from single words is often a very difficult task for a language generator. As pointed out by Kukich (1983), in general, their usage gives an impression of fluency that could not be equaled with compositional generation alone.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML