File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/03/w03-0412_metho.xml

Size: 18,345 bytes

Last Modified: 2025-10-06 14:08:26

<?xml version="1.0" standalone="yes"?>
<Paper uid="W03-0412">
  <Title>PhraseNet: Towards Context Sensitive Lexical Semantics/</Title>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2 The Design of PhraseNet
</SectionTitle>
    <Paragraph position="0"> Context is one important notion in PhraseNet. While the context may mean different things in natural language, many previous work in statistically natural language processing defined &amp;quot;context&amp;quot; as an n-word window around the target word (Gale et al., 1992; Brown et al., 1991; Roth, 1998). In PhraseNet, &amp;quot;context&amp;quot; has a more precise definition that depends on the grammatical structure of a sentence rather than simply counting surrounding words.</Paragraph>
    <Paragraph position="1"> We define &amp;quot;context&amp;quot; to be the syntactic structure of the sentence in which the word of interest occurs. Specifically, we define this notion at three abstraction levels.</Paragraph>
    <Paragraph position="2"> The highest level is the abstract syntactic skeleton of the sentence. That is, it is in the form of the different combinations of six syntactic components. Some components may be missing as long as the structure is from a legitimate English sentence. The most complete form of the abstract syntactic skeleton is:</Paragraph>
    <Paragraph position="4"> which captures all of the six syntactic components such as Subject, Verb, Direct Object, Indirect Object, Preposition and Noun(Object) of Preposition, respectively, in the sentence. And all components are assumed to be arranged to obey the word order in English. The lowest level of contexts is the concrete instantiation of the stated syntactic skeleton, such as [Mary(S)!give(V)!</Paragraph>
    <Paragraph position="6"> which are extracted directly from corpora with grammatical lemmatization done during the process. Therefore, all word tokens are in their lemma format. The middle layer(s) consists of generalized formats of the syntactic skeleton. For example, the first example given above can be generalized as [Peop(S)!give(V)!Peop(DO)!</Paragraph>
    <Paragraph position="8"> some of its components with more abstract semantic concepts. null PhraseNet organizes nouns and verbs into &amp;quot;consets&amp;quot; and a &amp;quot;conset&amp;quot; is defined as a context with all its corresponding pointers (edges) to other consets. The context that forms a conset can be either directly extracted from the corpus, or at a certain level of abstraction. For example, both [Mary(S) ! eat(V) !</Paragraph>
    <Paragraph position="10"> Two types of relational pointers are defined currently in PhraseNet: Equal and Hyper. Both of these two relations are based on the context of each conset. Equal is defined among consets with same number of components and same syntactic ordering, i.e, some contexts under the same abstract syntactic structure (the highest level of context as defined in this paper). It is defined that the Equal relation exists among consets whose contexts are with same abstract syntactic skeleton, if there is only one component at the same position that is different. For example, [Mary(S)!give(V)!John(DO)!</Paragraph>
    <Paragraph position="12"> cause the syntactic skeleton each of them has is the same, i.e., [(S) ! (V) ! (DO) ! (IO) ! (P) ! (N)] and except one word in the verb position that is different, i.e., &amp;quot;give&amp;quot; and &amp;quot;send&amp;quot;, all other five components at the corresponding same position are the same. The Equal relation is transitive only with regard to a specific component in the same position. For example, to be transitive to the above two example consets, the Equal conset should be also different from them only by its verb. The Hyper relation is also defined for consets with same abstract syntactic structure. For conset A and conset B, if they have the same syntactic structure, and if there is at least one component of the context in A that is the hypernym of the component in that of B at the corresponding same position, and all other components are the same respectively, A is the Hyper conset of B. For example, both [Molly(S) ! hit(V) !  arrow denotes the Hyper relation and the dotted two-way arrow with a V above denotes the Equal relation that is transitive with regard to the V component.</Paragraph>
    <Paragraph position="13"> tion can cluster a list of words which occur in exactly the same contextual structure and if the extreme case occurs, namely when the same context in all these equal consets with regard to a specific syntactic component groups virtually any nouns or verbs, the Hyper relation can be used here for further disambiguation.</Paragraph>
    <Paragraph position="14"> To summarize, PhraseNet can be thought of as a graph on consets. Each node is a context and edges between nodes are relations defined by the context of each node.</Paragraph>
    <Paragraph position="15"> They are either Equal or Hyper. Equal relation can be derived by matching consets and it is easy to implement while building the Hyper relation requires the assistance of WordNet and the defined Equal relation. Semantic relations among words can be generated using the two types of defined edges. For example, it is likely that the target words in all equal consets with transitivity have similar meaning. If this is not true at the lowest lower of contexts, it is more likely to be true at higher, i.e., more generalized level. Figure 1 shows a simple example reflecting the preliminary design of PhraseNet.</Paragraph>
    <Paragraph position="16"> After we get the similar meaning lists based on their contexts, we can build interaction from this word list to WordNet and inherit other semantic relations from Word-Net. However, each member of a word list can help to disambiguate other members in this list. Therefore, it is expected that with the pruning assisted by list members, i.e., the disambiguation by truncating semantic relations associated with each synset in WordNet, the extract meaning in the context together with all other semantic relations such as hypernyms, holonyms, troponyms, antonyms can be derived from WordNet.</Paragraph>
    <Paragraph position="17"> In the next two sections we describe our current implementation of these operations and preliminary experiments we have done with them.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.1 Accessing PhraseNet
</SectionTitle>
      <Paragraph position="0"> Retrieval of information from PhraseNet is done via several access functions that we describe below. PhraseNet is designed to be accessed via multiple functions with flexible input modes set by the user. These functions may allow users to exploit several different functionalities of PhraseNet, depending on their goal and amount of resources they have.</Paragraph>
      <Paragraph position="1"> An access function in PhraseNet has two components.</Paragraph>
      <Paragraph position="2"> The first component is the input, which can vary from a single word token to a word with its complete context. The second component is the functionality, which ranges over simple retrieval and several relational functions, modelled after WordNet relations.</Paragraph>
      <Paragraph position="3"> The most basic and simplest way to query PhraseNet is with a single word. In this case, the system outputs all contexts the word can occur in, and its related words in each context.</Paragraph>
      <Paragraph position="4"> PhraseNet can also be accessed with input that consists of a single word token along with its context information. Context information refers to any of the elements in the syntactic skeleton defined in Eq. 1, namely, Subject(S), Verb(V), Direct Object(DO), Indirect Object(IO), Preposition(P) and Noun(Object) of the Preposition(N). The contextual roles S, V, DO, IO, P or N or any subset of them, can be specified by the user or derived by an application making use of a shallow or full parser. The more information the user provides, the more specific the retrieved information is.</Paragraph>
      <Paragraph position="5"> To ease the requirements from the user, say, in case no information of this form is available to the user, PhraseNet will, in the future, have functions that allow a user to supply a word token and some context, where the functionality of the word in the context is not specified.  functions along with their input and output. [i] denotes optional input. PN RL is a family of functions, modelled after WordNet relations.</Paragraph>
      <Paragraph position="6"> Table 1 lists the functionality of the access functions in PhraseNet. If the user only input a word token without any context, all those designed functions will return each context the input word occurs together with the wordlist in these contexts. Otherwise, the output is constrained by the input context. The functions are described below: PN WL takes the optional contextual skeleton and one specified word in that context as inputs and returns the corresponding wordlist occurring in that context or a higher level of context. A parameter to this function specifies if we want to get the complete wordlist or those words in the list that satisfy a specific pruning criterion. (This is the function used in the experiments in Sec. 4.) PN RL is modelled after the WordNet access functions.</Paragraph>
      <Paragraph position="7"> It will return all words in those contexts that are linked in PhraseNet by their Equal or Hyper relation. Those words can help to access WordNet to derive all lexical relations stored there.</Paragraph>
      <Paragraph position="8"> PN SN is modelled after the semantic concordance in (Landes et al., 1998). It takes a word token and an optional context as input, and returns the sense of the word in that context. Similarly to PN RL this function is implemented by appealing to WordNet senses and pruning the possible sense based on the wordlist determined for the given context.</Paragraph>
      <Paragraph position="10"> to output a sentence that has same structure as the input context, but use different words. It is inspired by the work on reformulation, e.g., (Barzilay and McKeown, 2001).</Paragraph>
      <Paragraph position="11"> We can envision many ways users of PhraseNet can make use of the retrieved information. At this point in the life of PhraseNet we focus mostly on using PhraseNet as a way to acquire semantic features to aid learning based natural language applications. This determines our priorities in the implementation that we describe next.</Paragraph>
    </Section>
  </Section>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3 Constructing PhraseNet
</SectionTitle>
    <Paragraph position="0"> Constructing PhraseNet involves three main stages: (1) extracting syntactic skeletons from corpora, (2) constructing the core element in PhraseNet: consets, and (3) developing access functions.</Paragraph>
    <Paragraph position="1"> The first stage makes use of fully parsed data. In constructing the current version of PhraseNet we used two corpora. The first, relatively small corpus of the 1:1 million-word Penn-State Treebank which consists of American English news articles (WSJ), and is fully parsed. The second corpus has about 5 million sentences of the TREC-11 (Voorhees, 2002), also containing mostly American English news articles (NYT, 1998) and parsed with Dekang Lin's minipar parser (Lin, 1998a).</Paragraph>
    <Paragraph position="2"> In the near future we are planning to construct a much larger version of PhraseNet, using Trec-10 and Trec-11 data sets, which cover about 8 GB of text. We believe that the size is very important here, and will add significant robustness to our results.</Paragraph>
    <Paragraph position="3"> To reduce ostensibly different contexts, two important abstractions take place at this stage. (1) Syntactic lemmatization to get the lemma for both nouns and verbs in the context defined in Eq. 1. For data parsed via Lin's minipar, the lexeme of each word is already included in the parser. (2) Sematic categorization to unify pronouns, proper names of people, locations and organization as well as numbers. This semantic abstraction captures the underlying semantic proximity by categorizing multitudinous surface-form proper names into one representing symbol.</Paragraph>
    <Paragraph position="4"> While the first abstraction is simple the second is not.</Paragraph>
    <Paragraph position="5"> At this point we use an NE tagger we developed ourselves based on the approach to phrase identification developed in (Punyakanok and Roth, 2001). Note that this abstraction handles multiword phrases. While the accuracy of the NE tagger is around 90%, we have yet to experiment with the implication of this additional noise on PhraseNet.</Paragraph>
    <Paragraph position="6"> At the end of this stage, each sentence in the original corpora is transformed into a single context either at the lowest level or a more generalized instantiation (with name entity tagged). For example, &amp;quot;For six years, T. Marshall Hahn Jr. has made corporate acquisitions in the George Bush mode: kind and gentle.&amp;quot;, changes to: [Peop!make!acquisition!in!mode]: The second stage of constructing PhraseNet concentrates on constructing the core element in PhraseNet: consets.</Paragraph>
    <Paragraph position="7"> To do that, for each context, we collect wordlists that contain those words that we determine to be admissible in the context(or contexts share the equal relation). The first step in constructing the wordlists in PhraseNet is to follow the most strict definition - include those words that actually occur in the same context in the corpus. This involves all Equal consets with the transitive property to a specific syntactic component. We then apply to the wordlists three types of pruning operations that are based on (1) frequency of word occurrences in identical or similar contexts; (2) categorization of words in wordlist based on clustering all contexts they occur in, and (3) pruning via the relational structure inherited from WordNet - we prune from the wordlist outliers in terms of this relational structure. Some of these operations are parameterized and determining the optimal setting is an experimental issue.</Paragraph>
    <Paragraph position="8"> 1. Every word in a conset wordlist has a frequency record associated with it, which records the frequency of the word in its exact context. We prune words with a frequency below k (with the current corpus we choose k = 3). A disadvantage of this pruning method is that it might filter out some appropriate words with a low frequency in reality.</Paragraph>
    <Paragraph position="9"> For example, for the partial context [strategy ! involve!/!/!/], we have: [strategy - involve - * - * - *, &lt; DO : advertisement 4, abuse 1, campaign 2, compromise 1, everything 1, fumigation 1, item 1, membership 1, option 3, stockoption 1&gt; ] In this case,&amp;quot;strategy&amp;quot; is the subject and &amp;quot;involve&amp;quot; is the predicate and all words in the list serve as the direct object. The number in the parentheses is the frequency of the token. With k = 3 we actually get as a wordlist only: &lt; advertisment;option &gt;.</Paragraph>
    <Paragraph position="10">  2. There are several ways to prune wordlists based on the different contexts words may occur in. This involves a definition of similar contexts and thresholding based on the number of such contexts a word occurs in. At this point, we implement the construction of PhraseNet using a clustering of contexts, as done in (Pantel and Lin, 2002). An exhaustive PhraseNet list is intersected with word lists generated based on clustered contexts given by (Pantel and Lin, 2002).</Paragraph>
    <Paragraph position="11"> 3. We prune from the wordlist outliers in terms of the  relational structure inherited from WordNet. Currently, this is implemented only using the hypernym relation. The hypernym shared by the highest number of words in the wordlist is kept in the database. For example, by searching &amp;quot;option&amp;quot; in WordNet, we get its three senses. Then we collect the hypernyms of 'option' from all the senses as follows: 05319492(a financial instrument whose value is based on another security) 04869064(the cognitive process of reaching a decision) null 00026065(something done) We do this for every word in the original list and find out the hypernym(s) shared by the highest number of words in the original wordlist. The final pick in this case is the synset 05319492 which is shared by both &amp;quot;option&amp;quot; and &amp;quot;stock option&amp;quot; as their hypernym. The third stage is to develop the access functions. As mentioned before, while we envision many ways users of PhraseNet can use the retrieved information, at this preliminary stage of PhraseNet we focus mostly on using PhraseNet as a way to supply abstract semantic features that learning based natural language applications can benefit from.</Paragraph>
    <Paragraph position="12"> For this purpose, so far we have only used and evaluated the function PN WL. PN WL takes as input as specific word and (optionally) its context and returns a lists of words which are semantically related to the target word in the given context. For example,</Paragraph>
    <Paragraph position="14"> [protest, resist, dissent, veto, blackball, negative, forbid, prohibit, interdict, proscribe, disallow ].</Paragraph>
    <Paragraph position="15"> This function can be implemented via any of the three pruning methods discussed earlier (see Sec. 4). This wordlists that this function outputs, can be used to augment feature based representations for other, learning based, NLP tasks. Other access functions of PhraseNet can serve in other ways, e.g., expansions in information retrieval, but we have not experimented with it yet.</Paragraph>
    <Paragraph position="16"> With the experiments we are doing right now, PhraseNet only takes inputs with the context information in the format of Eq. 1. Semantic categorization and syntactic lemmatization of the context is required in order to get matched in the database. However, PhraseNet will, in the future, have functions that allow a user to supply a word token and more flexible contexts.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML