XML Viewer - c92-1042

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/92/c92-1042_metho.xml
Size: 23,158 bytes
Last Modified: 2025-10-06 14:12:54
<?xml version="1.0" standalone="yes"?>
<Paper uid="C92-1042">
  <Title>On compositional semantics</Title>
  <Section position="1" start_page="0" end_page="0" type="metho">
    <SectionTitle>
WLODZ @ WATSON.1BM.COM
</SectionTitle>
    <Paragraph position="0"> Abstract. We prove a theorem stating that any semantics can be encoded as a compositional semantics, which means that, essentially, the standard definition of compositionality is formally vacuous. We then show that when one requires compositional semantics to be &amp;quot;systematic&amp;quot; (that is the meaning function cannot be arbitrary, but must belong to some class), one can easily distinguish between compositional and non-compositional semantics. We also present an example of a simple grammar for which there is no &amp;quot;systematic&amp;quot; compositional semantics. This implies that it is possible to distinguish &amp;quot;good&amp;quot; and &amp;quot;bad&amp;quot; grammars oll the basis of whether they can have compositional semantics.</Paragraph>
    <Paragraph position="1"> As a result, we believe that the paper clarifies the concept of compositionality and opens a possibility of making systematic comparisons of different systems of grammars and NLU programs.</Paragraph>
    <Paragraph position="2"> l.lntroduction.</Paragraph>
    <Paragraph position="3"> Compositionality is defined as the property that the meaning of a whole is a function of the meaning of its parts (cf. e.g. Keenan and Faltz (1985),pp.24-25)? This definition, although intuitively clear, does not work formally. For instance, Ilirst (1987) pp.27-43 claims that the semantics of Woods (1967) and (Woods, 1969), is not compositional, because &amp;quot;the interpretation of the word depart varies as different prepositional phrases are attached to it&amp;quot;: AA-57 departs from Boston = &gt; depart(aa-57, boston).</Paragraph>
    <Paragraph position="4"> AA-57 departs from Boston to Chicago = &gt; connect(aa-57, boston, chicago).</Paragraph>
    <Paragraph position="5"> AA-37 departs from Boston on Monday = &gt; dday(aa-57, boston, monday).</Paragraph>
    <Paragraph position="6"> AA-57 departs from Boston at 8:00 a,m.</Paragraph>
    <Paragraph position="7"> = &gt; equal(dtime(aa-57, boston), 8:00am).</Paragraph>
    <Paragraph position="8"> AA-57 departs from Boston after 8:00 a.m.</Paragraph>
    <Paragraph position="9"> = &gt; greater(dtime(aa-57,boston),8:l10am).</Paragraph>
    <Paragraph position="10"> AA-57 departs from Boston before 8:00 a.m.</Paragraph>
    <Paragraph position="11"> = &gt; greater(8:0Oam,dtime(aa-57,boston)).</Paragraph>
    <Paragraph position="12"> Although this semantics does look like noncompositional, it is easy to create a function that produces the meanings of all these sentences from the meanings of its parts -- we can simply define such a function by cases: the meaning of departs~from/ is connect, the meaning of departs/from/on is dday, and so on. Hirst therefore changes the definition of compositionality to &amp;quot;the meaning of a whole is a Lystematic meaning of the parts&amp;quot; (op. cit. p.27.; tile emphasis is ours), but without defining the meaning of the word &amp;quot;systematic.&amp;quot; in this paper we show that, indeed, tlirst was right in assuming that the standard definition of compositionality has to be amended. Namely, we prove a theorem stating that any semantics can be encoded as a compositional semantics, which means that, essentially, the standard definition of compositionality is formally vacuous. We then show that when one requires composi-An equivalent definition, e.g. Partee et al. (1990), postulates the existence of a homomorphism from syntax to semantics. ACTES DE COLING-92, NANrEs, 23-28 ^Ot~T 1992 2 6 0 PROC. OF COLING-92, NANTES, AUO. 23-28, 1992 tional semantics to be &amp;quot;systematic&amp;quot; (i.e. the meaning function must belong to some class), one can easily distinguish between compositional and non-compositional semantics. We also give an example of a simple grammar lor which there is no &amp;quot;systematic&amp;quot; compositional semantics&amp;quot;. This result implies that it is possible to distinguish &amp;quot;good&amp;quot; and &amp;quot;bad&amp;quot; grammars on the basis of whether they can have a compositional semantics with a meaning function belonging to a certain class. As a result, we believe that the paper finally clarifies the concept of compositionality and opens a possibility of making systematic comparisons of different systems of grammars and NLU programs.</Paragraph>
    <Paragraph position="13"> 2.Some compositional meaning function can always be found Compositional semantics, or CS, is usually defined as a functional dependence of the meaning of an expression on the meanings of its parts. One of the first natural questions we might want to ask is whether a set of NL expressions, i.e. a language, can have some CS.</Paragraph>
    <Paragraph position="14"> This question has been answered positively by van Benthem (1982). ttowever his result says nothing about what kinds of things shoukl be assigned e.g. to nouns, where, obviously, we would like nouns to be mapped into sets of entities, or something like that. That is, we want semantics to encode some basic intuitions, e.g.</Paragraph>
    <Paragraph position="15"> that nouns denote sets of entities, and verbs denote relations between entities, and so on.</Paragraph>
    <Paragraph position="16"> So what about having a compositional semantics that agrees with intuitions? That is, the questions is whether after deciding what sentences and their parts mean, we can find a function that would compose the meaning of a whole from the meanings of its parts.</Paragraph>
    <Paragraph position="17"> The answer to this question is somewhat disturbing. It turns out that whatever we decide that some language expressions should mean, it is always possible to produce a ffmction that would give CS to it (see below tor a more precise formulation of this fact). The upshot is that compositionality, as commonly defined, is not a strong constraint on a semantic theory.</Paragraph>
    <Paragraph position="18"> The intuitions behind this result can be illustrated quite simply: Consider tile language of finite strings of digits from 0 to 7. Let's fix a random function (i.e. an intuitively bizarre function) from this language into {0,1). Let the meaning function be defined as the value of the string as the corresponding number in base 8 if tile value of the function is 0, and in base 10, otherwise. Clearly, the meaning of any string is a composition of the meanings of digits (notice that the values of the digits are the same in both bases). But, intuitively, this situation is different fi'om standard cases when we consider only one base and the meaning of a string is given by a simple lbrmula relizrring only to digits and their positions in the string, The theorem we prove below shows that however complex is the language, aqd whatever strange meanings we want to assign to its expressions, we can always do it compositionally.</Paragraph>
    <Paragraph position="19"> One of the more bizarre consequences of this fact is that we do not have to start building compositional semantics fbr NL beginning with assigning meanings to words. We can do equally well by assigning meanings to phonems or even LETTFRS, assuring that, for any sentence, the intuitive meaning we associate with it would be a lhnction of the meaning of the letters from which this sentence is composed.</Paragraph>
  </Section>
  <Section position="2" start_page="0" end_page="0" type="metho">
    <SectionTitle>
PROVING EXISTENCE OF C()MPOSI-
TIONAL SEMANTICS
</SectionTitle>
    <Paragraph position="0"> Let S be any collection of expressions (intuitively, sentences and their parts). Let M be a set s.t. for any seS, there is m=m(s) which is a member of M s.t. n, is the meaning of s. We want to show that there is a cmnpositional semantics for S which agrees with the function associating m with re(s) , which will be denoted by re(x).</Paragraph>
    <Paragraph position="1"> Since elements of M can be of any type, we do not automatically have (for all elements of S) ACqES DE COL1NG-92, NANTES, 23-28 AO~r 1992 2 6 1 Pgoc. OF COLING-92. NANrES. AUG. 23-28, 1992 m(s.t) = m(s)#m(t) (where # is some operation on the meanings). To get this kind of homomorphism we have to perform a type raising operation that would map elements of S into functions and then the functions into the required meanings. We begin by trivially extending the language S by adding to it an ~end of expression H character $, which may appear only as the last element of any expression. The purpose of it is to encode the function re(x) in the following way: The meaning function tz that provides compositional semantics for S maps it into a set of functions in such a way that l~(s.t) = t~(s)(#(t)). We want that the original semantics be easily decoded from p(s), and therefore we require that, for all . s, I~(s.$) = re(s). Note that such a type raising operation is quite common both in mathematics (e.g. 1 being a function equal to 1 for all values) and in mathematical linguistics. Secondly, we assume here that there is only one way of composing elements of S -- by concatenation 2 but all our arguments work for languages with many operators as well.</Paragraph>
    <Paragraph position="2"> THEOREM. Under the above assumptions.</Paragraph>
    <Paragraph position="3"> There is a function ~t s.t, for all s, #(s.t) = #(s)(tt(t)) , and l~(s.$) = re(s).</Paragraph>
    <Paragraph position="4"> Proof. See Section 5.1.</Paragraph>
    <Paragraph position="5"> 3.What do we really want from compositional semantics? In view of the above theorem, any semantics is equivalent to a compositional semantics, and hence it would be meaningless to keep the definition of eompositionality as the existence of a homomorphism from syntax to semantics without imposing some conditions on this homomorphism. Notice that requiring the computability of the meaning function won't do. s Proposition. If the original function re(x) is computable, so is the meaning function/~(x).</Paragraph>
    <Paragraph position="6"> Proof. See the proof of the solution lemma in Aczel (1987).</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.1 What do we really want?
</SectionTitle>
      <Paragraph position="0"> We have some intuitions and a bunch of examples associated with the concept of compositionality; e.g. for NP -&gt; Adj N , we can map nouns and adjectives into sets and the concatenation into set intersection, and get an intuitively correct semantics for expressions like &amp;quot;grey carpet&amp;quot;, &amp;quot;blue dog&amp;quot;, etc.</Paragraph>
      <Paragraph position="1"> There seem to be two issues here: (1) Such procedures work for limited domains like &amp;quot;everyday solids ~ and colors; (2) The function that composes the meanings should be &amp;quot;easily&amp;quot; definable, e.g. in terms of boolean operations on sets. This can be made precise for instance along the lines of Manaster-Ramer and Zadrozny (1990), where we argue that one can compare expressive power of various grammatical formalisms in terms of relations that they allow us to define; the same approach can be applied to semantics, as we show it below.</Paragraph>
      <Paragraph position="2"> 3.1,4 simple grammar without a systematic semantics If meanings have to be expressed using certain natural, but restricted, set of operations, it may turn out that even simple grammars do not have a compositional semantics.</Paragraph>
      <Paragraph position="3"> Consider two grammars of numerals in base 10:  mantics to parse trees, not to strings of words. But the method of proof can be modified to handle the case when concatenation is associative.</Paragraph>
      <Paragraph position="4"> Also, note that in mathematics (where semantics obviously is compositional) we can talk about noncomputable functions, and it is usually clear what we postulate about them.  For the grammar ND, the meaning of any numeral can be expressed in the model</Paragraph>
      <Paragraph position="6"> that is a polynomial in two variables with coefficients in natural numbers.</Paragraph>
      <Paragraph position="7"> For the grammar DN, we can prove that no such a polynomial exists, that is Theorem. There is no polynomial p in the variables #(D), ~(N) such that #(D iV) = p(p.(D), #(N)) and such that the value of #(D N) is the number expressed by the string D N in base I0.</Paragraph>
      <Paragraph position="8"> Proof. See Section 5.2.</Paragraph>
      <Paragraph position="9"> But notice that there is a compositional semantics for the grammar DN that does not agree with intuitions: #(D N) = 10 x #(N) +/~(D), which corresponds to reading the number backwards. And there are many other semantics corresponding to all possible polynomials in #(D) and/I(N).</Paragraph>
      <Paragraph position="10"> Also observe that (a) if we specify enough values of the meaning function we can exclude any particular polynomial; (b) if we do not restrict the degree of the polynomial, we can write one that would give any values we want on a finite number of words in the grammar.</Paragraph>
      <Paragraph position="11"> The moral is that not only it is natural to restrict meaning functions to, say, polynomials, but to further restrict them, e.g. to polynomials of degree 1. Then by specifying only three values of the meaning function we can (a) have a unique compositional semantics lbr the first granmaar;  (b) show that there is no compositional semantics tbr tile second grammar (directly from the proof of the above theorem).</Paragraph>
      <Paragraph position="12"> 4. Conclusions 4.1. Relevance for theories of grammar 4. I. 1. On reduction of syntax to lexieal meanings  T. Wasow on pp.204-205 of Sells (1985) writes: It is interesting that contemporary syntactic theories seem to be converging on the idea that sentence structure is generally predictable from word meanings \[...\]. \[...\] The surprising thing (to linguist) has been how little needs to be stipulated beyond lexical meaning. \[_.\] The reader should notice that the meaning function m in our main theorem is arbitrary. In particular we can take re(s) to be the preferred syntactic analysis of the string s. The theorem then confirms the above observation: indeed, the syntax can be reduced to lexical meanings. At the same time, it both trivializes it and calls out for a deeper explanation. It trivializes the reduction to lexical meanings, since it also says that with no restriction on the types of meanings permitted, syntax can be reduced to the meaning of phonems or letters. The benefits of the reduction to lexical meanings would have to be explained, especially if such meanings refer to abstract properties such as binding features, BAR, or different kinds of subcategorization.</Paragraph>
      <Paragraph position="13"> It is the view of this author (cf. Zadrozny and Manaster-Ramer (1997)) and, implicitly, of Fillmore et al. (1988), that such a reductionist approach is inappropriate. But we have no room to elaborate it here.</Paragraph>
      <Paragraph position="14"> 4.1.2. On good and bad grammars By introducing restrictions on semantic functions, i.e. demanding the systematicity of semantics, we can for the first time formalize the intuitions that linguists have had for a long time about &amp;quot;good&amp;quot; and &amp;quot;bad&amp;quot; grammars (cf. Manaster-Ramcr and Zadrozny (1992)). This allows us ACrF~ DE COLING-92. NANTES, 23-28 AOI3T 1992 2 63 PROC. of COLING-92. NANTES, AUO. 23-28, 1992 to begin dealing in a rigorous way with the problem (posed by Marsh and Partee) of constraining the power of the semantic as well as the syntactic components of a grammar.</Paragraph>
      <Paragraph position="15"> We can show for instance (ibid.) that some restrictions have the effect of making it in principle impossible to assign correct meanings to arbitrary sets of matched singulars and plurals if the underlying grammar does not have a unitary rule of reduplication. Thus, grammars such as wrapping grammars (Bach (1979)), TAGs (e.g., Joshi (1987)), head grammars, LFG (e.g., Sells (1985)) and queue grammars (Manaster-Ramer (1986)), all of which can generate such a language, all fail to have systematic semantics for it. On the other hand, we can exhibit other grammars (including old-fashioned transformational grammars) which do have systcmatic semantics for such a language.</Paragraph>
      <Paragraph position="16"> 4.2. What kind of semantics for NL? In view of the above results we should perhaps discuss some of the options we have in semantics of NL, especially in context of NLU by computers. To focus our attention, let's consider the options we have to deal with the semantics of depart as described in Section 1.</Paragraph>
      <Paragraph position="17"> * Do nothing. That is, to assume that the semantics is given by sets of procedures associated with particular patterns; e.g. &amp;quot;X departs from Y&amp;quot; gets translated into &amp;quot;depart(X,Y)&amp;quot;.</Paragraph>
      <Paragraph position="18"> * Give it semantics a la Montague, for instance, along the lines of Dowry (1979) (see esp. Chapter 7). Such a semantics is quite complicated, not very readable, and it is not clear what would be accomplished by doing this. However note that this doesn't mean that it would not be computational -- ttobbs and Rosenschein (1977) show how to translate Montagovian semantics into Lisp functions. null Restrict the meaning of compositionality requiring for example that the meaning of a verb is a relation with the number of arguments equal to the number of arguments of the verb. If the PP following the verb is treated as one argument, there is no compositional semantics that would agree with the intended meanings of the example sentences. This would formally justify tile arguments of t lirst.</Paragraph>
      <Paragraph position="19"> * Recognize that depending on the PPs the meaning of &amp;quot;X departs PP&amp;quot; varies, and describe this dependence via a set of meaning postulates (Bernth and Lappin (1991) show how to do it in a computational context). In such a case the semantics is not given directly as a homomorphism from syntax into some algebra of meanings, but indirectly, by restricting the class of such algebras by the meaning postulates.</Paragraph>
      <Paragraph position="20"> * Admit that the separation of syntax and semantics does not work in practice, and work with representations in which form and meaning are not separated, that is, there cannot be a separate syntax, except in restricted domains or for small fragments of language.</Paragraph>
      <Paragraph position="21"> This view of language has been advocated by Fillmore et al. (1988), and shown in Zadrozny and Manaster-Ramer (1997), Zadrozny and Manaster-Ramer (1997) to be computationally feasible.</Paragraph>
      <Paragraph position="22">  aning functions Let S be any collection of expressions (intuitively, sentences and their parts). Let M be a set s.t. for any swhich is a member of S, there ism=m(s) which isa member of Ms.t.m is the meaning of s. We want to show that there is a compositional semantics for S which agrees with the function associating m with m(s) , which will be denoted by rn(x). To get the ho-Afzr~:s DE COLING-92, NANTES. 23-28 AO13T 1992 2 6 4 PROC. OI~ COLING-92. NANTES, AUO. 23-28, 1992 momorphism from syntax to semantics we have to perfbrm a type raising operation that Would map elements of S into fhnctions and then the functions into the required meanings.</Paragraph>
      <Paragraph position="23"> As we have described it in Section 2, we extend S by adding to it the &amp;quot;end of expression&amp;quot; character $, which may appear only as the last element of any expression. Under these assumptions we prove: THEOREM. There is a function ~t s.t, for all s,</Paragraph>
      <Paragraph position="25"> Proof. Let /(0),1(1) .... , t(a) enumerate S. We can create a big table specifying meaning values for aU strings and their combinations. Then the conditions above can be written as a set of equations shown in the figure below</Paragraph>
      <Paragraph position="27"> Continuing the proof: By the solution lemma (Aczel (1987) and Barwise and Etchemendy (1987)) this set of equations has a solution (unique), which is a function.</Paragraph>
      <Paragraph position="28"> To finish the proof we have to make sure that the equation #($)= $ holds. Formally, this requires adding the pair &lt; $, $ &gt; into the graph of tt that was obtained from the solution letmna.</Paragraph>
      <Paragraph position="29"> \[\] We have directly specified the function as a set of pairs with appropriate values. Note that that there is place for recursion in syntactic categories. Also, if a certain string dues not belong to the language, we assume that the corresponding value in this table is undefined; thus # is not necessarily defined for all possible concatenations of strings of S.</Paragraph>
      <Paragraph position="30"> Note: The above theorem has been proved in set theory with the anti-loundation axiom, ZFA.</Paragraph>
      <Paragraph position="31"> This set theory is equiconsistent with the standard system of ZFC, thus the theorem does not assume anything more than what is needed for &amp;quot;standard mathematical practice&amp;quot;. Furthermore, ZFA is better suited as foundations for semantics of NL than ZFC (Barwise and Etchemendy (1987)).</Paragraph>
      <Paragraph position="32"> 5.2. A grammar without compositional semantics Vor the grammar DN, we can prove that no such a polynomial exists, that is 1qaeorem. There is no polynomial p in the variables #(D), #(N) such that #( D IV) = p(la( D ), I~( N) ) and such that the value of#(D P0 is the number expressed by the string D N in base 10.</Paragraph>
      <Paragraph position="33"> Proof. We are looking for</Paragraph>
      <Paragraph position="35"> where the function p must be a polynmnial in these two variables. But such a polynomial does not exist, since it would have to be equal to #(N) for p(N) in tile interval 0..9, and to /~(D) x 10+/~(N) for /~(N) in 10..99, and to #(D) x 100 + ~t(N) for ~t(N) in 100..999, and so on. And if the degree of this polynomial was less than l~, lbr some n, it would have to be equal identically to /~(D) x 10&amp;quot; +/~(N) , since it would share with it all the values in l@..10 ~ - 1, and therefore could not give correct values on the other intervals.</Paragraph>
      <Paragraph position="36"> AcrEs DE COLING-92, NANTES, 23-28 AOl\]'r 1992 2 6 5 PROC. OF COL1NG-92, NANTES, Au(L 23-28, 1992 Acknowledgements. Alexis Manaster Ramer brought to my attention the need to clarify the meaning of compositionality, and commented on various aspects of this work. I also benefited greatly from a discussion with John Nerborme and exchanges of e-mail with Feruando Pereira and Nelson Correa. (Needless to say, all the remaining faults of the paper are mine).</Paragraph>
      <Paragraph position="37"> Parts of this paper were written at the University of Kaiserslautern; I'd like to thank Alexander yon Humboldt Stiftung for supporting my visit there.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML