XML Viewer - p84-1061

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/84/p84-1061_intro.xml
Size: 11,302 bytes
Last Modified: 2025-10-06 14:04:25
<?xml version="1.0" standalone="yes"?>
<Paper uid="P84-1061">
  <Title>I SPECIFICATION OF THE I:~PUT STRUCTURES FOR INFE~ENC I\[IG</Title>
  <Section position="3" start_page="0" end_page="292" type="intro">
    <SectionTitle>
I SPECIFICATION OF THE I:~PUT STRUCTURES
FOR INFE~ENC I\[IG
</SectionTitle>
    <Paragraph position="0"> A. Outline of the TIBAQ ~lethod hhen the TIBA~ (~ext-and-~nference based ~nswering of ~uestions) project was ~esigned, main emphasis was laid on the automatic build-up of the stock of knowledge from the (non-~re-edited% input text. The experimental system based 6n this meth- od converses automatically the natural language input (both the questions and new Fieces of information, i.e. Czech sentences in their usual form) into the reDresentations of n,eaning (tectogranmlatical representations, TR's\]; these TR's serve as input structures for the inference procedure tilat enriches the set of TR's selected by the system itself as possibly relevant for an answer to the input question. In this enriched set suitable TR's for direct and indirect answers to the given question are retrieved, and then transfered by a synthesis procedure into the output (surface) form if sentences (for an outline of the method as such, see Haji~ov~, 197~; 3aji~ov~ and Sgall, 19~i; Sgall, 1982).</Paragraph>
    <Paragraph position="1"> B. :?hat Kind of Structure Inferences ~houl~i Be Based on To decide what kind of structures the inference procedure should operate, one has to take into account several criteria, some of which seemingly contradict each other: the structures should be as simple and transparent as possible, so that inferencing can be perfor,ued in a well-defined way, and at the s~e ti~ue, these structures ~hould be as&amp;quot;exDressive&amp;quot;as the natural language sentences are, not to lose any piece of information captured by the text.</Paragraph>
    <Paragraph position="2"> &amp;quot;~atural language has a major drawback in its ambiguity: when a listener is told that the criticisl~ of the Polish delegate was fully justified, one does not know (unless indicated by the context or situation) whether s/he should infer that soE~eone criticized the Polish delegate, or whether the Polish delegate criticized someone/something. On the other hand, there are means in natural language that are not preserved by most languages that logicians have used for drawing consequences, but that are critical for the latter to be drawn correctly: when a listener is told that ~ussiau is ~poken in SIBERIA, s/he draws conclusions partly different from those when s/he is told that in Siberla, RUS3IAN is spoken (caoitals denoting the intonation center); or, to borrow one of the widely discussed examples in linguistic writings, if one hears that Jonn called :ary a ~U~LICA~ and that then she insulted I~IM, one should infer that the sneaker considers &amp;quot;being a ~eoublican&amp;quot; an insult~ this is not the case, if the speaker said that then she I~SULTED hi~.</Paragraph>
    <Paragraph position="3"> These and similar considerations have led the authors of TIDAn to a stronc conviction that the structures representing F.nowledge and serving as the base for inferencing in a q-uestion-answerin\[~ system with a natural language interface should be linguistically based: they should be deprived of all ambiguities of natural language and at the same til:ie they should preserve all the information relevant for drawing conclusions that the natural lanciuage sentences encompass. The exr.erir,~ental syster~, based on TI~A(:, which was carried out by the group of formal linauistics at Charles University, Prague \[implemented on ~C 1040 c~:n?11ter, compatible with 15::4 360) works with representations of :~eaning (tectogrammatical representations, fR's2 worked nut in the framework of functional generahive descrintion, or ~GD (for the linguistic background of this aopro~ch we refer to Sgall, 1964; ~;~all et ai.,1959;  Haji~ov~ and Jgall, 19:~O ).</Paragraph>
    <Paragraph position="4"> C. l ectocrar.~n~tical ~eor:_'sentations One of the b~sic tenets of VGD is the articulation of the sc'~antic relation, i.e. th_- relation bet.:een sound and r,~eaning, into a hierarchy o\[ levels, connected with the relativiz~tion o\[ the rel~tion of form&amp;quot; an~ 'function' a:~ known from the * ~;ritings of Prague &amp;chool sc'nolar,3. This relativizatio~ .iakes it i~ossibl.., to di'~tingui.~h t::o levels of se:,tence structure: the level of surface syntax and that of t~e underlying or tectogramomatical structure of sentences.</Paragraph>
    <Paragraph position="5"> As for a forn~al specification of the comolex unit oF- this lev,;l, that is the T!~., the \[)re~{ent version (see :'l.&lt;ite\]-, Sgall an/ qgall, in }~ress) w~rks ::ith the notion of basic .\]e})endency structure (5DR) ,;hich is defined a~ \] structure over the alohabet A (corres\~onding to tne labels of no~les) and the set of sy~,~ools C (corres~onding.to the labels of e'lqes). 'i'he set of 5Dr- s is the sec of the tectogra:unatical representations of sentences containing no coordinated structures. 'fi%e ~-\]Dq s are generated by the gra:,~.~ar G = (V.,V ,5,q), where V = A ka C, A = {(a ~, ,~)\], a is in- T terpreted as a lexical unit, g is a variaole standing for t and f (contextually bound and non-bound, res~ectively\] an., ~ is internreted as a set of &lt;Ira,~,~aten~es belonging to a; C is a '~et of com~)lementations (c ~ C, where c is an inter;or denoting a certain type of comi~ler.'entation, called a functor),C&amp;quot; lenotes the set \[&lt;, &gt;, %, &gt;c~ for uvery C ~ C.</Paragraph>
    <Paragraph position="6"> %'o reuresent coordination, the formal a~paratus for sentence generation is to be complemented by another aluhabet Q, ..,here q ~ e is interpreted as tynes of coordination (conjun~ive, disjunctive, adversative, ..., ap}9osition) , .Ind by ~ ne',,! kinu of brackets denotinq the boundary of coordinated structures; .3&amp;quot;={\[ , ~, \] for every q ~ ~. The structures generated oy the grammar are then called comT~lex '.\]e:gendency str~ctures (CD~).</Paragraph>
    <Paragraph position="7"> Coming back to the notions of elementary and com~!ex units of the tectogra~c, atical level, we can say that the comnlex unit of the TR is the com?lex dependency structure as briefly characterized above, while the ele.nentary units are the symbol~ of ti~e shaoes a, g, c, q, the ele\[:ents of 3&amp;quot;~, and the ~arentheses. 'i'he lexical units a are conceiv..,&lt;~ of as elementary rather th~n zom:_~lex, since for the time being we .1o not work with anv kind of lexical d~co.,;&gt;osition. ,'.very le:~ical unit is assig~le\] V.n~: \[eat:/re conte.':tually bound&amp;quot; or 'non-bound&amp;quot; . The set of gra.'nmate~,~zs GR cov:_'rs a :;ide ranme oPS \[}henomena; they can be classifie,i into two groups.</Paragraph>
    <Paragraph position="8"> Grammatemes representing morphological rleanin C in the narrow sense are specific for different (semantic) word classes: for nouns, w~ distinguish grammatemes of number an~ of delimitation (indefinite, definite, specifying):for adjectives and adverbs, grammate~es of degree, for verbs, we work with grammatemes of aspect (processual, complex, resultative), iterativehess (iterative, non-iterative), tense (simultaneous, anterior, posterior), im:nediateness (immediate, non--immediate), predicate modality (indicative, Dossibilitive, necessitive, voluntative), assertive modality (affirmative, negative), and sentential modality (ieclarative, interrogative, imperative). The other group o~ gr~mmatemes is not - with some exceptions %~ord-class specific and similarly as the set of the types of complementations is closely connected with the kinds of the dependency relations between the governor and the dependent node; thus the Locative is accom}~anied by one member of the set {in, on, under, between .... \].</Paragraph>
    <Paragraph position="9"> %'he dependency relations are very rich and varied, and it is no wonder that there were many efforts to classify them.</Paragraph>
    <Paragraph position="10"> In FGD, a ,lear boundary is being made between -~tJcipants (deep cases) and(free) modifications: participants are those com!~lementations that can occur with the same verb token only once and that have to be sr~uci~ied for each verb (and similarly for each noun, adjective, etc.), while free modifications are those comolementations that may appear more than once with the same verb token and that can be listed for all the verbs once for all; for a ~ore detaile:i discussion and the use of operational criteria for this classification, see ?anevov~ 1974; 1980; Eaji~ov~ and Panevov~, in press; Haji~ov~, 1977; 1983.</Paragraph>
    <Paragraph position="11"> Doth ;~articipants and modifications can be (semantically) optional or obligatory; ~oth optional and obligatory oarticiDants are to be stated in the case frames of verbs, while modificatiors belong there only with such verbs with which they are obligatory.</Paragraph>
    <Paragraph position="12"> In the nresent version of FGD, the following five participants are distinguished: actor/bearer, patient (objective), addressee, origin, an~ effect. The list o4 ~odifications is by far richer and more differentiated; a good starting ~oint for tills differentiation can be found in Czech gram~lars (esp. ~milauer, 1947). %'bus one can arrive at the following grou~?ings:  (a) local: where, lirection, &amp;quot;~lhich ~:ray, (b) tem~3oral: when, since when, till when, how long, for ho%J long, luring, (c) causal: cause, condition real and unrdal, aim, concession, consequence, (d) manner: manner, regard, extent, norm (criterion) , substitution, accompaniment, means (instrument), difference,  benefit, comparison.</Paragraph>
    <Paragraph position="13"> In our discussion on types of complementations we have up to now concentrated on complementations of verbs; with Zhe FGD framework, however, all word classes have their frames. Specific to nouns (cf. Pi\[ha, 1980), there is the partitive participant (a glass of water) and the free modifications of appurtenance (a leg of the table\], of general relationship (nice weather), of identity (the city of Prague\] and of a descriptive attribute (golden Prague).</Paragraph>
    <Paragraph position="14"> To illustrate the structure of the representation on the tectogrammatical level of FG;), we present in Fi~. $ a com21ex dependency structure of one of the readings of of the sentence &amp;quot;Before the ~ar began, Charles lived in P~AGUE and Jane in BFRLIN&amp;quot; (which it has in cormnon with &amp;quot;Before the beginning of the war, Charles lived in PRAGUE and Jane lived in rSERLIN ~) ;to make the graph easier to survey, we omit there the values of the gram.~atemes.</Paragraph>
    <Paragraph position="15"> lize t AND live t ~arlest ~ Prague f ~ane t % Berlin f the linearized form: &lt;~war t, {sing, def\])&gt;Act (beglnt' {enter, compl, noniter, nonimmed, indic,lffirm, before\]\]&gt;whe n (&lt;(Charles t, {sing, det\]\].~Ac t (live t, {enter, compl, noniter, noninmled, decler, indie,effirm\]\] whe~re(Pregue f, {sing,def,in\])&gt; &lt; ( Janet; {sing, def\] \] .~ct (liver' {enter, eompl, noniter,nonirmled, declar, indic, affirm)) where (Berlin , {sing, def, in)\] &gt;SAND Fic.f. 1</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML