File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/94/c94-1076_intro.xml
Size: 8,545 bytes
Last Modified: 2025-10-06 14:05:34
<?xml version="1.0" standalone="yes"?> <Paper uid="C94-1076"> <Title>Minimal Change and Bounded Incremental Parsing Mats Wirdn</Title> <Section position="2" start_page="0" end_page="461" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="0" end_page="461" type="sub_section"> <SectionTitle> 1.1 Background </SectionTitle> <Paragraph position="0"> Natural-language computing has traditionally been understood as a &quot;batch-mode&quot; or &quot;once-only&quot; process, in which a problem instance P (shy, a text) is mapped as a whole to a solution S (such as air analysis of the text). IIowever, in highly interactive and real-time applications -- for example, grammar checking, structure editing and on-line translation -- what is required is efficient processing of a sequence of small changes of a text. Exhaustive recomlmtation is tber, not a feasible alternative. Rather, to avoid ms much recomputation as possible, each update cycle must re-use those parts of the previous solution that are still wdid. We say that an algorithm is incremental if it rises information from an old solution in computing the new solution.</Paragraph> <Paragraph position="1"> The problem of incremental processing can be stal.ed its follows, using a notation similar to that of All)ern et al. \[1\]: Assume given a problenr instance P (a representation of the current input), a solution S (the current output), and a modification Ap to p.2 The modification results in a new problem instance P' = P * At,, where * is a composition operator. The task of an in1I would like to thank Ralph RSnnqulst as well as Gregor Erbach and other colleagues in Snarbrfieken for discussions on the material presented here, Peter Fritzson for originally alerting my attention to Ramalingam and tleps' paper, and the anonymous referees. This research has been flmded by the Cerman Science Foundation (DFG) through the Sonderforschnngsl)erelch 314, project N3 (BiLD).</Paragraph> <Paragraph position="2"> ~A terminological note: we use &quot;input change&quot; and &quot;modification&quot; as well as &quot;output cllange&quot; and &quot;update&quot; synonymously. cremental algorithm is then to produce an upd~te As in the old solution such that .5' * As is a solution to P(DAp (see figure 1). At this point, nothing is stipulated about tim amount, of information it, S tlmt should be re-used in S'.</Paragraph> <Paragraph position="3"> To show properties such im correctness and complexity of incremental algorithms, it is necessary to establish a formal measure of &quot;the set of things changed&quot;. This me,inure sllouhl capture tim minimal change resuiting from a modification and, moreover, should be independent of any particular algorithms R)r incremental update. One way of achieving this is to compare the results obtained by batch-mode processing of the inputs before and after the change, respectively (Wirfin and l~.Snnquist \[15, 17\]): By forming tile &quot;difference&quot; l)el.ween the lmtch-mode soh,tions S and PS,1 obtained before ;tlt(\] after a modillcation At, to P, we obtain a parameter As,m ' which captures tin.' minimal change in a way which is indeed imlependent of the incremental ul)date. Given that A.s,,,i&quot; corre.sl)onds precisely to what any sound and complete incremental algorithm must do, it, can be used as a blmis lbr correctness proofs for suclt algorithms (given tl,at the batch-rhode algorithm is correct).</Paragraph> <Paragraph position="4"> Fnrthermore, Asmi&quot; can be used ms a basis of complexity analyses: Ideally, each update cycle of an incremental algorithm slmuld expend an amount of work which is a polynomial fimction of the size of the change, rather than, say, tile size of tl,e entire current input.</Paragraph> <Paragraph position="5"> However, making this notion precise in a way which is independent of particular incremental algorithms is not always straightforward. Two early approaches along these lines are Goodwin \[3, 4\] (reason maintenance) and Reps \[11\] (language-based editing). More recently, Alpern et al. \[1\] and Ramalingam and R.eps \[9, 10\] have provided a framework for analysing incremental algorithms, in which the basic measure used is the snm of the sizes of the changes in the input and output. This framework assumes that the modification of the input can be carried out in o(IAPI) time, where the generic notation IXI is used for the size of X. Furthermore, it assumes that \]As,m,\] denotes the minimal IAsl such that S (9 As solves P (9 Ap. Alpern et al. then define</Paragraph> <Paragraph position="7"> as the intrinsic size of a change.</Paragraph> <Paragraph position="8"> The choice of 6 is motivated as follows: IAph the size of the modification, is in itself too crude a measure, since a small change in problem instance may cause a large change in solution or vice versa. IAs..,,I iv then chosen as a measure of the size of the chauge in the solution, since the time for updating the solution can be no less than this. The 5 measure thus makes it possible to capture how well a particular algorithm performs relative to the amount of work that must be performed in response to a change.</Paragraph> <Paragraph position="9"> An incremental algorithm is said to be bounded if it can process any change in time O(f(5)), that is, in time depending only or, 5. Intuitively, this means that it only processes the &quot;region&quot; where the input or output changes. Algorithms of this kind can then be classified according to their respective degrees of boundedness (see Ftamalingam and Reps \[10, section 5\]). For exampie, an algorithm which is linear in 5 is asymptotically optimal. Furthermore, an incremental algorithm is said to be unbounded if the time it takes to update the solution can be arbitrarily large for a given 5.</Paragraph> <Paragraph position="10"> It might seem that what has been discussed so far has little relevance to natural-language processing, where incrementality is typically understood ,as the piecemeM assembly of an analysis during a single left-to-right a pass through a text or a spoken utterance. In particular, incrementality is often used as a synonym for interleaved approaches, in which syntax and semantics work in parallel such that each word or phr~me is given an interpretation immediately upon being recognized (see, for example, Mellish \[7\] and lladdock \[5\]). llowever, the two views are closely related: The &quot;leftto-right view&quot; is an idealized, psycholinguistically motivated special case, in which the only kind of change allowed is addition of new material at the end of the current input, resulting in piecemeal expansion of the analysis. Moreover, the interleaving is just a consequence of the fact that every piece of new input must, in some sense, be fully analysed in order to be integrated with the old analysis.</Paragraph> <Paragraph position="11"> To distinguish this special case from the general case, in wtfich arbitrary changes are allowed, Wirdn \[15\] refers to them as left-to-right (Lll) incrementality and 3Strictly speaking front-to-back or beginMng-to-end.</Paragraph> <Paragraph position="12"> full incremenlalily, respectively. The former case corresponds to on-line analysis -- that each prefix of a string is parsed (interpreted) before any of the input beyond that prefix is read (llarrison \[6, page 433\]).</Paragraph> <Paragraph position="13"> The latter case has long been studied in interactive language-based programming environments (for example, Ghezzi and Mandrioli \[2\]), whereas the only previous such work that we are aware of in the context of natural-language processing is Wirdn and R.gmlqnist \[14, 15, 16, 17\].</Paragraph> </Section> <Section position="2" start_page="461" end_page="461" type="sub_section"> <SectionTitle> 1.2 The Problem </SectionTitle> <Paragraph position="0"> The aim of this paper is to begin to adapt and apply the notion of bounded incremental computation to natural-language parsing, using a method for establishing minimal change previously introduced by Wir6n and RSnnquist \[15, 17\]. To this end, the paper shows how the 6 parameter can be defined in a fully incremental, chart-based parsing framework, briefly describes a previous, unbomMed algorithm, and then shows how a polynomially bounded algorithm can be obtained.</Paragraph> </Section> </Section> class="xml-element"></Paper>