XML Viewer - c88-1060

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/88/c88-1060_metho.xml
Size: 10,341 bytes
Last Modified: 2025-10-06 14:12:09
<?xml version="1.0" standalone="yes"?>
<Paper uid="C88-1060">
  <Title>An Algorithm for Functional Uncertainty</Title>
  <Section position="5" start_page="2981" end_page="2981" type="metho">
    <SectionTitle>
5. Pek'farmmice Considerations
</SectionTitle>
    <Paragraph position="0"> We have outlined a general, abstract procedure fro' solving uncertainty descriptions, making the smallest number of assumptions about the details of its operatiml, '\['he efficiency of any i,nphmmntation will depend in huge nleasure in just how details of data str(u:ture and explicit COlnp~ttational control are fixed.</Paragraph>
    <Paragraph position="1"> There are a nuruber of obvious optimizations tbat can be made.</Paragraph>
    <Paragraph position="2"> First, although not required by the abstract procedure, perfornmnce will clearly be better if deterministic, minimal-state finite-state nmchines are used to represent the uncertainties. This reduces the size of the :;late eross-prodnets, which is the leading term in the number of disiunctions that nnlst be processed. Second, the cases in the Free operatm' are not mutually distinct: if identical strings behmg to the two um-ertainty languages, those wonld full into both cases (at and (b) and hence be processed twice with exactly equivalent results.</Paragraph>
    <Paragraph position="3"> The solution to this redundancy is to restrict one of tile cases (say (at) so that it only handles proper prefixes, consigning the identical strings to the otber case. Third, when pairs of symbols are enumerated in the (el case, there is obviously no point in even considering symbols that are in the alphabet bnt are not First symbols of the suff'ix uncertainties. This optimization is applied automatically if only the transitions leaving the start states are enmnerated and the finite-state machines tire represented with partial transition functions pruned of transitions to failure states.</Paragraph>
    <Paragraph position="4"> Four(b, a derivative uncertainty produced by the Free opm'ator will sometimes be empty. Since equations with empty nncertainties are imsatisfiable by definition, tiffs case should be detected and that disjunctive brt, nch immediately discarded. Fifth, the same derivative suffix and prefix languages of a particular state may appear in pursuing diffecent branches of the disjunction er processing different combinations af equations. Some conq)utaUonal advantage may be gained by saving the derivative finite-state machines in a cache associated with the states they are based on. Finally, successive iterations of the Free procedure may lead to transparent inconsistencies; (an assertion of equality between two distinct symbols; m&amp;quot; equating a synlbol to a variable that is also nsed as a functimi). It is important to detect these inconsistencies when they first appear and again discard the corresponding disjunctive branch. In fact, if' this is done systemaUcally, iterated application of the Free operator by itself simulates the effect of traditional unification algorithms, with variables corresponding to f-structures or nodes of a directed graph.</Paragraph>
    <Paragraph position="5"> There are also some less obvious but also quite important peribrmance considerations. What we have described is an equational rewriting system that is quite different fl'om the usual reeursive unification algorithm that operates on directed graptl representations.</Paragraph>
    <Paragraph position="6"> Directed graph data structures index the information in the equations so that related structures are quickly accessible through the reem'sive control structure. Since our procedure does not depend for its correctness o~, (he order in which interacting equations arc chosen for i:recessing, it ought to be easy to embed Free as a simple extension of a traditional algorithm. However, traditional unification algorithms do not deal with disjnnetion gracefully. In particular, they typically do net expect new disjunctive branches to arise (luring the course of a reeursive invocation; this would require inserting a fork in the reeursive control structure or saving a emnplete 'copy of the enrrent computational context for each new disjunction. We avoid this ~wkwar(tness by postponing tile processing of the functional uncertainty natil all simple unifications are complete. Before performing a simple unification step, we remove from the data struetures all uncertsinties that need to be resolved and store them with a pointer to their contahdng structures on a qmme or agenda of peuding t.mificaLions. Uncertainty proceasing can be resumed at a later, more convenient time, after tile sinlpler unil'lcations have hecIl completed. (Indeed, if mm of tile simpler unifications fails, the mlcertainty may never be processed at all.) Waiting until sinipler nnifications are done means that no computational state has to be preserved; only data structures have to be copied to \[wmre the independence of the various disjunctive paths.</Paragraph>
    <Paragraph position="7"> We also note that as l&lt;lng as the machinery \[br postponing thnctiona\[ uncertainty 6~r some anmunt of time is needed, it is often advantagemm to postpoue it even hinter than is absohltely necessary In i)artieuhu', we fonnd I:lalL il' uncertainties are postl)nned until predicates (seulantic form values lilt' PIU'tD attributes) at'(! assigned to the I' structures they belong to, the nuluber of cases that must be explored is dramatically reduced. This is heeause of the coherence cm~dition that I,FG imposes on t\struetures with In'edicates: an \['-structure with a predicate can only contain (.hose govvrnable functions that are explicitly mentioned by the predicate. Any other governable \['unctions are considered unacceptable. Thus, if we wail until the predicate is klentified, we need only consider the small number of governable attributes that any particular predicate allows, even though the initial attributes in an uncertainty may include the entire set of governab\[e functions (SUB J, oBJ, and various kinds of obliques and eonlplmnents), and this may be quite large. The effect is to make tim processing of hmg distance dependencies sensitive to the subeategorization fralne of the predicate: we haw=&amp;quot; ahserved eUOFInOUS ow,'all performance ilnprovemetm; from applying this delay strategy Note that m a left.to-right parsing model, the processing h)ad therefore increases iu relative clauses just after the predicate is seen, and this might bare a variety of interesting psycholinguistic implications.</Paragraph>
    <Paragraph position="8"> Finally, we observe that there is a specialization of the Free operator that applies when an uncertainty interacts with several non uncertainty equations (equations whose attribute expressions have singleton First set:;), instead of separating one interaction flxun the uncertainty with each application of Fl'eo, the Itncertainty is divided in a single step into u minimum nmuber of disjunctive possibilities eacilef which interacts with just one of the. other equations. The disjunction contains one branch for each symbol in the uneertainty's First set that is an initial attribute in one of the other equations, ohm a single branch tbr all of the residual inithd symbols: (12) (fa)=u iff (fslSuffix(a,S(qa,st)))-:v ...v(fsnSuffix(u,5(q(,,st~)))::::v V (l'n--{s b...s,d~:*) = v The statement of the generic Free a/gm'ithm (10) is simplified by considering specific attributes as trivial regular languages, buL this suggests that COlnplex finite-state machinery would be roquh'ed to process them. This alternative works in the opposite direction: it reduces leading ternls in an uncertainty to simph. ~ attrihutes boil)re pursuing their interactions, so that efficient attribute lnatehing routines of a normal unification procedm'e can be applied. This alternative has a second computational advantage. The generic algorithm unwinds the uncertainty one attribute at a time, constructing a residual regular set at each step, which is then processed against the other nml-uncertain equations. The alternative pr(leesses them all at once, avoiding the construction of these intermediate residual languages. This is a very ilnportanl optimization, since we lbund it to be the most colnmon case when we embedded uncertainty resohltion in our reeursive unification algorithm.</Paragraph>
    <Paragraph position="9"> Unem'tainty sl/ecificatlons are at colnI)act way of expressing a large number of disjunctive possibilities that are uncovered one by one as our procedure operates. It might seem that this is an extremely expensive descriptive device, one which should lie avoided in tltvor of apparently simpler 'mechanisms. Bul; the disjunctions that emerge fl'om processing uncertainties arc real: they represent independent grammatical possibilities that would require additional computational resources no matter how they were expressed. In theories in which long-distance dependencies are based ou empty phase~strueture nodes and implemented, for example, by gap..threading machinery, a'rN tIol,I) lists, and the like, the exact h)cation of these empty nodes is not signaled by any in(urination directly visible in the sentence. This  increases the number of phrase.structure rules that can he applied.</Paragraph>
    <Paragraph position="10"> What we see as the computational cost of functional uncertainty shows up in these systems as additional resources needed for phrase-structure analysis and for functional evaluation of the larger number of trees that the phrase-structure component produces.</Paragraph>
    <Paragraph position="11"> Unlike phrasally-based specifications, fnnctional uncertainties in LFG are defined on the same level of representation as the subcategorization restrictions that constrain how they can he resolved, which our coherence-delay strategy easily takes advantage of. But the thct remains that functional uncertainties do generate dlsjueetions, and thus strongly highlight the already perceived need for efficient disjunction-processing techniques if acceptable performance is to be achieved with I,FG and related grammatical formalisms. Recent disjunction proposals by/Kasper 1987/and/Eisele and D0rre 1988/are important steps in the development of the necessary computational technology.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML