File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/94/c94-2145_evalu.xml
Size: 9,784 bytes
Last Modified: 2025-10-06 14:00:13
<?xml version="1.0" standalone="yes"?> <Paper uid="C94-2145"> <Title>ON TIIE PORTABILITY OF COMPLEX CONSTRAINT-BASED GRAMMARS</Title> <Section position="5" start_page="902" end_page="903" type="evalu"> <SectionTitle> 4 Performance </SectionTitle> <Paragraph position="0"> The second class of issues which affect the porting of a grammar frolu one forlnalisln to another is COlmeete.d with the relative perfornlance of the two instantiations. We consider two aspects of this topic, the provi-sion of explicit modules for processing in a particular domaiu, such as syntactic or morllhological analysers, ~md the complex and thorny issue of control information, or who gets control of control. First, though, it is worth emphasising wily we (:onsider performance to be a signilicant issue at all. We are not - yet, anyway particularly concerned with the real time performance of &quot;end-user&quot; allplications. Wc view all of the systelns that implenmnt these formalisms as develop ment environments, even if timy were originally developed as &quot;academic&quot; protol,ypes, in several cases with a view to demonstrating it particular theoretical perspective. Accordingly, we feel that it is more appropriate to evaluate their perfornlance with respect to the development loop ~ussociated with grammar writing.</Paragraph> <Paragraph position="1"> More. concretely, if either the analysis or compilation times exceed certain acceptable bounds (determined by pragmatic, external considerations like the attention sl)an of a grammar (levelol)er or lexicographer), then the grammar under development should be redeg garded as being, in a purely practical sense, no longer extensible. These may be rather harsh criteria, but we believe they reflect a more realistic sense of what these systems are good for%</Paragraph> <Section position="1" start_page="902" end_page="903" type="sub_section"> <SectionTitle> 4.1 Dedicated Modules </SectionTitle> <Paragraph position="0"> A further explicit distinction arises between those tbrmalisrns which include explicit modules for treating either phrasal or morphological structure (UD, ALl';), and those which only l)rovide a theorem prover over linguistic constraints (TFS, CUF). In general, we expect that, other things being equal, a formalism whose implementation contains dedicated processors for phrase structure parsing and/or string processing will have better run-time performance than one which does not, and this is indeed borne out empirically in the behaviour of the systems we considered.</Paragraph> <Paragraph position="1"> The prc'senee or absence of an explicit parser also ha~s obvious consequences for porting experiments. If there is a parser in the target system and not in the source system then seine phrase structure component must be supplied. This may just be a vacuous structure or it; may he derived from existing components of the source description, llence we have produced three instantiations of the UD translatiou of the TFS-IIPSG gra,mnar: one inw~lving a vacuous phrase structure description, one in which grammar rules are derived from the phrase structure delinitions of the TFS encoding and one ill which full strings are associated with a lexicon of garbage tokens to awfid invoking either of UD's dedicated modnles lbr morphology and syntax.</Paragraph> <Paragraph position="2"> Portability in the other direction poses considerably greater problems, since not only must the phrase strncture description he encoded, but some parsing strategy must also be detined. In translating the UD grammar into (J/Jl&quot; we encoded a head coruer parser (cf e.g.</Paragraph> <Paragraph position="3"> \[van Noord, t994\]) directly in the CUF formalism. In order to obtain adequate results with this strategy it was necessary to make use of all the facilities offered for determining both global and local process control.</Paragraph> <Paragraph position="4"> This sheds a certain anionnt of doubt on the possibility of replicating the CUI&quot; resnlts within TFS, where explicit local control statements are not permitted. We address the more general i)roblems with the incorporation of control information in the next section.</Paragraph> <Paragraph position="5"> While the question of translating more or less explicit phra~se structure information is already a diificult one, the issue of porting morphological information is quite chaotic. There is even less agreement on the in.formation structure of morphological regnlarities than there is on syntactic patterning, avd this fact is re,deg tlected in the fact that two of tile systems we have been working with do not oiler any apparatus at all for dealing with sub-word-level phenomena. Moreover, the two formalisms in our sample which (to admit explicit morphological descriptions differ so greatly ill the form that these components take that they are not directly comparable even with each other r.</Paragraph> </Section> <Section position="2" start_page="903" end_page="903" type="sub_section"> <SectionTitle> 4.2 Control Information </SectionTitle> <Paragraph position="0"> The final issue that wc turn to is one which is in effect most revealing about how system developers view their users. In terms of our sample formalisms, we once again can distinguish a two-way split, which actually cuts across all of the groupings that we have observed above. The crude characterisation of this distinction is that some formalisms permit the grammar writer to influence the local processing strategy, either in the good, old-fashioned Prolog manner of ordering clauses, as in ALE, or by providing additional control information, such as delay statements in CUF. The other two systems eschew this kind of local tweaking of the processing strategy and rely on a global specification of processing behaviour. Of course, this apparent dichotomy is to some extent illusory. Those systems which retain global control usually permit the user to modify certain parameters of this behaviour, and those that permit local control information must also assnme a global control strategy which may bc less forgiving than that in an apparently more totalitarian system. We have two observations in respect of the control strategies adopted by these systems.</Paragraph> <Paragraph position="1"> The first of these is that some form of lazy evaluation, such as that assumed as a global strategy in both UD and TFS, can become a requirement of a target system when the source system permits lazy evaluation.</Paragraph> <Paragraph position="2"> More explicitly a description may rely on a particular evaluation strategy that cannot be emulated in the target system. This situation actually occurred in the porting of the UD French grammar to ALE. The lack of a lazy evaluation strategy in ALE required a change in the analysis of verbal structure s , so the ALE description is actually different from the original UD one. In a very real sense the port failed, in that, even though in terms of the declarative formalism a compatible description was definable, it turned out that this was not runnable. The class of portable descriptions between ALE and any of the other formalisms is therefore further constrained by the ALE's underlying evahlation strategy.</Paragraph> <Paragraph position="3"> The second point we would like to make harks back, in many ways, to the warnings inherent in Kaplan's &quot;procedural seduction&quot;. Kaplan \[Kaplan, 1987\] reports experiences with the use of ATN parsers which ended with both grammar writers and system developers attempting to improve the performance of the same parser and effectively getting in each other's way.</Paragraph> <Paragraph position="4"> More generally, every time we think we may be making a smart move by some kind of local fix to the con7In the case of ALE it would probably be incorrect to speak of a lnorphological analyser since lexical forms are expanded at compile time.</Paragraph> <Paragraph position="5"> SAt the corresponding point in the CUb&quot; translation lazy evaluation had to be explicitly enforced by the use of a delay statement trol strategy we also make it more difficult for a really smart optimising controller to do its job properly. Of course we have progressed considerably in the declarativity and monotonicity of our formalisms which we now tend to view as st)ecialiscd logics, but where we have not learnt so much is in our view of the kind of people who arc going to use the implemented system and what they are capable of. Where local control information is specified in the ordering of statements in definitions, we are effectively requiring that the grammar writer be an accomplished logic programmer. Where local control information is added to supplement an existing grammar description the implicit assumption is even more demanding: that there are individuals capable of appcudiug local control information to descriptions that other people have written --- or worse still translated -- and of getting it right.</Paragraph> <Paragraph position="6"> Both of these approaches ultimately assume that it is not only possible but relatively easy to retain a detailed picture of the behaviour of a complex constraint solver.</Paragraph> <Paragraph position="7"> When translating to a formalism which permits local control from one which does not, the, issue may come down simply to a question of rclativc speed of computation, which is important enough iu itself in practical situations, as we have already pointed out.</Paragraph> <Paragraph position="8"> In cases where the target formalism, like ALE, requires local control information in order to guarantee termination, much more is at stake.</Paragraph> </Section> </Section> class="xml-element"></Paper>