File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/88/a88-1034_concl.xml

Size: 5,423 bytes

Last Modified: 2025-10-06 13:56:15

<?xml version="1.0" standalone="yes"?>
<Paper uid="A88-1034">
  <Title>CANONICAL REPRESENTATION IN NLP SYSTEM DESIGN= A CRITICAL EVALUATION</Title>
  <Section position="6" start_page="257" end_page="258" type="concl">
    <SectionTitle>
5 Conclusion
</SectionTitle>
    <Paragraph position="0"> We believe that the Lucy experiment with canonical representations has generally succeeded in lowering the amount of effort Lucy spends on search. The parser usually returns a single analysis, instead of many, and the semantics module usually succeeds in ruling out most of the possibilities when they are finally unpacked. A further benefit is that debugging some individual modules has been made easier.</Paragraph>
    <Paragraph position="1"> We have found, in particular, that debugging a grammar that typically produces only one or a very small number of parses is much easier than when the grammar returns, say, hundreds of parses for a given sentence.</Paragraph>
    <Paragraph position="2"> But what of the hidden costs to the system? The course of our research has caused us to step back and question the whole idea of canonical structures for two primary reasons: first, canonical structures tend to let declarative information be far too influenced by processing concerns; second, modules leak in such designs, essentially doing away with one of the main arguments for such control models in the first place.</Paragraph>
    <Paragraph position="3"> There are rather serious practical, as well as theoretical, consequences when canonical forms make their way into the grammar in the way discussed in Section 4.1. First is the problem of lexical acquisition when lexical category assignments become so off-beat. Second, with arcane relations between syntactic output and semantic result as discussed in Section 4.2, it becomes difficult to see how such systems could be easily used for other purposes than the specific ones they have been written for. For instance, it is hard to see how multilingual systems could relate grammars when individua\[ grammars have been so heavily influenced by the accidental vagaries of processing concerns in that language. It is also hard to see how a generation system could easily make use of such grammars, since the mapping rules will tend to be complicated and fundamentally unidirectional.</Paragraph>
    <Paragraph position="4"> The moral to be drawn from the remarks in Section 4.3 seems to be that a canonical structure model, at least in its extreme form, does not permit us to maintain the modularity of a traditional conduit model. If we return to Figure 2 above, it is clear that when we finally begin enumerating the branching that has simply been delayed in the canonical out8Furthermore, almost any physical object can serve as a container: &amp;quot;We had lunch at the dump. I drunk a hubcap of beer and ate a distributor cap of pate.&amp;quot; 9There is some evidence for treating &amp;quot;of&amp;quot; as a member of a distinct syntactic class. For one thing, &amp;quot;of', unlike other prepositions, cannot attach to sentences (though it can mark an argument of the verb: &amp;quot;the time has come, the Walrus said, to talk of many things...')  put of module A, we will still have to use the information that fundamentally belongs in module A, even though we are doing this processing in module B. The effect is that we will require passing along information from box to box. Thus, we end up doing interleaving whether we want to or not.</Paragraph>
    <Paragraph position="5"> Although these conclusions seem to be damning for the general design philosophy, we should note that our attempts at evaluation here are open to the criticism that a single case history does not necessarily justify general conclusions about a design philosophy. There is. always the possibility that the design wasn't applied &amp;quot;right&amp;quot; in the case at hand. In particular, we should distinguish the proposal for hand-tooling canonical representations into a grammar as we have done in Lucy from the proposal for automatically inferring higher level generalizations from modules that themselves have still been driven by principled linguistic concerns. The proposals of Church and Patil (1982) fall more into this latter camp, and it is a goal of the ongoing redesign efforts in Lucy to incorporate some version of automatic generalization.</Paragraph>
    <Paragraph position="6"> Despite the negatives, it is possible that for some NLP applications the balance could still tip in favor of using canonical representations for some limited set of structures such as noun compounding or PP modifier attachment. Applications that have no pretensions of being fully general or easily extensible may be willing to pay the price that canonicalization exacts in order to avoid a more complex design and still achieve acceptable performance results. In fact, we expect that the need for methods that incorporate some form of delayed evaluation will continue to be pressing in natural language analysis, and in view of the short supply of such methods currently available, canonicalization may continue to have its place in the near term. However, our conclusion after two years of pursuing such techniques is that conduit control models using canonical structures ultimately offer no real alternative to more complex designs in which control is interleaved among modules.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML