File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/91/e91-1027_concl.xml

Size: 7,183 bytes

Last Modified: 2025-10-06 13:56:38

<?xml version="1.0" standalone="yes"?>
<Paper uid="E91-1027">
  <Title>THE RECOGNITION CAPACITY OF LOCAL SYNTACTIC CONSTRAINTS</Title>
  <Section position="8" start_page="0" end_page="0" type="concl">
    <SectionTitle>
ALL OLD PEOPLE LIKE BOOKS ABOUT FISH
</SectionTitle>
    <Paragraph position="0"/>
    <Paragraph position="2"> We are left with four valid paths through the sentence, out of the 256 tentative paths in SG.</Paragraph>
    <Paragraph position="3"> Two paths represent legal syntactic interpretations (of which one is &amp;quot;the intended&amp;quot; meaning). The other two are locally valid but globally incorrect, having either two verbs or no verb at - 158 all, in contrast to the grammar. SCr(t,2) would have rejected one of the wrong two.</Paragraph>
    <Paragraph position="4"> Note that in this particular example the method was quite effective in reducing sentence-wide interpretations (leaving an easy job even for a deterministic parser), but it was not very good in individual word tagging disambiguation. These two sub-goals of raging disambiguation reducing the number of paths and reducing word-level possibilities - are not identical. It is possible to construct sentences in which all words are two-way ambiguous and only two disjoint paths out of the 2 N possible paths are legal, thus preserving all word-level ambiguity.</Paragraph>
    <Paragraph position="5"> We demonstrated the potential of efficient path reduction for a pre-parsing filter. But short-context techniques can also be integrated into the parsing process itself. In this mode, when the parser hypothesizes the existence of a constituent, it will first check if local constraints do not rule out that hypothesis. In the example above, a more sophisticated method could have used the fact that our grammar does not allow verbs in constituents other than VP, or that it requires one and only one verb in the whole sentence.</Paragraph>
    <Paragraph position="6"> The motiwttion for this method, and its principles of operation, are similar to those behind different tecimiques combining top-down and bottom-up considerations. The performance gains depend on the parsing technique; in general, allowing early decisions regarding inconsistent tag assignments, based on information Which may be only implicit in the grammar, offers considerable savings.</Paragraph>
    <Paragraph position="7"> 7. Educated Guess of Unknown Words Another interesting aid Which local syntactic constraints can provide for practical parsers is &amp;quot;an oracle&amp;quot; which makes &amp;quot;educated guesses ~ about unknown words. It is typical for language analysis systems to assume a noun whenever an unknown word is encountered. There is sense in tiffs strategy, but the use of LCA, even LCA(I), can do much better.</Paragraph>
    <Paragraph position="8"> To illustrate this feature, we go back to the princess and the frog. Suppose that an adjective unknown to the system, say 'q'ransylvanian&amp;quot; was used rather than &amp;quot;charming&amp;quot; in example (1), yielding the input sentence: (3) The Transylvanian princess kissed a frog.</Paragraph>
    <Paragraph position="9"> Checking out all tags in T in the second position of the tag image of this sentence, the only tag that satisfies the constraints of LCA(1) is adj.</Paragraph>
    <Paragraph position="10"> 8. &amp;quot;Context Sensitive&amp;quot; Spelling Verification A related application of local syntactic constraints is spelling verification beyond the basic word level (which is, in fact, SCr(t,0) ).</Paragraph>
    <Paragraph position="11"> Suppose that while typing sentence (1), a user made a typing error and instead of the adjective &amp;quot;charming u wrote &amp;quot;charm&amp;quot; (or &amp;quot;arming&amp;quot;, or any other legal word which is interpreted as a noun): (4) The charm princess kissed a frog.</Paragraph>
    <Paragraph position="12"> This is the kind of errors* that a full parser would recognize but a word-based spell-checker would not. But in many such cases there is no need for the &amp;quot;full power (and complexity) of a parser; even LCA(I) can detect the error. In general, an LCA which is based on a detailed grammar, offers cheap and effective means for invalidation of a large set of ill-formed inputs.</Paragraph>
    <Paragraph position="13"> Here too, one may want to get another point of view by considering the simple formal language L = {ambm}. A single typo results in a string with one &amp;quot;a', changed for a &amp;quot;W, or vice versa. Since LCA(i) recognizes strings of the form {aJb ~} for 1 &lt;_j,k, given arbitrary strings of length n over T = (a, b}, LCA(I) will detect &amp;quot;all but two of the n single typos possible - those on the borderline between the a's and b's.</Paragraph>
    <Paragraph position="14"> Remember that everything is relative to ~ the toy grammar used throughout this paper. Hence, although &amp;quot;the charm princess&amp;quot; may be a perfect noun phrase, it is illegal relative to our grammar. - 159 9. Assistance to Tagging Systems Taggcd corpora are important resources for many applications. Since manual tagging is a slow and expensive process, it is a common approach to try automatic hcuristics and resort to user interaction only when there is no dccisive information. A well-built tagging system can &amp;quot;learn&amp;quot; and improve its performance as more text is processed (e.g. by using the already tagged corpus as a statistical knowledge base).</Paragraph>
    <Paragraph position="15"> Arguments such as those given in sections 7 and 8 above suggest that the use of local constraints can resolve many tagging ambiguities, thus incrcasing the &amp;quot;specd of convergence&amp;quot; of an automatic tagging system* This seems to be true even for the rather simple and inexpensive I,CA(I) for laaaguagcs with a relatively rigid word order. For related work cf. \[Grccne/Rubin 71\], I~Church 88\], \[l)cRose 88\], and \[Marshall 83\].</Paragraph>
    <Paragraph position="16"> 10. Final Remarks To make our presentation simpler, we have limited thc discussion to straightforward context free grammars. But the method is more gcnerzd.</Paragraph>
    <Paragraph position="17"> It can, for example, he extended to Ci:Gs augmented with conditional equations on features (such as agrccmcnt)- cither by translathag such grammars to equivalent CFGs with a more detailed tag set (assuming a finite range of feature values), or by augmenting our a:utomata with conditions on arcs. It can also be extended for a probabilistic language model, generating probabilistic constraints on tag sequences from a probabilistic CFG (such as of \[Fujisaki et &amp;quot;,3.1. 89\]).</Paragraph>
    <Paragraph position="18"> Perhaps more interestingly, the method can be used even without an underlying grammar, if a large corpus and a lexical analyzer (which suggests prc-disambiguatcd cohorts) are available. This variant is based on a tcchnique of invalidation of tag pairs (or longer sequences) which satisfy certain conditions over the whole language L, and the fact that L can be approximatcd by a large corpus. We cannot elaborate on this extcnsion here.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML