File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/94/c94-1082_concl.xml
Size: 4,097 bytes
Last Modified: 2025-10-06 13:57:07
<?xml version="1.0" standalone="yes"?> <Paper uid="C94-1082"> <Title>References</Title> <Section position="6" start_page="505" end_page="506" type="concl"> <SectionTitle> 5 Discussion </SectionTitle> <Paragraph position="0"> The ability to deal with large amomlts of possibly ill-formed text is one of the principal objectives of current NLP research. Recent proposals include the use of probabilistic methods (see e.g.</Paragraph> <Paragraph position="1"> Briseoe and Carroll, 1993) and large robust deterministic systems like Hindle's Fidditch (Hindle, 1989). 4 Experience so far suggests that systems like LIIIP may in the right circumstances provide an alternative to these approaches. It combines the advantages of Prolog-interpreted DCGs (ease of modification, parser output suitable for direct use by other programs, etc.) with the ability to relax tile adjacency constraints of that form&llsm in a flexible and dynamic manner.</Paragraph> <Paragraph position="2"> LIHP is based on the assumption that partial results can be useful (often much more useful than no result at all), and that an approximation to complete coverage is more useful when it comes with indications of how approximate it is.</Paragraph> <Paragraph position="3"> This latter point is especially important in cases where a grammar must be usable to some degree at a relatively early stage in its development, as is, for example, the case with the development of a grammar for the Map Task Corpus. In the near future, we expect to apply LHIP to a different problem, that of defining a restricted language for specialized parsing.</Paragraph> <Paragraph position="4"> The rationale for the distinction between sanctioned and unsanctioned non-coverage of input is twofold. First, the qgnore' facility permits different categories of unidentified input to be distinguished. For example, it may be interesting to separate material which occurs at the start of the input from that appearing elsewhere. Ignore rules have a similar flmctionality to that of normal rules. In particular, they can have arguments, and may therefore be used to assign a structure to unidentified input so that it may be flagged as such within an overall parse. Secondly, by setting a threshold value of 1, LtIIP can be made to perform llke a standaxdly interpreted Prolog DCG, though somewhat more efficiently aIndeed, the ability of Fidditch to return a sequence of parsed but unattached phrases when a global analysis fails has clearly influenced the design of LHIP.</Paragraph> <Paragraph position="5"> due to the use of the chart. ~ A number of possible extensions to the system can be envisaged. Whereas at present each rule is compiled individually, it would be preferable to enhance preprocessing in order to compute certain kinds of global information from the grammar. One improvement would be to determine possible linking of 'root-to-head' sequences of rules, and index these to terminal items for use as an oracle during anMysis. A second would be to identify those items whose early analysis would most strongly reduce the search space for subsequent processing and sc,'m the input to begin parsing at those points rather titan proceeding strictly front left to right. This further suggests the possibility of a parallel approach to parsing.</Paragraph> <Paragraph position="6"> We expect that these measltres would increase the efficiency of LHIP.</Paragraph> <Paragraph position="7"> Currently, also, results are returned in an order determined by the order of rules in the grammar.</Paragraph> <Paragraph position="8"> It would be preferable to arrange matters in a more cooperative fashion so that the best (those with the highest coverage to span ratio) are displayed first. Support for bidirectional parsing (see Satta and Stock, to appear) is another candidate for inclusion in a later version. These appear to be longer-term research goals, however. 6 Acknowledgments: The authors would like to thank Louis des Tombe and Dominique Estival for comments on earlier versions of this paper.</Paragraph> </Section> class="xml-element"></Paper>