File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/02/c02-1071_intro.xml
Size: 5,413 bytes
Last Modified: 2025-10-06 14:01:23
<?xml version="1.0" standalone="yes"?> <Paper uid="C02-1071"> <Title>Integrating Shallow Linguistic Processing into a Unication{based</Title> <Section position="3" start_page="0" end_page="3" type="intro"> <SectionTitle> syntacticandsemanticanalysisofthesentences </SectionTitle> <Paragraph position="0"> itprocesses,howeveritfailsinproducing aresult when the linguistic structure being processedand/orwordsintheinputsentencesfall null beyond the coverage of the grammatical resources. Natural Language Processing (NLP) systemswithmonolithicgrammars,inaddition, havetodealwithhugesearchspaceduetoseveral sources of non{determinism (i.e. ambiguity). Thisisparticularlytrueofbroad{coverage unication{based grammars where all dimensionsoflinguisticinformationareinterleaved,as null theoriessuchasHPSGpropose. Lackofrobustness and inecient processing makesuchsystems inadequate for practical applications e.g. NaturalLanguageInterfaces(NLI).</Paragraph> <Paragraph position="1"> ThispaperpresentsaNLPsystemwhichintegratesalinguistic Part{of{Speech(PoS)tagger and chunker (as opposed to data{driven) asapreprocessingmodule ofabroad{coverage unication{basedgrammarofSpanish.</Paragraph> <Paragraph position="2"> By integrating shallow and deep processing theeciencyoftheoverallanalysisprocessimproves signicantly, since we can release the parser from certain tasks that maybeecientlyandreliablydealtwithbycomputation- null allyless expensivetechniques. Theintegration ofshallowprocessing,inaddition,providesthe unication{basedgrammarwithlargercoverage forsyntacticstructuresandallowsustoimplement default lexical entry templates for virtuallyunlimitedlexicalcoveragewhileavoidingin- null creaseinambiguity.</Paragraph> <Paragraph position="3"> Thesystemwepresentisinspiredby(Abney,</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.1 Coverage of the Grammar </SectionTitle> <Paragraph position="0"> The range of linguistic phenomena that the grammar handles includes: all types of sub-categorization structures, determination (simple and complex), a full coverage of agreement (subject{verb, subject{attribute, agreementwithin theNP),null{subjects (pro{drop, impersonal sentences), compound tenses and periphrastic forms, clausal complements (completive clauses and indirect questions), control and raising structures, support verb constructions, passive constructions (with the copula, withorwithoutthe`by{agent'complement,and reexivepassive),modiersofverbs,nouns,adjectives and adverbs, negation, sentential adjuncts, topicalization, relative and interrogatives clauses, surface word order variation, co-ordination (binary,enumeration and coordination of unlike categories), clitics (clitic{NP alternation, clitic doubling, clitic climbing, enclitics), NPs with no noun{head, non{sentential input strings and special constructions (number,dates,...). null</Paragraph> </Section> <Section position="2" start_page="0" end_page="2" type="sub_section"> <SectionTitle> 2.2 The ALEP Architecture </SectionTitle> <Paragraph position="0"> ALEP distinguishes preprocessing operations and linguistic processing operations. The former|TextHandling(TH)andorphographemic null analyses|accountforsurfacepropertiesofinput text (document formatting, delimitation oftextual structural elements, orthographemic aspects of morphology), while the latter | parsing and renement |deal with its non{ surface properties (morphosyntactic analysis, constituent structure, semantic representation). null A special rule{based operation | Lifting |interfaces the output of the preprocessingoperationwiththeparsingoperation. null</Paragraph> </Section> <Section position="3" start_page="2" end_page="3" type="sub_section"> <SectionTitle> 2.3 The ALEP Linguistic Formalism </SectionTitle> <Paragraph position="0"> TheALEPlinguisticformalismhasbeendeveloped on the basis of the specications result- null A distinctive feature of the ALEP processing architecture is the division of the analysis task into two sub{ tasks: `parsing', which builds up a complete but shallow phrase structure tree, and `renement', which traverses the structure top{down, thus monotonically performing feature decoration, typically with semantic information.</Paragraph> <Paragraph position="1"> 1991). It is a so called \lean&quot; formalism compilableintorst{order(Prolog)termsandthus null avoiding computationally expensive formaldevices. null AnALEPgrammarisimplemented byspecifyinglexical entriesandgrammarrules, based onatypesystemthatconstitutesamonotonic simpletypehierarchywithappropriatenessconditions. null Lexical entries are based on the data structureLinguisticDescription(LD),collectingcon- null straints on the type system. The lexical component of our grammar plays a crucial role in thegrammaticaldescriptionneededforprocessing. Itisahighlylexicalizedgrammarwherelinguisticphenomena,suchassubject{verbagree- null ment, subcategorization, modication, control relations,etc.,traditionallydealtwithbymeans ofspecializedphrasestructurerules,aretreated inthelexicon. Grammarrulesarethusreduced toasmallsetofbinary{branchingcontext{free phrasestructure rules, which arebased on the datastructureLinguisticStructure(LS).</Paragraph> <Paragraph position="2"> The adopted approach in the grammar we present follows HPSG proposals (Pollard and Sag,1994).</Paragraph> </Section> </Section> class="xml-element"></Paper>