File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/80/j80-2001_intro.xml
Size: 2,511 bytes
Last Modified: 2025-10-06 14:04:17
<?xml version="1.0" standalone="yes"?> <Paper uid="J80-2001"> <Title>Toward Natural Language Computation 1</Title> <Section position="4" start_page="0" end_page="0" type="intro"> <SectionTitle> 3. The Scanner </SectionTitle> <Paragraph position="0"> The scanner collects the string of tokens from the input and identifies them as well as possible. These tokens may be numbers or ordinals in various forms, names known to the system, punctuation, or dictionary words which may be abbreviated or misspelled in a minor way. The scanner outputs a set of alternative definitions for each incoming token, and the syntax stage attempts to select the intended meaning for each one.</Paragraph> <Paragraph position="1"> Each dictionary entry consists of a set of pairs of features. Two examples appear in Figure 3, the definitions of the word &quot;zero&quot; as an imperative verb and as an adjective. &quot;Zero&quot; as a verb takes one argument 74 American Journal of Computational Linguistics, Volume 6, Number 2, April-June 1980 Alan W. Biermann and Bruce W. Ballard Toward Natural Language Computation and no particle (type OPS1). The meaning of an imperative verb is built into the execution code of the matrix computer as explained in Section 6. As an adjective, the meaning of &quot;zero&quot; is embedded in the semantics code described in Section 5. That code will execute a routine associated with the name in the example input sentence. Associated with each token is the set of alternate definitions proposed by the system and the syntax stage will attempt to make appropriate choices such that the sentence is meaningful. Most tokens are found in the dictionary, but the string &quot;thee&quot; is not. So dictionary entries are selected by the spelling corrector which are similar to the unknown. The token &quot;y&quot; is also not found in the dictionary but is recognized as the name of an existing matrix entity. The words &quot;zero&quot; and &quot;to&quot; appear in the dictionary with multiple meanings.</Paragraph> <Paragraph position="2"> INTERPRETATION ( S ) add - verb y - propname to - verbicle to - prep thee - propname the - art them - pron then - etc there - etc these - pron these - art three - num zero - verb zero - adj zero - num entries - noun - punctuation &quot;Add y to thee zero entries.&quot; Scanner output for a sample sentence giving alternate interpretations for each word.</Paragraph> </Section> class="xml-element"></Paper>