File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/97/w97-0504_metho.xml
Size: 18,512 bytes
Last Modified: 2025-10-06 14:14:43
<?xml version="1.0" standalone="yes"?> <Paper uid="W97-0504"> <Title>Legend: Language Swedish Swedish</Title> <Section position="3" start_page="23" end_page="25" type="metho"> <SectionTitle> 2 Testing the Current Version of Profet </SectionTitle> <Paragraph position="0"> First of all, a study conducted by a speech pathologist with a number of subjects will be presented.</Paragraph> <Paragraph position="1"> Then follow two quantitative studies without subjects. null Profet, previously called Predict, has been evaluated for several years, initially together with individuals with slow and laborious writing stemming from a motoric dysfunction. As slow writing speed is often believed to be a very important issue for individuals with motoric impairments, its main purpose was to accelerate the writing process. In an effort to systematically investigate the aid provided by this program, a study was conducted in which time-saving and effort-saving were chosen as parameters. Time-saving was measured as the number of output characters produced during a given time, and efficiency as a decrease in the number of keystrokes for a given text. Eight persons with motor disabilities participated in the study, six with cerebral palsy and one with a muscular disease, two of them also evidencing writing difficulties of a linguistic nature.</Paragraph> <Paragraph position="2"> A &quot;single-case design&quot; was used. Prior to introduction of word prediction to the writer, a baseline was established during repeated sessions with texts written without any writing support. This made it possible to compare texts written with vs without Profet. The baseline test consisted of two tasks: a) to copy from a given text and b) to write about a topic that was chosen freely before the test began.</Paragraph> <Paragraph position="3"> Tests of the same type were then administered at three separate sessions with two months of training between each test.</Paragraph> <Paragraph position="4"> The degree of improvement relating to speed and efficiency was found to vary considerably among subjects depending on their underlying writing abilities and which strategies they employed. With subjects A and B, the number of characters in text per minute increased and the total number of keystrokes decreased, as expected. Subject C, however, was too fast a typist to benefit from the program. Sub-ject D, who was not extremely slow, felt that the program helped her because it forced her to use a more efficient typing strategy. For subject E, who was extremely slow and very easily exhausted, the program had only begun to have an effect but was expected to continue to improve performance even after the study had ended. However, contrary to our expectations, subject F, who had a severe motor disability, showed no improvement. For subject G, the only difference was decreased writing speed.</Paragraph> <Paragraph position="5"> Lastly, although the improvements exhibited in sub-ject H were small, they motivated him to increase his writing significantly.</Paragraph> <Paragraph position="6"> In summary, the results of this first study indicate that a) there was most often a reduction of keystrokes, which meant less effort; b) a reduction in the number of keystrokes did not necessarily mean a savings in time; c) the writing strategy had to be changed due to a higher cognitive load on the writing process, i.e., the time-saving gained by fewer keystrokes was consumed by longer time looking for the right alternative, which involved shifting one's gaze from the keyboard to the screen and back to the keyboard, then having to make a decision and hit the right key; d) speed was not the most important aspect to the user, but the effort-saving (as typing is often very laborious for a person with a motor impairment; one comment was: &quot;I get less exhausted when I write with Profet&quot;), and the possibility of producing more correct texts; e) the written texts were often better spelled and, on the whole, had a better linguistic structure, which was an unexpected, positive finding; f) a typical Profet error that occurred was when the subject chose an incorrect prediction (This type of error, where the word is spelled correctly but completely unrelated to the context, gives the text a bizarre look, and the text actually ends up being more unintelligible than if the word had merely been misspelled. However, the improvement in spelling outweighs this problem); and g) the possibility of adding speech synthesis to the other functions of Profet was an important and helpful feature to severely dyslectic individuals. The implication of these findings is that the effect and efficiency of a writing aid of this type to a great extent is dependent upon the underlying writing strategy and skills of the user.</Paragraph> <Paragraph position="7"> Two subjects that participated in the speed enhancement evaluation study turned out to have se-</Paragraph> <Paragraph position="9"> vere writing difficulties at different linguistic levels: the character level (spelling errors), morphological level (agreement and occasional inflection errors), and syntactic level (incorrect word order, poor grammatical variability and incorrect handling of function words).</Paragraph> <Paragraph position="10"> Of the two subjects who had difficulties with spelling and text construction, one showed substantial improvement and the other showed moderate improvement but reported a significant difference in ease of writing. These results indicated the power of prediction techniques as linguistic support for writing and stimulated the interest for the present focus on use of word prediction for persons with reading and writing difficulties and/or dyslexia. In a follow-up study, the potential to use the program as a support for spelling and sentence construction was also investigated by comparing spelling and word choice as well as qualitative aspects such as intelligibility and general style. Subsequent studies have included individuals with writing difficulties due to linguistic and/or dyslectic difficulties as well. In these linguistically oriented studies, the focus has been on spelling and morphosyntactic improvement or strategy changes. Qualitative aspects of the texts, such as intelligibility and stylistics, were judged by readers uninitiated as to the purpose of the study. To summarize the findings from this follow-up study: the use of Profet resulted in considerably better spelling, not much morphological improvement, inclusion of the usually non-existent function words, and more correct word order as well as positive subjective experiences such as &quot;Profet helps me write more independently.&quot; null Recently, two strictly quantitative comparative studies without subjects were also performed. In the first one, which was a preliminary test conducted at our laboratory, the Swedish, British English, Danish, and Norwegian versions of Profet were run automatically with a statistical evaluation program on text excerpts approximately 6000 characters in length. The results are presented in Table 1, where Preds is the number of suggestions presented in the prediction window, Chars the number of characters in the text, Keys the number of keystrokes required with word prediction, and Saved the keystroke savings expressed as a percentage of the number of keystrokes that would have been required, had word prediction not been used. As can be seen, keystroke savings range roughly from 33% to 38% for 5 predictions, and from 35% to 42% for 9 predictions. The cross-language variations in the results could stem from several factors, one undoubtedly being an unfortunate non-reversible character conversion error for &quot;C/&quot;, which, for Danish, resulted in predictions with the letter &quot;o&quot; and, for Norwegian, no predictions, for words with this character. A more linguistically valid factor would be differences in morphosyntactic language typology. For instance, the lower keystroke savings in Swedish compared to English might be explained in part by the fact that compounding (the formation of a new word, i.e., string, through the concatenation of two or more words) is a highly productive word creation strategy in Swedish, but not in English. Another factor might be the difference in test text style, the Swedish consisting of adolescent literature with a sizable amount of dialogue, the English of newspaper text from the electronic version of the Daily Telegraph, and the Danish and Norwegian of articles on language teaching. Likewise, the style of the texts from which the lexica were built must be taken into consideration. The Swedish lexicon was created from a 4 million-running-word balanced corpus augmented with a 10,000 word-frequency list and a 6,500 high-school word-list. The English lexicon was also built from a balanced corpus of some 4 million words, while the Danish was derived from a conglomerate of some 132,000 running words of newspaper text, prose, research reports, and legal and IT texts.</Paragraph> <Paragraph position="11"> The Norwegian lexicon was created from a 4 million-word corpus with a similar composition.</Paragraph> <Paragraph position="12"> The second study, conducted at the Universidad Politecnica de Madrid within the VAESS project, analyzed, on the one hand, keystroke savings obtained with different prediction systems that had been tested at various research sites, and, on the other hand, factors affecting keystroke savings (See also Boekestein, 1996). The lack of standardization of test conditions prevented any cross-linguistic or cross-product comparison of keystroke savings.</Paragraph> <Paragraph position="13"> The predictors included in the study were the Dutch (Boekestein, 1996) and Spanish (VAESS version) versions of Profet, and JAL-1 and JAL-2 for Spanish. Results from a test by Higginbotham (Higginbotham, 1992) of five word prediction systems were included as well; the systems were EZ Keys (Words, Inc.), Write 100, Predictive Linguistic Program (Adaptive Peripherals), Word Strategy (Prentke Romich Company & Semantic Corporation), and GET, all of which seem to have been tested on American or British English. Keystroke savings for these systems are presented below.</Paragraph> <Paragraph position="14"> Factors affecting keystroke savings are test text size, test text subject (lexicon coverage), prediction method, maximum number of prediction suggestions, method for selecting prediction suggestions, amount of time needed to write the test text, and type of interface. An example is the difference between an interface with automatic row-andcolumn scanning, which requires two keystrokes to select a letter, and an interface with linear scanning and keystrokes on a keyboard, which requires only one keystroke per letter. Differences in morphosyntactic typology should logically also influence keystroke savings. Relevant examples are inflectional paradigm size and word order flexibility.</Paragraph> <Paragraph position="15"> Spanish, for instance, has both a significantly larger verb inflection paradigm and a freer word order than English.</Paragraph> <Paragraph position="16"> Keystroke savings are here presented for the various prediction systems. First of all, with the Dutch version of Profet, they varied between 35% and 45%, depending on the setting of the test parameters. In * the testing of the Spanish VAESS version of Profet, savings were 50.34% - 51.3% for texts with lengths of 2300 - 3100 characters and the number of prediction suggestions set to 5. With the number of suggestions set to 10, the savings were 53.71% - 55.14%. It should be noted that the test texts belonged to the same corpus from which the lexicon had been built, thus assuring good lexicon coverage. For perfect adaptation of lexicon to test text, maximum savings of around 70% were obtained. The input method used was linear scanning. Testing JAL-1, JAL-2 for Spanish with frequency-based prediction yielded savings of 56.55% and 60.61%, with the number of predictions set to 5 and 10, respectively. Testing the same system with syntactic prediction with automaton yielded savings of 57.83% and 61.63 % with 5 and 10 predictions, respectively. With syntactic prediction based on the char parsing method, the savings were 58.47% with 5 predictions and 61.84% with 10.</Paragraph> <Paragraph position="17"> Information on test text size was unavailable for this system. For the following five predictors, no information on test conditions was available: EZ Keys 45%, Write 100 45%, Predictive Linguistic System 41%, Word Strategy 36%, and GET 31%.</Paragraph> </Section> <Section position="4" start_page="25" end_page="25" type="metho"> <SectionTitle> 3 Why a New Version of Profet? </SectionTitle> <Paragraph position="0"> The current project started in July 1995 and originated through the search for new applications, the desire for more accurate prediction and enhancement of the pedagogical aspects of the user interface. The goal of our research is a grammatically more accurate prediction, psychological user support, and integration with spellchecking developed by HADAR in MaimS, Sweden, into a writing support tool for dyslexics. The project is funded by the National Labour Market Board (AMS), The Swedish Handicap Institute (HI), and the National Social Insurance Board (RFV).</Paragraph> </Section> <Section position="5" start_page="25" end_page="26" type="metho"> <SectionTitle> 4 Hypothesis </SectionTitle> <Paragraph position="0"> Our hypothesis is that certain aspects of the disabled individual's writing will improve with the appropri-</Paragraph> <Paragraph position="2"> ate use of, and training with, the new version of Prolet with its augmented functionality. The purpose of this study is to find out a) if the user's spelling can be improved further by integrating Profet with a spellchecker, b) if the user's use of morphology (including the presence of required endings, the choice of endings and degree of agreement) improves with extension of scope and addition of grammatical tags, and c) if the subjects will approve of the predictions to a higher extent after incorporation of semantic tags.</Paragraph> <Paragraph position="3"> Test results of a first version of the new Profet show an increase in keystroke savings compared with the current version. (See Testing the New Version of Profet below). However, as previously mentioned, there is also a qualitative, non-quantifiable aspect to writing that has to be evaluated.</Paragraph> <Paragraph position="4"> 5 Description of the New Version of To date, the modifications of the prediction system include extension of scope, addition of grammatical and semantic information as well as automatic grammatical tagging of user words. To accommodate the weighting of multiple information sources, the strictly frequency-based program has been replaced by one based on probabilities. Furthermore, an efficient lexicon development algorithm has been developed, facilitating the creation of new lexica, from either untagged or grammatically tagged text. The word lexicons (unigrams and bigrams) were created with the new lexicon creation algorithm from a union corpus of the 300,000-word subset of the Stockholm-Ume~ Corpus (SUC) 1, while awaiting the forthcoming 1 million-word final version, and a 150 million-word conglomerate of electronic texts 2, including running text from newspapers, legal documents, novels, adolescent literature, and cookbooks. For comparison with the present version of Profet, the size of the new lexicons was set to 7000 words and 14,000 bigrams, respectively.</Paragraph> <Paragraph position="5"> Grammatical and/or semantic knowledge has been used in advanced systems worldwide since the early 1990s (Tyvand and Demasco, 1993) (Guenthnet et al, 1993) (Guenthner et al, 1993a) (Booth, Morris, Ricketts and Newell, 1992) and has proven able to increase communication rate (Arnott et al, 1993) (Tyvand and Demasco, 1993) (Le Pdvddic and ICurrently available on CD-ROM through the European Corpus Initiative (ECI).</Paragraph> <Paragraph position="6"> 2Sources: Spr~kdata 24 million words, SRF Tal & Punkt 37 million words, GSteborgsposten 5 million words, and Pressens Bild 100 million words.</Paragraph> <Paragraph position="7"> Maurel, 1996). The grammatical information that was added to our system consisted of a set of 146 grammatical tags based on that of SUC. The tag statistics for the database were derived from the SUC subset. Tag unigram (146), bigram (5163), and trigram (43,862) lexicons were created with the same lexicon-creating algorithm as the word lexicons. The inclusion of trigrams involved an extension of scope compared with the current version of Profet. Another new feature is the automatic grammatical classification of user words, which is based on n-gram statistics.</Paragraph> <Paragraph position="8"> Thirdly, a tentative effort was made to incorporate semantic information about the noun phrase into the prediction algorithm. Four semantic categories were established for nouns and adjectives: inanimate, animate, human, and inanimate behaving as human, an example of the latter being &quot;company&quot; as in &quot;The company laid off 20% of its employees.&quot; The unigram word lexicon was then hand-tagged and prediction tests run, with vs without semantic information. As stated earlier, the addition of semantic information was not motivated by a desire for further keystroke savings (Hunnicutt, 1989b). Rather, the goal was to promote coherent thinking in the writing process by demoting semantically incongruous word choices.</Paragraph> <Paragraph position="9"> As expected, fewer of these words appeared in the list of suggestions, and no keystroke savings were gained. In fact, the results exhibited a 1% decrease in savings, which seems to have two explanations.</Paragraph> <Paragraph position="10"> First of all, the addition of semantic tags increased the total number of tags from 146 to 338, resulting in sparser training data. Secondly, the semantic tagging was done statically, i.e., each word received one and only one semantic tag, independent of context.</Paragraph> <Paragraph position="11"> A large percentage of the words belonged to all four categories. It would therefore be useful to expand the semantic classification system.</Paragraph> </Section> class="xml-element"></Paper>