File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/01/w01-0903_metho.xml
Size: 33,697 bytes
Last Modified: 2025-10-06 14:07:37
<?xml version="1.0" standalone="yes"?> <Paper uid="W01-0903"> <Title>Usability Evaluation in Spoken Language Dialogue Systems</Title> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> 4 User involvement </SectionTitle> <Paragraph position="0"> In general, representative users from the target user group(s) should be involved in evaluation from early on. The developers themselves can certainly discover many of the usability problems with the early design and implementation, especially when supported by state-of-the-art usability standards, evaluation criteria and design support tools. The problem is that they know too well how to interact with the system in order to avoid creating interaction problems which the system cannot handle. For the time being, there is no alternative to involving the target users in all or most system evaluation phases and for most usability evaluation purposes. This is costly and complex to do. However, the data analysis which is crucial to benefiting from trials with the system, is as necessary after trials with developers as it is after trials with representative users. Even the early involvement of representative users is no guarantee that the system will ultimately produce sufficient user satisfaction. For one thing, the data distribution they generate may not match the behaviour of the users of the system, once installed. For another, experimental user trials are different from real situations of use in which time, money and trust are really at stake. For these reasons, and particularly when introducing SLDSs which are innovative in some respect, it is necessary to prepare and budget for field trials with the implemented system as well as for the subsequent data analysis and fine-tuning of the system. Users who are &quot;only&quot; involved in a test can be much more indifferent to, or more positive towards, a system with poor usability characteristics than real users who have something to loose if the system lets them down (Bernsen et al. 1998).</Paragraph> <Paragraph position="1"> 5 How to evaluate Evaluation, including usability evaluation, is non-trivial and cannot be explained simply by stating what to evaluate (cf. Section 6) and what the developers' options are. One of the most difficult questions in evaluation probably is how to do it properly. We have developed a template which supports consistent and detailed description of each evaluation criterion (Bernsen and Dybkjaer 2000). The template includes the following issues: what is being evaluated (e.g.</Paragraph> <Paragraph position="2"> feedback adequacy), the system part evaluated (e.g. the dialogue manager), type of evaluation (e.g. qualitative), method(s) of evaluation (e.g. controlled user experiments), symptoms to look for (e.g. user clarification questions), life cycle phase(s) (e.g. simulation), importance of evaluation (e.g. crucial), difficulty of evaluation (e.g. easy), cost of evaluation (e.g. expensive), and support tools (e.g. SMALTO), see (www.disc2.dk/tools). The idea is that the combined set of (i) design options for SLDS usability, (ii) usability evaluation criteria, and (iii) template-based characterisation of each criterion, will provide developers with sufficient information for proper evaluation of their SLDSs.</Paragraph> </Section> <Section position="4" start_page="0" end_page="0" type="metho"> <SectionTitle> 6 What to evaluate </SectionTitle> <Paragraph position="0"> In general terms, a usable SLDS must satisfy user needs which are similar to those which must be satisfied by other interactive systems.</Paragraph> <Paragraph position="1"> The SLDS must be easy to understand and interact with. Interaction should be smooth and the user should feel in control throughout the dialogue with the system. It is the task of the SLDS developer to meet those user needs considered as overall usability design goals.</Paragraph> <Paragraph position="2"> However, SLDSs are very different from more traditional interactive systems whose usability aspects have been investigated for decades, such as systems controlled through graphical user interfaces involving screen, keyboard and mouse. Perhaps the most important difference is that speech is perceptually transient rather than static. This means that the user must pick up the output information provided by the system the moment it is being provided or else miss it altogether. Moreover, the user has no way of inspecting the interface prior to interaction. If the interface is not self-evident all the way through the dialogue it must be learnt by trial-and-error through repeated interaction, which is unsatisfactory for the casual walk-up-and-use user. Secondly, the processing (recognition, language understanding, dialogue management) of spoken input remains difficult to design and error-prone in execution, which is why SLDSs must be crafted with extreme care to ensure that users do not produce spoken input which the system is incapable of handling.</Paragraph> <Paragraph position="3"> In the following we present a set of usability evaluation criteria which are based on results achieved in the European DISC project (www.disc2.dk) on best practice in the development and evaluation of SLDSs (Failenschmid et al. 1999). Our claim is that quality usability of SLDSs may be pursued by focusing on a comprehensive set of 15 usability issues which include all major usability perspectives on SLDSs. These usability issues or a subset thereof, depending on how advanced and complex the system is to be - should be represented in the system specification and should therefore also be reflected in a set of evaluation criteria for the system which would appear mandatory for evaluating the usability of the SLDS. More details on the usability issues can be found in (Dybkjaer and Bernsen 2000).</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 6.1 Modality appropriateness </SectionTitle> <Paragraph position="0"> The majority of task-oriented SLDSs have so far used speech as the only input/output modality.</Paragraph> <Paragraph position="1"> However, an increasing number of systems now combine spoken input/output with other modalities.</Paragraph> <Paragraph position="2"> It is well-known that speech-only interaction is not appropriate for all tasks and applications, and the same is true for any particular modality combination which includes speech input and speech output. Few users would e.g. be happy to speak aloud their pin code to the bank teller machine in the street. The developers should attempt to make sure that spoken input and output, possibly combined with other input/output modalities, is an appropriate modality choice for the planned application. If the chosen modalities are inappropriate, chances are that the users either will not accept the application or will refrain from using some of the modalities it offers. Common sense, experimentation and/or the use of the tool SMALTO (www.disc2.dk/tools) may help the developers in making the right modality choice.</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 6.2 Input recognition adequacy </SectionTitle> <Paragraph position="0"> From the user's point of view, good speech recognition means that the system rarely gets the user's spoken input wrong or fails to recognise what the user just said. Recognition success, as perceived by the user, not only depends on recogniser quality but also on how other parts of the SLDS handle the user's input. Good recogniser quality nevertheless remains the key factor in making users confident that the system will successfully get what they say.</Paragraph> <Paragraph position="1"> Walk-up-and-use systems may be used by many different users in highly different environments. The speech recogniser, therefore, and depending on more specific information on its intended users and environments of interaction, must be trained to recognise a variety of dialects and accents, speakers of different gender, age and voice quality, speaking with a low or a loud voice, in noisy or quiet environments, and with varying channel quality.</Paragraph> <Paragraph position="2"> Adequate information on users and environments is essential input to the selection and creation of training data. To assess the quality of the system's recognition capabilities prior to running the full system, speech recognition accuracy may be tested on the recogniser with users from the target group(s).</Paragraph> </Section> <Section position="3" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 6.3 Naturalness of user speech </SectionTitle> <Paragraph position="0"> Speaking to an SLDS should feel as easy and natural as possible. It does not help the user that the system's speech recognition is perfect in principle if the input vocabulary and grammar expected from the user are not the ones which the user is likely to use and thus cannot be understood. Depending on, e.g., the task and users' experience, what is &quot;natural&quot; input language may vary considerably.</Paragraph> <Paragraph position="1"> What is being experienced as natural input speech is also highly relative to the system's output phrasing. For example, lengthy and/or overly polite system utterances are likely to invite similar linguistic user behaviour, thereby burdening input recognition and understanding unnecessarily. The system's output language thus should be used to control users' input language so that the latter becomes manageable for the system whilst still feeling natural to the user. If the minimal constraints imposed by the task are satisfied and the system's output language adequately controls the user's input language, users may well feel that the dialogue is natural even if they are not inclined to engage in lengthy conversation.</Paragraph> <Paragraph position="2"> Analysis of data from system simulations, questionnaires and interviews is a useful tool for obtaining information on users' input language and on what they perceive as being natural input language.</Paragraph> </Section> <Section position="4" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 6.4 Output voice quality </SectionTitle> <Paragraph position="0"> From the user's point of view, good SLDS output voice quality means that the system's speech is clear and intelligible, does not demand an extra listening effort, is not particularly noise sensitive or distorted by clicks and other extraneous sounds, has natural intonation and prosody, uses an appropriate speaking rate, and is pleasant to listen to (Karlsson 1999). Taken together, these requirements are difficult to meet today.</Paragraph> <Paragraph position="1"> There are three main types of output speech: recordings of entire system utterances, concatenation of recorded words and phrases, and text-to-speech (TTS). Concatenated speech is the most frequently used type of speech in today's SLDSs. For walk-up-and-use systems in particular, TTS may simply be too difficult to understand for infrequent users while full recordings are much too inflexible. Moreover, too natural output speech, like full recordings, may suggest to users that the system is far more capable and human-like than it actually is, encouraging them to address the system in a way which is more conversational and talkative than it can handle.</Paragraph> <Paragraph position="2"> The type of output voice chosen is likely to affect users' perception of the system as a whole. In particular, and together with the quality of the speech output, the voice type has a major influence on how pleasant users find the &quot;system's voice&quot;. Voice type includes features such as male/female, deep/high voice, speaking rate, and emotions.</Paragraph> <Paragraph position="3"> In order to gather input on user preferences with respect to the system's output voice, representative users of the system under development may be asked to listen to different &quot;system voices&quot; and provide feedback on which one they prefer and what they like and dislike about each of them.</Paragraph> </Section> <Section position="5" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 6.5 Output phrasing adequacy </SectionTitle> <Paragraph position="0"> Regardless of the topic, the system should express itself co-operatively in order to maximise the likelihood that the task is achieved as smoothly and efficiently as possible . To facilitate successful interaction, the contents of the system's output should be correct, relevant and sufficiently informative without being overinformative. Users have good reason for dissatisfaction if the system provides false information, e.g., if the database is not being properly updated. Lack of relevance of system output caused by, e.g., misrecognition, will typically lead to meta-communication dialogue.</Paragraph> <Paragraph position="1"> System output should be sufficiently informative. Otherwise, misunderstandings may occur which are only detected much later during interaction, if at all, or which, at best, lead to immediate requests for clarification by the user.</Paragraph> <Paragraph position="2"> Conversely, the system should not provide too much or overly verbose information. Users may then e.g. become inattentive, try to take the dialogue initiative, or become confused and initiate clarification meta-communication.</Paragraph> <Paragraph position="3"> The form of system expressions should be clear and unambiguous, and language and, as far as possible, terminology should be consistent and familiar to the user (Bernsen et al. 1998).</Paragraph> <Paragraph position="4"> Unclarity naturally leads to uncertainty and need for clarification. So does ambiguity if detected by the user. If undetected, as often happens, the effects of ambiguity can be severe. If the user unknowingly selects a non-intended meaning of a word or phrase uttered by the system, all sorts of things can go wrong. To help avoid ambiguity it is, moreover, advisable to use the same expressions for the same purposes throughout the dialogue. The system preferably should not use terms and expressions which are not familiar to most or all of its users. If the system must do that, unfamiliar terminology should be explained either proactively (before users ask) or through adequate measures for clarification metacommunication. null Developers may use CODIAL - a tool based on Cooperativity Theory - as support for the design and evaluation of co-operative system dialogue (www.disc2.dk/tools).</Paragraph> <Paragraph position="5"> It is important to realise that the system's output language tends to have a massive priming effect on the user's language. It is, therefore, crucial that the words and grammar used in system output can be recognised and understood by the system itself. Similarly, the system should have a speaking style which induces users to provide input that is to the point and can be handled by the system.</Paragraph> <Paragraph position="6"> Interaction data analysis is needed to assess the efficiency of the input control strategies adopted. User contacts through interviews and questionnaires are good means for obtaining early input on how users experience the system's output.</Paragraph> </Section> <Section position="6" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 6.6 Feedback adequacy </SectionTitle> <Paragraph position="0"> Adequate feedback is essential for users to feel in control during interaction. The user must feel confident that the system has understood the information input in the way it was intended, and the user must be told which actions the system has taken and what the system is currently doing. A difficult thing is that telling the user is not always good enough - the user must be told in such a way that the user notices what the system says. It may therefore be a good thing for SLDSs to provide several different kinds of feedback to their users. We distinguish between process feedback and information feedback.</Paragraph> <Paragraph position="1"> When the system processes information received from the user and hence may not be speaking for a while, process feedback - which may be provided in many different ways - keeps the user informed on what is going on. A user who is uncertain about what the system is doing, if anything, is liable to produce unwanted input or to believe that the system has crashed and decide to hang up. Moreover, the uncertainty itself is likely to affect negatively the user's satisfaction with the system.</Paragraph> <Paragraph position="2"> Feedback on the system's understanding of what the user just said and on the actions taken by the system helps ensure that, throughout the dialogue, the user is left in no doubt as to what the system has understood and is doing.</Paragraph> <Paragraph position="3"> Information feedback can be provided in different ways and more or less explicitly. The amount and nature of the information feedback depends e.g. on the cost and risk involved in the user-system transaction. A user who is uncertain as to what the system has understood, or done, is liable to produce unwanted input and to react negatively to the way the system works.</Paragraph> </Section> <Section position="7" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 6.7 Adequacy of dialogue initiative </SectionTitle> <Paragraph position="0"> To support natural interaction, an SLDS needs a reasonable choice of dialogue initiative, an appropriate dialogue structure, sufficient task and domain coverage, and sufficient reasoning capabilities.</Paragraph> <Paragraph position="1"> Spoken human-human dialogue is prototypically mixed-initiative. However, many task-oriented dialogues tend to be directed primarily by one of the interlocutors. Users may even feel satisfied with less initiative when interacting with an SLDS than when talking to a person as long as the dialogue initiative distribution fits the task(s) the system and the user must solve together, and provided that the rest of the best practice issues proposed in this paper are properly attended to. Thus, system directed dialogue can work well for tasks in which the system simply requires a series of specific pieces of information from the user, in particular if the user is new to the system. To satisfy experienced users, the system may have to be able to cope with the larger packages of input information which are natural to these users.</Paragraph> <Paragraph position="2"> In principle, a (mainly) user directed dialogue is as much of an aberration from mixed initiative dialogue as is the (mainly) system directed dialogue. Currently, user directed dialogue would seem to be appropriate primarily for applications designed for experienced users who know how to make themselves understood by the system. Unless supported by screen graphics or other additional modalities, inexperienced users are likely to address the system in ways it cannot cope with.</Paragraph> <Paragraph position="3"> Mixed initiative dialogue, i.e. a mixture of system and user initiative, is often both desirable and technically feasible. At some points in the dialogue it may be appropriate that the system takes the initiative to guide the user, obtain missing information, or handle an error. At other points, such as when the user needs information from the system, is already familiar with the system or wants to correct an error, it is appropriate for the user to take the initiative.</Paragraph> </Section> <Section position="8" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 6.8 Naturalness of the dialogue </SectionTitle> <Paragraph position="0"> structure As long as we cannot build fully conversational systems, dialogue designers may have to impose some kind of structure onto the dialogue, determining which topics (or sub-tasks) could be addressed when. It is important that the structure imposed on the dialogue is natural to the user, reflecting the user's intuitive expectations, especially in system directed dialogue in which the user is not supposed to interfere with the dialogue structure. Unnatural dialogue structure will often cause users to try to take the initiative in ways which the system cannot cope with.</Paragraph> </Section> <Section position="9" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 6.9 Sufficiency of task and domain </SectionTitle> <Paragraph position="0"> coverage Sufficient task and domain coverage is also crucial to natural interaction. Even if unfamiliar with SLDSs, users normally have rather detailed expectations to the information or service which they should be able to obtain from the system. It is important that the system meet these expectations. If, for some reason, the system is not able to perform a certain sub-task which users would expect the system to handle, this has to be stated clearly. Even then, user satisfaction is likely to suffer.</Paragraph> </Section> <Section position="10" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 6.10 Sufficiency of the system's </SectionTitle> <Paragraph position="0"> reasoning capabilities Contextually adequate reasoning is a classical problem in the design of natural interaction. Even when users have been appropriately primed to expect a rather primitive interlocutor, they tend to assume that the system is able to perform the bits and pieces of reasoning which humans are able to do without thinking and which are inseparable parts of natural dialogue about the task. Typically, therefore, SLDSs must incorporate both facts and inferences about the task as well as general world knowledge in order to act as adequate interlocutors. Defining which kinds of reasoning the system must be capable of is part and parcel of defining the system's task and domain coverage and subject to similarly difficult decisions on task delimitation. It is possible to get rough ideas on initiative distribution, users' models of the task, and how to delimit the domain from studying recorded human-human dialogues on tasks similar to those which the system is intended to cover.</Paragraph> <Paragraph position="1"> However, the recordings should only be considered possible starting points. In particular, as task complexity grows, developers are likely to find themselves forced to adopt more restrictive task delimitations and impose a more rigid dialogue structure than those which they found in the human-human dialogues. Having done that, the resulting interaction model needs early testing and evaluation. In particular, if the developer is into relatively high task complexity compared to the state of the art, early testing is strongly recommended.</Paragraph> <Paragraph position="2"> 6.11 Sufficiency of interaction guidance Sufficient interaction guidance is essential for users to feel in control during interaction.</Paragraph> <Paragraph position="3"> Interaction guidance can be particularly hard to get right in speech-only, walk-up-and-use SLDSs. Speech is inappropriate for providing lengthy and complex &quot;user manual&quot; instructions up front for first-time users. Moreover, at any given time some users will already be familiar with the system whereas others will be novices.</Paragraph> <Paragraph position="4"> Issues to consider include cues for turn-taking vs. barge-in; help facilities; and highlighting of non-obvious system behaviour.</Paragraph> <Paragraph position="5"> Barge-in allows the user to speed up the interaction, e.g. by interrupting already familiar instruction prompts. If the system does not allow barge-in, it must provide clear cues for turntaking, making it completely clear to the user when to speak and when to refrain from speaking because the system does not listen.</Paragraph> <Paragraph position="6"> Cues can be explicit or implicit. If the user starts speaking while the system is still listening but processing the previous user input, the user's new input may cause problems for the dialogue manager which has to generate an appropriate response to disjoint pieces of user input. And if the system is not listening any more, important input could be lost in cases when users do not merely repeat themselves.</Paragraph> <Paragraph position="7"> General and explicit instructions on what the system can and cannot do and how to interact with it may be provided in a spoken introduction which can be repeated on request or be skipped by experienced users. In fact, most speech-only SLDSs strictly need some up-front introduction to guide interaction. We already mentioned the when-(not)-to-speak issue above. Just as importantly, the system should be perfectly clear about the task(s) which the user can accomplish through interaction. The introduction should not be too long because then users cannot remember the instructions. Moreover, the instructions must be feasible for the user. If the instructions needed by the walk-up-and-use user are too many to be presented in the system's introduction, some of them may be relocated for presentation at particular points during interaction and only when needed.</Paragraph> <Paragraph position="8"> Providing useful help mechanisms is a difficult interaction design task. Help may be an implicit part of the dialogue, be available on request by saying &quot;help&quot;; or be automatically enabled if the user is having problems repeatedly, for instance in being recognised. In this case the system may, e.g., propose how to express input or inform the user on what can be said.</Paragraph> <Paragraph position="9"> Sufficiency of interaction guidance should be carefully evaluated by exposing the SLDS to interaction with representative users.</Paragraph> <Paragraph position="10"> 6.12 Error handling adequacy Even if the best practice issues discussed so far have been taken into account carefully during specification, design and implementation, the SLDS and its users will still make errors during dialogue. In human-system interaction, error prevention is far preferable to error correction, and what those best practice issues do is to help prevent errors from occurring during interaction. Also as regards error handling current SLDSs are far inferior to their human interlocutors. This is why adequate error handling remains a difficult issue in SLDS development. Intuitively, this issue can be decomposed along two dimensions: (a) either the system initiates error-handling meta-communication or the user initiates error-handling meta-communication.</Paragraph> <Paragraph position="11"> And (b) when error-handling meta-communication is initiated, it is either because one party has failed to hear or understand the other or because what was heard or understood is false, or it is because what was heard or understood is somehow in need of clarification.</Paragraph> <Paragraph position="12"> We distinguish, therefore, between system or user initiated repair meta-communication and system or user initiated clarification metacommunication. null System-initiated repair meta-communication is needed whenever the system either did not understand or was uncertain that it understood correctly what was said. In such cases, the system must ask for repetition, ask the user to speak louder or modify the way the input is being expressed in other specified ways, or tell the user what it did understand and ask for confirmation or correction. In case of a repeated misunderstanding the system may either choose to fall back on a human operator, close the dialogue, or, better, start graceful degradation, i.e. change the level of interaction into a simpler one. If users simply fail to respond, then the system should tell that it is expecting their input. Users may also be understood by the system to have said something which is false and hence needs to be corrected. User-initiated repair meta-communication can be designed in several different ways. Ideally, users should just initiate repair the same way they would have done in dialogue with a human, but since users may express their corrections in many different ways this is very difficult. Some systems require the user to use specifically designed keywords. The problem is that using keywords for correction is unnatural and hence difficult for the user to remember. A third approach is the &quot;eraser&quot; principle where the user simply repeats his input until the system has received the message.</Paragraph> <Paragraph position="13"> Whilst this solution may work well for lowcomplexity tasks, it may be difficult to keep track of in high-complexity tasks. And it will not work if the system cannot recognise input on any sub-task all the time.</Paragraph> <Paragraph position="14"> Very roughly speaking, clarification meta-communication is more difficult to design for than repair meta-communication, and user-initiated clarification meta-communication is more difficult to design for than system-initiated clarification meta-communication. System-initiated clarification is needed when the user's input is inconsistent, ambiguous or underspecified. In such cases, the system must ask for clarification, for instance by pointing out that an expression is inconsistent. User-initiated clarification is needed whenever the system produces inconsistent or ambiguous utterances, or uses terms with which the user is not familiar. Unfortunately, handling user clarification questions is difficult for SLDSs and the system developers might not have discovered all the potential problems in the first place. If they had, they could have tried to prevent all or most of the problems from occurring through adequate output phrasing or other means. Due to the nature of their domain, some tasks inherently require facilities for clarifying the terminology used because it may not be a practical option for e.g. a car sales system to explain all domain terms as it goes along.</Paragraph> <Paragraph position="15"> Most SLDSs need abilities for handling system- and user-initiated repair, and many SLDSs need system-initiated clarification abilities. There is no simple decision procedure for deciding which mechanisms to include in a particular SLDS. Sensible decisions very much depend on factors such as domain, task complexity, user population and peculiarities of user behaviour which can only be discovered through interaction data analysis.</Paragraph> <Paragraph position="16"> 6.13 Sufficiency of adaptation to user differences It is useful to distinguish between four types of user: system expert/domain expert, system expert/domain novice, system novice/domain expert and system novice/domain novice. An SLDS needs not support all four groups, of course. If the target user group is domain and system experts only, then, obviously, the system is not a walk-up-and-use system and thus falls outside the group of SLDSs considered in this paper. If the primary target group is system novice users, on-line instructions and other help information is likely to be needed. This need tends to increase even further when the system novices are also domain novices who need explanation of domain technicalities.</Paragraph> <Paragraph position="17"> Given the relative simplicity of current SLDSs, walk-up-and-use users may quickly become (system) experts. This means that interaction should be supported and facilitated for both system novices and system experts.</Paragraph> <Paragraph position="18"> Special shortcuts for expert interaction can be a good solution. Such shortcuts include e.g.</Paragraph> <Paragraph position="19"> introductions which can be skipped easily through barge-in or explicit de-selection.</Paragraph> <Paragraph position="20"> 6.14 Number of interaction problems Lack of co-operativity in the system's output may be diagnosed from the occurrence of communication problems in simulated or real user-system interaction. Data capture and analysis is costly, however, especially because large amounts of data may be needed for triggering most of the communication problems which the system is likely to cause. To reduce cost, and to help identify those kinds of lack of cooperativity which are less likely to cause communication problems, CODIAL may be used both for walk-throughs through the interaction design prior to data capture and for the actual data analysis.</Paragraph> <Paragraph position="21"> 6.15 User satisfaction Objectively measured quality, technical and otherwise, does have an impact on user satisfaction, but this is far from being the whole story. User satisfaction is inherently subjective, building on personal preferences and contextual factors. Unfortunately, some of the most difficult usability issues exactly concern contextual adequacy, i.e. adequacy of the full set of contextual factors which contribute to making an SLDS acceptable to its users. These factors remain insufficiently explored both as regards which they are and as regards their individual contributions to user satisfaction. It is possible that contextual factors, such as service improvements or economical benefits, are among the most important factors influencing users' satisfaction with SLDSs.</Paragraph> <Paragraph position="22"> Much still remains to be discovered about how the behaviour of SLDSs affect the satisfaction of their users. Therefore, subjective evaluation remains a cornerstone in SLDS evaluation. User questionnaires and interviews remain core tools for gathering information on user satisfaction.</Paragraph> </Section> </Section> class="xml-element"></Paper>