HUMAN-COMPUTER INTERACTION SECOND EDITION
Dix, Finlay, Abowd and Beale


Search Results


Search results for words
Showing 141 to 150 of 151 [<< prev] [next >>] [new search]


Chapter 15 Out of the glass box 15.3.4 Uninterpreted speech Page 561

Recordings of users' speech can also be very useful especially in collaborative applications; for example, many readers will have used voice-mail systems. Also, recordings can be attached to other artefacts as audio annotations in order to communicate with others or to remind oneself at a later time. For example, audio annotations can be attached to Microsoft Word documents.


Chapter 15 Out of the glass box 15.4 Non-speech sound Page 562

Non-speech sound has traditionally been used in the interface to provide warnings and alarms, or status information. For example, there is experimental evidence to suggest that the addition of audio confirmation of modes, in the form of changes in key clicks, reduces errors [161]. Video games offer further evidence, since experts tend to score less when the sound is turned off than when it is on; they pick up vital clues and information from the sound while concentrating their visual attention on different things. Dual-mode displays are, in general, thought to be better since the presentation of similar information along different channels allows the brain to search along two paths, with the best path finishing first and therefore maximizing response time. The presentation of redundant information in this way may increase a user's performance since, for example, he may be able to remember the sound associated with a particular icon but not its visual representation. Ambiguity in one mode can also be resolved by using the information presented in the other. One such example is of a speech recognition system that also uses a camera to video the lip movements of the speaker. Indistinct words or phrases can be resolved more accurately by using the visual information as well as analyzing the sound.


Chapter 15 Out of the glass box 15.4 Non-speech sound Page 562

We have previously discussed the role of speech in the interface, but non-speech sounds offer a number of inherent advantages. Speech is serial and we have to listen to most of a sentence before we can extract the meaning; since many words make up a message this can take a relatively long period of time. On the other hand, non-speech sounds can be associated with a particular action and assimilated in a much shorter period. Non-speech sounds can also be universal; in much the same way as visual icons have the same meaning in many different countries, so can non-speech sounds. The same is not true of speech, which requires that we understand and interpret it and so have to know the language used. Non-speech sound is also able to make use of the phenomenon of auditory adaptation, in which sounds that are unchanging can fade into the background, only becoming evident when they alter or cease. One problem is that non-speech sounds have to be learned, whereas the meaning of a spoken message is obvious (at least to a user conversant in the language used). However, since users are able to learn the visual icons associated with things, this should not be seen as too great a disadvantage.


Chapter 15 Out of the glass box 15.4 Non-speech sound Page 563

Soundtrack is an early example of a word processor with an auditory interface, designed for visually disabled users [76]. The visual items in the display have been given auditory analogs, made up of tones, with synthesized speech also being used. A two-row grid of four columns is Soundtrack's main screen (see Figure 15.3); each cell makes a different tone when the cursor is in it, and by using these tones the user can navigate around the system. The tones increase in pitch from left to right, while the two rows have different timbres. Clicking on a cell makes it speak its name, giving precise information that can reorient a user who is lost or confused. Double clicking on a cell reveals a submenu of items associated with the main screen item. Items in the submenu also have tones; moving down the menu causes the tone to fall whilst moving up makes it rise. A single click causes the cell to speak its name, as before, whilst double clicking executes the associated action. Soundtrack allows text entry by speaking the words or characters as they are entered, with the user having control over the degree of feedback provided. It was found that users tended to count the different tones in order to locate their position on the screen, rather than just listen to the tones themselves, though one user with musical training did use the pitch. Soundtrack provides an auditory solution to representing a visually based word processor, though the results are not extensible to visual interfaces in general. However, it does show that the human auditory system is capable of coping with the demands of highly interactive systems, and that the notion of auditory interfaces is a reasonable one.


Chapter 15 Out of the glass box 15.5.2 Recognizing handwriting Page 568

These problems are reminiscent of those already discussed in speech recognition, and indeed the recognition problem is not dissimilar. The equivalent of co-articulation is also prevalent in handwriting, since different letters are written differently according to the preceding and successive ones. This causes problems for recognition systems, which work by trying to identify the lines that contain text, and then to segment the digitized image into separate characters. This is so difficult to achieve reliably that there are no systems in use today that are good at general cursive script recognition. However, when letters are individually written, with a small separation, the success of systems becomes more respectable, although they have to be trained to recognize the characteristics of the different users. If tested on an untrained person, success is limited again. Many of the solutions that are being attempted in speech recognition are also being tried in handwriting recognition systems, such as whole-word recognition, the use of context to disambiguate characters, and neural networks, which learn by example.


Chapter 15 Out of the glass box Background Page 571

In all of these cases, the emphasis on ubiquity is clearly seen in the capture and integration phases. Electronic capture is moved away from traditional devices like the keyboard and brought closer to the user in the form of pen-based interfaces or actual pen and paper. There is not so much emphasis on ubiquity of access, mainly because the focus is on supporting a small group of users, usually only one. We can look at another application domain for capture, integration and access that emphasizes group access. That application domain is education, and is the concern of the Classroom 2000 project at the GVU Center at Georgia Tech [4]. One way to view classroom teaching and learning is as a group multimedia authoring activity. Before class, teachers prepare outlines, slides, or notes and students read textbooks or other assigned readings. During the lecture, the words and actions of the teacher and students expound and clarify the lessons underlying the prepared materials. It is common practice to annotate the prepared material during the lecture and to create new material as notes on a whiteboard or in a student notebook. These different forms of material - printed, written and spoken - are all related to the learning experience that defines a particular course, and yet there are virtually no facilities provided automatically to record and preserve the relationships between them. Applying a variety of ubiquitous computing technologies - electronic whiteboards, personal pen-based interfaces, digital audio and video recording, and the World Wide Web - would allow us to test whether ubiquitous computing positively affects the teaching and learning experience.


Chapter 15 Out of the glass box 15.9 Interfaces for users with special needs Page 576

For users with speech and hearing impairments, multimedia systems provide a number of tools for communication, including synthetic speech and text-based communication and conferencing systems (see Chapter 13). Textual communication is slow, which can lower the effectiveness of the communication. Predictive algorithms have been used to anticipate the words used and fill them in, to reduce the amount of typing required. Conventions can help to provide context, which is lost from face-to-face communication, for example the 'smilie' :-), to indicate a joke. Facilities to allow turn-taking protocols to be established also help natural communication [173].


Chapter 15 Out of the glass box 15.9 Interfaces for users with special needs Page 578

Finally, users with learning disabilities such as dyslexia can find textual information difficult. In severe cases speech input and output can alleviate the need to read and write and allow more accurate input and output. In cases where the problem is less severe, spelling correction facilities can help users. However, these need to be designed carefully: often conventional spelling correction programs are useless for dyslexic users since the programs do not recognize their idiosyncratic word construction methods. As well as simple transpositions of characters, dyslexic users may spell phonetically, and correction programs must be able to deal with these errors.


Chapter 15 Out of the glass box 15.11.2 Structured information Page 585

One common approach is to convert the discrete structure into some measure of similarity. For a hypertext network this might be the number of links that need to be traversed between two nodes; for free text the similarity of two documents may be the proportion of words they have in common. A range of techniques can then be applied to map the data points into two or three dimensions, preserving as well as possible the similarity measures (similar points are closer). These techniques include statistical multi-dimensional scaling, some kinds of self-organizing neural networks, and simulated gravity. Although the dimensions that arise from these techniques are arbitrary, the visual mapping allows users to see clusters and other structures within the dataset.


Chapter 16 Hypertext, multimedia and the World Wide Web 16.2 Text, hypertext and multimedia Page 594

There are many different ways of traversing the network, and so there are many different ways of reading a hypertext document - the intention is that the user is able to read it in the way that suits him best. Links can exist at the end of pages, with the user choosing which one to follow, or can be embedded within the document itself. For example, in an on-line manual, all the technical words may be linked directly to their definitions in the glossary. Simply clicking on an unknown word takes the user to the relevant place in the glossary. Another unknown word encountered there can also be traced back to its definition and then the user can easily return to his original place in the manual. The positions of these links are known as hot-spots since they respond to mouse clicks. Hot-spots can also be embedded within diagrams, pictures or maps, allowing the user to focus his attention on aspects that interest him.


Search results for words
Showing 141 to 150 of 151 [<< prev] [next >>] [new search]

processed in 0.006 seconds


feedback to feedback@hcibook.com hosted by hiraeth mixed media