HUMAN-COMPUTER INTERACTION
SECOND EDITION
We have previously discussed the role of speech in the interface, but non-speech sounds offer a number of inherent advantages. Speech is serial and we have to listen to most of a sentence before we can extract the meaning; since many words make up a message this can take a relatively long period of time. On the other hand, non-speech sounds can be associated with a particular action and assimilated in a much shorter period. Non-speech sounds can also be universal; in much the same
Sound can be used to provide a second representation of actions and objects in the interface to support the visual mode and provide confirmation for the user. It can be used for navigation round a system, either giving redundant supporting information to the sighted user or providing the primary source of information for the visually impaired. Experiments on auditory navigation [196] have demonstrated that purely auditory clues are adequate for a user to locate up to eight targets on a screen with reasonable speed and accuracy, so there is no excuse for ignoring the role of sound in interfaces on the grounds that it may be too vague or inaccurate.
Soundtrack is an early example of a word processor with an auditory interface, designed for visually disabled users [76]. The visual items in the display have been given auditory analogs, made up of tones, with synthesized speech also being used. A two-row grid of four columns is Soundtrack's main screen (see Figure 15.3); each cell makes a different tone when the cursor is in it, and by using these tones the user can navigate around the system. The tones increase in pitch from left to right, while the two rows have different timbres. Clicking on a cell makes it speak its name, giving precise information that can reorient a user who is lost or confused. Double clicking on a cell reveals a submenu of items associated with the main screen item. Items in the submenu also have tones; moving down the menu causes the tone to fall whilst moving up makes it rise. A single click causes the cell to speak its name, as before, whilst double clicking executes the associated action. Soundtrack allows text entry by speaking the words or characters as they are entered, with the user having control over the degree of feedback provided. It was found that users tended to count the different tones in order to locate their position on the screen, rather than
Auditory icons [93] use natural sounds to represent different types of objects and actions in the interface. The SonicFinder [94] for the Macintosh was developed from these ideas. It is intended as an aid for sighted users, providing support through redundancy. Natural sounds are used since people recognize, not timbre and pitch, but the source of a sound and its behaviour [250]. They will recognize a particular noise as glass breaking or a hollow pipe being tapped; a solid pipe will give a different noise indicating not only the source but also the behaviour of the sound under different conditions. In the SonicFinder, auditory icons are used to represent desktop objects and actions. So, for example, a folder is represented by a papery noise, and throwing something in the wastebasket by the sound of smashing. This helps the user to learn the sounds since they suggest familiar actions from everyday life. However, this advantage also creates a problem for auditory icons. Some objects and actions do not have obvious, naturally occurring sounds that identify them. In these cases a sound effect can be created to suggest the action or object but this moves away from the ideal of using familiar everyday sounds that require little learning. Copying has no immediate analogous sound; in the SonicFinder it is indicated by the sound of pouring a liquid into a receptacle, with the pitch rising to indicate the progress of the copying. These non-speech sounds can convey vast amounts of meaning very economically; a file arrives in a mailbox, and being a large file it makes a weighty sound. If it is a text file it makes a rustling
An alternative to using natural sounds is to devise synthetic sounds. Earcons [25] use structured combinations of notes, called motives, to represent actions and objects (see Figure 15.4). These vary according to rhythm, pitch, timbre, scale and volume. There are two types of combination of earcon. Compound earcons combine different motives to build up a specific action, for example combining the motives for 'create' and 'file'. Family earcons represent compound earcons of similar types. As an example, operating system errors and syntax errors would be in the 'error' family. In this way, earcons can be hierarchically structured to represent menus. Earcons are easily grouped and refined owing to their compositional and hierarchical nature, but they may be harder to associate with a specific task in the interface since there is an arbitrary mapping. Conversely, auditory icons have a semantic relationship with the function that they represent, but can suffer from there being no appropriate sound for some actions.
We first introduced the notion of ubiquitous computing in Chapter 4. The interest in ubiquitous computing has surged over the past few years, thanks to some influential writings and plenty of experimental work. The defining characteristic of ubiquitous computing is the attempt to break away from the traditional desktop interaction paradigm and move computational power into the environment that surrounds the user. Rather than force the user to search out and find the computer's interface, ubiquitous computing suggests that the interface itself can take on the responsibility of locating and serving the user.
There has been a good deal of research related to this general capture, integration, and access theme, particularly for meeting room environments and personal note taking. Work at Xerox PARC has resulted in a suite of tools to support a scribe at a meeting [159, 164], as well as some electronic whiteboard technology - the LiveBoard [81] - to support group discussion. The Marquee note-taking system from PARC [254] and the Filochat prototype at Hewlett-Packard Labs [260] both supported individual annotation. A simple pen-based interface produced automatic indexes into either a video (for Marquee) or an audio (for Filochat) stream that could be traversed later on during access and review. Stifelman used an even more natural interface of pen and paper to produce a stenographer's notepad that automatically indexed each penstroke to a digital audio record [229]. The implicit connection between the note-taking device and alternate information streams (audio and/or video) is a common theme that has also been explored at MIT's Media Lab [113] and at Apple [60].
In all of these cases, the emphasis on ubiquity is clearly seen in the capture and integration phases. Electronic capture is moved away from traditional devices like the keyboard and brought closer to the user in the form of pen-based interfaces or
The desktop paradigm leaves it to the user to find the interface to a computational service (such as an email browser or a calendar manager) when it is needed. A ubiquitous software service, on the other hand, finds the user. Two important characteristics of such a service are: its availability on any device handy to the user; and its adaptability to a changing set of services that the user wants. The former characteristic is referred to as the scaleable interface problem. The creation of an architecture-neutral virtual machine, such as the Java Virtual Machine, solves part of the scaleable interface problem because it now becomes possible to execute the same program on many different devices.
There is work in the user interface development community to remove both the programmer burden and the user inflexibility of integrated software suites. Work at Georgia Tech (the CyberDesk project [5, 63]), Apple (Data Detectors [12]) and Intel (Pandit and Kalbag's Selection Recognition Agent [189]) all show the beginnings of providing automatic integration of separate computer applications based on what information the user is currently interacting with on the graphical display.
processed in 0.005 seconds
| |
HCI Book 3rd Edition || old HCI 2e home page || search
|
|
feedback to feedback@hcibook.com | hosted by hiraeth mixed media |
|