|
|
|
CHAPTER 9 evaluation techniques
|
|
|
|
EXERCISE 9.1
In groups or pairs, use the cognitive
walkthrough example, and what you know about user
psychology (see Chapter 1), to discuss the design
of a computer application of your choice (for example,
a word processor or a drawing package). (Hint:
Focus your discussion on one or two specific tasks
within the application.)
answer
This exercise is intended to give you
a feel for using the technique of cognitive walkthrough
(CW). CW is described in detail in Chapter 9 and the
same format can be used here. It is important to focus
on a task that is not too trivial, for example creating
a style in a word processing package. Also assume
a user who is familiar with the notion of styles (and
with applications on the same platform (e.g. Macs,
PCs, UNIX, etc.)) but not with the particular word
processing package. Attention should be given to instances
where the interface fails to support the user in resolving
the goal and where it presents false avenues.
EXERCISE 9.2
What are the benefits and problems of
using video in experimentation? If you have access
to a video recorder, attempt to transcribe a piece
of action and conversation (it does not have to be
an experiment - a soap opera will do!). What
problems did you encounter?
answer
The benefits of video include: accurate,
realistic representation of task performance especially
where more than one video is used; a permanent record
of the observed behaviour.
The disadvantages include: vast amounts
of data that are difficult to analyse effectively;
transcription; obtrusiveness; special equipment required.
By carrying out this exercise, you will
experience some of the difficulties of representing
a visual record in a semi-formal written format. If
you are working in a group, discuss which parts of
the video are most difficult to represent, and how
important these parts are to understanding the clip.
EXERCISE 9.3
In Section 9.4.2 (An example: evaluating
icon designs), we saw that the observed results
could be the result of interference. Can you think
of alternative designs that may make this less likely?
Remember that individual variation was very high,
so you must retain a within-subjects design, but you
may perform more tests on each participant.
answer
Three possible ways of reducing interference
are:
- During the initial training period,
swap back and forth between learning the two sets
of icons, with the aim of getting the subjects used
to swapping between the two sets of remembered icons.
However, this design could be argued to suffer the
same flaws as the original. If the abstract icons
had been taught in isolation perhaps they might
have fared far better.
- We could invent a third set of 'random'
icons (call them R). We could then interpose them
in the experiment, that is present the icons in
the orders RARN and RNRA. The intention is to swamp
any transfer effect in the 'noise' of the random
icons. It could be argued that our experiment then
measures the robustness of the icon sets to such
'noise'!
- We could give the subjects multiple
presentations, for example ANAN and NANA presentation
orders. This would not remove transfer effects,
but it would give us some way to quantify them.
Imagine that in the ANAN group the second presentation
of the abstract icons was significantly worse than
the first, but there was not a similar effect for
natural icons in the NANA group. This would give
us both positive evidence of a transfer effect,
and perhaps some quantitative measure. However,
even going from this additional evidence to a strong
conclusion will be difficult.
Notice that all the above measures require
additional subject time and one has to constantly
weigh up the advantages of richer experiments against
those of larger subject groups.
EXERCISE 9.4
Choose an appropriate evaluation method
for each of the following situations. In each case
identify
(i) The participants.
(ii) The technique used.
(iii) Representative tasks to be examined.
(iv) Measurements that would be appropriate.
(v) An outline plan for carrying out the evaluation.
(a) You are at an early stage in the
design of a spreadsheet package and you wish to test
what type of icons will be easiest to learn.
(b) You have a prototype for a theatre booking system
to be used by potential theatre-goers to reduce queues
at the box office.
(c) You have designed and implemented a new game system
and want to evaluate it before release.
(d) You have developed a group decision support system
for a solicitor's office.
(e) You have been asked to develop a system to store
and manage student exam results and would like to
test two different designs prior to implementation
or prototyping.
answer
Note that these answers are illustrative;
there are many possible evaluation techniques that
could be appropriate to the scenarios described.
Spreadsheet package
(i) Subjects |
Typical users: secretaries, academics, students,
accountants, home users, schoolchildren |
(ii) Technique |
Heuristic evaluation |
(iii) Representative tasks |
Sorting data, printing spreadsheet, formatting
cells, adding functions, producing graphs |
(iv) Measurements |
Speed of recognition, accuracy of recognition,
user-perceived clarity |
(v) Outline plan |
Test the subjects with examples of each icon
in various styles, noting responses. |
Theatre booking system
(i) Subjects |
Theatre-goers, the general public |
(ii) Technique |
Think aloud |
(iii) Representative tasks |
Finding next available tickets for a show, selecting
seats, changing seats, changing date of booking |
(iv) Measurements |
Qualitative measures of users' comfort with
system, measures of cognitive complexity, quantitative
measures of time taken to perform task, errors
made |
(v) Outline plan |
Present users with prototype system and tasks,
record their observations whilst carrying out
the tasks and refine results into categories identified
in (iv). |
New game system
(i) Subjects |
The game's target audience: age, sex, typical
profile should be determined for the game in advance
and the test users should be selected from this population,
plus a few from outside to see if it has wider appeal |
(ii) Technique |
Think aloud |
(iii) Representative tasks |
Whatever gameplay tasks there
are - character movement, problem solving, etc. |
(iv) Measurements |
Speed of response, scores achieved,
extent of game mastered. |
(v) Outline plan |
Allow subjects to play game and talk
as they do so. Collect qualitative and quantitative
evidence, follow up with questionnaire to assess satisfaction
with gaming experience, etc. |
Group decision support system
(i) Subjects |
Solicitors, legal assistants, possibly
clients |
(ii) Technique |
Cognitive walkthrough |
(iii) Representative tasks |
Anything requiring shared decision
making: compensation claims, plea bargaining, complex
issues with a diverse range of expertise needed. |
(iv) Measurements |
Accuracy of information presented and
accessible, veracity of audit trail of discussion,
screen clutter and confusion, confusion owing to turn-taking
protocols |
(v) Outline plan |
Evaluate by having experts walk through
the system performing tasks, commenting as necessary. |
Exam result management
(i) Subjects |
Exams officer, secretaries, academics |
(ii) Technique |
Think aloud, questionnaires |
(iii) Representative tasks |
Storing marks, altering marks,
deleting marks, collating information, security protection |
(iv) Measurements |
Ease of use, levels of security and
error correction provided, accuracy of user |
(v) Outline plan |
Users perform tasks set, with running
verbal commentary on immediate thoughts and considered
views gained by questionnaire at end. |
EXERCISE 9.5
9.4 Complete the cognitive walkthrough
example for the video remote control design.
answer
Continue to ask the four questions for
each Action in the sequence. Work out what the user
will do and how the sytem will respond. If you can
analyse B and C, you will find that Actions D to I
are similar.
Hint: Remember that there is no
universal format for dates.
Action J: Think about the first question.
Will the user even know they need to press the transmit
button? Isn't it likely that the user will reach closure
after Action I?
EXERCISE 9.6
9.5 In defining an experimental study,
describe
(a) how you as an experimenter would formulate the
hypothesis to be supported or refuted by your study
(b) how you would decide between a within-groups or
between-groups experimental design with your subjects
answer available for tutors only
EXERCISE 9.7
9.6 What are the factors governing the
choice of an appropriate evaluation method for different
interactive systems? Give brief details.
answer available for tutors only
|
|