Methods for evaluating functional prototypes

In this post I will take a look at three methods on how to evaluate Functional prototypes. 

The methods are:
1) System usability scale;
2) AttrakDiff; and
3) Think aloud protocol.

General classes of usability should be covered if usability is measured. They are effectiveness, efficiency and satisfaction. Vastly different metrics can  also be used while measuring these attributes. Context-specificity makes comparing different systems usability very difficult. It also means that one design which is very good in one system and with one set of users/use cases, might not work well in other context. Copying usable solutions is then very risky.

The usability of an artefact is defined by the context it is used in. Thus every usability study should require different and detailed approach. To make usability testing easier and more universally comparable, the System Usability Scale (SUS) was created. System usability scale is a reliable and low-cost usability scale which can be universally used for assessment of systems usability.  SUS is a Likert scale and the overall outcome will be a score between 0 and 100. Alltogether it is a valuable, reliable and robust evaluation tool that allows usability evaluation to be performed effectively.

AttrakDiff is used to assess users feelings about a system with a queastionnaire. The questionnaire studies both hedonic and pragmatic dimensions with semantic differentials. The data aquired is quantitative and also comparable, much like in the case of System Usability Scale, but the weakness of AttrakDiff is that it uses the reflections of the users rather than the real experiences themselves. The approach is also used both in lab and field studies.

Think aloud protocol incorporates another dimension to the studies. While doing usability testing, users are asked to talk aloud and say what comes to their minds while performing the set tasks. The thoughts are often not even related to the task but rather something that comes to mind while performing them. This might include what they are looking at, thinking, doing, and feeling. Also a set of questionnaires are used. The questionnaires can be used before or after performing the tasks, as necessary.


Methods for Collecting UX Data

There are three articles about physiological measures in collecting UX data on the table today. I will try to seek out the pros and cons, to see which would be the best for use in our own UX evaluation process. All of these methods are not always used in evaluating prototypes of certain type. Some of them are used mainly in web design etc. I need to find the most suitable approach for me to evaluate user experience.

The three physiological measuring approaches are:

Visual Complexity Evaluation
Pupillary dilation monitoring
Eye tracking techniques

Visual Complexity Evaluation is used often in website design, but its outcomes and effects are not always fully utilized or understood. Within the studies a hypotheses was proposed and tested, to see, if increasing a websites complexity would have a detremental cognitive and emotional impact on users. Users want their web environments to be usable and appealing. Adding visual complexity may play a huge factor in their first impressions and even later usage decisions. If it works this way than designing appealing and simple webpages might work the other way around also and attract users to a webpage, regardless of the content. These performed studies also included passive viewing task (PVT) and visual search task (VST) methods.

Pupillary dilation monitoring during music-induced aesthetic responses (chills). The study concentrated on the correlation of music-induced chills and pupil reaction. While listening to different songs, participants were asked to actively press a button when they got a chill when listening to a song. The point where participants pressed the button was then correlated to pupil reaction. The study concluded that pupil reaction during passive music listening can be monitored and translated into aesthetic responses.

The third study was performed with touch-screen devices and soft keyboards. Eye tracking was used to evaluate the user experience of participants when using different soft keyboard layouts. The aim of the study was to provide input into soft keyboard layout design to help users type more effectively.

I think that our concept, which is a collaborative music making experience, could benefit a lot from pupillary dilation monitoring. That is to see if people get engaged and enjoy the process itself. On the secon hand, the experience needs to be as simple as possible for the users to step in and start using our machine. Thus visual complexity needs to be toned down to the very minimum in order to attract ausers. Either of these methods could be then used to evaluate the user experience on our concept/prototype.



Methods for evaluating early prototypes (Reading assignment 2)

This evaluation and discussion is to learn and select from three methods for evaluating early prototypes. The methods are:

(1) Multiple Sorting
(2) Contextual laddering
(3) Wizard of OZ

Multiple Sorting
The first method addressed is multiple sorting. It relies on te assumption that People’s conceptualising and understanding of their world, and therefore their knowledge, is based on categorisation. A simple linear scale method (semantic differential – awkward vs. easy to use) was used to collect information on whow we see the world. Since a person is constantly updating their understanding of their surroundings and the world, Kelly (Kelly, G. A. The Psychology of Personal Constructs, Volume One: Theory and Personality. (1). 1955. New York, Norton.) developed a technique for eliciting personal constructs in an interview context which would be more multidimentional. This technique is known as the ‘Repertory Grid Test.
In the Repertory Grid Test, participants are asked to think aboud tiads of items. They need to think and describe, what is similar between two of them and why the third one is different. This method brings out contrastive dimentions and participants are then asked to rate those contrastive items.  The data collected then forms a grid and helps to explain how people interpret the items and connections genetrated by other people on the same items.
Both of these methods rely on the assumption that our constructs are polar. Categorization can often be multidimentional. The Repertory Grid Test method is quite timeconsuming also since interviewing the participants is also necessary. Verbalization of the answers is also often difficult. To address the linear approach of the earlier tests, new variations have been devised. A new version of the test provides the participants with a wide variety of objects and allows them to select as many to a group as they like. After each sorting, participants are asked for the reasons they made the decisions. Multidimensional Scalogram Analysis (MSA) is then performed on the resulting sort data, to yield spatial maps of constructs for interpretation alongside interview discussions.
Contextual laddering 
Laddering is especially a well-known interview technique where customers are asked multiple times to explain why an attribute, that has been given to a product by the user, is important. Many times after explaining the background of an attribute, the question why, is asked again. E.g. Why is that important to you? By probing into the reasons why, the interviewee will ‘climb up the ladder’. This way, the reasons (consequences) why certain attributes are important will first be revealed, followed by an expression of how
these consequences serve personal values. UX Laddering is useful to design for attributes that offer value and meaningful user experiences to users. The goal of laddering, is to identify and understand the links between key perceptual elements across the range of
attributes, consequences and values.
Wizard of OZ (WOz)
The WOz method helps designers avoid getting locked into a particular design or working under an incorrect set of assumptions about user preferences, because it lets them explore and evaluate designs before investing the considerable development time needed to build a complete prototype. The method is particularly useful in exploring user interfaces for pervasive, ubiquitous, or mixed-reality systems that combine complex sensing and intelligent control logic. The problem with the method is that it requires the engineering of an interface and integrating it with an incomplete system. Now when the system under investigation is developed further, the WOz interface is usually not and is thus a once (or twice) time used interface.
In WOz a person acts as a wizard and performs the steps that have not been automated or implemented by the system itself. Thus manually closing the gaps that the current state of development has within the system. As development goes onward the gaps, the wizard has to fill when testing, shrink.
In our case, the last method seems a good one to use since we need to test the overall system on potential users without having certain very important functionality developed. In order to see if it makes sense to go forward with the same plan, we need to test the overall perception of the system.

Three methods for evaluating design. Which would I choose?

There are three methods for evaluating design that are currently on the table. I will explore to learn and decide on which I would use to evaluate our future work and designs.

The methods are as follows:

(1) Sentence Completion
(2) AXE (Anticipated eXperience Evaluation)
(3) Exploration test

Sentence completion technique is often used in psychology and marketing. The method can be developed and applied for evaluating symbolic meaning. In the paper  “Sentence Completion for Evaluating Symbolic Meaning” by Sari Kujala and Piia Nurkka, sentence completion is used in two case studies to evaluate how people give symbolic meaning to objects based on their design and associations. Respondents are provided with a questionnaire full of beginnings of sentences where they can complete them in ways that are meaninful to them.


Sentence completion techique can be used through interviews and nowadays even often via electronic channels (electronic questionnaires via e-mail etc.). Interviewing allows for collecting large amounts of information, following how people react and associate things and events to design. The downside of interviewing is of course the organizational side. The time to gather necessary resources, get people to attend the interviews and set the meetings is a very resource heavy process. Using web based questionnaires is an easier way to go, but there are also downsides. The amount of attention and input you get from people attending the study is often lower and depending on the associations people have with the objects or design studied.

Having interviews with users in studies has often many drawbacks that need to be addressed. When evaluating concepts, it is necessary to get feedback from potential users. Especially important is perceived product character and its individual features. Through this information, it is possible to identify potential issues early on and make modifications to avoid them. But the bottom line is that these are concepts that are being evaluated and concepts are abstract. The presentation of a concept or the visual look of an early prototype might sway the feedback one way or the other. Even the storytelling behind the necessity or creation of a concept might put new and confusing thoughts in to the mix. There is also an issue with talking about the future or putting it into words. It becomes very dificult when people are asked to formulate their future needs. It is epecially so because using words to describe experiences is difficult and by adding imagination to the pot, it becomes more so.

To overcome all the difficulties in concept evaluation, the AXE (Anticipated Experience Evaluation) method is proposed by Lutz Gegner and Mikael Runonen. It is an approach  to gain insights on the perceived value of concepts by utilizing image-pairs as stimuli in user interviews. The approach has three steps: concept briefing, concept evaluation, and data analysis.

First in concept briefing, the participants are presented the concept each time in the same manner and order. All information like concept narratives and extra material is also provided to the participants, so they are able to access them during the session at any time. In the second and main part of the AXE method, Concept Evaluation, participants are provided with image-pairs and a scale in between. The images are used to make sure that all input towards the participants is similarily structured and would help steer them to talk about experimental aspects they perceive. The generative visual and enabling scaling methods help solve some challenges that normal interviews have. The image-pairs are composed to display a contrast and linked through a scale to strengthen the idea of bipolarity. Through selecting in between the images, people express their perception of the product and preference.

An evaluation interview is also carried out during concept evaluation. That is to get a deeper understanding on why the participant has chosen one or the other. The visuals associated with the concept are not always the visuals that are their preferred ones. When there is a difference between the chosen and prefferred images, the interviewer can get an insight on what would make the concept better. It is very important to use only adjectives and information provided by the participant during the interviews, otherwise the validity of feedback can be lowered significantly.

During data analysis, the data is transcribed, partitioned into manageable segments and then an analytical framework is built. In the framework every segment is coded and categorized. The main categories classes reflect the current state of the concept and comprise of perceived product features, associated attributes and anticipated consequences. All these categories and subcategories provide input towards UX development concerning the evaluated concept.

The third method, Vermersch’s `explicitation’ interviewing technique (Vermersch’s `explicitation’ interviewing technique used in analysing human-computer interaction, Ann Light, december 1999), is more based on interviewing and provides a HCI especially with a bigger range of application. HCI researchers are trying to understand the use of technologies, and regularly use qualitative research methods to do so. This approach will help them investigate, how tasks are completed. The method allows participants to enter evocation. The interviewee is encouraged to think of a particular episode involving an activity under investigation and go into a state of evocation so that the episode can be described in detail. This gives researchers clearer overview of what goes on in peoples mind when they perform certain tasks. The interview has to be carried out with extreme detail to avoid misguiding the interviewee.

All of these described methods have tasks that they perform best. In our case where we have an idea that has not fully been described and formulated, we find that we should use the first method –  Sentence Completion . We would be able to use sentence completion to find out what are the perceived qualities and symbolic meaning that our proposed concept or idea has. We wish to see if users give the concept meanings like having fun, spending quality time with friends or family etc. By providing participants with a questionnaire compiled of open-ended questions, we can see if our own vision and idea goes hand in hand with the perception of possible users and see if sümbolic meaning is given to the concept like we hope.

30 important concepts in HCI (for me)

  1. body of knowledge
  2. interaction design
  3. Interface design
  4. user interaction
  5. User-centered design
  6. cognitive theory
  7. (cognitive) frameworks
  8. cognitive modeling
  9. external representation
  10. social approach
  11. affordances
  12. Distributed cognition
  13. Artefacts
  14. Embodiment
  15. Design/Designer
  16. Group behavior
  17. Social loafing
  18. Holistic experience (Compositional, Sensual, Emotional, Spatio-temporal)
  19. Tangible manipulation
  20. Heuristics (recogniti0n, search, choice)
  21. Innovation
  22. Predictive
  23. Collaborative work (design)
  24. Perception
  25. Augmentation
  26. Data collection
  27. Research (methods)
  28. Aesthetics
  29. Prototype
  30. Culture



Conclusions on HCI Theory [M7]

As far as reading the paper and also reflecting on it through the concepts and maps, I must agree that applying different conceptual frameworks has been effective and even impressive in shaping research but their application in practice is indeed a bit below par. I did enjoy the overview of shaping and defining of user experiences through sensor-based conceptualization which nowadays contributes a lot into innovation when dealing with user input and interaction design.

More theoretical approaches, though not very successful, are to my liking becouse of their property to allow different angles of attack. The issue is that one might often be misunderstood and also, as stated in the discussion, your ideas and thoughts are best followed if all the participants have an overview of the basic theoretic concepts. Many terms and concepts are thus soften differently interpreted and that prolongs and complicates the application of such approaches in studies or even real life.

Proposed and used frameworks help understand, how many similar but still different works or approaches come from the same background and with the same basics come up with different solutions and generalizations. By understanding the basics, I am now able to spot at least some of them being used (past or present) in my work, studies and surroundings.

Contemporary theories (part II) [M6]

Känd [M6] Contemporary theory (Part II)

The last two turns in Contemporary theory area bit easier to comprehend and follow than the first two. For example turn to the wild is largely concerned with observing situations in everyday lives and propose a logical cyclical re-contextualizing process. The heuristical approach in ecological rationality is comprehensible and quite easy to explain in real life scenarios. The way we interact with our surroundings and how we recognize, search and choose without having to go through all the information provided is interesting and fun to follow.Turn to embodiment emphasizes on learning through doing and tries to understand how engagement with social and physical environment works in practice.

Both of these turns and the theories that they embody bring forth the researcher or designer, finds themselves slipping into over-simplified interpretations. In-the-wild studies have also a threat of being too focused on the interpretation of the researcher. Also, when carrying out a laboratory research, there is always the case of a person, who has to explain the “rules of the game” thus threatening the overall credibility of the outcome as being a natural way of interaction in-the-wild.

Contemporary theories in HCI [M5]

Känd [M5] Contemporary theory pt.1

The step into Contemporary theories is a fascinating one. It requires to build and rely on the previous developments of HCI. It also incorporates and expands on top of previously examined theories and approaches. Taking into account the relationship between technology and experience is one that is more to my liking. The circumstances surrounding design and the environment designed in and for is a more complex one. Contemporary theory accepts more involvement from previous approaches and sciences which in turn makes it harder to comprehend and grasp overall. One has to be fluent in the history and development of HCI to be competent while dealing within the realm of Contemporary theory.

Now to the articles.


To examplify the turn to design, I chose an article “Developing the Drift Table” by Andrew Boucher and William Gaver.

Firstly, a drift table is meant to be a table which portraits areal images as if you were drifting across landscapes.  The article describes designing a table and the evolution from the first scetch to the final design over a period of months.

There were many considerations to be adhered during that time, including experience issues about the imagery to be displayed and ways to interact with it, aesthetic issues around designing domestic electronic furniture and engineering issues concerning how to produce a prototype that would be quiet, safe, and reliable. The issues were deeply intertwined.
Potential aesthetic choices were constrained by technical feasibility, while engineering solutions were constrained by the
need to achieve a desirable aesthetics.

The first change that was made in the project, was to abandon the idea of making a dinner table and make a coffe table instead. The reason being very simple, but an important one in the context – they wanted to move away from the task-oriented environment of dining and closer to the relaxed space around a coffee table in order to promote ludic values through the design.

They also didnt want to make the table overly modern to stand out. It was supposed to be simple and usable. It was built using white laminate and wood veneer in order to provide an inherently domestic look. In order to make the prototype as close to the real thing they also went for a self-contained computer setup, meaning that the computer internals and areal photography are all stored within the table itself.

One of the main lessons learned during the design process, was the engineering and making of a full sized, fully operational prototype that was as close to the real thing as possible. That meant that users and indeed future customers would be able to suspend disbelief and engage fully with the device over long-term.Such attention to detail and finish are perhaps not normally associated with an experiment or prototype, but there are real benefits to be gained by doing so.


To examplify the turn to culture, I chose an article “Hatching Scarf: A Critical Design about Anxiety and Persuasive Computing” by Andrew Boucher and William Gaver.

The idea was to create a tangible object which would measure quantified data about the user but instead of persuading the user to perform certain tasks, which might create unhealthy anxiety, they wanted the object/device to make the user reflect on their person as objects and subjects of knowledge. As a computational object, it contains a range of metaphors, symbols, and concepts to help critically interrogate the relations among technology, social norms, and comforting habits.

The scarf contains many pouches and pockets. The way the scarf works is by opening and closing in response to the motion of the hand within the pockets. Its visual and interactive vocabularies are inspired by the way a mother bird brings food to the nest for her chicks.

The most important aim of the designers was to intimately incorporate an artifact into the bodily practice. THey want the user (or wearer in this case) to percieve the object as an extension of themselves. The design is agnostic about what it means, what is correct, etc. It does not give certain feedback to a user, but rather sends cues or signs to to the user to help re-conceptualize the users actions and perhaps see the actions in a new light.

Modern theories (part II) [M4]

Känd [M4] Modern theories (part II) vol.2.cmap

As these maps get more complex it becomes easier to get lost in them. Some notions and artifacts are very similar and even overlap. Thus I become to a point where I want to bring them together. Since they are quite a way apart, the would become unreadable when connected. That is why some parts might be a bit repetitive and even not connected although they should be.

Also as the theories and approaches pile up, I find it necessary to write more detailed descriptions of them to the map. That makes the nodes larger and the picture a bit harder to read, but once I get into it, I am at least able to distinguish between them.

I also find, while going back to previous maps, that I should have been a bit more descriptive. Some concepts might be a bit confusing at first and I need to think back in order to thoroughly understand them. I believe that this might be the case for everyone while looking at it.

As far as the map is concerned it is making a lot more sense of the overall build-up of theories and approaches in HCI. It is now a more comprehensible overall picture with most of the modern theories and approaches also listed. If I compare the concepts to my own knowledge and what I have read in “HCI Theory” by Yvonne Rogers, it is now a better comparable picture to the relations between HCI, design theories and computer sciences.

What I fear most of all is when we go to map the contemporary theory. Mainly because I dont know how I am able to portrait the concepts to the same map as well. It might be a bit of a challenge.

Modern theories (distributed cognition) [M3]

Känd [M3] Modern theories (distributed cognition) vol.2 - How did distributed cognition evolve

The map firstly reflects upon the three theoretical approaches in HCI modern theories. Firstly Social approaches like external cognition, distributed cognition and ecological psychology. They were all a sort of derivation of the others. External cognition was quite concrete and focused on the relations of internal and external representations. Distributed cognition took into account the environment in which the task was carried out as well as the interactions among people and the artifacts used. The main limitations of the distributed approach brought out in the article were the need of extensive fieldwork – the analysis needed to be carried out in the environment and it also expected to have a good overview of the different aspects and relations within it. Because it did not have a set of out-of-the-box interlinked concepts, it was also difficult to apply. Ecological psychology approach on the other hand was more focused on invariant structures in the environment and using them as tools to guide and design the surroundings or visuals. The approach used ecological constraints to guide actions, accordances to give usability clues and entry points to attract and trigger interaction. There were some limitations for usability and interaction like being unable to find similarities between certain visual artefacts on the screen and real world analogies (e.g. door knob and grasping action).

As far as the theories and approaches are concerned, it was a bit easier to comprehend than classic theories. Mainly because I am able to easily visualize the point and put it into a real world perspective. Some aspects of the social approaches are quite new and modern (that is why they must be called that) and are easily applicable even today.