A Case Study: Usability Problems in an Electronic Store

A cornerstone of NCR’s Electronic Store offerings is the notion that the store can only maximize its potential if consumers find it easy to use. Electronic retailing is more than just providing consumers with access to products. The design of an Electronic Store must support all aspects of the retail experience. Too little emphasis is currently given in most web sites to how people will actually use the application – that is, how people shop.

To illustrate the importance of this idea, cognitive engineers at NCR’s Human Interface Technology Center recently conducted a series of usability evaluations of several well-known existing Electronic Stores. Sites selected for the studies represent a broad range of available merchandise on the Internet. The sites evaluated were for a catalog retailer, a specialty store (i.e., card and gift retailer), a book store, and a general merchandizer.


The cognitive engineering team approached the evaluations expecting to uncover some usability problems. Nevertheless, the results are startling in that for three of the sites, evaluators only completed without assistance approximately one-half of the standard shopping tasks they attempted. In those three sites, over one-third of the tasks could not be completed at all, even with assistance. Evaluators of the fourth site had better success completing tasks without assistance, but still experienced a 16% failure rate. Details of the studies follow.


 

Figure 1. Task Success for Four Well-Regarded Web Sites.

What is usability testing?

In a usability evaluation, a sample of the population of intended users of a product or application is recruited to perform a series of representative tasks with the product or application. The evaluation is performed in a usability laboratory, a controlled environment in which end user behavior can be observed and videotaped through a one-way mirror as the users attempt to complete tasks using existing or simulated applications. The data generated by these evaluations (e.g., whether or not users complete each task, user errors, and subjective ratings) are used to identify features of the application design that lead to errors, confusion, and frustration on the part of the end users. Usability testing is a powerful methodology for understanding the problems end users have in operating technology-based systems.


In the paragraphs that follow, we describe the procedure and results from the four usability evaluations referenced above. The specifics of the retailers and their products are eliminated from this description. Similar results have been found for other sites that NCR has evaluated.


What was done in the usability evaluation?

The evaluators who participated in the four studies were recruited to match a standard demographic profile of Internet shoppers. In each study, between 12 and 17 experienced Internet shoppers evaluated a single web site. Approximately half of the evaluators were male and half were female. Evaluators represented a range of age categories and income levels.


For each session, the evaluator began by completing a brief questionnaire to understand their Internet shopping behavior and their familiarity with the retailer’s web site. The evaluator then attempted to perform between 6 and 8 representative shopping tasks within the web site of a prominent Internet retailer. At the end of the session, the evaluator completed a second questionnaire to rate the usability of the site. Evaluators each spent about an hour participating in the study. Each evaluator was paid between $35 and $40 for their participation.


The tasks that each evaluator performed were selected to reflect tasks that shoppers typically perform (e.g., search for alternatives, compare alternatives) and to exercise the core functionality of the site. Since each site was unique, the tasks had to be tailored for each study. Some tasks were eliminated for a given site since the site did not support them. The following list is a sample of the types of tasks performed by evaluators:


  1. Find a product that meets a set of specifications and add it to your shopping basket.
  2. Add an item to the shopping basket when you know its number from the mail-order catalog.
  3. Look for an item for which you have incomplete criteria (e.g., a t-shirt in three different colors, the latest book from your favorite author).
  4. Buy an item for a person with a general need (e.g., someone who likes bears, wants a gardening item, or wants a toolbox for a truck).
  5. Learn about a product you cannot buy on the Internet.
  6. Get suggestions for an item to purchase.
  7. Delete an item from the shopping basket.
  8. Submit an order

During the test, HITC cognitive engineers noted evaluator errors, comments, and task success statistics (i.e., whether or not the evaluator completed the task or needed assistance). If an evaluator became frustrated or was sufficiently lost, the cognitive engineer offered assistance. Cognitive engineers stopped evaluators who could not accomplish a task in about ten minutes.


What were the results?

As noted above, the overall results show that for three sites, an average of 34% of the tasks could not be completed within the allowed time period. The book seller site had a 16% failure rate. Success rates for similar tasks varied widely across the sites. For example, evaluators who attempted to remove an item from the shopping basket in the catalog retailer site failed 9% of the time. Evaluators who attempted the same task in the card and gift retailer’s site failed 46% of the time.


Even a task as seemingly simple as adding an item to the shopping basket when a catalog number is known had widely varying success rates. Evaluators of the catalog retailer site experienced a 42% failure rate for this task. Evaluators of the general merchandizer site had a 0% failure rate.


Reasons for the high failure rates can be grouped into three categories:

  1. Visual Design – User errors or inefficiencies resulting from a visual design that is not perceived by end users as intended.

    For example, in the catalog retailer’s site, buttons appeared at the bottom of the screen that evaluators assumed would lead directly to new pages containing related information. Instead, when evaluators clicked the "Shop" button, a list of related links appeared. Because the links were at the bottom of the page, were displayed in a relatively small text, and were not what evaluators expected, many evaluators did not even realize that clicking the "Shop" button had any effect. The way these options were displayed was confusing to evaluators.

  2. Technology/Database – User errors or inefficiencies that result from the way content is represented in the database or the way in which a search is carried out.

    For example, some evaluators who performed Task C in the catalog retailer site attempted to search for a particular kind of shirt. When evaluators entered the name of the shirt into the search field, the system responded with a single, inappropriate link. Evaluators who had this kind of trouble using the search function tended not to use that function again.

  3. Cognitive Engineering – User errors or inefficiencies that result from a site operation and organization that is inconsistent with the way users expect tasks to be performed.

In the example above about searching for a product, many of the evaluators resorted to browsing through a list of similar products (i.e., shirts). Unfortunately, these similar products were listed by "marketing" names instead of descriptive product names. Users could not scan the the names of the shirts and determine which ones might meet the task criteria. Instead, evaluators were forced to select a product name, press the view button, wait for the product information to be displayed, and then decide whether or not that product met their criteria. If it did not, the evaluator had to return to the page with the list of shirts, select the next product in the list, and repeat the process. To compound the problem, evaluators were unable to make direct comparisons between products since the list of product names contained no other helpful information such as a picture or a price.


This approach may make sense from a development standpoint – content that exists in support of the store’s published catalog can easily be re-purposed for the electronic store. This approach does not, however, support the way people shop. Evaluators found this approach time-consuming and frustrating.


Conclusions

NCR completed the assessment of these sites to demonstrate the benefits of usability evaluation and to illustrate the sometimes-subtle ways in which user interface design can affect usability. The examples cited above only scratch the surface of the types of problem that the usability evaluation uncovered. By evaluating the systems with real users performing realistic tasks, NCR was able to identify numerous problems in the design of the sites. Each evaluation took about 3 weeks to complete (from planning to results), which is a relatively small investment in time and resources to discover factors that will have a strong effect on the overall success of the site. The bottom line is that if users can complete their shopping tasks effectively and efficiently, it will lead to greater store sales.


If you are interested in Usability Testing, see the HITC service description.



[Home][Technologies][Methods][Solutions][News][NCR] Top of Page HITC Services

copyright 1997, NCR Human Interface Technology Center