This paper is also available as PDF (suitable for printing).

Extract from Blackwell, A.F. (1998). Metaphor in Diagrams
Unpublished PhD Thesis, University of Cambridge.

Chapter 7: Conclusions

Then thought I to understand this:
but it was too hard.
Psalm 73:15

This thesis set out to investigate a widely held belief about the diagrams in graphical user interfaces: that they are interpreted in terms of metaphor. As with other types of diagram, user interfaces cannot be treated simply as if they are a form of language, or as if they are objects to be perceived without interpretation. The user of any diagram must interpret it according to the intention with which it was constructed (Ittelson 1996), and there are many cases in which diagrams are intended as figurative rather than literal representations. The figurative intention in a diagram might be regarded either as analogical, transferring the structure of graphical variables to the domain of a problem (Bertin 1981, Larkin & Simon 1987), or as allegorical, embodying abstractions as some analogue of objects in physical space (Jackendoff 1983, Johnson 1987).

When user interface design textbooks recommend the use of metaphor as a basis for new designs they do not investigate these theoretical alternatives too carefully. Such investigation might even seem foolhardy, because empirical research projects in HCI often fail to find significant benefits when metaphors are compared to non-metaphorical interfaces (Simpson & Pellegrino 1993, Potosnak 1988). At a time when many technical resources are being devoted to extending allegorical metaphor into three dimensional virtual worlds, it is worrying that such projects similarly fail to find advantages of 3-D metaphors (e.g. Sutcliffe & Patel 1996).

The demonstrable advantages of graphical user interfaces, as with many types of diagram, can be explained in other terms. Direct manipulation - representing abstract computational entities as graphical objects that have a constant location on the screen until the user chooses to move them - facilitates reasoning about the interface by reducing the number of possible consequences of an action (Lindsay 1988). Similar constraints provide the cognitive benefits of most types of diagrammatic representational system (Stenning & Oberlander 1995), and valuable new diagrams can be invented by selecting geometric constraints that apply to a specific abstraction (e.g. Cheng 1998). Where these types of cognitive benefits arise from a notational system, there is little need to invoke theories of metaphor to explain them. Many of the Cognitive Dimensions of Notations (Green 1989, Green & Petre 1996) are little affected by the degree to which the notation is intended to be figurative.

These pragmatic principles for the analysis of notations are only gradually becoming accepted among designers of new diagrammatic user interfaces. Chapter 3 reported a survey of research publications describing visual programming languages. These publications reflected widespread superlativist theories of diagram use (Green, Petre & Bellamy 1991). They also appeared to reflect a consensus of opinion within this community about the nature of diagram use - much of which can be attributed to the pioneering work of Smith (1977), where he argued that visual images support creative abstract thought by metaphorical processes. Despite this consensus, two surveys of professional programmers showed that these beliefs appear to be restricted to the research community. Programmers who have no specific allegiance to visual programming are sceptical about any potential benefits - probably simple professional conservatism. Those who do use a visual language professionally are not sceptical, but they often describe its pragmatic advantages and disadvantages in terms of notational features that are better explained by the Cognitive Dimensions of the language than by theories of metaphor.

Review of experimental findings

Visual programming languages have often been described as suitable for inexperienced programmers because the diagram expresses the behaviour of the program in terms of some metaphorical virtual machine. In chapter 4, this was tested by making the metaphor more or less available to inexperienced programmers, and evaluating the resulting changes in performance by comparison to experienced programmers. In experiment 1, the metaphor was conveyed by making the elements of the diagram more pictorial, while in experiment 2 the metaphor was explicitly described in instructional material. In neither case did the provision of the metaphor result in appreciable performance improvements relative to more experienced programmers, despite the fact that the computational concepts included in the diagrams of experiment 2 were heavily disguised, and that the diagrams were equally novel to all participants.

Early research publications in HCI regularly equated graphical user interfaces with metaphor on a basic cognitive level: "the use of appropriate verbal metaphors was enhanced by the use of diagrams (which are of course also metaphors)" (Carroll & Thomas 1982, p. 111); "images are metaphors for concepts" (Smith 1977, p. 23); "visual imagery is a productive metaphor for thought" (ibid. p. 6); "concepts in the short term memory are metaphorical images derived from sense perceptions" (ibid. p. 11). The role of visual imagery in the production of diagrams was investigated in the experiments of chapter 5. These found little evidence for the use of visual imagery when planning diagrams. Experiment 3 did find some evidence that a physical metaphor might inhibit abstract reasoning, but this appears to have been an artefact of a specific combination of stimulus and task, as several attempts to replicate the effect failed.

Whereas experiment 6 found that using metaphorical elements in a diagram had less effect on solution elaboration than either task complexity or experimental demand, experiment 5 failed even to find any effect resulting from the use of severely incongruent metaphors. The tasks used in these experiments were admittedly simple by comparison to the demands of design and problem-solving in many computer applications, but they suggest that many people construct diagrams without systematic use of either mental imagery or physical metaphor.

The experiments in chapter 6 returned to the question of metaphor as an instructional device. A number of variations on experiment 2 found that metaphor did provide some advantages in learning the conventions of a new diagram. Rather than supporting complex problem solving, however, those advantages seemed to be simple mnemonic ones. Performance in comprehension tasks was scarcely improved by systematic metaphors that explained diagrams in terms of virtual machines and other metaphorical conventions. Furthermore, the mnemonic advantage was equally great where participants created their own metaphors rather than being given an explicit instructional metaphor. If the diagram included pictorial elements, this facilitated self-generated metaphor - a factor that seems to have greater mnemonic value than systematic explanations.

Related results

These findings are consistent with a small number of previous studies. Marschark and Hunt (1985) found superior recall of verbal metaphors in cases when it was easy to form a concrete visual image of the metaphor subject, but inferior recall of metaphors with strong semantic relations. If theories of verbal metaphor can be extended to diagrams, their work supports the contention that diagrammatic metaphor brings more advantage for mnemonic tasks than for systematic problem solving. Payne (1988) found that metaphorical instruction had a mnemonic benefit when learning a command language, although this effect was smaller than the benefit of making the language itself systematic and consistent.

Martin and Jones (1998) observed that memory for even familiar symbolic conventions is influenced by people constructing their own systematic interpretations. These interpretations will often reflect some schema that was not intended in the original symbol design, resulting in surprisingly inaccurate memory among the general population for common symbols such as road signs.

Self-generated metaphor has been employed to great advantage in an educational variation of the visual programming language. Ford (1993) asked introductory computing students to create their own animated visual language, metaphorically expressing fundamental computational concepts. He reported significant benefits to his students. On the basis of the experimental evidence reported in this thesis, such an approach would be expected to be far more valuable than many proposed visual programming languages.

Implications

The findings described in this thesis have often encountered scepticism from other HCI researchers. Metaphor is so widely recognised as forming the basis for graphical user interfaces that it might seem unreasonable to question this assumption. Metaphor is almost always recommended as a basis for design in user interface textbooks, and publications describing the benefits of metaphor have appeared in the research literature for many years. Substantial commercial empires are attributed to the success of the "desktop metaphor", and millions of personal computer users have experienced dramatic improvements in usability when moving to this interface from previous generations of command-line system.

My contention is that the benefits of the "desktop metaphor" are misattributed. These systems were the first commercial products to allow direct manipulation of elements on the display, and I believe that their benefits derive almost completely from direct manipulation rather than from metaphor. Indeed, I have encountered anecdotal evidence that personal computer users outside the computer industry are often completely unaware of the intended metaphor in the systems they use. They understand "Windows" to be an arbitrary trade name rather than a metaphorical description of some aspect of the user interface (perhaps someone might compare the usability advantages of the "Apple" metaphor for fruitier user interfaces).

To those who have experienced the benefits of direct manipulation, and are aware of the supposed benefits of user interface metaphor, it is still difficult to explain those benefits in terms of cognitive theories. There are several competing theories of metaphor, with differences that may appear overly subtle by comparison to the obvious benefits of direct manipulation. Some respected cognitive scientists explicitly discount the application of their theories to computer interfaces (e.g. Gentner, Falkenhainer & Skorstad 1988), but this leaves the HCI researcher with numerous other theories to choose from, some of which claim a very strong link to diagram interpretation (e.g. Lakoff 1993).

The most important factor in the general acceptance of diagrammatic metaphor as a design principle in HCI, however, is the fact that it has not been empirically tested. The current project appears to have been the first attempt to isolate diagrammatic metaphor (rather than more general instructional metaphor) as a factor in controlled experiments. Previous experiments have compared metaphorical graphical interfaces to non-metaphorical interfaces, with disappointing results (Potosnak 1988). It has always been possible, however, to attribute those negative results to other aspects of the systems that were being compared (e.g. Simpson & Pellegrino 1993), or to confounding individual differences in the experimental sample (e.g. Rohr 1987).

The findings of this study do suggest some limited advantages from employing metaphor in diagrams - pictorial metaphor brings some mnemonic advantages, especially if the pictures are realistic (more realistic than those usually included in graphical user interfaces). In the absence of strong evidence for the broader benefits of metaphor, it might be wiser for user interface designers to turn to empirical findings that have shown advantages resulting from other aspects of notation design - particularly the correspondence between geometric structure, tools for manipulating those structures, and cognitive information processing tasks.

Further Investigation

Most of the experiments described in this thesis have taken at face value some of the "superlativist" claims of the HCI and visual languages community. In particular, they have tested the assumption that diagrams are universally beneficial in problem solving and design. This is obviously not the case - even within the scope of these studies, a simple distinction between "experts" and "novices" has revealed not universal benefits, but a large difference between individuals with different amounts of experience. Some of this difference may well be attributable to imagery strategies, as found in studies of experts by Hishitani (1990) and Saariluoma and Kalakoski (1997). Other studies have also found differences between the strategies of individuals given the opportunity to employ imagery when using diagrammatic representations (Schwartz & Black 1996b, Stenning & Gurr 1997). It would be valuable to investigate the relationship between individual strategies and the type of metaphorical diagrams studied here - the result of such a study may well explain the vehemence of superlativist positions, as well as the historical context of diagram research within the imagery debate.

This study also shares a practical deficiency with many other experimental studies in psychology and HCI. Although it purports to investigate learning effects, the period of learning in each experimental trial is extremely short - the longest experiment reported here lasted two hours, and almost all were shorter than an hour. No real design notation or user interface would be learned in such a short time, and as a consequence the experimental tasks have been almost trivially simple. Other experiments have shown significant effects of practice when using similar notations (Scaife & Rogers 1996, Schwartz & Black 1996a, van der Veer 1990). Despite this weakness, the current study is valuable because many researchers have claimed large and immediate effects for the benefits of metaphor. The conclusion of this study is that, although there may well be effects to be observed, they are not necessarily large or immediate.

The notations used in this study have been rather artificially simple, in order to be acquired and applied within an experimental session. They have also been made unlike any widely used diagrammatic notation, in order not to introduce a confound of individual experience beyond the basic level of professional background (Cox 1997). For any real notation, however, many of its elements are based on diagrammatic conventions that are already familiar to users from other contexts (S. Smith 1981, Wang, Lee & Zeevat 1995). Consistent application of these conventions is an essential aspect of the professional skills of information designers and other developers of new notations. This is also an area that deserves further investigation.

Finally, this study has only partially explained the almost universal acceptance of metaphor as a basis for user interface design. It is possible that part of this popularity results from confusion between the functions of metaphor and direct manipulation in HCI. It is also possible that the pioneering work of D.C. Smith (1977), popularised and achieving huge commercial success through the Xerox Star, Apple Macintosh, and Microsoft Windows, has had more influence than is justified by the psychological evidence he presented. It is still possible, however, that the advantages attributed to metaphor are really due to some unexplored aspect of systematic conceptual metaphors. One candidate for this is the principle of abstraction congruence proposed by Simos and Blackwell (1998), which will be the subject of extended investigation in the near future (Blackwell 1998).

Return to table of contents and download information .