This paper is also available as PDF (suitable for printing).

Extract from Blackwell, A.F. (1998). Metaphor in Diagrams
Unpublished PhD Thesis, University of Cambridge.

Chapter 5: Diagrams and Abstract Structure Generation

Those weird designs, they only show
What's going on in weirder minds,
Cause when you doodle then
Your noodle's flying blind.
Every little thing that you write
Just conceivably might
Be a thought that you catch
If while caught in a wink.
Doodling takes you beyond what you think
Then you draw what you feel.
Doodlin' - Horace Silver

The experiments described in the previous chapter found large differences in performance between experts and novices, but only small relative changes in novice performance as a result of using metaphorical notations. The apparent lack of educational benefits is disappointing. It is possible, however, that the notations used the wrong kind of metaphors.

In experiment 1, evidence from think-aloud protocols suggested that novice programmers paid little attention to implicit visual metaphors, yet regarded the problem in less abstract terms when using a physical metaphor. These findings concur with some of the intuitions of visual programming language designers and users - one of the themes that I found in the surveys of chapter 3 was the claim that visual languages would reduce the degree of abstraction required in programming. Diagrams may help users avoid abstraction by depicting an abstract concept in terms of physical experience (Lakoff 1993). Alternatively, diagrams may be more computationally tractable because they have less potential for expressing abstraction than symbolic languages (Stenning & Oberlander 1995).

We can thus distinguish between diagrams in which the components illustrate a simple physical metaphor, as in experiment 1 and 2, and diagrams whose geometric structure acts directly as an alternative concrete metaphor for some abstract structure. The latter is more like the case of an electrical schematic where the possible set of causal relationships is constrained by the connection paths in the diagram.

In experiment 1, participants appeared to use fewer abstract terms when using an overtly metaphorical pictorial notation. It is possible that an increased level of visual detail constrains the representation to refer to a specific situation, rather than an abstract set of potential situations. It is for precisely this reason that pictures have long been considered not to be abstract representations; Berkeley's early 18th century theory of vision (1705/1910) distinguished between the visible abstractions of geometry and perceptual experience of the real world. Bartlett quotes Napoleon as distrusting the use of mental images: "those who form a picture of everything are unfit to command" (Bartlett 1932, p. 220). Modern work in cognitive psychology has also observed that mental image-based strategies are unable to represent indeterminacy (Mani & Johnson-Laird 1982), and that generalisation from multiple examples requires the translation of descriptions from images into verbal abstractions (Goldberg & Costa 1981) - the consequent loss of these visual-turned-verbal abstractions has been observed in psychiatric patients during left hemisphere ECT suppression (Deglin & Kinsbourne 1996).

The inability of mental images to support abstraction is considered by Stenning & Oberlander (1991, 1995) to be their principal advantage, because reducing the range of possible interpretations (they call this specificity) makes reasoning with a diagram more computationally efficient. Restricting the range of interpretations can also be a disadvantage, of course. Pimm (1995) believes that using concrete representations in mathematical settings may prevent children from forming necessary abstractions. Other theories, however, emphasise that mental images are at least more abstract than visual percepts because they do not specify all possible details (Paivio 1971, Miller 1993). This observation has also been made of representational conventions in drawing (Arnheim 1970) and of diagram use (Wang, Lee & Zeevat 1995).

If diagrammatic images are interpreted metaphorically, which of these possibilities would be the most relevant? The interpretation of metaphor is itself a process of abstraction from one situation to some interpretive domain (Verbrugge & McCarrell 1977, Gentner & Wolff 1997), but this abstraction makes metaphor difficult to understand because of the range of potential interpretations (Winner & Gardner 1993). If images could be used as intermediaries when interpreting metaphors (Beck 1978), this might provide the advantage of specificity - constraining potential interpretations. In fact, many theories of metaphor comprehension propose that mental images are central to use of metaphor (Cacciari & Glucksberg 1995, Gibbs & O'Brien 1990, Kaufmann 1979, Walsh 1990). Tentative proposals have been made of a functional relationship between the cognitive resources applied in diagram use and metaphorical image use (Lewis 1991, Lakoff 1993), but these have not been as confident as the claims made by computer scientists about the benefits of HCI metaphor.

The use of strategies based on mental imagery to solve verbal problems has historically been one of the central issues in the mental imagery debate. Much of the existing research into diagram use appears to have been motivated by entrenched positions in that debate, as reviewed by Blackwell (1997b). This discussion can only briefly summarise that review, which considered experimental tasks involving picture naming (e.g. Potter & Faulconer 1975), identity judgements (e.g. Theios & Amrhein 1989), evaluating sentences about diagrammatic situations (e.g. Clark & Chase 1972) and problem solving (e.g. Frandsen & Holder 1969, Schwartz 1981). Blackwell also reviewed the main theoretical positions in the debate (Kieras 1978, Pylyshyn 1981, Kosslyn 1981) and some of the philosophical approaches to resolving it (Dennett 1981, Goodman 1990, Sloman 1995).

The most convincing evidence in the imagery debate has come from purely visual tasks such as mental rotation (Shepard & Metzler 1971) and map construction (Kosslyn, Ball & Reiser 1978), but many experiments have investigated the diagrammatic use of images to represent logical propositions (Huttenlocher 1968, Shaver, Pierson & Lang 1974/75, Mani & Johnson-Laird 1982, Fuchs, Goschke & Gude 1988, Matsuno 1987). Many computational models of mental imagery have been constructed as supporting evidence that images can be used for logical reasoning (Lindsay 1988, Greeno 1989, Glasgow & Papadias 1995), as well as for reasoning about the abstract structure of physical situations (Koedinger & Anderson 1990, McDougal & Hammond 1995, Novak 1995, Gardin & Meltzer 1995, Forbus 1983, Faltings 1987, Blackwell 1989).

The most ambitious claims found in the surveys of chapter 3 extend well beyond such restricted problem-solving activities, however. Some researchers apparently claim that all software design problems are solved by thinking in images, and that visual programming languages directly facilitate the solution process. This intuition is consistent with the introspections of programmers who use conventional languages (Petre & Blackwell 1997). Several other studies have also found evidence for use of mental images during software design (Gilmore & Green 1984a, Buckingham Shum et. al. 1997, Green & Navarro 1995, Saariluoma & Sajaniemi 1994).

When mental images are reported by expert programmers, the activities they refer to are not simple problem-solving, but large-scale design. The processes of system design in programming have more in common with other design disciplines, such as engineering and architecture, than with the type of experimental tasks described earlier in this review. Ferguson (1992) has described the way in which the development of modern engineering depended on the ability to publish pictorial representations of engineering designs. Ferguson, along with many eminent engineers whom he quotes, believes that engineering designs are constructed as mental images, and that communicating those designs depends on non-verbal representations. Similar claims have been made regarding the use of visual representations in architecture. Goel (1992, 1995) challenges the computational theory of mind on the basis that it cannot account for the way that architects use sketches, as documented in protocol studies of architects at work by Goldschmidt (1991, 1994) and Suwa and Tversky (1997). Fish and Scrivener (1990) have proposed a general model of the use of sketches in creative design - they claim that perception of sketches interacts directly with mental imagery to enable creative problem solutions.

The use of both visual representations and mental images to discover creative solutions has also been proposed as a fundamental mechanism of scientific discovery (Dreistadt 1968, Gooding 1996, Nersessian 1995, Qin & Simon 1995), as well as in other fields of creativity (Koestler 1964, Shepard 1978, Johnson-Laird 1988). It has even been proposed that almost all problem solving involves structural analogies constructed from mental images (Paivio 1971, Kaufmann 1979). Recent experimental investigations of this proposal have concentrated on a single question, however: once a mental image has been formed, is it possible to reinterpret this image in order to discover new properties? This question is crucial to proposed models of image-based creativity, and highly relevant to the theories of engineering and architectural design described above. Finke, with various colleagues, has carried out a series of experiments in which he has found evidence for discovery of new structure in images when subjects are shown apparently unrelated elements, then asked to combine them in working memory (Finke & Slayton 1988, Finke, Pinker & Farah 1989, Finke 1996). Other experiments, however, have found that memorised ambiguous images cannot be reinterpreted, although the subject can later reproduce the image on paper and then reinterpret their own drawing (Chambers & Reisberg 1985, Slezak 1992, Walker et. al. 1997).

The experiments in this chapter investigate the way that diagram use interacts with mental imagery during design tasks. It addresses several of the questions that have been discussed in this introduction, but concentrates on their relevance to diagram use, rather than speculating on general properties of mental images.

Experiment 3: Visual imagery during planning

Is there any evidence that diagrams are direct expressions of image-like mental representations? One way to investigate this question is by analysing external signs of cognition associated with both diagrams and imagery. Brandt and Stark (1997), for example, found that the same sequence of gaze fixations was involved in imagining a simple diagram as in observing it. A second alternative is to use dual-task studies: if a certain task has been observed to impair the formation of mental images (presumably, but not necessarily, because it uses the same cognitive resources), will that task also impair the planning of diagrams? This experiment takes the second approach; if diagram planning is impaired by the secondary task, we can infer that diagrams express image-like mental representations.

It also addresses two further questions arising from experiment 1 by using tasks that may involve different encodings in working memory. The first is related to the possible distinction between physical information and abstract information. Experiment 1 suggested that pictorial representations may cause physical information to be emphasised rather than abstract information. Previous research into working memory has found experimental and neurological evidence that spatial information is encoded separately from categorical information (McNamara 1986, Mecklinger & Muller 1996, Kosslyn et. al. 1989) but also that the two are combined when abstract information must be memorised in association with a spatial context, as when functions are assigned to buildings on a map (McNamara, Halpin & Hardy 1992). It seems likely that visual material presented in diagrams involves both categorical and spatial information. Must a combination of abstract information within a spatial metaphor hence rely on different working memory resources?

The second working memory question arising from experiment 1 is the distinction between encoding the spatial arrangement of the elements in a diagram and encoding their visual appearance. Just as there is strong evidence for separate working memory resources for categorical and spatial information, there is also substantial evidence for a distinction between the visual and spatial components of working memory, including neurological (Farah et. al. 1988), developmental (Logie & Pearson 1997), anatomical (Mishkin, Ungerleider & Macko 1983) and functional imaging (Smith & Jonides 1997) studies, as well as evidence from conventional cognitive experiments. An example of the latter is the report by Tresch, Sinnamon and Seamon (1993) that memory for objects or for location is selectively impaired after tasks involving colour identification and motion detection respectively. In experiment 1, the mental animation process that was postulated as a basis for analysing pictures of a physical machine can be identified as primarily visual, while the process of arranging nodes and connections into a complete diagrammatic solution is primarily spatial.

There is a diverse spectrum of hypotheses relating the two distinctions: coordinate/categorical and spatial/visual. It is quite possible that there is only a single representational dichotomy, but that it is simply poorly understood. Either distinction, however, may be relevant to the current investigation - the distinction between abstract and physical information, or between pictorial metaphor and simple geometry, might easily interact as a result of their respective working memory requirements. This experiment addresses these questions by considering separately abstract and physical situations, and by using separate secondary tasks that exercise either visual or spatial short term memory.

Notation

The notation used in this experiment was designed to be as simple as possible while maintaining the visual dataflow metaphor introduced in experiment 1. It was intended for use by participants with no experience of computer programming, without requiring that they learn any computational concepts. The form of the notation was nodes connected by arcs, as in experiment 1, but these were given only minimal semantics. Four different types of node were defined, but these had no semantic implication - I told participants that they could choose whichever node they liked, and use them to stand for anything they liked. Each node type included a selection of terminals to which arcs could be connected. Terminals on the left hand side of a node were described as "inputs", and those on the right hand side as "outputs", so that flow implicitly proceeded from left to right, even though (as in experiment 1) I never explicitly mentioned flow. Each terminal could have any number of arcs connected to it.

As in experiment 1, there were two forms of this notation, each with identical semantics, but with different pictorial images representing nodes and arcs. The first of these used simple geometric shapes, connected by plain lines, shown in figure 5.1.

Figure 5.1. Simple geometric nodes and arcs

In the second form of the notation, nodes were connected together by images of cylindrical ducts (actually miniaturised bitmap images produced from digitised photographs of air-conditioning ducts). The nodes themselves were also small photographic images, designed to be obviously mechanical, and plausibly producing flow through the attached ducts, but without having any identifiable function. They were produced from digitised photographs of air conditioning components and garage tools, but the original devices would only be identified by an experienced engineer - participants in the experiment did not recognise them. This implicit data flow version is shown in figure 5.2.

Figure 5.2. Mechanical nodes and arcs with implicit data flow

Participants created diagrams by manipulating the appropriate set of nodes with an editor on a computer screen. The editor screen is illustrated in Figure 5.3. The editor included a palette in one corner with four different node images; participants created new nodes by clicking on any one of these images with the mouse. Nodes could be moved to any location on the screen by clicking in the middle of the image, and dragging it. Connections between nodes could be made by dropping one node so that its input coincided with the output of another node, or by clicking on the output of one node, and dragging from there to the input of another node. If the node at either end of a connection was dragged to a new location, the arc would move to follow it.

Figure 5.3. Simple node and arc editor

Most participants in this experiment had little experience of computers, and some had never used a mouse before. To make it easier to arrange nodes and connect them together, the screen was therefore divided into a grid of points with approximately 1 cm spacing, as shown in Figure 5.3 (for the geometric version, the grid was simple dots, while for the pictorial version, it resembled a grid of rivet heads on a steel plate). When a node was moved to a new position, it would "jump" to the nearest location on the grid. This made it relatively easy to connect terminals together - the participant only needed to click within the 1 cm2 region around the terminal.

The editor also included an "erase mode" allowing nodes or arcs to be removed (the erase mode button is at the top right of figure 5.3). If the participant moved the mouse cursor over any shape on the screen after selecting erase mode the shape would turn into a cloud containing the words "click me"; clicking would then erase it. When not in erase mode, clicking on a node caused a selection box to be drawn around that shape. For the geometric version, this was a simple box and arrow, and for the pictorial version, it was a piece of paper and a pencil. The participant could assign a label to each node by clicking to select it, and then typing the label. Labels were only required after the diagram was complete, however, as in experiments by Szlichinski (1979).

These various features of the diagram editors are illustrated in appendix B.3.

Tasks

Participants were asked to use the editor to draw diagrams describing the workings of six different devices. Three devices incorporated moving parts and physical processes, while the other three had no internal moving parts, and only abstract processes. Table 5.1 shows the six devices used and the categories they were assigned to.

Concrete	Abstract
washing machine	telephone
motorbike	calculator
coffee vending machine	television

Table 5.1. Abstract/concrete devices

Participants were told that the diagram should show the way that the named device worked on the inside, and should not be a picture of the device.

Participants were also asked to perform secondary tasks while planning their diagrams. The first of these tasks, spatial tracking, was designed to interfere with spatial working memory, as in the motion detection task described by Tresch et. al. (1993). The participant moved the mouse pointer to follow a circle moving slowly around the computer screen with random changes in direction. If the mouse pointer moved outside the circle, the circle changed colour - the participant was instructed not to let this happen (more positive alarms were also tried, but were found during pilot testing to induce excessive anxiety). The exact form of this task was proposed by Professor Alan Baddeley (personal communication 18 June 1996). The second task, visual noise, has been shown by Quinn and McConnell (1996) to interfere with memory for a list of items specifically when subjects are asked to use a visual mnemonic strategy. The participant watched a continually changing random grid of black and white squares. In the third task, blank screen, the participant simply watched a blank screen with a fixation cross in the centre.

Equipment

The editor was implemented using the animation package MacroMedia Director, version 4.0. I captured the pictorial node and arc images using a Kodak DC500 digital camera, reduced the resolution to 40 pixels square using Adobe Photoshop, and edited them to provide uniform connection points on each node before importing them as bitmaps for use in Director. The animated behaviour of the nodes and arcs in response to user actions was implemented in the Lingo scripting language provided with the Director product.

The experiment procedure was controlled by a presentation sequence implemented in Director. This provided an animated tutorial demonstrating the use of the editor (the tutorial script is reproduced in appendix B.3), then invoked the editor program. The editor program maintained a log of all actions made by the participant, with the system time (reported as the previous whole second) recorded at the time of each action. The experimental software ran on a Macintosh PowerPC 8200/120 computer with a 17 inch monitor displaying a resolution of 832 x 624 pixels at 24-bit colour.

The secondary tasks to be performed while planning diagrams were also implemented as movies in MacroMedia Director. The spatial tracking task was designed to have minimal visual contrast. A dark grey circle slowly followed a randomised path over a light grey field. The speed of motion was kept constant, with the direction of motion changing by small randomised increments. As long as the mouse pointer was kept over the top of the moving circle, it would stay the same colour, but if the participant let the pointer move away, the circle would change to a slightly different shade of grey.

The visual noise task was based on a C program originally developed for the IBM PC by Jean McConnell (McConnell, personal communication 26 July 1996), as used in the research described by Quinn & McConnell (1996). The original program was not reused, but I created a close visual equivalent using the facilities of Director. I first wrote a LISP program to generate a series of images consisting of a grid of black and white squares. In each image the squares of the grid were randomly coloured either black or white. These images were then imported as a sequence of animated frames in Director, with the transition between frames taking place as a random fade, where the fade grid corresponded exactly to the grid of random squares. The result of this was a randomly changing sequence of grid squares practically indistinguishable from the stimulus used by Quinn and McConnell.

Hypotheses

Based on the inhibitory effect of pictorial representations on reasoning about abstract tasks observed in experiment 1, the same effect should result in an interaction between the type of editor and the type of device being explained.
On the basis of the previous literature describing use of mental images, introducing a secondary task that has been shown to impair visuo-spatial short term memory would inhibit planning of the diagram.
That abstract and concrete diagrams might be prepared as different image types, and that different secondary tasks would therefore have different inhibitory effects on planning each device type, possibly interacting with the two types of editor.

Participants and design

Twenty-four participants were recruited from the APU panel of volunteers. None of them had any experience of computer programming. Two further participants were recruited after two of the original cohort misinterpreted the description of the editor palette, concluding that each diagram should have exactly four nodes.

One independent variable was assigned randomly between subjects - twelve participants used the editor with simple geometric shapes and twelve used the pictorial editor with implicit data flow. A second independent variable was the nature of the diagram drawing task. Each participant drew six diagrams, three of which explained physical processes and three abstract processes. The third independent variable was the secondary task performed by the participant while planning each diagram: spatial tracking, designed to interfere with spatial working memory, visual noise, designed to interfere with visual working memory, and blank screen, in which the participant simply watched a fixation cross in the centre of an otherwise blank screen.

The experiment included three dependent variables. The first was the degree of elaboration - the number of nodes in the diagram, as measured in experiment 1. The second was the speed with which the diagram was created - the average interval between addition of new nodes or arcs. The third was the proportion of changes made to the diagram - the proportion of nodes or arcs that were moved or erased after their initial creation.

In order to test the hypothesis of an interaction between the type of editor and the type of process being explained, I tested for different degrees of diagram elaboration for each device type in the two groups. In order to test the second hypothesis, that the secondary tasks would inhibit planning, I compared the speed of creation of the first ten additions to the diagram immediately after the end of the planning period. I also compared the proportion of changes made to the diagram. The third hypothesis of an interaction between the type of secondary task and the pictorial representation or process type was tested in terms of diagram elaboration.

Procedure

The experiment started with an explanation of the editor functions, at a pace controlled by the participant. The first part of the explanation covered basic mouse operation - clicking on a shape and dragging it from one place to another. This was followed by an animated demonstration of the editor functions, using the appropriate (geometric or pictorial) version of the editor. The secondary tasks were then demonstrated, so that the participant could practise following the moving circle, and see the random grid display. Finally, the participant was asked to draw a diagram as a practise exercise using the editor. The instructions for the exercise specified that the diagram should show how a toaster works on the inside. It stressed that the diagram should not be a picture of a toaster, and that it did not need to look like any part of a toaster - it would simply show how a toaster worked. No further instructions were given regarding the intended use of the nodes and arcs. During this instructional sequence, I sat beside the participant, and answered any questions asked by the participant. Most participants had no questions. Some needed assistance with the procedure for dragging using the mouse. Some asked for clarification of the difference between an input terminal of a node and an output terminal. Some asked for clarification of the instruction that the diagram should show how the toaster works, rather than what it looks like. Several participants were quite anxious about the task, and protested that they would be unable to draw any diagrams. I reassured these participants in general terms, and all of them proved able to draw acceptable diagrams using the editor - it was not necessary to remove any participants as a result of inability to use the editor. During the remainder of the experiment, participants worked on their own, drawing diagrams to describe the workings of the six different devices. The presentation sequence displayed a description of the required diagram, including a reminder that the diagram should show the way that the named device worked on the inside, and should not be a picture of the device. The participant was then given 60 seconds to plan their diagram, during which they had to perform one of the three secondary tasks. The allocation of secondary task to device description was balanced across subjects, and the presentation order of both devices and secondary tasks was also balanced. After the planning period, the editor screen was displayed. The participant then had a period of five minutes in which to draw the diagram they had planned. At the end of the five minutes, they were given a further two minutes in which to type labels for each node in the diagram. During this second period, the node creation and erase functions were disabled. This planning / drawing / labelling sequence was repeated six times by each participant. When participants had completed the diagram drawing tasks, I asked them to complete a debriefing questionnaire. This questionnaire asked:

how easy it was to plan the diagrams in advance;
which was the most difficult diagram to plan;
whether they felt that the secondary task had made the diagrams more difficult to plan;
how they decided which shapes to use;
whether shapes were chosen during planning, or only while drawing the diagram; and
whether they had been able to assign names to shapes in advance.

Results

The analysis approach is a multivariate analysis of variance (MANOVA) with repeated measures, having two within subjects factors (abstract/concrete processes and blank screen/visual noise/spatial tracking secondary task), and one between subjects factor (geometric/pictorial editor). The three dependent variables were elaboration (number of nodes), initial speed of production, and proportion of diagram elements changed. The MANOVA results reported here are calculated using Pillai's Trace method; alternative MANOVA techniques did not result in any difference of significance values.

Initial univariate tests showed that both the editor type and the type of process had significant effects on elaboration F(1,22)=5.74 and 5.81 respectively, p<.05. As shown in figure 5.4, abstract processes were drawn with more nodes (an average of 8.74) than were concrete processes (7.90). Diagrams drawn with the geometric editor were also more elaborate (9.46 nodes) than those drawn with the pictorial editor (7.18 nodes). The hypothesised interaction between these two factors did occur in the predicted direction - diagrams describing abstract processes were more elaborate when using the geometric editor. Although the difference in the means was relatively large, this interaction was not significant F(1,22)=2.33, p=.14. A similar interaction effect was observed for the other dependent variables. Participants constructed the diagram more quickly when using the geometric editor for abstract processes, and they made a smaller proportion of changes to their diagrams, as shown in Figure 5.5. MANOVA analysis indicates that these interactions when taken together are significant, F(3,20)=5.77, p<.01.

Figure 5.4. Effects of editor type and process type on diagram elaboration

Figure 5.5. Interactions of editor type and process type:
elaboration, speed and number of changes, taken together, show a significant effect

The second hypothesis was that secondary tasks during the planning period would have an effect on speed of production and the proportion of changes made. There was no evidence that the secondary task had any effect on either speed of production F(2,44)=0.034, p=.96 or on proportion of changes F(2,44)=0.116, p=.89. There was also no evidence of the interactions postulated in the third hypothesis - a multivariate analysis of variance found no interaction of secondary task with process type F(6,86)=1.373, p=.51 or with editor type F(6,86)=0.562, p=.21. Univariate ANOVA tests on each variable also found no significant interactions with the secondary task.

When answering the questions in the debriefing questionnaire, only five of the 24 participants said that they had been able to choose shapes in advance while planning the diagram, although a further six said that they could do so occasionally. Twelve of the participants said that it was "not easy", "hard", "difficult" or "impossible" to plan diagrams in advance. The performance of these twelve was then considered separately from the twelve who reported that advance planning was relatively easy. As can be seen in figure 5.6., there was no overall difference between the performance of the groups in any measure, with the MANOVA test result F(3,20)=0.425, p=.73. There were however marginally significant covariances of self-reported planning with the effect of secondary tasks, F(6,82)=2.074, p=.065. Those who reported that it was easy to plan in advance actually performed slightly better (more elaborate diagrams and faster production) when carrying out secondary tasks, while the reverse was true of those who had difficulty planning.

Figure 5.6. Effect on performance of self-reported advance planning

Discussion

As in experiment 1, there is evidence in this experiment that novices are not completely happy when asked to use pictorial elements diagrammatically. In experiment 1, several participants commented that they found the level of detail in the pictorial notation confusing. At the end of the experiment I showed some of these participants the geometric version of the language, and they claimed that they would prefer to use that version. I also showed two of the participants in the geometric condition the pictorial version, and they said that they would prefer not to use it. This supports the finding of Strothotte and Strothotte (1997), amongst others, who have noted that pictogram users tend to choose more "abstract" symbols such as asterisks or arrows when representing abstract concepts.

In this experiment, the informally expressed preference has been supported by performance measures. Participants were less productive when using a pictorial metaphor to create diagrams than when using simple geometric shapes. This difference was most pronounced in the case of diagrams describing abstract processes. My explanation for this is that participants regard the illustrations as being literal rather than metaphorical, and that this results in incongruity between the task and the notation. No participants commented that they found the pictorial notation inappropriate for particular tasks, but this hypothesis is tested in greater detail in experiment 5.

The main intention of this experiment, however, was to test the way in which choice of notation affects the user's ability to form diagrams as mental images. Only some participants in this experiment appeared to carry out any planning using mental imagery. Secondary tasks during the planning period had no overall effect on speed of production, despite the fact that these tasks have reliably been shown to impair mental images in short term visuo-spatial memory. Those participants who reported that they found it easy to carry out advance planning actually improved their performance when a secondary task was given. It is possible that the plans involved verbal rehearsal rather than images - this is investigated further in experiment 6B, in which some participants were given no planning time at all.

Overall, this experiment found a reduction in performance when pictorial elements were used to describe abstract processes. A further relationship had been expected between performance and the use of mental images for diagram planning, but no clear evidence was found for this.

The different reports regarding advance planning reflect a wider range of individual differences that are relevant to this experiment. The underlying causes of these differences may be complex. Several researchers have reported differences in mental imagery ability correlated with gender (Casey 1996, Delgado & Prieto 1996, Paivio & Clark 1991, Silverman, Phillips & Silverman 1996), handedness, or an interaction between the two (Halpern 1996). Further proposed distinctions include the difference between verbalizer and visualizer "cognitive styles" (Richardson 1977), interaction of cognitive style with handedness (Casey et. al. 1993), cognitive style with gender (Winner & Casey 1992), with age (Johnson 1991) or self reported vividness of mental imagery (Katz 1983). This is a very complex issue, and I found no obvious correlations with (for example) gender in post hoc tests. It is certainly true that there was a wide range of individual variation in performance in this task, and this variation has contributed to the marginal significance of the reasonably large effects observed. Stenning and Gurr (1997) have also observed the difficulty of evaluating external representation use in the face of individual differences such as these.

Van der Veer (1990) has explicitly investigated the effect of cognitive style on the interpretation of software diagrams, but found that individual differences in mathematical experience had a greater effect than cognitive style. Previous experience of mathematics notations has also been identified as a factor in image memory by Winner and Casey (1992) and by Manger and Eikeland (1998). Most of these studies have also noted an interaction of experience with gender or gender image. This is in accordance with casual comments made by many of the female participants in this and later experiments, along the lines of "I'm not much good with mathematical things - you should have got my husband/son to do this experiment".

Experiment 4: Comparing diagrams to text

Participants in experiment 3 were selected on the basis that they had no experience of computer programming - indeed most had little experience of computers, and some had never used a mouse before taking part in the experiment. The training phase did take account of this, and all participants successfully completed the experiment. Nevertheless, the environment was not one with which they were comfortable. This may have caused participants to produce diagrams that were unusually simple. The use of a computer-based editor may also have removed the potential in pencil sketches for discovery through ambiguity, as has been suggested by Goel (1995) in the case of architectural CAD systems. In this related experiment, participants were therefore asked to explain the same devices, but by either drawing diagrams using pencil and paper or writing a verbal explanation.