I’m looking for graphics to add – this is just about incomprehensible to a “visual learner”
Left: New labeling, maybe? Right: Old labeling (1960s) but still in use.
One brain – two visual systems / Mel Goodale and David Milner outline their research.
Terms: agnosia – inability to interpret sensations and hence to recognize things, typically as a result of brain damage; ataxia: the loss of full control of bodily movements; saccadic (French for jerk) quick, simultaneous movement of both eyes between two phases of fixation in the same direction. The phenomenon can be associated with a shift in frequency of an emitted signal or a movement of a body part or device.
Why would anyone think we have two visual systems? After all, we have only one pair of eyes – and clearly we have only one indivisible visual experience of the world. Surely it would be more sensible to assume, as most scientists throughout the history of visual science have assumed, that we have only one visual system. But of course, what seems obvious is not always correct.
The assumption of a single visual system began to be challenged in the late 1960s and early 1970s. According to one influential account (Schneider, 1967), the more ancient subcortical visual system (in particular, the superior colliculus) enables animals to localise objects whereas the newer cortical visual system allows them to identify those objects. The time was certainly right for such ideas and a number of other related schemes were put forward, coming from a variety of experimental traditions (Trevarthen, 1968; Held, 1970; Ingle, 1973). These ideas were revolutionary at the time and many investigators, including ourselves, were inspired by the notion of two distinct visual pathways, each with a different job description. Indeed, our first collaborative work together in the 1970s was an attempt to specify more precisely the role of the superior colliculus in guiding different kinds of motor behaviour in rodents (Goodale & Milner, 1982).
But the eye does not just send input to the superior colliculus and the visual cortex. In fact, messages are sent to at least 10 different target brain areas, each of which appears to be involved in the control of its own separate class of behaviours. For example, whereas the superior colliculus has been shown to be involved in guiding eye and head movements towards potentially important visual events or objects, another subcortical structure, the pretectum, plays a crucial role in guiding animals around potential obstacles as they move around their environment. Indeed, lower vertebrates, from frogs to gerbils, show good evidence for the existence of several independent and parallel visuomotor pathways (Goodale, 1983).
But what about primates?
The breakthrough arose from several studies in the monkey by Leslie Ungerleider and Mortimer Mishkin (1982), which led them to retain the earlier distinction between localisation and identification but to move it entirely into the cerebral cortex. From then on, the distinction between cortical and subcortical visual pathways that had been so much the rage during the previous decade fell out of fashion, and suddenly the phrase Ungerleider and Mishkin used to describe the division of labour between their two cortical systems – ‘what versus where’ – began to fall from every psychology student’s lips.
According to Ungerleider and Mishkin’s scheme, the ventral system, passing from primary visual cortex (V1) to the inferior temporal lobe, is concerned with object identification, while the dorsal system, passing from V1 to the posterior parietal lobe, is charged with object localisation. Thus, these two essential and complementary aspects of visual perception were allocated to separate processing ‘streams’.
Throughout the 1980s and 1990s, the existence of these two streams in the monkey brain was amply confirmed, and several new visual areas belonging to one or the other stream were discovered. Nobody now disputes the existence of the ventral and dorsal visual streams. By the early 1990s, however, we and others began to question the appropriateness of the ‘what versus where’ story in capturing the functional distinction between the two cortical streams.
In December 1990, two apparently unrelated discoveries, one made by us and one made by another research group, suddenly made it clear to us that a new formulation based on a distinction between ‘what versus how’ would do a much better job of characterising the division of labour between the ventral and dorsal streams.
We had two years earlier begun a series of neuropsychological studies of a patient with severe visual agnosia, and indeed by the end of 1990 had written two articles based on this work (Milner et al., 1991; Goodale et al., 1991). Our patient, D.F., closely resembled an earlier patient Mr S., described by Benson and Greenberg (1969) as having ‘visual form agnosia’, and who had been examined in admirable detail by Efron (1968). Mr S. was characterised by a profound problem not just in recognising objects, but more fundamentally in discriminating between even quite simple geometrical shapes, such as rectangles of different aspect ratio but identical surface area. D.F. not only shared this perceptual deficit, but also, just like Mr S., had incurred her disabling brain damage from carbon monoxide poisoning (from a faulty water heater) while taking a shower.
In our two papers, we documented not only what D.F. could not do, using a range of perceptual tasks, but also explored what she could do, which turned out to be far more interesting. Even though she was very poor at describing or demonstrating the orientation of a line or slot, she could still reach out and post a card into the same slot without error. Similarly, despite being unable to report (verbally or manually) the width of a rectangular block, she would still tailor her finger-thumb grip size perfectly in advance of picking it up. In short, she could guide her movements using visual cues of which she seemed completely unaware.
While these two papers were still in press, we came across a dramatic report published by a group of Japanese neurophysiologists led by Hideo Sakata. They were following up the classic work of Mountcastle, who had discovered several different classes of neurons in the monkey’s dorsal stream, each of which was activated when the monkey performed a particular kind of visually-guided act (Mountcastle et al., 1975). Some neurons were active when the monkey reached towards a target, others when it made a saccadic eye movement to the target, and others when it pursued a moving target. Reasonably, Ungerleider and Mishkin had interpreted these results as reflecting the ‘where’ function of the dorsal stream, since the activity of the neurons appeared to be related to where in space the targets were. But another class of neurons that Mountcastle had discovered did not fit so well with their account. These neurons, which responded whenever the monkey grasped a target object with its hand, were not concerned at all with where the object was. And perhaps worse for the ‘what versus where’ account, the work by Sakata and his colleagues showed that these grasp-related neurons did concern themselves with the shape and size of the objects the monkey was grasping. In fact, they found that there were many neurons of this kind, mostly clumped together at the front end of the dorsal stream (Taira et al., 1990). What these neurons had in common with the other neurons in the dorsal stream was not their spatial properties but rather their visuomotor ones.
Putting two and two together, we realised that D.F.’s perceptual problems could have arisen from severe damage to the ventral stream, while her spared visuomotor skills could perhaps be attributed to a functionally spared dorsal stream. Things were beginning to fall into place. We could now also begin to explain the striking deficits seen in another group of patients, those with so-called optic ataxia, which in many ways is the converse of those seen in D.F. The term optic ataxia was coined by the Hungarian neurologist Rudolph Bálint (1909) in his description of a patient with bilateral damage to the parietal lobes. This patient had difficulty pointing towards or grasping objects presented to him visually, even though he had no trouble reaching out and touching parts of his body touched by the examiner. Bálint argued that his patient suffered from a visuomotor, rather than a visuospatial impairment: thus foreshadowing the debates that were to emerge much later in the century.
Bálint’s foresight was amply borne out by subsequent research, notably by Marc Jeannerod and his colleagues in the Lyon group (see Jeannerod, 1988, 1997). Optic ataxia patients not only have difficulty making spatially accurate reaches, but also are unable to rotate their wrist or pre-configure their hand posture when reaching out to grasp objects of different orientation or size. At the same time, many of them can still report where those targets are located relative to themselves and what they look like. In other words, these patients, in direct contrast to D.F., had presumably sustained damage to the neuronal hardware in the dorsal stream while still retaining an intact ventral stream.
We set out these ideas in two early theoretical papers (Goodale & Milner, 1992; Milner & Goodale, 1993). Marc Jeannerod and Yves Rossetti independently published closely similar ideas in 1993 (Jeannerod & Rossetti, 1993). We developed the model in much more detail in book form soon afterwards (Milner & Goodale, 1995).
Essentially, we see the ventral stream as supplying suitably abstract representations of the visual world, which can then not only serve to provide our immediate visual experience but also be stored for future reference. By this means the ventral stream enables the brain to create the mental furniture that allows us to think about the world, recognise and interpret subsequent visual inputs, and to plan our actions ‘off-line’.
In contrast, we see the dorsal stream as acting entirely in real time, guiding the programming and unfolding of our actions at the instant we make them; thus enabling the smooth and effective movements that allowed our primate ancestors to survive in a hostile and unpredictable world. In short, ours is a distinction between vision for perception and vision for action.
In the years since our 1995 book, the field of visual neuroscience has advanced rapidly. Part of the reason for publishing a new book 10 years later, Sight Unseen (Oxford University Press), was to give a retrospective view of our original ideas in the light of these more recent developments. At the same time, we wanted to bring our ideas to a wider audience.
In our own work, we had continued to study D.F. on and off over the intervening years, as well as patients with optic ataxia. Their complementary patterns of impairment and spared ability continued to impress us. One good example concerns the necessary role of the ventral stream in providing our visual memories – in other words its role in bridging a temporal gap, however short. We tested this by, presenting a visual object briefly to the subjects which was then taken away. A few seconds later the subject was asked to pick up the object as if it were still there. Remarkably, D.F. failed completely in this task. Unlike her behaviour in the normal situation of reaching to grasp a visible object, she now showed no tailoring of her finger-thumb separation at all as she reached out and pretended to pick it up (Goodale et al., 1994). Evidently she had no working memory of the object – not because her memory wasn’t working properly, but because she hadn’t consciously perceived the dimensions of the object in the first place.
This is how my “Asperger” clumsiness occurs: objects within a foot or two of my body are a problem, like a coffee cup next to the computer. I should know where it is, but upon reaching for it, my hand goes too far and knocks it over. Or if I’m reaching for something near it, it’s as if I don’t see the cup and my hand collides with it. This doesn’t happen often, but when it happens, it’s this type of location error. The odd thing is that I have really good aim when throwing an object toward a target. When I “lose” something like keys or my phone, the item is usually right under my nose. It’s as if my attention is rarely applied close to my body, but instead at a distance.
More recently, we have found that optic ataxic patients have exactly the converse problem. As expected, they show no sign of adapting their handgrip to the size of objects in ‘real time’, but when asked to perform the delayed pantomime task, they perform just like healthy subjects (Milner et al., 2001). In fact they even do this when the object is still present at the end of the delay. Evidently, once the healthy ventral stream has a chance to become involved, it tends to dominate their actions even when the patients are subsequently faced with a visible object. We verified this conjecture by secretly switching between different-sized objects during the delay on some trials – the patient’s hand opened according to the size of the previewed (remembered) object rather than the one actually present.
So if we are going to act on the basis of a visual memory, we need to use our ventral stream – the dorsal stream has no visual memory. In fact, this may be the most important job of the ventral stream; it allows us to use vision ‘off-line’, providing a bridge from the past to the present. But although the dorsal stream does not appear to have a visual memory, the visuomotor activities to which it contributes clearly do benefit from experience. In fact most of our visually guided actions are as skilled as they are precisely because of their being well-honed by practice. It seems likely that as our initially slow and painstaking efforts become more automatic, the contribution of the ventral stream retreats and is replaced by more streamlined circuitry involving the dorsal stream, related frontal cortical areas, and subcortical structures in the brainstem and cerebellum.
Seeing inside the head
The greatest advances in cognitive neuroscience over the years since we first published our ideas have come about through the development of functional MRI. This new technology has confirmed that the dorsal and ventral streams really do exist in humans – an implicit assumption that we made in our model. Functional MRI has also allowed us to test our hypothesis about what was no longer working – and what was still working well – in D.F.’s visual brain.
In collaboration with Tom James and Jody Culham at the University of Western Ontario, we started by carrying out an accurate anatomical scan of D.F.’s brain, and then used functional imaging to try to identify which visual areas were still working, and which were not. First we looked for a specific area in the ventral stream that is known to be activated when a person looks at objects, or even at line drawings or pictures of objects. To find this object-recognition area, we contrasted the pattern of brain activity that occurred when subjects looked at line drawings of real objects with the activity that occurred when they looked at scrambled versions of those same pictures. Not surprisingly, the brains of our healthy volunteers showed strong activity for the line drawings in this object-recognition area.
When we looked at D.F.’s brain, however, we could see that not only was this area severely damaged, but the remaining areas in her ventral stream showed no more activity for the line drawings than they did for the scrambled versions (James et al., 2003). In other words, she had lost her shape-recognition system. Just as we had inferred from our original testing many years ago, her brain can process lines and edges at early levels of the visual system, but it cannot put these elements together to form perceived ‘wholes’, due to the damage in her ventral stream.
A completely different story emerged when we looked for brain activation in D.F.’s dorsal stream, which we had hypothesised must be working well. When we looked at activity in her brain during a scanning session in which she was asked to reach out and grasp objects placed in different orientations, we saw lots of activity in the front part of her dorsal stream just as we did in healthy volunteers. This activity presumably reflected the normal operation of human ‘grasp’ neurons. What you see is not always what you get
The most difficult aspect of our ideas for many people to accept has been the notion that what we are consciously seeing is not what is in direct control of our visually guided actions. The idea seems to fly in the face of common sense. After all our actions are themselves (usually) voluntary, apparently under the direct control of the will; and the will seems intuitively to be governed by what we consciously experience. So when we claimed that a visual illusion of object size (the Ebbinghaus illusion) did not deceive the hand when people reached out to pick up objects that appeared to be larger or smaller than they really were (Aglioti et al., 1995), vision scientists around the world embarked on a series of experiments to prove that this could not possibly be true.
Of course, our model does not predict that actions are immune to all visual illusions. Some illusions, after all, arise so early in visual processing that they would be expected to affect both perception and action. Indeed, we have shown exactly that (Dyde & Milner, 2002). But our model does predict that some illusions, those that arise deep in the ventral stream, might well not affect visuomotor processing. Whenever this does happen, as it often does, it dramatically illustrates our claim that what we see is not necessarily what is in charge of our actions.
One recent example of this dissociation between perception and action is particularly striking. In collaboration with Richard Gregory, we used the powerful ‘hollow face’ illusion, in which knowledge of what faces look like impels observers to see the inside of a mask as if it were a normal protruding face (Kroliczak et al., 2006). Despite the fact that observers could not resist this compelling illusion, actions that they directed at the face were not fooled. Thus, when they were asked to flick off a small (‘bug-like’) target stuck on the face, they unhesitatingly reached out to the correct point in space (see picture). This striking dissociation between what you see and what you do provides a dramatic demonstration of the simultaneous engagement of two parallel visual systems – each constructing its own version of reality. Where do we go from here?
Much of our work to date has focused on the differences between the two visual streams – establishing where they go, why they are there, and how they work. This side of the story has depended crucially on evidence from patients who have suffered damage to one or the other stream. But even though studying the visual deficits and spared visual abilities in these patients has told us a great deal about the systems working in isolation, it has told us nothing about how the two systems interact. The big unanswered question for the future is how the two streams work together in all aspects of our visual life. – Professor Mel Goodale is at the University of Western Ontario. E-mail: firstname.lastname@example.org. – Professor David Milner is at the University of Durham. E-mail: email@example.com.
Weblinks Mel Goodale homepage: www.psychology.uwo.ca/faculty/goodale David Milner homepage: tinyurl.com/hmfoj The visual brain in action: tinyurl.com/evgu4