I saw a dog

—as I walked down a quiet side-street in Cambridge, not far from Central Square. I was glued to my phone and couldn’t make out so many details without looking up, but I could see that it was middle-sized and black, facing me and angled to the north-east.

I could tell this was a dog not only from its shape, but also from that primitive thwang that dogs trigger in my bones. I’m not afraid of dogs – I’ve spent most of my life around them – but I’m still wary around arbitrary canines on the street, leashed or not.

I felt that thwang as I registered the dog’s basic features. Black, medium size – maybe a black labrador. I raised my head, ready to step out of the way, smile at the owner, follow the basic program. But there was no dog in front of me.

What was in front of me was not a black labrador, but a commuter bike locked to a slightly oblique street sign. The bike had a thin black seat and narrow road tires, with a rusty pannier rack framing its back wheel. Its handlebars – drop bars, taped black – were angled away from me. No recognizable dog-features in sight, let alone an actual dog.

How could my own experience of the world be so wrong?

Am I pathological? I don’t think so. I’ve been noticing more of these experiences over the past few months. Sights, sounds, and sensations occasionally reveal themselves to be little fibs: reasonable, but ultimately inaccurate, pictures of what is actually out there in the real world.

There are at least two pictures of perception that such fib-experiences might suggest. Both views suggest that sensations and beliefs combine to produce our visual experience—this much is uncontroversial. They differ, though, on how much credit is assigned to each of those sources.

In one picture, my brain takes in an abundant amount of detail about the visual world at all times. On top of that abundant stream of information, some higher-level system sprinkles on the conceptual details: that cube is a cardboard box, that wiggling object is dangerous, and so on. Serious mistakes in those higher-level attributions – like the dog-percept presented above – can temporarily paint over my sensory inputs and cause me to see things that aren’t there.

A second picture suggests that the sensory information reaching my brain at any moment is actually quite sparse. On top of this sparse stream, most of the work of perception is performed at higher levels, with the mind “filling in” all of the gaps in my sensory data on the basis of beliefs and expectations. In this view, it’s not that the mind overwrites sensory data that is already there — rather, the mind is continuously tasked with filling in perceptual information not present in the original sensory data.

To further develop these two pictures, I’ll turn to some details on the human eye.

The human retina contains two major types of light-sensitive cells:¹ rods and cones. Rods are responsible for vision in low light, and are not sensitive to color. Cones function well only in high-light situations, and uniquely support color vision.

It turns out that these two types of cells are distributed unequally in the retina. Cones cluster around the area of the retina which maps to the center of our visual field (the fovea), while rods dominate everywhere else.

Human photoreceptor distribution.svg — Spatial distribution of rods and cones in the human retina. From Cmglee on Wikipedia.

This spatial distribution suggests that, at any moment, the majority of the color information my retina receives only picks out points in the very center of my visual field.²

This is one case, then, in which the brain seems to receive rather sparse sensory information. That’s puzzling, because it doesn’t seem to map onto my experience. I certainly don’t think that my color vision is limited to the very center of my visual field—I really hope yours isn’t, either.³

How is it that I perceive the world as fully colored, if my sensory machinery cannot possibly yield such an image? If that underlying hardware is yielding only a sparse picture of the real world, why does color feel so abundant in my visual experience?

Balas & Sinha (2007) present a simple experiment which will help us better draw out this sparse view. Their results offer behavioral evidence that some higher-level mental module actively fills in color information, turning what is originally a rather sparse sensory representation into an abundant visual experience.

(Unfortunately, the paper is not open-access, and the publisher demands hundreds of dollars for the rights to reproduce the figures. So I’ll do my best to summarize the procedure and results here.)

The authors prepared modified images of natural scenes like the one in the figure below. They took full-color images and imposed a randomly sized color mask, such that a circle in the center of the image remained in color while the rest of the image appeared in grayscale.

Partially-colored chimera like those used in Balas & Sinha (2007). — Partially-colored chimera image like those used in Balas & Sinha (2007).

These “chimera” images were rapidly presented to adult subjects, and the subjects were asked after each presentation to report whether the image they had just seen was in grayscale, full color, or a mix of the two.

The crucial metric of interest here is the rate of color “false alarms” — that is, how often subjects perceive an image with only its center in color as a fully colored picture. These false alarms be evidence of the brain filling in color percepts.

What would we expect to find? We know that the actual sensory data is rather sparse — recall that the majority of color-sensitive photoreceptors cluster in the fovea, in the center of the visual field. We might guess, then, that if the region of the image perceived by this color-sensitive area is appropriately colored, then the brain would serve to fill in the rest of the percept.

This is what Balas & Sinha find in their main result: even when nontrivial portions of the image are presented in grayscale, people are likely to report that the entire image is in color. For example, when the color mask covers 17.5 degrees of the visual field, subjects report that the entire image is colored almost 40% of the time. These false alarm rates reach 60% as the size of the color mask increases.

There’s much more to the paper: the authors present further experiments attempting to work out the source of the information used to fill in color. For our purposes, though, this headline result is already interesting.

We have evidence from both directions for the sparse view, then:

At the neural level we can see that the hardware to support color vision is clustered around a small central area of the retina, yielding rather sparse information about color in the rest of the visual field.
At the behavioral level we see that people often perceive these partially-colored images as fully colored.

It seems, then, that higher-level mechanisms are doing quite a bit of work in the brain to “fill in” the sparse information which manages to filter through the retina.

Why am I writing about this? I think the “filling in” metaphor is a useful tool for the mental toolbox.⁴ While this sort of phenomenon shows up again and again in psychology, I feel like I’ve only just begun to internalize it — to start to actually see footprints of the process in my own experience.

It’s likely due only in small part to my intellectual understanding of the process. It’s more likely, I think, that regular meditation and introspection is what is actually helping me see my own experience more clearly.

In any case, it’s quite the thrilling ride. I am catching my mind for the regular fibber that it is, as it paints pretty pictures over messy and sparse arrays of input from my sensory hardware. Happy hallucinating!

Fun fact. There is actually a third type with quite a long name: intrinsically photosensitive retinal ganglion cells. These cells (a ~1% minority in the retina) help regulate circadian rhythms and contribute to melatonin production/suppression. They were first hypothesized after scientists discovered that supposedly blind mice were still able to respond to changes in their visual environment. ↩
This is not exactly correct, of course. We rapidly and subconsciously microsaccade, even when we feel we are fixating our eyes on one position in our visual field. It’s possible that these microsaccades function in part to gather information about colors and forms in our periphery. I don’t pretend to cover all my bases as a vision scientist here – I only hope to get the broad strokes of this argument correct. ↩
I also don’t think that my peripheral vision is especially acute in low-light conditions. ↩
Psychologists and cognitive scientists might be reminded of terms like “top-down processing” and “predictive processing.” I’m not sure this metaphor adds anything on top of those, but it does sound quite a bit more intuitive. Anyway, the point of this post is to share some fun facts and ideas, not to present a novel metaphor. ↩