Human cognitive ability seems unique, but our brains don’t differ that much from other mammals. Evolutionary remodeling popped out the roof to add a roomful of neocortex, but it left the knob-and-tube wiring, awkward kitchen, and tiny closets. The original floor plan is intact. So, any account of advanced cognition must accord with it emerging from brain structures that developed long ago, brain structures that evolved to perceive the world and move bodies around.
One possibility is that thought is just perception and action, but in attenuated form. This is an old idea. 19th century philosopher Alexander Bain argued that thought is “covert” or weak behavior, and William James referenced the view that “[imagination] is only a milder degree of the same process which took place when the thing now imagined was sensibly perceived." 20th century behaviorists held a similar position.
Modern neuroscience does seem to indicate that our private mental world - imagery, inner dialogue, plans, concepts - runs on the same old set of sensory and motor hardware. To use another analogy, imagine a brain in thought as a car in neutral gear. The cabin lights turn on, the mirrors work, the engine revs, the brakes pump, but it’s not going anywhere. There’s no extra thinking system; it’s the same apparatus, active in the same way, but disconnected from the physical world.
The best explication of this theory I’ve found comes from neuroscientist Germund Hesslow, and this essay relies on his work to lay out the argument and evidence. And of course I try to tie it back to psychiatry.
Thinking consists of simulated interaction with the environment
A lot of thought is “simulated” behavior, in various tenses. We can mentally rewrite an argument with our spouse, rehearse an athletic move, and play out scenarios if we ask for a raise. More concretely, brain activity when we think about doing something resembles the activity when we actually do that thing. Structures involved in physical movement, like the basal ganglia and cerebellum, also light up when thinking and planning. Their output, however, is inhibited by frontal cortex areas, so “a simulated action is a suppressed or unfinished action.”
Thinking also means simulating perception. We hear music in our head, we anticipate the taste of dessert, and we bring familiar images to our mind’s eye. The resolution varies here a lot; my recall of film trivia is excellent, but my recall of film scenes is blurry at best. The important part is that imagining a thing uses the same parts of the brain as perceiving it, but without external stimuli.
These two phenomena link up. Just like actual perception elicits a behavioral reaction which changes our next perception (because we take a step, pick up a coffee mug, speak out loud, etc), internally created “perceptions” evoke internal “behavior” which elicit new “perceptions” and so on. All that’s required is a generic mechanism to associate different brain states in sequence. Instead of physically reacting to every event, the organism can model a response and likely consequences. This saves time and energy while creating the ability to anticipate.

Planning and executive function are also forms of behavior. The thought “I’ll make lasagna for dinner,” itself a projected action, stands in for concrete steps like stopping by the store after work, picking out groceries, and choosing a route home. In the kitchen, the headline goal “make lasagna” contains innumerable sub-tasks.

Once this function is available, longer and more nebulous chains of simulation are only a matter of degree. As stated elsewhere, “human thinking can be seen as a form of navigation in a space of abstract concepts. It is like executing movements without activating the muscles.” The capacity for abstraction is inherent: “make lasagna” is already more symbol than action, with concepts like “impress dinner guests” and “become a better cook” just further layers on top.
Cognitive tasks use motor and sensory pathways
When people imagine walking into a room and locating a specific object, the elapsed time tracks the actual time. Mentally rotating a 3D object takes as long as manually rotating one. On fMRI, imagined movements light up premotor cortex and supplementary motor areas, and mentally rehearsing a piano piece uses the same frontal and parietal areas as actually playing it. As other authors conclude, “motor imagery corresponds to a subliminal activation of the motor system.” However, the intensity is weaker than in overt movement, consistent with these pathways simultaneously being inhibited.
The same pattern holds for perception. Imagining flashes of light activates the visual cortex, imagining touch activates somatosensory cortex, and imagining sound activates auditory cortex. Visualizing a face even activates the cortical fusiform face area, just like seeing one in real life. (Further review of all this here.) Even better, recalling a specific scene leads to eye movements that track the location of objects in the scene; the “mind’s eye” moves your real eyes in parallel.
Some implications
The existence of internal monologue falls out neatly here. Prepared speech from our expressive language area goes straight to receptive language and auditory processing regions, so we “hear” our own thoughts. The important point is that internal speech, a primary means of thinking, is just suppressed speech. This makes sense considering how hard it is to avoid voicing certain thoughts, and how people with diminished frontal lobes lack a filter.

Simulation theory also recasts the nature of autobiographical memory. Computer metaphors treat memory as discrete information stored away in a hard drive of sorts. But this hard drive doesn’t seem to exist, and people with amnesia from hippocampal damage also lack the ability to imagine new experiences. Long-term memory can instead be seen as simulated perception activated by pointers from other brain regions. Recall and imagination are the same “program,” which certainly makes sense of the former’s mutability.
Conclusion
This theory has a lot of nice features. For one, it avoids evolutionary leaps or novel structures to explain human cognitive ability. Thinking is performed by the same systems as action and perception - thinking is just inhibited action and perception. For another, it fits really well with the idea that emotional feelings are simulated body states. In addition, it complements other connectionist paradigms like Perceptual Control Theory and Predictive Coding. Like those, simulation theory eschews the need for philosophical constructs like mental objects or internal representations. It has an elegance.
As a model of brain function, it should also have something to say about brain dysfunction. And I think it does: psychosis is when the simulation breaks down. If thought is a form of behavior, it makes more sense why delusions tend to go with disorganized speech and bizarre behaviors. If hearing voices is a form of inner speech, even better. The overlap of psychosis and the abnormal movements of catatonia is also easier to understand. This is just one theory, but psychiatry has been stuck for decades in an empiricist-reductionist crouch, and new interpretations are the only way forward.
Really interesting work.
Definitely asks questions related to the strength of the simulation—do people with aphantasia have a more suppressing system? Do people with vivid imaginations have weaker suppression systems?
Is the suppression factor constant for all behaviours/actions of the brain, or can you be more suppressive in one domain than another.
Can we modify this suppressive factor?
Is the suppressive factor a feature of the neural architecture or an extra isolated feature (i.e. paint on the walls vs a new room)? I'd argue it's more the former than the latter, since we would probably have observed the latter in fMRI scans, but it begs the question of how something like that evolved so uniformly given our similarity to non-conscious organisms.
Woah. Must read.