Surfing Uncertainty: Prediction, Action, and the Embodied Mind

Andy Clark's new book advances a sweeping vision of the mind as geared most fundamentally towards prediction error minimization. When we interact with the world, we have expectations regarding what sensory input we will receive. The expectations differ to a greater or lesser extent from actual sensory input. The mind strives to minimize prediction error, bringing expected sensory input into better alignment with actual sensory input. This picture informs a class of Predictive Processing (PP) models that have attracted much attention within psychology and neuroscience. Clark wants us to know about these models and to appreciate their implications. He applies the PP framework to perception, motor control, imagination, social cognition, and other mental phenomena. He also discusses how it bears upon central debates about mind (e.g. the explanatory status of mental representation, the embodied mind, direct perception).

Clark's past writings have repeatedly helped set the agenda for philosophy of cognitive science, and this book promises to do so as well. Taken together with Jakob Hohwy's The Predictive Mind (2013), which likewise explores the significance of prediction error minimization, Clark's discussion will surely ignite great philosophical interest in PP modeling. Striking attributes that have garnered Clark's previous work so much acclaim are equally evident here: utter mastery of the relevant scientific theories; a genius for explaining those theories in exciting and perspicuous terms, often with a memorable analogy or anecdote; and prose that frequently approaches poetry in eloquence and rhetorical flare. All philosophers concerned with how the mind works will want to read this brilliantly written book so as to evaluate Clark's case for the PP framework. For my own part, I am not convinced that the framework is as promising as Clark suggests, despite the enormous skill with which he defends it. I find that the PP models Clark emphasizes are often speculative and unsupported by compelling evidence. In some key cases, they conflict with current scientific knowledge. A book review cannot address all the areas discussed by Clark, so I will focus on the two mental phenomena that he most emphasizes: perception and motor control.

Bayesian modeling of perception

The perceptual system transits from proximal sensory inputs to perceptual states that represent the distal environment as being a certain way. How does it do this? For example, how does it estimate an object's shape based upon retinal stimulations caused by the object? Helmholtz (1867) proposed that the perceptual system estimates environmental conditions through an "unconscious inference." Recently, perceptual psychologists have developed Helmholtz's suggestion by modeling perception as unconscious Bayesian inference. On a Bayesian approach, the perceptual system maintains prior probabilities regarding the distal environment (e.g. certain distal shapes are deemed likelier than others) and prior likelihoods that relate the distal environment to sensory input (e.g. certain retinal inputs are deemed likelier given certain distal shapes and certain lighting conditions). The perceptual system deploys these priors to transit from sensory input to a posterior probability (e.g. the posterior may assign high probability to the perceived object having a convex shape). Based on the posterior, the perceptual system chooses a privileged estimate of distal conditions. For an overview of Bayesian perceptual psychology, with citations to the scientific literature, see Rescorla (2015).

Bayesian perceptual models are highly idealized. In most cases, calculating the posterior with exact precision is a computationally intractable task. The brain cannot literally implement a typical Bayesian model. At best, it can approximately implement the model. Accordingly, researchers have pursued neurophysiologically plausible models that approximately implement idealized Bayesian inference. See Sanborn (forthcoming) for a helpful survey.

PP models (more typically called predictive coding models) form one strand in this literature. The basic idea is that the brain uses priors to make predictions about sensory input, comparing the predictions with actual sensory input to compute prediction error. Predictions are then changed so as to minimize prediction error. The formal models that concern Clark have hierarchical structure. Higher levels pass predictions down to lower levels, and lower levels pass prediction errors back up to higher levels. For example, we might employ a three-level model where the highest level encodes the category to which an object belongs, the next level down encodes salient observable features of the object, and the lowest level encodes proximal sensory input caused by the object. If we construct PP models in the right way, then they approximately implement Bayesian inference.

I have seen some philosophers equate Bayesian modeling with PP modeling. This is a serious mistake. The PP framework offers one way that the brain might approximately implement Bayesian inference. Other approximation schemes are possible and under active scientific investigation. Furthermore, most models offered within Bayesian perceptual psychology do not presuppose any particular approximation scheme. They certainly do not assume that prediction error plays an explicit computational role in perceptual processing. Thus, evidence for Bayesian perceptual psychology is not in itself evidence for the PP framework. Clark acknowledges these points (pp. 27-28, pp. 298-299, pp. 302-303). In practice, though, he tends to move rather freely between evidence for Bayesian perceptual psychology and evidence for PP modeling, sometimes engendering a spurious impression of support for the latter.

How strong is the evidence for a predictive coding account of perception, as opposed to a generic Bayesian account? Bayesian perceptual psychology offers detailed models that explain numerous perceptual phenomena, including various illusions and constancies. Many of the models are well-confirmed, at least as idealized models to which the perceptual system approximates. Thus, I believe that current evidence supports a generic Bayesian approach to perception. Evidence for the PP approach seems much weaker. To date, predictive coding has generated comparatively few successful models of the perceptual phenomena that constitute the core explananda for perceptual psychology, such as illusions and constancies. Moreover, some widely discussed PP models strike me as less successful than non-PP Bayesian alternatives. For example, Clark (pp. 33-37) emphasizes a PP model of binocular rivalry offered by Hohwy, Roepstorff, and Friston (2008), but this model is arguably much less rigorous and explanatorily powerful than the non-PP model given by Gershman, Vul, and Tenenbaum (2012).

In several passages, Clark tacitly concedes that current evidence for PP perceptual modeling is fairly inconclusive (p. 10, pp. 298-299). This made me wonder why he decided to feature PP models so prominently in his discussion. In my opinion, Clark's focus upon PP modeling as opposed to generic Bayesian modeling is problematic for two reasons. First, it obscures how much explanatory power one already gains from generic Bayesian modeling of perception, absent any particular theory of implementation mechanisms. Second, it deflects attention from promising non-PP theories of implementation mechanisms currently on the market. The end result is that Clark's discussion overinflates how much explanatory value prediction error minimization contributes to current scientific theorizing about perception.

Optimal feedback control versus active inference

Let us now consider motor control, i.e. the ability to control one's own body. I claim that Clark's PP analysis of motor control conflicts with current scientific knowledge.

It is intuitively evident that we have goals and that we often achieve these goals by moving our bodies. Researchers seek to explain how the motor system selects motor commands that promote achievement of one's goals. Converting a goal into an appropriate sequence of motor commands is a non-trivial undertaking, due partly to a redundancy problem highlighted by Bernstein (1967). There are innumerable ways to achieve a given goal. For example, if my goal is to push an elevator button with my finger, then there are many paths my finger might follow to the button, many trajectories along that path, and many patterns of muscle activation that would produce a given trajectory. To achieve a goal, the motor system must select from among many possible sequences of motor commands. How does it do so?

The most compelling answer is given by a framework called optimal feedback control (OFC), introduced by Todorov and Jordan (2002). According to OFC, the motor system engages in unconscious inference and decision-making. It uses Bayesian inference to estimate environmental conditions. On that basis, it picks "optimal" motor commands for promoting the overall goal. "Optimality" is quantified by reference to a cost function. The cost function rewards achievement of the goal. It also reflects task-independent constraints, such as efficiency of energetic expenditure. When charged with achieving a goal, the motor system picks motor commands that minimize expected costs. Thus, the motor system solves Bernstein's redundancy problem by implementing (or approximately implementing) expected cost minimization. For a philosophical introduction to OFC, see Rescorla (2016).

OFC has proved explanatorily successful. Perhaps most notably, it explains a fundamental observation made by Bernstein: an agent who repeatedly performs some task will exhibit more trial-by-trial variation along task-irrelevant dimensions than task-relevant dimensions. For example, suppose the task is to move your hand through a sequence of widely spaced targets arranged in a plane. In such a case, hand position across trials varies much more as your hand passes between targets than as your hand passes near targets. The asymmetry between task-relevant and task-irrelevant variation follows naturally from OFC. Noise constantly perturbs motor execution, but from an OFC perspective the motor system should only correct task-relevant perturbations -- correcting a task-irrelevant perturbation is a useless expenditure of energy. Since perturbations constantly arise but only task-relevant perturbations are corrected, the result is much more variation along task-irrelevant dimensions.

Clark recognizes the appeal of OFC (pp. 117-120). Nevertheless, like Hohwy, he favors an alternative approach espoused by Friston (2011) called active inference. Friston extends the PP framework from perception to action. He proposes that motor control seeks to minimize sensory prediction error (especially proprioceptive prediction error). As Friston notes, there are two ways to reduce prediction error: change the predictions, or change the incoming sensory signal. Friston holds that perception follows the first strategy while motor control follows the second. The motor system expects motor organs to follow some trajectory, and it expects certain sensory consequences to result from that trajectory. If motor organs deviate from the expected trajectory, then sensory prediction error occurs. Motor reflexes are engaged to suppress prediction error, steering motor organs towards the expected trajectory.

How do desires, intentions, and other conative mental states fit into the active inference scheme? Some passages in Friston's writing suggest an eliminativist stance (e.g. Friston, Mattout, and Kilner, 2011, p. 157). On the eliminativist view, there are no conative mental states. There are only belief-like states, including priors. You do not have any goals at all. You simply expect that you will act a certain way. Eliminativism is a very radical view, and Clark does not endorse it. He appears to favor an alternative reductionist view: conative mental states are to be reduced to expectations, priors, and other belief-like states (p. 125, pp. 129-130). Certain passages in Friston's writings also suggest this reductionist view. On the reductionist view, you do have goals, but having a goal is just a matter of having certain expectations about how you will act. Clark provides no hint how the envisaged reduction might proceed.

My main reservation about Clark's position does not turn upon this issue. My main reservation is that active inference is much less explanatory than OFC. OFC invokes expected cost minimization to describe how the motor system converts a goal into a particular bodily trajectory. It thereby explains how the motor system selects a particular bodily trajectory. In contrast, active inference invokes prior expectations regarding bodily trajectory. As Adams, Shipp, and Friston (2013, p. 617) put it: "In active inference schemes, the cost functions are replaced by prior beliefs about desired trajectories in extrinsic frames of reference, which emerge naturally during hierarchical perceptual inference." You move your body a certain way because your motor system expects you to move that way! But why does your motor system expect you to move a certain way? Why does it expect one particular bodily trajectory as opposed to the many other bodily trajectories that would have accomplished the same goal? Friston never explains in any serious way how this is supposed to work. He never explains in any detail how "hierarchical perceptual inference" yields prior expectations regarding bodily trajectories. Instead, his models simply assume a prior over bodily trajectories. Expected bodily trajectory is treated as an exogenous given, rather than an endogenous variable to be explained. By assuming an expected bodily trajectory, these models presuppose that the motor system has already solved a huge portion of Bernstein's redundancy problem. They assume a solution to the very problem that drives scientific research in this area.[1]

Clark acknowledges such worries. He responds:

The PP story shifts much of the burden onto the acquisition of those prior 'beliefs' . . . The PP bet is, in effect, that this is a worthwhile trade-off since PP describes a biologically plausible architecture maximally suited to installing and subsequently tuning the requisite suites of generative-model based prediction (p. 132).

In my opinion, Clark's response elides the vast explanatory differential between OFC and active inference. OFC has produced explanatorily successful models of specific motor tasks. The models describe in considerable detail how the motor system transits from the goal (encoded by the cost function) to a particular bodily trajectory. Active inference has not yet produced comparably successful models. Proponents have not supplied detailed models describing how the motor system arrives at its supposed expectations regarding bodily trajectory. At present, active inference has demonstrated no ability to generate well-confirmed, explanatorily fruitful models of specific motor tasks.

Perhaps such models will eventually emerge. But there are grounds for doubt. Recall the asymmetry between task-relevant variation and task-irrelevant variation. From an active inference viewpoint, it is unclear why this asymmetry prevails. If the motor system seeks to eliminate proprioceptive prediction errors, then one would expect the motor system to correct all deviations from the expected bodily trajectory, including task-irrelevant deviations. So one would expect comparable variation along task-relevant dimensions and task-irrelevant dimensions. Clark does not say how he envisages handling the task-relevant/task-irrelevant asymmetry within the active inference framework. Like Friston and Hohwy, he simply ignores the issue.

Conclusion

The PP framework is an intriguing conjecture about how the brain might approximately implement Bayesian inference. However, Clark exaggerates its merits as applied to perception, and he overstates its worth as a rival to the orthodox OFC analysis of motor control. Clark's rousing advocacy notwithstanding, it is premature to elevate prediction error minimization into an overarching principle for how the mind works.

ACKNOWLEDGEMENTS

I presented some of this material at the conference "Is the Brain Bayesian?", hosted by the NYU Center for Mind, Brain, and Consciousness. Thanks to participants for their feedback, especially Ned Block, Rosa Cao, Karl Friston, and Steven Gross. Thanks also to an anonymous referee for helpful suggestions.

REFERENCES

Adams, R., Shipp, S., and Friston, K. 2013. "Predictions, not Commands: Active Inference in the Motor System." Brain Structure and Function 218: 611-643.

Bernstein, N. 1967. The Coordination and Regulation of Movements. Pergamon.

Friston, K. 2011. "What is Optimal about Motor Control?" Neuron 72: 488-498.

Friston, K., Mattout, J., and Kilner, J. 2011. "Action Understanding and Active Inference." Biological Cybernetics 104: 137-160.

Gershman, S., Vul., E., and Tenenbaum, J. 2012. "Multistability and Perceptual Inference." Neural Computation 24: 1-24.

Helmholtz, H. von. 1867. Handbuch der Physiologischen Optik. Leipzig: Voss.

Hohwy, J. 2013. The Predictive Mind. Oxford: Oxford University Press.

Hohwy, J., Roepstorff, A., and Friston, K. 2008. "Predictive Coding Explains Binocular Rivalry: An Epistemological Review." Cognition 108: 687-701.

Rescorla, M. 2015. "Bayesian Perceptual Psychology." The Oxford Handbook of the Philosophy of Perception, ed. M. Matthen. Oxford University Press.

---. 2016. "Bayesian Sensorimotor Psychology." Mind and Language 31: 3-36.

Sanborn, A. Forthcoming. "Types of Approximation for Probabilistic Cognition: Sampling and Variational." Brain and Cognition.

Todorov, E., and Jordan, M. 2002. "Optimal Feedback Control as a Theory of Motor Coordination." Nature Neuroscience 5: 1226-1235.

[1] My argumentation in this paragraph is heavily influenced by unpublished work by Emanuel Todorov and Tom Erez, who argue along similar lines.