The Innocent Eye: Why Vision is not a Cognitive Process

Nico Orlandi's book is a notable contribution to the burgeoning philosophical literature on perception. Orlandi advances a novel theory of vision, centered on the crucial role that the external environment plays in shaping visual activity. She defends her theory with an impressive array of empirical and theoretical considerations.

The Embedded View

Perceptual states represent the distal environment as standing a certain way. For example, perception represents objects as having certain shapes, sizes, colors, and locations. Helmholtz (1867) proposed that the perceptual system executes an "unconscious inference" from proximal sensory stimulations (e.g., retinal stimulations) to perceptual representations. The inference deploys "implicit assumptions" concerning the distal environment or the relation between environment and perceiver. To illustrate, suppose that a pattern of retinal stimulation is compatible with two conflicting hypotheses: light comes from overhead and the perceived object is convex; or light comes from below and the perceived object is concave. Despite this ambiguity, humans normally perceive the object as convex. On Helmholtz's approach, the perceptual system executes an unconscious inference based upon an "implicit assumption" that light comes from overhead, yielding a perceptual estimate that the object is convex.^[1]

Helmholtz's approach has long been orthodox within perceptual psychology, where it is often called "constructivism." Orlandi calls it "cognitivism," because it attributes cognition-like activity to the perceptual system. She advances a rival Embedded View (EV), which places explanatory weight on environmental conditions rather than unconscious inferences. EV holds that the visual system is wired to reflect certain environmental regularities without representing or encoding those regularities. By analogy, an optical smoke detector is wired to reflect the fact that smoke particles deflect light, but we gain no explanatory value by saying that the smoke detector represents or encodes this fact. Similarly, the visual system is wired to reflect the environmental regularity that light comes from overhead, but (according to EV) we gain no explanatory value by saying that the visual system represents or encodes this environmental regularity. Instead, we should cite the environmental regularity itself to explain why the visual system is wired to transit a certain way from retinal stimulations to percepts.

In some respects, EV resembles Gibson's direct perception framework (1979). Like Gibson, Orlandi denies that perception involves cognition-like mental activity. Like Gibson, she denies that representational mental states mediate between proximal sensory stimulations and visual percepts. However, Orlandi disagrees with Gibson on certain key points. Most importantly, she differs from Gibson by vigorously insisting that visual percepts are representational mental states. She thereby also rejects the relationalist viewpoint espoused by Martin (2004), Travis (2004), and others. I think that Orlandi chooses wisely by giving short shrift to direct perception, relationalism, and other views that deny the representational character of perceptual states.

Chapter 1 introduces the dialectic between constructivism and EV. Chapter 2 argues that vision science supports EV over constructivism. Orlandi places particular emphasis upon Natural Scene Statistics, a scientific movement that seeks to illuminate perception by delineating statistical regularities about environmental conditions. Chapter 3 is a nuanced analysis of the distinctive value that mental representation contributes to psychological explanation. Based on her analysis, Orlandi concludes that we lack warrant to postulate representational intermediaries between retinal stimulations and percepts. Chapter 4 argues that EV can explain various perceptual phenomena, including certain well-known illusions. Chapter 5 argues that EV is compatible with a suitably modest form of the view that mental processes are computational.

Many scientists and philosophers have criticized constructivism. This book is among the most scientifically informed and philosophically sophisticated critiques. Orlandi's rich, provocative discussion merits careful study. Opponents of constructivism will find many new arguments to fuel their opposition. Constructivists must consider how best to answer those arguments. Particularly salutary is Orlandi's emphasis upon the myriad ways that our highly structured environment shapes the visual system's operations. Burge (2010) sounds the same theme, but she develops it in distinctive and interesting ways. Her focus on the embedding environment is a welcome counterbalance to the many contemporary philosophical discussions that instead prioritize phenomenological, environment-independent aspects of perception.

Still, I was not convinced that EV is a compelling competitor to constructivism. I will now detail my main reservations regarding Orlandi's position and her arguments.

Bayesian models of perception

Beginning in the 1990s, perceptual psychologists have developed constructivism using Bayesian decision theory. The basic idea is that the perceptual system executes an unconscious statistical inference from proximal sensory stimulations to percepts. The perceptual system assigns subjective probabilities to hypotheses regarding the distal environment (e.g., hypotheses regarding the shapes, sizes, and colors of perceived objects), and it updates those subjective probabilities in light of sensory input. Talk about "implicit assumptions" is transmuted into talk about prior probabilities and prior likelihoods. For example, rather than posit an implicit assumption that light comes from overhead, we posit a prior that assigns higher probability to overhead lighting directions. Similarly, we posit a prior likelihood that assigns probabilities to retinal stimulations given a certain shape and a certain lighting direction. When the perceptual system receives sensory input, it reallocates probabilities across the hypothesis space in approximate accord with Bayes's Rule, yielding a posterior probability. Based on this posterior, the perceptual system selects a single determinate hypothesis (e.g., a determinate estimate of some object's shape). Researchers have used the Bayesian framework to explain a wide variety of perceptual illusions and perceptual constancies. See (Knill and Richards, 1996) and (Rescorla, forthcoming) for more details.

Bayesian perceptual psychology poses a serious challenge to EV. Bayesian models postulate that perceptual processing approximately conforms to Bayesian norms. The postulated mental activity looks a lot like an unconscious inference (although one might debate what exactly it takes for mental activity to count as "inference-like"). Thus, Bayesian perceptual models seem to accord more closely with constructivism than EV. In particular, consider the hypotheses to which probabilities get assigned. The most natural construal takes these hypotheses to represent possible distal conditions. For example, a Bayesian model of shape perception posits a prior over hypotheses that represent specific distal shapes. Apparently, then, our Bayesian model postulates a mental state with representational properties (a probability assignment over representationally contentful hypotheses). This representational mental state causally influences which percept results from given retinal stimulations. EV denies that any such representational mental state mediates between retinal stimulations and percepts.

To her credit, Orlandi directly confronts this challenge. She acknowledges the explanatory power of Bayesian perceptual psychology, but she contends that one can interpret the Bayesian framework as compatible with EV. Rather than regard priors as inputs to an unconscious statistical inference, Orlandi suggests that we instead view them as "biases" (p. 83) or "simple constraints" (p. 82) wired into the perceptual system. I am not totally sure how to construe this suggestion, but I believe that Orlandi's position amounts to a kind of instrumentalism regarding Bayesian perceptual psychology. The visual system proceeds as if it executes Bayesian inferences, so we can use Bayesian models to predict how various sensory inputs will cause various percepts. We should not conclude that the perceptual system literally encodes priors or literally executes statistical inferences. Talk about priors and inferences is just a useful metaphor for describing how the perceptual system is wired.

To what extent does the scientific literature support an instrumentalist stance? Some Bayesian perceptual psychologists advance their models in as if fashion, without claiming that the perceptual system literally reallocates probabilities over a hypothesis space. Overall, though, I think that the science favors a more realist attitude towards priors. We reap important explanatory benefits if we construe talk about priors in literal rather than metaphorical fashion.

A good illustration is given by perceptual adaptation. Abundant evidence shows that suitable manipulation can change the visual system's priors. For example, Adams, Graf, and Ernst (2004) altered the light-from-overhead prior by exposing subjects to a deviant environment where lighting direction was perturbed. In another striking experiment, Ernst (2007) exposed subjects to stimuli where two normally uncorrelated properties (luminance and stiffness) were correlated. These deviant stimuli caused a change in the joint prior over luminance and stiffness, thereby causing changes in the percept (e.g., stiffer objects are seen as having greater luminance, because the prior now treats stiffness and luminance as correlated). There are many similar examples along these lines, including manipulations that change the prior likelihood while leaving the prior probability fixed (Beierholm, Quartz, and Shams, 2009). Researchers have also offered well-confirmed Bayesian models that describe how priors evolve in response to environmental perturbations (Ernst and Di Luca, 2011).

Orlandi acknowledges that experience can change the mapping from sensory inputs to perceptual states. She says that, in such cases, the perceptual system's wiring changes in response to altered environmental conditions (pp. 146-149). I think that Orlandi's analysis entrains a significant loss in explanatory power. If we adopt a realist perspective on priors, we can explain why various changes in environmental conditions yield various changes in the mapping from sensory stimulations to percepts: namely, because the priors change a certain way. In contrast, Orlandi simply says that the perceptual system that was formerly wired one way is now wired another way. She provides no explanation why a given stimulus history yields a given mapping from retinal inputs to percepts. What is it about the perceptual system that leads a given stimulus history to produce a given wiring?

Orlandi says relatively little about the psychological processes that carry retinal stimulations into percepts. She mentions connectionist networks as systems "wired" to reflect various environment conditions without representing those conditions (pp. 46-50). However, no one has yet shown how a connectionist theory might explain the varied perceptual phenomena explained by Bayesian models, except insofar as the relevant connectionist network encodes suitable priors. Thus, although a few perceptual psychologists pursue something resembling Orlandi's embedded approach (Purves, Wojtach, and Lotto, 2011), we presently have little solid idea how one might develop EV into a satisfying competitor with constructivism.

Nevertheless, Orlandi raises important questions that a realist interpretation of Bayesian modeling must face. For example, a prior assigns subjective probabilities (real numbers) to hypotheses. As Orlandi notes, it is questionable whether the perceptual system manipulates symbols that represent real numbers (p. 83). In what sense, then, does the perceptual system "encode" a prior? There are many questions in this vicinity that realists about Bayesian perceptual psychology should pursue. I hope that Orlandi's discussion will impel more philosophers to investigate these questions.

Explanation through environmental regularities rather than priors?

Orlandi's main objection to constructivism is that "implicit assumptions" cannot do the explanatory work typically assigned to them (pp. 40-41). If the perceptual system deploys implicit assumptions, then where do implicit assumptions come from? The natural answer, she says, is the environment itself. For example, why does the perceptual system implicitly assume that light comes from above? Because light does in fact typically come from above. However, if we are citing environmental conditions to explain implicit assumptions, why not bypass the implicit assumptions and appeal directly to environmental conditions?

Orlandi applies this argument to Bayesian perceptual psychology (pp. 89-94). One obvious question is where the priors employed by the perceptual system come from. As Orlandi argues, an equally important question is where the hypothesis space at work in a given perceptual task comes from. Orlandi thinks that Bayesian perceptual psychology should draw upon Natural Scene Statistics when answering these questions. But then environmental conditions will be doing the explanatory work, with priors no longer playing a central role.

I agree with Orlandi that we must explain how the hypothesis space and the priors arise. I also agree that Natural Scene Statistics may prove helpful in this regard. However, I do not think that Orlandi's anti-constructivist conclusion follows. I have three main worries:

Orlandi suggests that Bayesian models have no explanatory power until one explains how the hypothesis space and the priors arise (pp. 91-93). In general, though, one can explain X by citing Y without in turn explaining Y. For example, a physicist can at least partially explain the acceleration of some planet by citing the planet's mass and the net force acting on the planet, even if she does not explain why that net force arises. Similarly, while we would like to explain the hypothesis space and the priors at work in some perceptual phenomenon, doing so is not required before we use the hypothesis space and priors to explain the phenomenon.
Suppose we supplement a Bayesian model with a Natural Scene Statistics explanation of the hypothesis space and the priors. It does not follow that the priors play no significant explanatory role in the resulting supplemented model. If we explain X by citing Y and then explain Y by citing Z, it hardly follows that Y is explanatorily irrelevant. To continue my earlier example: if we explain the net force acting on some planet by citing all masses in the physical system, then net force is still explanatorily relevant to acceleration.
Even if we cite environmental regularities to explain the hypothesis space and the priors, we must consider the psychological processes through which those regularities impact the priors. How does past exposure to the environment (either by the subject or her progenitors) help determine which priors are currently encoded by the perceptual system? It seems likely that here again our best theory will involve Bayesian modeling, so that we will end up postulating further inference-like mental processes defined over representational mental states.

I submit that the study of environmental regularities is a valuable complement to constructivism, rather than a satisfying replacement for constructivism.

Conclusion

Orlandi argues convincingly that philosophical theorizing about vision should highlight how the external environment molds visual activity. I think she would have done better to showcase the embedding environment in conjunction with the constructivist paradigm, not as the basis for a rival paradigm. Nevertheless, I found her discussion enjoyable and thought-provoking at every turn. All philosophers interested in perception should read this book.

REFERENCES

Adams, W., Graf, E., and Ernst, M. 2004. "Experience Can Change the "Light-From-Above'

Prior." Nature Neuroscience 7: 1057-1058.

Beierholm, U., Quartz, S., and Shams, L. 2009. "Bayesian Priors Are Encoded Independently from Likelihoods in Human Multisensory Perception." Journal of Vision 9: 1-9.

Burge, T. 2010. Origins of Objectivity. Oxford: Oxford University Press.

Ernst, M. 2007. "Learning to Integrate Arbitrary Signals from Vision and Touch." Journal of Vision 7: 1-14.

Ernst, M., and Di Luca, M. 2011. "Multisensory Perception: From Integration to Remapping." In Sensory Cue Integration, eds. J. Trommershäuser, K. Körding, and M. Landy. Oxford: Oxford University Press.

Gibson, J. J. 1979. The Ecological Approach to Visual Perception. Boston: Houghton Mifflin.

Helmholtz, H. von. 1867. Handbuch der Physiologischen Optik. Leipzig: Voss.

Knill, D., and Richards, W., eds. 1996. Perception as Bayesian Inference. Cambridge:

Cambridge University Press.

Martin, M. 2004. "The Limits of Self-Awareness." Philosophical Studies 120: 37-89.

Purves, D., Wojtach, W. T., and Lotto, R. B. 2011. "Understanding Vision in Wholly Empirical Terms." Proceedings of the National Academy of Sciences 108: 15588-15595.

Rescorla, M. Forthcoming. "Bayesian Perceptual Psychology." In The Oxford Handbook of the Philosophy of Perception, ed. M. Matthen. Oxford: Oxford University Press.

Travis, C. 2004. "The Silence of the Senses." Mind 113: 57-94.

^[1] Actually, the visual system "implicitly assumes" that light comes from overhead and slightly to the left.