Body Language: Representation in Action

Placeholder book cover

Mark Rowlands, Body Language: Representation in Action, MIT Press, 2006, 218pp., $36.00 (hbk), ISBN 0262182556.

Reviewed by Shaun Gallagher, University of Central Florida


Mark Rowlands, in his book Body Language (2006), provides an analysis of representation as it enters into action.  For anyone interested in action theory or the philosophy of action, or the concept of representation, this is an important book that will help you get a bearing on the recent embodied, enactive, extended approaches to these topics.

The central question in this book is whether the concept of representation is required for an account of action.  This question is the subject of an ongoing debate that plays the more recent approaches based on embodied, enactive and extended analyses of cognition against traditional accounts of representation.  Rowlands champions the former approaches, and along with several others, like Michael Wheeler (2005), Richard Menary (2007), and Andy Clark and Rick Grush (1999), attempts to defend a minimalist account of representation in embodied action.

First, let's set aside two forms of representation that would shift the focus away from action.  What is not at stake in Rowlands' book is what we might call internal representation, that is, a sort of representation that may (or may not) occur as the basis for propositional attitudes "inside" a cognitive system traditionally defined as mind or brain.  We can also leave aside the question of external representations such as physical signs, diagrams, and such (see Menary 2007 for the importance of such representations).  Rather, the focus is on representation in action and what role representational elements might play in the structure and control of action.

Consider then the classical concept of representation as outlined and rejected by Rowlands (5-10), a concept that is modeled on language, on how words work.  On such a conception, representation:

1.  is internal (it's an image, symbol, neural configuration)

2.  has duration (it's an identifiable thing)

3.  is intentional (it refers to something other than itself -- it has content)

4.  requires interpretation (its meaning derives from the semantic economy of the subject -- like a word or an image its meaning gets fixed in context)

5.  is passive (it's produced, enacted, called forth by some particular situation – à la Dretske; or we do something with it – à la Millikan).

To this list, we should add that representation is decoupleable, that is, it functions even when the feature or object of which it is a representation is absent from the environment.  Rowlands does not include decoupleability in his definition of the classic conception, but as his discussion of decoupleability (157ff.) suggests, it seems to apply to all concepts of representation.  The idea of decoupleability is that one can go "off-line" and represent (imagine or remember) an action or object or context even if that action, object, or context is not currently present.  Accordingly, representation would involve a form of decoupling away from action, away from the target of action, or away from the current context.  One question about representation in action is whether action itself depends on this kind of representation, traditionally defined -- and another question is whether there can be a decoupled element within action itself.

There are some theorists who give negative answers to these questions and take an anti-representationalist view.  Hubert Dreyfus (2002), for example, argues that for practiced or skillful intentional action one does not require representation, traditionally defined as internal.

A phenomenology of skill acquisition confirms that, as one acquires expertise, the acquired know-how is experienced as finer and finer discriminations of situations paired with the appropriate response to each.  Maximal grip [Merleau-Ponty] names the body's tendency to refine its responses so as to bring the current situation closer to an optimal gestalt.  Thus, successful learning and action do not require propositional mental representations.  They do not require semantically interpretable brain representations either.  (Dreyfus 2002, 367)

Dreyfus associates the idea of representation with a failed Cartesian philosophy -- the concept of representation (as used in AI) remains context-independent and bound up with epistemic states of knowing-that (propositional knowledge), when everything about intelligent action and knowing-how depends on being-in-the-world, not standing back and representing the world.

A similar anti-representationalist stance is explicated by Alain Berthoz and Jean-Luc Petit in their recent Phénoménologie et physiologie de l'action (2006).  They argue that the brain is an organ for action rather than an organ for representation.  Berthoz and Petit want to move away from a philosophy that puts language in first place and that models action on language-like representation, where action is equivalent to movement plus representation.  Representations are not the dynamic processes required to explain action.  Yet, even if action involves dynamic anticipation, as it certainly does according to Berthoz and Petit, there is a certain way, as we'll see, that anticipation itself might be considered representational.

What takes the place of representations in non-representationalist accounts of action is a form of perceptually based online intelligence which generates action "through complex causal interactions in an extended-body-environment system" (Wheeler 2005, 193).  But can this sort of system do everything it needs to do without any form of representation?  According to Rowlands, Wheeler, and others, the answer is "no."  One requires some minimal kind of representation in action.

Wheeler (2005), for example, is certainly a friend of Dreyfus' anti-representationalist view, but following Clark (1997), he suggests that there has to be something like "action-oriented representations" (AORs).  AORs are temporary egocentric motor maps of the environment that are fully determined by the situation-specific action.  On this model, it is not that the AORs re-present the pre-existing world in an internal image, or that they map it out in a neuronal pattern:  rather, "how the world is is itself encoded in terms of possibilities for action" (Wheeler 2005, 197).  What is represented in AORs is not knowledge that the environment is x, but knowledge of how to negotiate the environment.  AORs are action specific, egocentric relative to the agent, and context dependent.

But what sort of thing is an AOR?  Is it a neural firing pattern, a motor schema, or something like a bodily movement?  In contrast to Wheeler (2005, 209), Rowlands argues that certain kinds of bodily movements are themselves representational -- not, however, in the traditional sense of representation.  AORs (I'll retain Wheeler's terminology here for convenience) are not internal, enduring, intentional, in need of interpretation, or passive.  For Rowlands (113-114), the following characteristics define the representational nature of action oriented representation.

·      AOR carries information about something other than itself (x)

·      AOR is teleological -- it tracks or has a specific function towards x

·      AOR can misrepresent x

·      AOR occurs within a more general representational framework

·      AOR is decoupleable from x (x may be absent from the immediate environment)

To make the case that certain kinds of bodily movements can count as representational Rowlands distinguishes between intentional actions, sub-intentional acts, and pre-intentional acts.  Sub-intentional acts (O'Shaughnessy 1980) are non-intentional movements, e.g., of tongue or fingers, of which we are not aware, for which there is no reason, and which serve no purpose connected with action.  Pre-intentional acts or "deeds" include such things as the positioning of fingers in catching a ball that is flying toward you at a high rate of speed, or the movement of your fingers while playing Chopin's Fantasie Impromptu in C# Minor on the piano.

Deeds, or pre-intentional acts, include an array of "on-line, feedback-modulated adjustments that take place below the level of intention, but collectively promote the satisfaction of [an] antecedent intention" (103).  Rolands provides a detailed example: Yarbus' (1967) experiments on saccadic eye movements.  In these experiments subjects view a painting of six women and the arrival of a male visitor; subjects are asked to do certain tasks, such as: view the picture at will; judge the age of the people in the painting; remember the clothing worn; indicate which person the man came to visit.  Yarbus found that subjects used different visual scan paths/saccades for each task.  E.g., subjects who were asked about the age of the people focused on their faces.  The scan paths varied systematically with the nature of the task.  The saccades are in some way governed by the intention/task, but they are not intentional in the sense that we do not decide to use this visual tactic, and we are not conscious we are doing the saccades: they are pre-intentional acts.

Rowlands argues that such "deeds" or pre-intentional acts are representational.  Pre-intentional acts:

·      carry information about x (the trajectory, shape, size of the ball, the keyboard, a specific aspect of people in the painting)

·      track x or function in a way that allows me to accomplish something in virtue of tracking x

·      can misrepresent (get it wrong)

·      can be combined into a more general representational structure (I catch the ball and throw it back; I continue to play the music; I can systematically scan a painting)

·      are decoupleable from x (x may be absent from the immediate environment -- e.g., I can later remember and demonstrate how I caught the ball replicating the same act).

Let me focus on this question of decoupleability.  According to any definition of representation, a representation is decoupleable from x (x may be absent from the immediate environment).  But it is difficult to see how pre-intentional acts can decouple from x (the ball, the piano keys, the painting) or the context, without becoming something other than what they are and failing to serve the action originally at stake.  Imagining, remembering, or even re-enacting an action outside of its original context and absent x, may (or may not) require representation -- but this says nothing about representation in action and the role that pre-intentional acts play in constituting that action.

One might be tempted, although Rowlands is not, to appeal to a model developed by Andy Clark and Rick Grush (1999).  They offer a model of representation that puts decoupleability directly into action at a sub-personal level.  They propose that anticipation in motor control, specifically in the working of a forward emulator, involves a decoupled representation.  Since the emulator anticipates (represents) an x that is not there (a future x), or a predicted motor state, it is decoupled from x or the current movement.  Thus, "emulators seem to be a nice, biologically detailed example of the sort of disengagement that Brian Cantwell Smith (1996) has recently argued to be crucial for understanding representation" (Clark and Grush 1999, 7).  On this view, it seems that in the very structure of action (and motor control) one finds an aspect of decoupled representation.

It is difficult, however, to see how an aspect of motor control that is a constitutive part of the action can be considered decoupled from x, the context, or the action itself.  Doesn't the anticipation of a future state or location of x (e.g., anticipating where the ball will be in the next second), or of the predicted motor state (anticipating where to strike the keyboard in the next measure), require reference to the present state or location of x and my hand, or more generally my body's motor state?  Furthermore, the idea of decoupleability seems to interfere with the concept of teleological tracking in this regard.  Nor is it clear in what sense this sort of anticipatory simulation/emulation is "off-line" rather than part of the online process of action.  If one does decouple the emulation, it ceases to be part of a forward motor control mechanism, although it may turn into part of a truly off-line representational process, that is, we may use a decoupled emulation process in memory or imagination.

Decoupleability aside, Rowland's pre-intentional acts could be considered an example of Wheeler's AORs.[1]  As such they are not reducible to neural firing patterns, although they do not exclude such patterns.  They clearly involve the body-schematic motor control system.  So they belong to a system that includes brain and body, but also environment.  "The vehicles of representation do not stop at the skin; they extend all the way out into the world" (Rowlands 2006, 224).  Here Rowlands joins Clark and Wheeler, and the extended mind (vehicle externalism) hypothesis, where AORs are complex causal interactions in an extended-body-environment system -- where the causality is spread around.

Wheeler calls this the "threat from [non-trivial] causal spread" (2005, 200).  Precisely the commitment to some version of this idea is what motivated anti-representationalism in the first place.  Let's say, for example (a la Haugeland 1995), that I am riding my horse from Aix to Ghent.  Getting from one place to another involves having some kind of strategy.

Strategy 1:  I have a stored inner representation of the directions

Strategy 2:  I follow the road and road signs, which are external representations

Strategy 3:  Having decided to go to Ghent, I jump on my horse and start off, and having done it many times before, we (my horse and I) go on automatic pilot and allow the landscape and roads to guide us (no representations required since the landscape, which is not representing anything, does the work)

The third strategy involves non-trivial causal spread since we allow the world to do some of the work.  Does this not rule out a role for representations?  Rowlands would seemingly point to certain pre-intentional body-schematic aspects of my riding and guiding the horse along the road, and call them representational.  Likewise Wheeler argues that in order to go anti-representationalist in an extended cognition paradigm one would have to construe representations as involving (1) strong instructionalism (i.e., the idea that representations provide a full and detailed description of how to achieve the outcome) -- and this is certainly not implied in Rowlands' concept of pre-intentional acts; and (2) the neural assumption (i.e., that neuronal processes play a central and close to exclusive role) -- and this too is not the case on Rowlands' account, since the shape of my body and gravity certainly play a role in pre-intentional acts.  The neural assumption is already weakened in favor of non-neural elements on the extended cognition hypothesis; and we can easily give up (1) -- no need for anything like a fully-specifying representation.  Giving up (1) and (2), however, still leaves room for minimal kinds of representation that are distributed across brain, body, and environment.

But, we can surely ask, what's the point in retaining the term 'representation' in this case?  What work does the concept of representation really do since nothing is being re-presented to the subject, since it is not consistent with the classical notion of representation, and since in working out the justification for characterizing this as representation one is already explaining action in non-representational terms of perception-based complex causal interactions in an extended-body-environment system?  A facetious economic argument against either the representationalist, or even the minimal representationalist, would suggest that the work that the concept of representation does is less than the work it takes to justify the use of the term 'representation'.

Dreyfus (2007), arguing against the minimalist representationalism of Wheeler, appeals to Merleau-Ponty's work, and a non-representational Heidegger, for an account of the way the body and the world are coupled.  As an agent acquires skills, those skills are "stored," not as representations in the agent's mind, but as the solicitations of situations in the world.  If the situation does not clearly solicit a single response or if the response does not produce a satisfactory result, the learner is led to further refine his discriminations, which, in turn, solicit ever more refined responses.

On this model, in our action, we can still get things wrong, but not because our representation of the world misrepresents the world.  Rather, the world itself appears ambiguous in the light of our particular abilities and projects.  From a particular distance and perspective, or in a certain light, the mountain appears to be climbable.  Once I begin to climb, however, I can discover that the mountain is not climbable.  On the representationalist view this is explained by saying that my original representation of the mountain had been wrong.  On the non-representationalist view, the mountain presented a certain affordance relative to my embodied skills, at a certain distance, in a certain light, from a certain perspective.  Once we change the distance, light and/or perspective, the affordance disappears.  These are physically determined factors that involve being embodied and in-the-world; they need not be representational (although I can certainly represent them cognitively).  The affordance disappears not because I changed the representation of my distance from the mountain -- I actually have to change my distance, and when I do so, the body-mountain relation, which defines the affordance, changes.

Within such embodied-embedded-extended approaches, what role does a minimal representation play?  For Wheeler, the AOR is a perception-based, short-lived, egocentric (spatial) mapping of the environment calibrated strictly in terms of possible actions.  Clark and Grush suggest that representational elements are to be found in the anticipation that is built into a forward emulator for online motor control.  Rowlands argues that the pre-intentional movement that constitutes the current structure of intentional action is representational.  When we consider these three aspects of action together we should notice that they constitute the dynamic temporal structure of action itself.  On a phenomenological, non-representational model of this temporal structure the short-term mapping of the environment is a function of a pragmatic (i.e., in terms of possible actions) retentional maintenance (holding in perceptual presence) of the relevant aspects of the environment, where those aspects themselves may be doing some of the work; the anticipation that is essential to motor control is a protentional aspect that is an implicit characteristic of my immediate project-determined coupling with the environment.  And the pre-intentional movement is an occurrent contribution to the very structure of the action.  As Berthoz and Petit (2006) make clear, none of these dynamically dissipating aspects amounts to a representation, if we take representation to involve:

·      an internal image, symbol, or neural configuration

·      an enduring thing

·      decoupleability

·      interpretation, where some other process takes it as content.

Could these aspects add up to a minimal representation?  Rowlands' concept of pre-intentional deeds is consistent with Wheeler's definition of a minimal representation as (1) richly adaptive, (2) "arbitrary" or ad hoc -- in the sense that it is not predefined, but processes current information about the world, and (3) employing a homuncular mechanism, i.e., a mechanism that is hierarchically compartmentalized but contributes to a collective achievement (see Wheeler 2004, 252ff.).  The idea of the homuncular mechanism is an attempt to preserve the criterion of interpretability.  Representational interpretation is usually conceived of as involving modularity -- processing in one module independent of processing in another, but each communicating results to (and mutually interpreting) another module.  The homuncular mechanism thus takes some information "off-line" and manipulates it to anticipate possible actions -- much like Clark and Grush's emulator.

But modularity can be given up for the dynamic systems concept of a self-organizing continuous reciprocal causation (Varela, Clark).  On-line sensory-motor processes that are serving intentional action and are temporally structured in dynamic relation to the environment are in fact richly adaptive and arbitrary in the relevant sense, but are not homuncular, which means they involve no interpretational element -- one part of the mechanism doesn't interpret the information presented by another part -- the relation is more causal than communicative.

Once again we can ask what's left of the idea of representation in action.  The kind of processes that make up action:

1.  are not internal -- they extend to include embodied-environmental aspects and are only "weakly" neuronal

2.  cannot be characterized in terms of simple duration -- rather they are temporal, dynamic, and distributed processes

3.  are not passive -- they are pragmatically enactive -- proactively contributing to the adaptability of the system

4.  are not decoupleable -- indeed, if such processes are to remain teleological, they have to continue tracking x or they have to involve a continuing and online anticipation or protention of a predicted motor state

5.  are not strongly instructional

6.  do not involve interpretation.

In effect, the processes described as minimally representational no longer conform to the criteria that would make them representational.  Action does involve processes that are intentional (action refers to something other than itself) at the personal level, and in a way that contributes to the organization of the sub-personal processes that support the intentional action.  But if representation is one form of intentionality, not all intentionality is representational.  There is an intentionality of the body-in-action that is not internal, decoupleable, or instructional, that does not involve interpretation in the relevant sense, and that is dynamically linked with the environment in a way that reflects a specific temporal structure at the subpersonal level.

Action, thus, involves temporal processes that can be better explained in terms of dynamic systems of self-organizing continuous reciprocal causation at the subpersonal level.  Action does involve a retentional, short-lived, egocentric orientation to the environment calibrated in terms of possible actions (but this does not require an action oriented representation); action does involve an anticipatory (protentional) aspect that is built into a non-modular forward emulator for online motor control (but this does not require representational modularity); and these two aspects are dynamically tied to occurrent pre-intentional movements that serve the intentional action (but are not representational since they are not decoupleable).

It might still be argued that representational accounts provide a helpful short-cut for explaining action.  In this regard, however, at best, it is just one way -- a scientifically abstract way -- of explaining the action process.  I submit that the concept of a representation is not an explanans that does any work itself; rather, it's a concept which itself requires explanation.  The risk is that representational accounts come with ontological claims -- there really are representations in the system and they are something more than what a motor control system does as part of the action itself.

The problem is that a majority of cognitive scientists, and many philosophers, including Rowlands, continue to use the R-word as if it is an explanation.  In this regard, however, even if you think that the concept of representation does do some explanatory work, what I called the facetious economic argument against representationalism in fact suggests a scientific pragmatism.  It may take more energy to define and distinguish any legitimate sense of representation from amongst the plethora of uses of that term, and to justify its special use (which is the central task that Rowlands takes on in Body Language), than it would take to explain the phenomenon in non-representationalist terms.  And if one can explain the phenomenon in non-representationalist terms, then the concept of representation is not necessary, and is at best redundant.


Berthoz, A. and Petit, J-L. 2006. Phénoménologie et physiologie de l'action. Paris: Odile Jacob.

Clark, A. 1997. Being There: Putting Brain, Body and Workd Together Again. Cambridge, MA: MIT Press.

Clark, A. and Grush, R. 1999. Towards a cognitive robotics. Adaptive Behavior 7 (1): 5-16.

Dreyfus, H. 2007. Why Heideggerian AI Failed and How Fixing it Would Require Making it More Heideggerian. Philosophical Psychology 20 (2): 247-268.

Dreyfus, H. 2002. Intelligence without representation: Merleau-Ponty's critique of mental representation. Phenomenology and the Cognitive Sciences 1 (4): 367-83.

Gallagher, S. 2005. A new movement in perception: Review of Alva Noë's Action in Perception. Times Literary Supplement (London), 9 September 2005.

Haugeland, J. 1998. Mind embodied and embedded. Having thought: Essays in the metaphysics of mind (pp. 207-237). Cambridge, MA: Harvard University Press.

Meneary, R. 2007. Cognitive Integration. London: Palgrave Macmillan.

Merleau-Ponty, M. 1962. Phenomenology of perception (C. Smith, Trans.). London: Routledge & Kegan Paul.

Merleau-Ponty, M. 1966. The Structure of Behavior (A. L. Fisher, Trans., 2nd ed.). Boston: Beacon Press.

O'Shaughnessy, B. 1980. The Will, 2 volumes. Cambridge: Cambridge University Press.

Rowlands, M. 2006. Body Language. Cambridge, MA: MIT Press.

Varela, F. J., Thompson, E. and Rosch, E. 1991. The Embodied Mind: Cognitive Science and Human Experience. Cambridge, MA: MIT Press.

Wheeler, M. 2005. Reconstructing the Cognitive World: The Next Step. Cambridge, MA: MIT Press.

Yarbus, A. 1967. Eye Movements and Vision. New York: Plenum Press.

[1] Wheeler gives up the criterion of decoupleability in his characterization of a minimal (or weak) representation (2004, 219).