Semantic Perception: How the Illusion of a Common Language Arises and Persists

You have a great deal of semantic knowledge. You know, for example, that the sentence 'Hasselhoff is handsome' is meaningful. But do you see that 'Hasselhoff is handsome' is meaningful? If I were to utter it in your presence, would you experience that 'Hasselhoff is handsome' is meaningful? More generally, do the contents of human perception sometimes involve semantic properties, such as meaningfulness? According to Jody Azzouni, the answer is 'yes'. "My thesis is that we (human beings) involuntarily see uttered words, among other things, as possessing certain monadic meaning-properties, and that we involuntarily see uttered sentences as possessing other (but related) monadic meaning-properties" (p. 1).

Azzouni's position is complicated by a background commitment: "No object . . . has the property of meaning some other thing in the way that we experience words to so mean what they refer to. Nothing real, that is, refers; nothing real monadically means anything" (p. 5, emphasis in original). Much later in the book, Azzouni claims that "we experience language artifacts (and gestures) to have meaning-properties even though they don't" (p. 351, emphasis in original). Azzouni's position, then, is that semantic perception is illusory; language users regularly and incorrigibly "experience objects and events . . . as endowed with monadic properties that they don't have" (p. 6). One might naturally wonder, then, how this commitment could possibly cohere with the claim -- expressed on the very first page of the book -- that we see words and sentences, "among other things", as possessing monadic meaning-properties. Doesn't seeing that p entail that p? Not according to Azzouni:

It's not an unusual or specialized or metaphorical application of 'see' to complain about seeing afterimages, or hallucinations. If, after a blow to the head, I say that I'm seeing stars, the "metaphorical" stress is on "the stars," not on the "seeing." (p. 45, emphasis in original)

The central thesis of Azzouni's book (that we perceptually represent semantic properties) isn't new. It's one of several closely related doctrines in the philosophy of perception (such as, for example, that human perceptual experience can represent natural kinds or causal relations), which have received some degree of attention in recent years. To appreciate what's distinctive about these doctrines, it helps to distinguish between "low-level" properties, such as colors, shapes, odors, pitches, motion, and warmth, from "high-level" properties, such as being a pine tree, causing the fire, being a face, meaningfulness, and referring to Hasselhoff. The distinction isn't sharp, and may be difficult to spell out on closer inspection, but it conveniently allows us to characterize a substantive and interesting debate. Philosophers of perception seem to agree, by and large, that human perceptual experience represents low-level properties, but what about high-level properties? Are any of them represented?

Over the last few years, some authors have defended an affirmative response to the question by appealing to the method of phenomenal contrast. The method begins with a description of two perceptual experiences. The description is supposed to elicit the intuition that the experiences differ with respect to phenomenal character (or what it's like for a subject to have the experiences) but not with respect to which low-level properties the experiences represent. The claim that one experience involves a particular high-level property (meaningfulness, say, or being a pine tree) as part of its content is then justified on the grounds that it provides the most compelling explanation of why the two experiences differ in phenomenal character. I recommend that interested yet uninitiated readers have a look at Susanna Siegel (2006, 2009), Tim Bayne (2009), and Casey O'Callaghan (2010).

O'Callaghan (2010) discusses the perceptual experience of semantic properties at some length, though he primarily wants to recommend that human perceptual experience sometimes involves the awareness of phonemes. At one point, O'Callaghan constructs an interesting phenomenal contrast argument for the semantic perception view by appealing to "sinewave speech". He reports,

To test what is acoustically important in speech, Remez et al. (1981) devised a kind of synthetic speech that replaces a complex human voice with a few simple sine waves whose frequencies and amplitudes vary with components (formants) of the original signal. (O'Callaghan 2010, p. 309)

The result, when one first hears it, is an inarticulate high-pitched computer-generated whistle that sounds a lot like a theremin (the instrument some people believe was used to record the original Star Trek theme song). But if one listens to a recording of the unaltered voice and then listens to the sinewave speech again, one's experience of the very same auditory stimulus is quite different. I certainly feel the temptation to say that, the second time around, the synthetic whistles are heard as meaningful speech. But could there be an alternative explanation of the phenomenal difference? Could it be that exposure to the unaltered voice recording disposes me to attend to low-level properties that were present all along but were simply unnoticed when I initially heard the sinewave speech? How are we to decide between these competing accounts? I don't know. Judge for yourself.

Surprisingly, and to my disappointment, there's no mention of any of this literature in Azzouni's book. Instead, Azzouni relies on his intuitions about particular cases and on "aspects of the intellectual history of the study of language as evidence for what we experience in the understanding of uttered expressions" (p. 44, emphasis in original). So, for example, one of several considerations used to support his claim that language users perceive literal content to be context independent is the observation that Frege, Russell, and Carnap "underestimated" the context sensitivity of sentences (p. 202). Why rely on the history of language studies for evidence about the contents of perceptual experience? Azzouni justifies the method by claiming that "the earliest theories in a subject area often more directly reflect our experience . . . than later more sophisticated ones do" (p. 44). This comparative judgment may well be true -- I remain agnostic -- but even early theories might reflect the contents of our experiences rather poorly. One would like a more reliable method for determining which properties are constituents of perceptual content. In any case, the theories of Frege, Russell, and Carnap are hardly the earliest theories in the history of language studies.

Semantic Perception consists of a "General Introduction" in which the aims of the book are identified, a zeroth chapter, entitled "Methodological Preliminaries", in which Azzouni discusses his background commitments and how they bear on his present undertaking, and nine proper chapters interrupted by the occasional "Methodological Interlude". Here I provide a chapter-by-chapter summary, but I restrict my critical remarks to the first half of the book.

Chapters 1 and 2 are about the experience of understanding language. Chapter 1 focuses on the type-token distinction, and on the ways in which ordinary language use fails to reflect an understanding of the difference between types and tokens. Despite our "systematic fumblings over types and tokens" (p. 9), Azzouni maintains that speech perception involves "'looking through' the token [expression] to the type that it instantiates" (p. 51). What he means, I gather, is that one's perception of word boundaries doesn't coincide with the physically real acoustic boundaries in a speech event. In Chapter 2 Azzouni claims that these "oddities" about ordinary language use and speech perception are explained "if it's meaning-propertied physical objects that we experience to be words, sentences, and, generally, expressions" (p. 88). To develop and motivate the point, Azzouni compares the experience of speech with the perception of tools, such as screwdrivers and hammers. I confess that I don't understand how Azzouni's explanation is supposed to work. It's not that I think some step in the explanation is false or otherwise objectionable. My incomprehension is primitive: I don't see an illuminating connection between explanandum and explanans. The crucial parts of the discussion move far too quickly to be helpful.

Chapters 3 and 4 are about one of the central topics in the philosophy of language: what is said by an utterance of an expression. In Chapter 3 Azzouni presents a number of examples that are supposed to motivate the thesis that human perceptual experience represents what is said by an utterance of an expression. He then claims,

We don't experience the presence of a speaker's intentions -- even when we are aware of them -- as causing or influencing or determining what is said in these cases. Rather, we simply experience the expression as just meaning this or as just meaning that. (p. 130, emphasis in original)

This claim is the major premise in his objection to Gricean accounts of what is said.

The Gricean tells us that what is said by an utterance of a sentence is constitutively determined by the speaker's communicative intentions. According to Azzouni, the Gricean also requires that what is said be accessible to consciousness, since interpreters retrieve implicatures by reasoning about what is said in light of conversational norms. But, Azzouni argues, since the experience of understanding a sentence doesn't involve an experience of the speaker's communicative intentions "causing or influencing or determining" what is said, Griceans can't maintain both the claim that what is said is determined by speaker intentions and the claim that what is said is consciously accessible.

Knowing, on the basis of (extended) study, that the speaker's intentions have effects on what is said is quite different from one's experiencing them as doing so. An honest appraisal of the experience of understanding uttered expressions requires denying that we experience speaker's intentions as causally efficacious in the ways that Recanati describes us as so experiencing them. The conclusion is straightforward. No notion of what is said -- if stipulated as perceived to derive its properties from the speaker's intentions -- can be posited as consciously accessible. (p. 130, emphasis in original)

Insofar as I understand the notion of conscious accessibility, Azzouni's argument strikes me as invalid. Compare: water is constituted by H₂O; furthermore, the experience of water doesn't involve an experience of H₂O as constituting it (Thales was wrong to believe that water is a fundamental element, but we needn't suppose that his belief clashed with the contents of his water experiences); nevertheless, water is consciously accessible to us. At any rate, it doesn't follow that water is consciously inaccessible; one can coherently maintain that we see it. Perhaps I'm missing something, but how is Azzouni's argument any better than the fallacious one we just considered?

Unlike water, what is said by an utterance of an expression is a non-natural kind. Maybe a background assumption of Azzouni's argument is that the constitutive basis for a non-natural kind has to be consciously accessible if the kind itself is. Perhaps. But why think that? Suppose your visual experience represents my gender. Suppose you see that I'm a man. Thus my gender is consciously accessible to you. Still, your visual experience needn't represent that my gender is socially constructed.

Chapter 4 is about, among other things, the experience of retrieving implicatures. Here Azzouni maintains that "it's part of the phenomenological data that what is said by an uttered expression is both robustly perceived and robustly perceived as distinct from implicational content" (p. 147, emphasis in original). Azzouni also discusses context sensitivity more generally. The chapter includes a lengthy critical discussion of the claim -- defended by Herman Cappelen and Ernie Lepore -- that one can provide a "context-shifting argument" for "any sentence whatsoever" (Cappelen and Lepore 2004, p. 40).

Contextualism about a fragment of language is often defended by appeal to a context-shifting argument. One begins by drawing attention to a sentence, S, in which a certain term of interest ('red', 'knows', or whatever you like) occurs. Perhaps S is 'Tile 500 is red', or maybe it's 'Moore knows that he has hands'. One then describes two speech contexts. In the first, the use of S is intuitively true; in the other, the use of S is intuitively false. One then explains the difference between our intuitive judgments about the truth or falsity of S by hypothesizing that the term of interest is context sensitive. Its interpretation is influenced by the context of utterance in a way that allows the use of S to express different propositions in the different settings. If one could provide such an argument for just about any sentence of English, then, according to Cappelen and Lepore, the most common motivation for contextualism would be undermined, since it would over-generalize in a way that defies credulity.

Azzouni argues that context-shifting arguments aren't as easy to construct as Cappelen and Lepore claim. "My suspicion is that if a sentence has -- if the terms in that sentence have -- a sufficiently localized and specific use, then what that sentence says will be immune to contextuality effects" (p. 162). Azzouni invites us to consider four examples (pp. 162-163).

(1) Two plus two is four.

(2) Protons have positive charge.

(3) Genetic bar codes and ecological data revealed that the neotropical skipper butterfly Astraptes fulgerator is actually a complex of 10 species.

(4) According to conventional paleontological wisdom, an asteroid or comet 10 to 14 kilometers wide crashed into the present-day Yucatán Peninsula 65 million years ago and wiped out the dinosaurs.

Azzouni suggests that

the reader can provide many sentences of English -- along these lines -- that seem immune to [context-shifting arguments]. For those readers unsure of their abilities in this respect, I recommend early issues of Scientific American. . . . [and] various technical journals in biology or chemistry. (p. 163, emphasis in original)

Examples (3) and (4) are excerpts from the May 2005 issue of Scientific American.

Azzouni considers a response on behalf of Cappelen and Lepore. (The response is credited to John Collins.) "One might respond to these counterexamples by demarcating 'technical vocabulary' as having 'stipulated properties' and treating such vocabulary items as limit cases" (p. 163). As I understand it, the suggestion is to restrict Cappelen and Lepore's claim about the range of terms for which one can construct context-shifting arguments to non-technical vocabulary. Azzouni dismisses the response:

But how is one to independently demarcate 'technical vocabulary'? I think the suggested countermove is hopeless if only because new vocabulary is being coined for natural languages on a daily basis . . . . Is 'e-mail' technical vocabulary? Is 'iPad'? (pp. 163-164).

Suppose everything Azzouni says is right. (I certainly don't want to quibble over how to properly demarcate technical vocabulary.) Still, I fail to understand why Cappelen and Lepore should be worried, since all that's required for their purpose is that one be able to construct context-shifting arguments for a wide enough range of English sentences that would make the contextualist's argumentative strategy look unconstrained and implausible. It seems to me that the range of English sentences for which one is able to construct context-shifting arguments can be wide enough to suit Cappelen and Lepore's purpose and still fall way short of being so wide as to include just about any English sentence. In fact, for all that Azzouni says, the range of sentences for which one is able to construct context-shifting arguments may be wide enough to suit Cappelan and Lepore's purpose without including any sentences whose terms have, as Azzouni puts it, "a sufficiently localized and specific use" (p. 162). Here my point isn't that Azzouni's argument is fallacious; it's that his argument is irrelevant.

Chapters 5 and 6 are about literal content, our "truth sharing practices", compositionality, and lexical concepts. The central claim in Chapter 5 is that an ordinary language user's notion of literal content "isn't the result of experiencing such content during language transactions", but is, rather, a "rational reconstruction" that requires language users to reconsider uttered sentences (p. 187). In this respect, the notion of literal content differs from both the notion of what is said and the notion of what is implicated but not said. According to Azzouni, language users grasp the latter notions as the result of experiencing content during language transactions. Chapter 6 discusses the "truth sharing practices" of language users. According to Azzouni, ordinary language users have the "impression" that any truth expressible in one context can be expressed in another context. Azzouni tries to explain what the source of this impression is. We're also given a very quick explanation of why ordinary language users perceive expressions as "public objects" (pp. 239-240). Presumably, the explanation is supposed to address the question Azzouni poses in the subtitle of his book. Chapter 6 also includes a discussion of how ordinary language users perceive the relationship between word meaning and sentence meaning. The chapter ends with a section on the average language user's impression of what concepts are and how it bears on the analytic-synthetic distinction.

In chapters 7 and 8 Azzouni presents several old objections to Gricean "intention-based semantics" and claims that, at bottom, the problems stem from the Gricean's inaccurate understanding of the ordinary experience of language comprehension.

Chapter 9 discusses the relationship between the artificial languages that specialists use in the seminar room and the natural languages that we use in the marketplace. Azzouni rejects the Montagovian presumption that there's no theoretically important difference between the two (p. 322). But the main purpose of the chapter is to defend the claim that "our experience of language explains why it's so easy for us to think of artificial languages as extensions of ordinary language . . . rather than as radically discontinuous with natural languages" (p. 12).

Semantic Perception is long, and this review is short. There's a lot more to say that can't be said -- at least not here. But, at all events, I hope to have said enough to allow readers to make an informed decision about whether the book is worth reading.

ACKNOWLEDGEMENTS

Thanks to Corine Besson, Aidan Gray, Dave Hilbert, and Adam Hosein for helpful discussions.

REFERENCES

Bayne, Tim. 2009. 'Perception and the Reach of Phenomenal Content'. Philosophical

Quarterly 59: 385-404.

Cappelen, Herman and Ernie Lepore. 2004. Insensitive Semantics. Oxford: Wiley-Blackwell.

O'Callaghan, Casey. 2010. 'Experiencing Speech'. Philosophical Issues 20: 305-322.

Siegel, Susanna. 2006. 'Which Properties are Represented in Perception?' In Tamar Szabó Gendler and John Hawthorne, eds., Perceptual Experience. Oxford: Oxford University Press.

-------. 2009. 'The Visual Experience of Causation'. Philosophical Quarterly 59: 519-540.