Moral Reality and the Empirical Sciences

In this book, Thomas Pölzler evaluates recent empirically-informed work that purports to address the existence and nature of moral reality. The first two chapters introduce the project and address metatheoretical worries, and subsequent chapters evaluate particular arguments that ostensibly support moral realism or moral anti-realism. Pölzler maintains that, while it is possible for empirical work to contribute to the moral realism/anti-realism debate, thus far, studies have either made problematic conceptual assumptions, were not scientifically well-founded, or both.

Pölzler has produced an outstandingly useful book. It is clearly written and painstakingly researched, with detailed notes and references. Anyone working on these topics will want to read the sections relevant to their own interests. Pölzler identifies multiple problems with influential arguments and positions, and he makes helpful suggestions about how empirical studies might be improved. While his numerous criticisms are of somewhat uneven quality, he provides a complete catalogue of possible objections to the views he canvasses.

A recurrent theme is that empirical researchers routinely make significant conceptual assumptions concerning the nature of moral judgments. For example, it is sometimes assumed that simple expressions like "okay" or "allowed" express specifically moral senses of permissibility, or that moral judgments must be beliefs, or that moral facts must be strongly objective (independent of the favoring attitudes of actual and possible people). Additionally, researchers are unclear on whether they take moral judgments by their very nature entail motivation -- or the existence of categorical reasons for action. Some assume that moral judgments must be backed up by considerations of harm, rights, or fairness, while others are more permissive. All of this is problematic, given the goals of this body of research.

This review describes the contents of the substantive chapters, before returning to some questions about the material in Chapter 2 and the book's conclusion.

Chapter 3, "Folk Moral Realism," concerns arguments for moral realism based on the experiential hypothesis -- the allegedly universal appearance of objective moral truths. Philosophers like David McNaughton argue that, if this is how things appear, there is prima facie reason to believe that this is how they are.

Now, the idea that people experience morality as objective looks like an empirical claim -- one that could be verified experimentally. And in principle, Pölzler says, it could be. However, existing studies are inconclusive. If anything, they suggest that people do not consistently experience morality as objective: perhaps most people experience some moral sentences according to realism, though they experience others according to anti-realism.

Pölzler argues that many of the relevant studies lack construct validity -- i.e., they do not measure the right thing, given their purposes. If the aim is to establish that most people experience morality as a realm of objective truths in the sense endorsed by robust metaethical realists, their instruments are too simplistic. Representative studies present subjects with a hypothetical moral disagreement and ask them to choose between "both beliefs are right" and "only one belief is right," or to evaluate possible assertions as "true," "false," and "opinion." They fail to include options that would draw out answers associated with views like non-cognitivism, quasi-realism, speaker-relativism, cultural relativism, ideal-observer and constructivist theories, and non-standard conceptions of truth (e.g., truth as coherence or warranted assertibility). These possibilities must be investigated and ruled out.

There are also questions concerning these studies' external validity -- i.e., they fail to establish their conclusions. Thus far, studies have been run on a limited range of populations. Many involve unrealistic stories or humorous examples that might influence subjects in unintended ways. To remedy such problems, Pölzler makes numerous constructive suggestions about experimental design.

Chapter 4, "Moral Disagreement," focuses on one of Mackie's famous disagreement-based arguments for error theory. In the face of the widespread and fundamental moral disagreement that does exist, the moral realist must hold that people error in their moral judgments in systematic ways. This suggests moral truths are difficult or impossible to access. But in this case, they cannot be practical: they cannot make a different to what we ought to do. But, as Richard Rowland has argued, being practical in this sense is a necessary condition for the existence of these truths (93).

The idea that there is widespread and fundamental moral disagreement is an empirical claim. And indeed, some empirical studies appear to support it. But do they really establish that there are suitably deep moral disagreements that are neither premised on non-moral disagreement, nor due to cognitive biases or irrationality? Pölzler says: No. Faced with the results of these studies, moral realists can reply that disagreement has been illegitimately inferred from behavior, that the disagreement is due to differences in non-moral beliefs, that it merely concerns the degree of rightness/wrongness, or that it is not even moral (but instead about what is all-things-considered rational, or conventionally right).

For example, some claim that the Hopi, in contrast to most Westerners, do not judge harming animals for trivial reasons to be morally wrong. But in Richard Brandt's famous study of the Hopi, many did say it was wrong, even though this had a very weak influence on their behavior. So perhaps the Hope did typically judge animal cruelty to be wrong, but they were simply not motivated to behave accordingly. In this scenario, they would not be so different from Westerners who eat factory-farmed food.

It is also claimed, e.g. by Richard Nisbett and Dov Cohen, that U.S. Southerners are more likely than other Westerners to believe it is morally okay to respond to threats and insults with violence. However, the data is open to other interpretations. Perhaps Southerners are just more aroused or stressed by such threats and insults, or more ready to forgive such violence when it is perpetrated. Or, when they judge that a person is "justified" in fighting or shooting an insulter, they may mean that the person was socially justified in acting this way, or even rationally justified in acting immorally (113).

Error theorists needs studies that isolate moral disagreements that cannot be traced back to differences in non-moral beliefs, and that are not merely differences in degree. Alternatively, they might focus -- as Brian Leiter also suggests -- on moral philosophers' disagreements about the truth of moral principles (123). For they do not need a great quantity of disagreement, just persistent disagreements of the right kind. Philosophers could even aid in the design of studies that target this sort of disagreement.

Chapter 5, "Moral Judgements and Emotions," targets views according to which moral judgments always co-occur with -- or are causally influenced by, or are constituted by -- emotions, where emotions are understood in a non-cognitivist way (135). A leading example, here, is Prinz's sentimentalism.

Such views are apparently committed to the truth of certain empirical claims. But which claims depends on precisely what they propose when it comes to frequency (whether all moral judgments are associated with emotions in this way), intensity (how intense the relevant emotions are), and effect sizes. Accordingly, Pölzler distinguishes and evaluates several possible sentimentalist hypotheses. He argues that existing studies only establish small effects that occur for some subjects in some situations.

Many of the experiments in this area, Pölzler says, are lacking in internal validity. Several, such as those conducted by Simone Schnall (et al.), focus on the idiosyncratic emotion of disgust, especially where it is elicited by "incidental" sources, i.e., sources besides the relevant moral stimuli (145). These studies did not account for the possibility that feeling disgust might influence one's interpretations of non-moral facts, or the possibility that people might reason from their experience of disgust to certain conclusions (154). Studies have also used flawed scenarios and item-statements (e.g., ones originally designed for children), as well as unrealistic and humorous scenarios that introduce various sorts of experimental noise (145). Some have focused on idiosyncratic sub-populations (e.g., people high in "private body consciousness"), and others cannot be replicated (171-2).

There are further problems with the studies that purport to establish that emotions co-occur with or causally influence moral judgments. Influential fMRI studies run separately by Alan Sanfey (et al.) and Joshua Greene make the problematic assumption that brain areas are functionally specific in certain ways. Prompts used in these studies asked subjects about the generic "appropriateness" of actions, not specifically about moral rightness. Furthermore, when subjects in Sanfey's study judged that an unequal split was acceptable -- or, in Greene's study, that pulling the switch was morally permissible -- this was not classified as making a moral judgment, even though the subjects were apparently judging that some behavior was morally okay. This is problematic, because unlike subjects' judgments of impermissibility, these permissibility judgments were not accompanied by supposedly emotional brain-states.

There are also problems with studies that aim to show that emotions are causally sufficient for moral judgments. This hypothesis predicts that induction of emotion will prompt moral judgment. Jonathan Haidt has conducted relevant experiments. For example, subjects are hypnotized to feel disgust when confronted with a neutral stimulus, or a prompt about incest is used to induce moral dumbfounding. But these studies, too, have only found relatively rare and small effects (163).

Prinz argues for the claim that emotions are synchronically necessary for moral judgments on the grounds that, if this claim were false, "we would expect more moral convergence cross-culturally" (quoted on 165). Pölzler says this is a stretch. The lack of convergence could be due to religious, metaphysical, or empirical disagreements. It could be due to partiality, or to the difficulty of grasping or establishing certain moral truths.

What about the weaker claim that certain basic emotions are diachronically necessary for moral judgments in a developmental sense, i.e., necessary in order that people acquire and deploy moral concepts? Prinz (like Haidt, Michael Gill, and Shaun Nichols) proposes that psychopaths lack certain emotions and thereby suffer a moral impairment. In particular, Prinz says, they are deficient in fear and sadness. But even supposing this is true, this argument turns on whether psychopaths also lack moral concepts. The traditional evidence for this is that they cannot sort moral from conventional permissions and prohibitions, as these were famously distinguished by Elliot Turiel (1983). In outline, Turiel holds that moral prohibitions are characteristically: (1) serious, (2) universal, (3) authority-independent, and (4) justified in terms of harm, justice, or rights. Unlike most people, psychopaths -- according to a famous study by Robert Blair -- believe that conventional transgressions are just as authority-independent as moral ones.

Pölzler urges caution, here. Blair's study uses "okay for x to do y" in a way that might prompt judgments about x's evaluative perspective, not moral permissibility. Also, recent work shows that other sub-populations fail to draw the moral / conventional distinction in the way that Turiel describes (167). Some philosophers, such as Joseph Heath, also criticize Turiel's way of drawing the distinction. Absent clarification on what is really necessary for a person to have and deploy moral concepts, perhaps we should conclude, not that psychopaths lack moral concepts, but that "psychopaths think differently about moral concepts than non-psychopaths" (168).

Chapter 6, "The Evolution of Morality," examines evolutionary arguments for moral skepticism. These typically run as follows: moral judgments are adaptations caused by natural selection. But natural selection is insensitive to truth: it might lead one to judge that p whether or not p is true. Therefore, moral judgments are unjustified.

But if moral judgments are adaptations, which trait linked to them is supposed to be fitness-enhancing? On Richard Joyce's view, humans evolved a functionally specialized capacity to make judgments to the effect that certain behaviors are morally obligatory and/or morally wrong. This capacity was a heritable adaptation with a differential impact on fitness, because it made our ancestors more likely to help each other: the tendency to make moral judgments would reliably motivated useful pro-social actions in the environment of ancestral adaptation.

This view is empirically committed in obvious ways. Can we establish that this capacity is an adaptation? Pölzler says we cannot, unless we are able to rule out competing hypotheses. These include the non-evolutionary hypothesis that the capacity for moral judgment, like the one for handwriting, emerged but did not evolve for any specific purpose. They also include the byproduct hypothesis favored by Darwin himself, according to which the capacity for moral judgment was a byproduct of "well-marked social instincts" and "intellectual powers" (196-7).

Additionally, Joyce's argument depends on the conceptual claim that moral judgments are intrinsically motivating. Joyce appears to recognize this: he is explicit that moral judgments express non-cognitive motivational states, that they posit (and could only be made true by) the existence of external, objective reasons for action, and that these reasons are very strong (204). But these are significant assumptions. We do not know whether Ancient Mesopotamians -- or even earlier peoples -- made moral judgments in this rich sense. Furthermore, as Joyce acknowledges, even if they did, the universality of moral judgment would not prove it is an adaptation: wearing clothes is near-universal, but that's because it is beneficial in virtually all human environments.

What if it could also be shown that the capacity to make moral judgments is developmentally canalized, i.e., such that it develops in a way that exceeds information available in children's environments? This would show developmental nativism, but Pölzler claims it would not discriminate between the adaptationist hypothesis Joyce needs and the by-product hypothesis. Furthermore, to establish developmental canalization, we would need to know when we can attribute this capacity to children, and this would require us to know the general features of moral judgments. But this remains unsettled. There is evidence that neither children nor adults sharply distinguish moral and conventional judgments in the way that Joyce -- following Turiel -- proposes (209). Pölzler also criticizes Joyce for not clarifying potential inconsistencies between Turiel's conception of moral judgments (on which they need to be justified in terms of harm, justice, and/or rights) and his own. Finally, if moral judgments really evolved to increase the likelihood of ancestors helping each other, why are so many judgments not about helping?

This summary should make clear that Pölzler's book constitutes a significant contribution to the surrounding literature. He clearly identifies the empirical commitments of various science-based arguments for moral realism and moral anti-realism; he evaluates scientific studies that have been thought to probe these commitments; and he suggests how to improve these studies.

However, Chapter 2 ("Metatheoretical Considerations") and the book's conclusion both raise questions about the relevance of empirical studies to metaethics that go beyond the particular arguments considered in the various chapters. Pölzler's stance on these more general issues is somewhat unclear.

One such issue is whether the conceptual is logically prior to the empirical. Antti Kauppinen has argued that we must settle controversies in moral semantics (e.g., about the meaning of 'moral judgment') before we frame empirical hypotheses for testing. But scientific data, he says, is irrelevant to moral semantics. For only competent speakers' semantic intuitions under ideal circumstances have probative value, here. Experimental circumstances are not ideal, targeting semantic intuitions is very difficult, and competence is itself a normative concept that cannot be operationalized. Therefore, there is no better approach to moral semantics than the consideration of convincing stories and plausible descriptions by trained philosophers (31).

In contrast, Pölzler wants to claim that scientific data is relevant to moral semantics, and so also to larger metaethical issues. With the help of philosophers, he says, we might refine experiments to target semantic intuitions more effectively. We might improve the experimental circumstances by providing better instructions and fewer distractions. We could use larger samples, include validity checks, and so on. Still, since these improvements will probably fail to settle all conceptual controversies, Pölzler says, scientists may need to acknowledge their conceptual assumptions and frame their conclusions in a conditional way -- e.g., if moral rightness is this, then (e.g.) judgments about it are adaptations. Each study could note its assumptions, and we could save some conceptual controversies for later.

This seems reasonable enough. However, at later points, Pölzler indicates that he actually prefers a different response to Kauppinen. He suggests we abandon the strict priority of the conceptual. We should mutually adjust our conceptual and empirical claims until they best explain each other. "Scientific findings about paradigmatic instances of moral judgments may then plausibly be claimed to support accounts of what moral judgments mean" (33-4, 229). Presumably, we should consider our intuitions about which judgments count as moral, but also theoretical considerations relating to our best normative theories, our theories about moral learning, and the role of moral judgments in anthropological, sociological, and psychological theorizing.

There is doubtless much to be said for this approach to moral semantics. However, despite Pölzler's announced preference for it, he never illustrates what it would mean in the context of studies like the ones he considers -- e.g., how we might improve on Turiel's conception of moral judgment in light of more recent research. Indeed, Pölzler sometimes leans in the opposite direction, entertaining conceptual constraints on moral judgments that are parochial and demanding, and using these to criticize the work of others. This is not what one would expect from someone who takes the reflective equilibrium approach to moral semantics sketched above.

Here are three examples. First, Pölzler follows Rowland in claiming that "practicality is a necessary condition for the existence of moral truths," and that unknowable moral truths could not make a difference to what we ought to do (93). Depending on how exactly this is interpreted, this practicality may be at odds with a naturalistic moral metaphysics semantics. Second, Pölzler mentions Foot's idea that genuinely moral judgments must be about harms and benefits (97). This seems unduly restrictive: it implies that many people have been seriously confused about which judgments count as moral. In a third case, Pölzler criticizes Haidt's hypnosis experiments on the grounds that the subjects may not have been making moral judgments, because "in order for a person's judgment to qualify as moral, the person would have to be willing to retract it if he or she came to know that he or she did not adopt it on the basis of relevant reasons" (161). Or rather, he mentions this criticism to show that we must balance the results of the hypnosis experiment "against all relevant conceptual evidence and all other logically related claims" (161). But to mention this possible constraint seems a little impertinent if we are to take the reflective equilibrium approach. For it would likely force us to adopt an overly intellectualized account of moral judgments.

Of course, using empirical studies or theories from other disciplines to triangulate in on the true nature of moral judgments would be an enormous project in its own right. Pölzler cannot be faulted for not having undertaken it, here. Still, some of his criticisms seem gratuitous, given the methodology he ultimately favors. In any case, he has demonstrated that we urgently need a better understanding of moral judgments in order to evaluate empirical studies relating to moral realism and anti-realism. Until we have such an understanding, many of our evaluations will be inconclusive, even if scientific studies are improved in the ways Pölzler recommends.