# Self-Control, Decision Theory, and Rationality: New Essays

José Luis Bermúdez (ed.), Self-Control, Decision Theory, and Rationality: New Essays, Cambridge University Press, 2018, 268pp., \$105.00 (hbk), ISBN 9781108420099.

Reviewed by Richard Pettigrew, University of Bristol

2019.09.11

When José Luis Bermúdez invited the authors of this volume to contribute, he described for them the paradigm sort of case he wished the volume to consider and the two questions about it that he wanted them to answer. Here is the version of the paradigm case that Johanna Thoma considers in her chapter ('Temptation and Preference-Based Instrumental Rationality'). You have been working all morning and it is now time for your coffee break. You want to watch an episode of your favourite TV programme during the break but you're worried that, if you do, you'll be tempted to watch a second episode straight after the first has finished, and currently you don't want that because you more strongly want to finish your work. Should you watch the first episode? If you do, should you watch the second? It seems that, whether or not rationality requires it of us, we do sometimes resist temptation in such situations and stop watching after the first episode. If that's right, how do we do that?

Thoma's example has the following structure. At time t1, you face a decision between action A (watch one episode) and action B (watch none). If you choose B, you will receive outcome O0 (in which you watch no episodes). On the other hand, if you choose A, you will then face a further decision at the later time t2 between action C (watch no further episodes) and action D (watch a second episode). Action C leads to outcome O1 (in which you watch just one episode) and action D leads to outcome O2 (in which you watch two episodes). Here are your preferences over the three options at the three relevant times: t1 is at the start of your coffee break, t2 is one episode's length later; t3 is another episode's length after that:

t1: O1 > O0 > O2

t2: O2 > O1 > O0

t3: O1 > O0 > O2

Thus, your preferences switch between t1 and t2, and then switch back between t2 and t3. We can think of many more examples with a similar structure: I want to have only one piece of cake, but I know that, if I do, I'll want another; ditto cigarettes; I want to check Twitter for a maximum of 5 minutes but know that if I login, I'll want to stay on for 30 minutes; and so on. In each case, if at the earlier time (t1) I choose the action (A) that will put me in a position to obtain the outcome I most prefer at that earlier time (O1), I thereby also put myself in a position at a later time (t2) where I face a choice between an action (C) that will deliver the outcome (O1) I initially most preferred and an action (D) that will deliver the outcome (O2) I initially least preferred; but, at this later time (t2), my preferences will have reversed and I will now prefer the outcome I initially least preferred (O2); and by a still later time (t3), my preferences will have switched back again. In many such situations, we say that I am tempted at the middle time (t2); and if I resist that temptation, we say that I displayed self-control. We often think of self-control as a virtue and its exercise as the best course of action, and yet it seems to involve acting against your preferences at the time of action, something that decision theory usually judges to be irrational.

With the paradigm case in hand, we can now meet Bermúdez's questions to his contributors:

1. When is it rational to exercise self-control?
2. How is it possible to exercise self-control?

The resulting bookful of answers is wonderfully rich. It is genuinely and productively multi-disciplinary, drawing contributions from philosophy, decision theory, and psychology. For the reader, studying all of the contributions together, it is the best sort of edited collection: a thrilling tour of the unifying questions, prompting you to make all manner of interesting connections between the entries.

Let's begin by enumerating some of the answers to the normative questions here. How should I choose at t1? Should I choose action A (watch the first episode) or action B (watch no episodes)? If I choose action A at t1, how should I choose at t2? Should I choose action C (watch no further episodes) or action D (watch the second episode)?

One option is what Thoma calls the two-tier account. On this, it is not only actions that we might choose to perform; we can also choose to adopt plans or resolutions or strategies or intentions. These specify, for current and future possible choices, how you should choose if you are faced with those choices. Thus, in the paradigm example from above, you choose a plan at t1, and then the acts it is rational for you to choose at t1 and t2 are those specified by the plan. Four varieties of two-tier accounts are considered in the book. Thoma considers David Gauthier's version (Gauthier 1984), which exhorts you to choose at t1 the plan that you will consider best at all future times. However, as Thoma notes, in cases of temptation, there will be no such plan. At t1 and t3, you will consider it best to plan to watch just one episode and then stop; at t2, you'll consider it best to plan to watch two episodes. Martin Peterson and Peter Vallentyne ('Self-Prediction and Self-Control') consider Edward McClennen's resolute choice account (McClennen 1990), but note that it is narrow in its scope. It can apply only to individuals with perfect self-control, and such individuals are rare indeed! In Bermúdez's own contribution to the volume ('Frames, Rationality, and Self-Control'), he considers Richard Holton's account of resolutions (Holton 2009), mental states that are supposed to provide the psychological mechanism by which we effect McClennen's resolute choice. Bermúdez raises three objections to it, chief among them that Holton gives no account of how it might be possible to reconsider at t2 a resolution you made at t1 but then come to rationally reaffirm it at t2. Finally, Kenny Easwaran and Reuben Stern ('The Many Ways to Achieve Diachornic Unity') treat Michael Bratman's norm of diachronic continence, which says that, at a later time, you should follow through on any intentions made at an earlier time concerning what you will do at the later time (Bratman 2012). Among other things, they argue that you cannot be rationally required not to reopen deliberation about what to do at the later time if such deliberation will have a negligible cost or even positive benefit -- in their example, reopening the deliberation saves you from boredom!

A second option is the sophisticated choice account. According to this, when you know you face a sequence of decisions, you should treat your future choices in the face of those decisions as part of the world that you can predict and that you should take into account when you make your initial choice. For instance, in Thoma's example, you would reason as follows: if I choose A at t1, then I'll face the choice between C and D at t2; my preferences at t2 will lead me to choose D, which has outcome O2; if, on the other hand, I choose B at t1, this will have outcome O0; at t1, I prefer O0 to O2, so I should choose B at t1. That is, I should choose to watch no episodes since I prefer that to watching two and, if I watch one, I'll end up watching another. However, as Peterson and Vallentyne point out, this assumes that you can be certain you'll choose at t2 in line with the preferences you'll have at that time. And given that we know we manage to resist temptation at least sometimes, we shouldn't be sure of this. They then propose a generalisation of sophisticated choice on which we still treat our future choices as simply other future events whose outcomes we must factor into our decision-making, but we do not assume that we are certain how we will make those choices.

A third option is what Thoma calls intrapersonal optimality accounts. According to these, we should consider the two time-slices of the agent, one at t1 and one at t2, as distinct agents who face different decisions. We then seek an account of how these two separate agents might cooperate in such a way that the t2 time slice will choose against its preferences and resist temptation. In her chapter ('Putting Willpower into Decision Theory'), Natalie Gold gives a fascinating overview of team reasoning, the theory of group decision making developed independently by Michael Bacharach as well as Gold herself, working together with Robert Sugden (Bacharach 1997, Sugden 1993, Gold and Sugden 2007). According to this theory, individuals can sometimes identify with a group of which they are a member and make decisions using group agency and group preferences. Gold applies this theory when the group in question is the collection of time slices of an individual.

Thoma poses a puzzle for any such theory. If we conceive of the time slices as genuinely separate entities, rationality cannot require that they use team reasoning to make their decisions, just as it cannot require that I use team reasoning when I'm making decisions that will affect others at my university. If that's right, then intrapersonal optimality accounts can at most explain how it's possible to resist temptation at t2, but not why it's rationally required. On the other hand, if we conceive of the time slices as different stages of the same person, we might wonder why the new preferences at t2 don't simply override the old ones at t1. Thoma asks for an account of intrapersonal cooperation that would sit between these two extremes and require the time slice at t2 to choose in line with the preferences of the time slices at t1 and t3. Here's a proposal. An individual agent should see the relationship between herself and the collection of time slices that she comprises in the same way we see the relationship between certain collectives, such as activist groups, and the individual members that they comprise. In these collectives, executive power passes from one member to another for a specified time period: Sarah is in charge for one month, then Adil, then Cleo, and so on. The individual in charge can place some extra weight on their own preferences, but they must also give weight to the preferences of the others in the collective. Understood in such a way, time slices will give weight in the way Gold proposes in her team reasoning account and the numbers will often work out so that the t2 time slice is required to resist temptation.

A fourth option, defended by Thoma and also Alfred R. Mele ('Exercising Self-Control'), is to distinguish evaluative preferences from motivational preferences. Motivational preferences have causal force that plays the primary role in determining action; evaluative preferences are your all-things-considered judgment about the options. These can often come apart, and Thoma and Mele both think that this happens in cases of temptation. While your motivational preferences switch between t1 and t2 and then back again between t2 and t3, your evaluative preferences are the same as your motivational preferences at t1 and t3, but instead of switching in between, they remain constant. Thus, you are rationally required at t2 to choose in line with your evaluative preferences, even though your motivational preferences make it very likely you won't achieve that.

A fifth and, to my knowledge, novel option is presented by Arif Ahmed ('Self-Control and Hyperbolic Discounting'). Suppose you are currently at t2, having watched one episode; and suppose, moreover, that you know that you'll face similar sequences of choices many times in the future -- during every coffee break for the rest of your working life, for instance. Then, at t2, you prefer to succumb to temptation this time and watch the second episode; but you also prefer not to succumb to temptation next time nor for any of the other future times. That's because you feel the temptation to watch a second episode only felt when you are faced with the decision. One consequence of this: if by resisting temptation this time I can make it likely enough that I'll resist again in the future, I'll actually prefer to resist this time and stop watching after one episode. Now, Ahmed doesn't think that's likely. But what he does think is likely is that resisting this time gives you evidence that you'll resist next time. Thus, while causal decision theory would lead you to give in to temptation at t2, evidential decision theory might well lead you to resist. You can then either argue from the rational requirement to resist temptation to evidential decision theory, or from evidential decision theory the rational requirement to resist temptation.

A sixth and, again, novel account is provided by Chrisoula Andreou ('Why Temptation?'). Andreou describes a model of the motivational mechanisms that might underpin temptation and argues that they are rational mechanisms that function correctly when we give in to temptation. For her, temptation occurs when we have multiple competing desires. How should a decision-making system work in such a case? Andreou argues that it must balance fairness to the various competing desires, ensuring that each gets its turn to drive choice, while also ensuring that the agent can take advantage of situations in which there is a lot of opportunity to satisfy a particular desire when it is not that desire's turn to drive choice. She argues that a mechanism that does this may well lead us to give in to temptation.

So much, then, for the normative side of the question. The contributions that deal most obviously with the descriptive side are those by the psychologists, Leonard Green and Joel Myerson ('Preference Reversals, Delay Discounting, Rational Choice, and the Brain') and Howard Rachlin ('In What Sense Are Addicts Irrational?'). These authors have long, established research programmes in this area and their contributions to this volume provide valuable overviews of the central results they've established. They are primarily concerned with the ways in which individuals discount future goods -- I value 1 bar of chocolate tomorrow more than 10 bars in one year's time. This phenomenon is often held responsible for creating temptation cases. Some highlights from Green and Myerson's piece: (1) When the future good is larger, we discount it less than when it's smaller, which means that even exponential discounting functions can result in preference reversal; (2) the hyperboloid curve that they describe fits the empirical data on discounting better than the standard hyperbolic curve; (3) the ease with which we can imagine future outcomes is often thought to determine how much we discount future outcomes, but Green and Myerson's experiments involving someone with no ability to imagine the future suggest that this isn't so.

Rachlin's piece provides psychological evidence that individuals at least think of themselves in some ways like the time slice model used by Gold and discussed by Thoma in their chapters. When an outcome will give you something good in the future, you value that outcome less the greater the distance in time between now and when you'll receive that good; when an outcome will give someone else something good now, you value that outcome less the greater the 'social distance' is between you and that individual. Rachlin describes experiments that show that the function of temporal distance you use to discount future goods for yourself is the same as the function of social distance you use to discount current goods for others. This strongly suggests that, when we consider outcomes that give us goods in the future, we think of ourselves in the future as a different person to whom we are socially connected to a certain extent.

Combining Rachlin's results and Green and Myerson's, a natural question arises: do we discount large goods for others less than small goods in the way that Green and Myerson show that we discount large goods for future selves less than small goods? This sort of question occurred to me countless times as I read through this book. The results surveyed, the approaches proposed, and the arguments given are rich and varied enough to keep interest at all times; but they are also focussed very closely on a single issue, and that suggests very fruitful ways in which we might combine different parts of the material presented here to make further progress.

REFERENCES

Bacharach, M. (1997) 'We' Equilibria: A Variable Frame Theory of Cooperation. Oxford: Institute of Economics and Statistics, University of Oxford.

Bratman, M. (2012) 'Time, Rationality, and Self-Governance'. Philosophical Issues 22:73-88.

Gauthier, D. (1994) 'Assure and Threaten'. Ethics 104(4):690-721.

Holton, R. (2009) Willing, Wanting, Waiting. Oxford: Oxford University Press.

McClennen, E. F. (1990) Rationality and Dynamic Choice. Cambridge: Cambridge University Press.

Sugden, R. (1993) 'Thinking as a Team: Towards an Explanation of Nonselfish Behavior'. Social Philosophy and Policy 10:69-89.

Gold, N. and R. Sugden (2007) 'Collective Intentions and Team Agency'. Journal of Philosophy 104:109-37.