Seeing Things: The Philosophy of Reliable Observation

A scientific inference based on multiple lines of independent evidence converging on a similar result is often described as robust. Such 'robustness reasoning' is widely held to be valuable by philosophers of science. Robustness has been appealed to as a defense of various forms of scientific realism and as a response to various forms of scientific skepticism such as underdetermination. Robert Hudson's book sets out to challenge the importance of robustness reasoning.

Despite the ubiquity of appeals to robustness in philosophy of science, as far as I am aware there are no book-length works with a sustained focus on robustness other than Soler et al. (2012), which is a collected volume. Seeing Things is the first of its kind, and for that reason alone it is a valuable contribution. More than this, however, it challenges the mainstream view of robustness. It is, therefore, an ambitious project.

Most of the philosophical arguments come in Chapters One, Six, and Seven. Chapter One criticizes various philosophical defenses of robustness, and Chapter Six picks up where One left off. Chapter Seven asks what remains of arguments for scientific realism without robustness (not much). Chapters Two through Five are detailed descriptions of case studies from twentieth century science. The central point of these case studies, as Hudson would have it, is that scientists themselves do not place much weight on robustness reasoning. If true this would be surprising since scientists often claim that they employ robustness reasoning and philosophers of science have relied on the importance of robustness for many of their arguments (as recent examples: Schupbach forthcoming; Claveau 2013; Lisciandra 2014). Indeed, one of the cases that Hudson describes is the work of Jean Perrin, who famously argued for the reality of atoms by estimating Avogadro's number with a variety of independent lines of evidence -- Perrin claimed that he was reasoning robustly (though he did not use that term), and philosophers such as Nancy Cartwright, Wesley Salmon, Deborah Mayo, and Peter Achinstein have appealed to the case of Perrin as exemplifying robustness reasoning. Hudson revisits the case and concludes that Perrin did not in fact employ robustness reasoning.

Hudson promotes two epistemological principles which he seems to think are competitors to robustness: 'reliable process reasoning' and 'targeted testing'. Reliable process reasoning involves making an inference based on the single most reliable method available to scientists in the respective domain. Targeted testing involves choosing between several possible hypotheses which are presently underdetermined by the available evidence by acquiring independent evidence which serves to increase confidence in one of the available methods, thereby increasing confidence in one hypothesis over another (Hudson recognizes that targeted testing sounds similar to robustness, but he labors to show that they are distinct principles of scientific reasoning). These two principles are undoubtedly important to scientific reasoning, though it is bold to claim that their epistemological merit entirely supplants that of robustness.

While many philosophers seem to hold that robustness reasoning is obviously valuable, Hudson notes that there have been few attempts to justify the value of robustness. Four arguments that have thus far been proposed to justify robustness are (in Hudson's terminology): no-miracles, probabilistic, pragmatic, and epistemic. In Chapter
One Hudson claims that he finds all of these wanting. He identifies what he calls the 'core' argument underlying each of these justifications of robustness, which is akin to the no-miracles argument, and which he claims to denounce. In Chapter One he says that he will leave the details of this criticism until Chapter Six.

The core argument for robustness, in Hudson's words, is:

if an observational report is the product of two (or more) different physical process [sic] (or, in epistemic terms, the product of two or more distinct theoretical assumptions), then there is less of a chance the report is only an artifact of one of the processes (or simply a byproduct of one of these assumptions) since the independent production of the same artifact, despite a change in the physical process (or in the assumptions used), is highly unlikely. (170)

The best explanation for the fact that multiple independent lines of evidence produced the same report, according to the core argument for robustness, is "the reliability of the processes that generate this report along with this report's truth" (170). After five chapters, Hudson finally articulates the problem that he has with the core argument. The problem has to do with the notion of independence. The core argument that warrants robustness is compelling only if the multiple lines of evidence are independent. But, claims Hudson -- and this is the crux of his complaint -- the notion of independence "lacks clarity" (173). When the crux comes, the reader is referred back to Chapter One.

One way of explicating independence is probabilistic, and Hudson purports to show that such approaches have thus far failed, though the strength of his thesis about this is ambiguous: "So the lesson we might derive here is that, to understand the independence that underwrites robustness reasoning, we need to comprehend this independence in a nonprobabilistic way" (24). This is in Chapter One. He waits until he has discussed his case studies in detail, in Chapters Two, Three, Four, and Five, to assess a nonprobabilistic way of explicating independence, in Chapter Six.

I found many of Hudson's arguments unconvincing. Consider his case against probabilistic approaches to robustness (sometimes called the 'variety of evidence thesis' among formal philosophers of science), by Colin Howson and Allan Franklin, Elliott Sober, Luc Bovens and Stephan Hartmann, and an unpublished approach articulated twenty years ago in personal correspondence (0n which Hudson dwells for almost three pages). For brevity I discuss only his criticism of Sober's approach.

Let P be a proposition of interest, and W_i(P) stand for 'witness Wi claims P'. Sober (2008) gives a formal argument that supports the ubiquitous intuition undergirding robustness, namely, that "two independent and (at least minimally) reliable witnesses who agree that P is true provide stronger evidence in favor of P than either witness does alone." Sober's argument relies on this plausible representation of independent witness reports: that W₁(P) is independent of W₂(P) once we conditionalize on P. Hudson retorts that if this premise is true, then W₁(P) must also be independent of W₁(P) once we conditionalize on P. Skipping some details, Hudson concludes that "Sober's argument for the value of retrieving independent reliable witness reports, as opposed to sticking with just one witness, breaks down at its most crucial point" (20). But why must W₁(P) be independent of W₁(P) once we conditionalize on P? This is a strange idea for which Hudson gives no argument, though he does restate his claim: "P (or -- P) screens off the impact learning W₁(P) might have on our assessment of the probability of W₁(P) just as it does with W₂(P)" (19-20). To see that this is false, one can compare the following probabilities: p(W₁(P)|P & W₁(P)), which is always 1, versus p(W₁(P)|P), which is not necessarily 1 and indeed only in unusual cases will be 1. Thus, in general, p(W₁(P)|P & W₁(P)) ≠ p(W₁(P)|P); in other words, W₁(P) is usually dependent on W₁(P) once we conditionalize on P (this is intuitive: the impact that a witnesses report has on one's assessment of P ought to depend on the impact that the same report, expressed earlier, had on one's assessment of P).

Thus Hudson's case against Sober is unsound. Unfortunately he relies on the same move when arguing against the probabilistic approach to robustness advanced by Bovens and Hartmann. In short, if you are among those readers whose eyes glazed over when reading the prior paragraph: the most promising probabilistic approaches to warranting robustness remain unscathed by Hudson.

Here is one of Hudson's central arguments against robustness reasoning which appears throughout the book: if a scientist has inferred P using the most reliable method available (call this method M), then that scientist ought not use, in addition to M, non-M methods of inferring P because those methods are (by stipulation) less reliable. An exemplary quote: "These other approaches can be simply and safely ignored" (107). In many places throughout the book Hudson goes so far as to claim that any non-M method is unreliable. Sometimes Hudson seems to assume that reliability of empirical methods is all-or-nothing -- methods are "either reliable or not" (6) -- and since non-M methods are less reliable than M, they are unreliable and thus should be ignored: "robustness is not a valuable approach when we are considering two observational procedures, one of which is deemed reliable and the other unreliable" (211). This basic argument appears throughout the book (see 97, 101, 102, 107, 174, 211, and 243, as examples).

It could be that in some domains of science there is one gold-standard method while all other methods are mere shots in the dark, little better than palm reading or simply guessing. But most of science is not like that. Suppose, for example, that a single small randomized controlled trial -- controversially considered to be the gold-standard method in medical research -- shows that a drug reduces risk of death by 2%, while ten large case-control studies -- controversially considered to be less reliable than randomized controlled trials -- show that the same drug increases risk of death by 1%. Should those who must estimate the effect of this drug on risk of death -- physicians and patients and policy-makers, say -- ignore the ten large case-control studies and pay heed only to the single randomized controlled trial? Surely not.

Hudson's central argument violates the infamous 'principle of total evidence'. The present review is no place to argue about this principle. Nevertheless, violating the principle of total evidence is, to say the least, controversial. In a book-length work which consistently urges us to violate this principle, one would hope that philosophical reservations about violating the principle would be addressed.

What is robustness reasoning supposed to establish? For most philosophers and scientists who appeal to robustness, the structure of inference is supposed to increase confidence in an experimental finding or observational result -- but to use a handy distinction, it is not the data itself which is rendered more secure by robustness, but rather it is a more general and abstract phenomenon which is inferred from the data. Suppose a single observational process or experimental method (M₁) generates evidence (E₁) which supports an inference to phenomenon P. Is P true? How confident ought we be in P? The trouble is that, despite our inference, P might not be the case because E₁ might be merely the result of systematic error in M₁. Now suppose we have a second method, M₂, independent of the first, which generates evidence (E₂) that also supports the inference to P. We have a case of robustness. Usually, robustness is said to be about P: P is more probably true given E₁ and E₂ than given either lines of evidence alone.

Hudson, on the other hand, seems to think that robustness is about M₁ and M₂. That is, Hudson seems to think that, in cases like ours, robustness reasoning holds that the convergence of E₁ and E₂ on P indicates that our confidence in the reliability of M₁ and M₂ ought to be greater than it would have been given a lack of such convergence. Robustness reasoning, claims Hudson, involves "grounding the reliability of independent observational processes on their capacity to generate of robust results [sic]" (195). In virtue of our increased confidence in the reliability of M₁ and M₂, we might increase our confidence in P, but it is the increase of confidence in the reliability of M₁ and M₂ that Hudson seems preoccupied with. This view of robustness is repeated throughout (see 7, 173, 190, 200, 211, and 243, as examples), though Hudson does occasionally claim that robustness is about both M₁ and M₂ and P. Is robustness about increasing our confidence in P or about increasing our confidence in the reliability of M₁ and M₂?

I doubt that there is a definitive answer. A lesson from confirmational holism is that confirming evidence can increase confidence in a hypothesis, in one's background assumptions regarding one's experimental methods, or sometimes both. If that is the case with a single kind of confirming evidence, it is the case when one has multiple kinds of confirming evidence that converge on a similar result. If so, then even if we grant that Hudson's arguments against robustness with respect to reliability of methods are convincing (which, as I have suggested, would be generous), that would leave untouched robustness with respect to inferred phenomena. And it is this use of robustness that most philosophers and scientists are concerned with, though arguing the point here would take me astray.

Most of the philosophical literature that Hudson grapples with is dated. His interlocutors include books and articles published in the early 1980s by Hacking, Franklin and Howson, McMullin, and Salmon. With a few exceptions, his citations to more recent work tend to lack engagement. For instance, in the introduction he cites an article of mine (2009) identifying me as a 'pro-robustness supporter' despite the fact that the central point of the cited article is to articulate problems with robustness reasoning. His engagement with the recent probabilistic approaches to robustness is, as noted above, problematic.

Contrary to Hudson's thesis that robustness reasoning is useless, the right thing to say about robustness reasoning is that it can be useful though it is fallible. This is what the mesosome case, as described by Hudson in Chapter Two and others before him, shows. But if that is the most critical thing one can say about robustness, why should we care? Aren't we all fallibilists about empirical science? Not so fast. Many arguments for scientific realism of one flavor or another have been based on robustness-type considerations, as Hudson aptly shows. Putting aside one's intuitions about the no-miracles argument, it is hard to see why one would convert to realism upon consideration of a fallible method of scientific reasoning (robustness). Noting the fallibility of robustness reasoning, as modest as this may seem, is a contribution to the realism-antirealism debate since defenders of realism have placed so much metaphysical weight on robustness.

Thus far I have had little to say about the four case studies that Hudson discusses. They are about: mesosomes, once thought to be a structure of cells but now thought to be an artifact of microscopic cell preparations (Chapter Two); weakly interacting massive particles, one of the leading hypothetical candidates for 'dark matter' (Chapter Three); Perrin's infamous estimation of Avogadro's number and subsequent argument for the reality of atoms (Chapter Four); and purported observations of dark matter, dark energy, and the accelerative expansion of the universe (Chapter Five). These case studies are described in detail, and the first two have been extensively discussed by previous historians and philosophers of science. The point of these chapters "is to see if robustness reasoning is in fact used by practicing scientists" (52). Hudson claims that it is not. The implication seems to be that robustness reasoning is not a compelling mode of inference in science (see, for example, 65, 80 and 94). This sort of historical naturalism about scientific reasoning has become popular in contemporary philosophy of science, so fewer philosophers of science will be offended by it than one might suspect.

In short, this is an ambitious book which stumbles. It is the first sustained, single-author book on robustness, in which a widely believed platitude is challenged. Its flaws, however, are numerous. Nevertheless, I anticipate that it will interest those philosophers of science concerned with scientific reasoning and the realism-antirealism debate.

REFERENCES

Claveau, Francois. 2013. "The Independence Condition in the Variety-of-Evidence Thesis" Philosophy of Science 80: 94-118.

Lisciandra, Chiara. 2014. "Robustness Analysis and Tractability in Modeling" Philosophy of Science Association Meeting, Chicago.

Schupbach, Jonah. Forthcoming. "Robustness Analysis as Explanatory Reasoning" The British Journal for the Philosophy of Science.

Sober, Elliott. 2008. Evidence and Evolution. Cambridge University Press.

Soler, Léna, Trizio, E., Nickles, T., Wimsatt, W. (Eds.). 2012. Characterizing the Robustness of Science, Springer.

Stegenga, Jacob. 2009. "Robustness, Discordance, and Relevance" Philosophy of Science 76: 650-661.