Evidence, Inference and Enquiry

Evidence, Inference and Enquiry includes seventeen essays and is a product of a multiyear, multidisciplinary evidence project housed primarily at University College London and directed by Philip Dawid and William Twining.[1] The UCL Evidence Programme received funding through the Leverhulme Trust and as a result of the Trust's call for proposals on the "Nature of Evidence." This is not, strictly speaking, a collection of philosophy essays -- the participants in the project were from many disciplines (e.g., jurisprudence, physiology, cognitive science, classics, engineering). Nonetheless, evidence, inference, and inquiry are clearly of interest to philosophers and therefore the collection is likely to be too.

The foreword and Philip Dawid's introduction alert the reader to various benefits and drawbacks of tensions among the participants. These tensions make the first few chapters somewhat surprising. One expects essays about evidence -- and these are there -- but one does not expect essays about the project itself, and either whole essays or parts of the first few essays are indeed devoted to the difficulties of the project. The defensive tone of Schum's "Classifying Forms and Combinations of Evidence: Necessary in a Science of Evidence" reveals at least one source of these difficulties. He reports that in an early paper for the project he had argued for a "substance-blind" classificatory system of evidence. Schum tells us, "my use of the term 'substance-blindness' regarding evidence aroused some controversy. What I had said about there being a scheme for classifying all evidence regardless of its substance or content has occasionally been misinterpreted." (15) In the present essay he proposes to clarify both what he intended by "science" and by "substance-blindness" and to argue for the importance of a uniform classification system that can be applied to any discipline.

For his clarification of "science" he turns to The New Shorter Oxford English Dictionary. There he finds five characterizations, several of which he acknowledges might be seen as problematic. The one that he finds resonates most closely with his goals focuses on science as a system of classification, and he goes on to cite Poincaré for a clearer notion of what he means by "substance-blind." It is the recurrent nature of patterns of relations that he seeks to identify in evidence. Schum proposes two dimensions along which classification of evidence should proceed: credibility and relevance. The credibility dimension includes five types: tangible (real or demonstrative), testimonial (unequivocal or equivocal), missing tangibles or testimony, and accepted facts. Each of these is also judged across the dimension of relevance, as either directly or indirectly relevant.

In addition to classifying items of evidence, Schum's system classifies evidence in terms of its relation to other evidence. Evidence may be related harmoniously, dissonantly, or redundantly. The great virtue of employing such a classification system, according to Schum, is particularly clear when it is necessary to sort through large bodies of evidence (in any discipline or subject matter area). He finishes the essay with two examples: evidence in the Sacco and Vanzetti case and the use of his classificatory system across six studies from a wide variety of cases in different disciplines in the work of Twining and Hampshire-Monk. He argues both that without the classification system it would have been much more difficult to sort through the evidence and also that his ability to use the system on such different content shows that a substance-blind science of evidence is possible.

Jason Davies's "Disciplining the Disciplines," is entirely focused on the project. (He has a second essay later in the volume that deals with evidence in one of his disciplines.) He considers objections raised against the presupposition that there could be a "science of evidence" and offers some diagnosis. He suggests, for example, that the lack of focus and inability to come to agreement partly resulted from the participants not being focused on solving a particular problem -- e.g., how to train intelligence officers. His example is not chosen at random. We learn through Davies's essay that while one source of the Leverhulme Trust's interest in funding a project on evidence was the ongoing call for evidence-based policy and questions about just what this might mean; another motivator was questions raised after 9/11 about how to sort through intelligence evidence. Claims had been made that if available data had been properly sorted and interpreted as evidence, the attacks might have been prevented.

Davies expresses uncertainty that the project was worthwhile, arguing that the necessary boundaries between the disciplines prevent a general account of evidence. However, he also argues that the problematics of interdisciplinarity that make progress unlikely also make it hard to judge if progress has been made. The essay ends with the suitably ambivalent suggestion that those who participated in the project from the beginning were more able to handle the ambiguities surrounding evidence than those joining for the final symposium. "If there is any truth in it, such a qualitative change is extremely difficult to substantiate -- we have next-to-no 'evidence'. The reader will have to decide whether to take my word for it." (70)

William Twining ("Moving Beyond the Law: Interdisciplinarity and the Study of Evidence") also considers the project reflexively.

When I started, I had assumed that the main focus would be on exploring the extent to which there are concepts, principles and methods relating to evidence and inference that could be developed and applied broadly across many disciplines. I expected that scholars with different specialisms could both contribute to and learn from such an enterprise. (96-97) . . . My personal expectations were disappointed, but as Davies makes clear, this is far from saying that the programme was a failure; but the criteria of 'success' are elusive (98).

Some might think that discussions about the organization of and problems with the project are inappropriate in the anthology, but it seems difficult to dismiss the idea that the topic itself had something to do with these issues. Additional funding from the Economic and Social Research Council (ESRC) ultimately allowed for the support of a second, "sister," evidence project: a project on facts at the LSE led by Mary Morgan. The LSE facts project also produced an anthology -- How Well Do Facts Travel? (Cambridge 2011). I mention the LSE project for two reasons: it is referred to in the foreword, the introduction, and in several other chapters, indicating an awareness of the other project among the participants in the UCL project. Also, while it shared many of the characteristics of the UCL project (funding, multidisciplinarity), it apparently progressed much more smoothly. What was different about the two projects?

First, some of the difference appears to result from the framing of the project. "'Evidence, Inference and Enquiry: towards an integrated science of evidence'; led by Philip Dawid and William Twining at UCL" was the originally proposed title, and the concerns about the use of "science" have already been noted. Twining identifies suspicion of scientism as one of the fears creating tension among the participants. Suspicion of scientism (or empiricism) is fundamentally a concern about pushing all of human enquiry into one mode. It is sometimes expressed through resistance to the idea that all data (and so evidence?) must be quantitative or at least quantifiable. Different conceptions of evidence often go with strong views on the autonomy of the disciplines -- the source of the difficulty that Davies highlights. Debates of this sort about evidence are not merely debates that take place when different disciplines are brought into contact with each other but are also debates that are taking place within disciplines themselves. For example, political methodologists debate the relative merits of methodologies for producing evidence to establish causal claims: is qualitative evidence relevant? Should arguments be statistical (econometric) or can such claims only be established through randomised controlled trials? These debates are not merely inter- but also intra-disciplinary, and that may account for some of the heat surrounding the UCL evidence project.

But I think that there are specifically philosophical issues about evidence that are not merely a result of disciplinary boundaries. Consider a comparison with the LSE facts "sister" project. In that project the question posed was how facts might travel to other disciplines -- that is, once they had been deemed to be facts within their home discipline. How they were so established and whether they were deserving of the honorific title was not up for grabs within the context of the project. Howlett and Morgan spell this out in the introduction to their volume: "The essays here do not adopt any one theoretical or disciplinary approach. We are not committed as a group to any particular sociological theory about knowledge transfer, nor to the establishment of a philosophical test of the truth or falsity of facts, nor to the provision of an epistemic history of facts as a category" (2011, xvi). In the case of the UCL evidence project, the idea that there might be standards of evidence that do or should apply across all disciplines appeared threatening to those who had reason to believe that what was understood by "evidence" in their discipline, might not be seen as evidence in another (particularly if the standards were somehow to be scientific). But this is not just an issue of different views about what counts as evidence in different disciplines (or different approaches or methodologies). It is really a result of a puzzle about evidence that many of the essays in this volume run up against.

Whether there are recurrent patterns that are characteristic of evidence wherever it appears, as Schum claims, requires drawing a distinction between form and content (Schum's "substance" for which the categories are "blind"). But drawing such a distinction runs into the same sorts of problems that Quine identified in "Two Dogmas of Empiricism" -- it is a bit like drawing the distinction between the form and content of a sentence to nail down the analytic/synthetic distinction. Another way to see this is to note that often we can ask the question about whether something is evidence or really a warrant for inference. We can often answer such questions within a particular context but the answers are typically not general enough to be portable across all contexts -- that is, across disciplines or sometimes even within disciplines. The underlying problem is a philosophical problem, I think, at least in so far as it requires making distinctions and thinking clearly about differences and similarities in concepts and contexts. But apparently philosophy did not fare that well in this setting. Davies notes, "Some thought these were philosophical questions -- but why privilege philosophy?" (53) The facts project explicitly put aside such philosophical questions and asked other questions, and perhaps this is one of the distinguishing features between the levels of harmony. I now turn to the other essays, which look at evidence rather than at the UCL Evidence Programme.

William Twining's essay, "Moving Beyond the Law: Interdisciplinarity and the Study of Evidence," is really two essays in one. I have discussed the reflexive second half above. The first half offers a history of evidence in the law. Twining gives a clear and concise account, focusing on several different approaches (including Bentham's work in this area which is largely unknown among philosophers). It is also clear from this section why Twining and others may have thought that starting from the law made sense. Evidence in the law has always been interdisciplinary (scientific, medical, engineering, statistical).

As his title suggests, the next question was whether ideas of evidence could be taken from the law and used elsewhere. But in fact there are various characteristics of evidence in the law that are quite unlike those in other arenas. For example, in legal contexts the hypotheses for which we seek evidence are clearly defined and limited to a few (guilty or not guilty in the most simple form, but more complex questions such as guilty of manslaughter vs. guilty of murder in the first degree and so on). In the sciences, this is not the case, and the question of whether all the hypotheses that could be considered have actually been considered is relevant for determining the degree of certainty that we have for accepted hypotheses. Another dissimilarity is that in the case of the law we are typically seeking evidence in support of a particular conclusion -- "Did the butler do it?" -- whereas in the sciences we are looking for support for more general hypotheses. Additionally and relatedly, the conclusions that we seek in the law are not intended to support predictions whereas in other areas they may well be.

The focus on law continues in "Inference Networks: Bayes and Wigmore" by Philip Dawid, David Schum, and Amanda Helper. John Wigmore (1863-1943), an American scholar of evidence, devised a method of "charts" for representing evidential relations. The authors consider a comparison of Wigmore's charts with Bayesian networks. They conclude that while Bayesian networks can be problematic in that they require the assignment of specific probabilities, they allow for representation of greater complexity in evidential relations than the Wigmore charts do. While both representational methods are useful for organizing complex evidence, the authors suggest that Bayesian networks may well be better suited to scientific evidence. The primary reason they offer is that the main difference between Bayesian networks and Wigmore's representations is that the probative questions for Wigmore's method provide one outcome that is being sought (and evidence for that). The outcome is something that has already happened. This is likely because of the subject matter for which they were developed -- legal evidence. In the case of Bayesian networks, the chart branches from roots rather than aiming at a specific goal. For this reason they may appear to be more relevant to scientific evidence.

John Fox's essay, "Arguing about the Evidence: A Logical Approach" and the next essay "Thinking About Evidence," by David Lagnado also consider issues of evidence that are raised by Bayesian network analysis. These authors also agree that the value of Bayesian networks rests more clearly on their relational aspects rather than in the calculations of probabilities or degrees of belief based on Bayesian updating. Both Fox and Lagnado note that there is increasing empirical evidence that people are neither good estimators of probability nor good Bayesian updaters. However, the directionality that Bayesian networks capture does give some insight into at least some types of evidential reasoning -- for example, the way that jurors make judgments.

Lagnado notes that empirical studies strongly suggest that the order in which evidence is received makes a difference to the role it is given in causal model. Empirical studies on discrediting evidence suggest that jurors group evidence based on directionality -- they treat evidence as cohering independently of causal relatedness so long as the evidence shares a common direction. They use causal models to explain away discredited evidence and will tend to explain away evidence that shares the same directionality as the evidence that is discredited (219). In other words, jurors do not integrate evidence in a fully Bayesian way. Lagnado suggests that the grouping that jurors appear to do allows for conservation of mental energy (specifically memory). He concludes by hypothesizing that we use Bayesian networks for determining interrelations among causal relations but use coherence-based reasoning when we make inferences.

Terence Anderson's "Generalisations and Evidential Reasoning" examines the usefulness of Wigmore charts for the analysis of evidence in fields other than law by applying it to an argument for a hypothesis in Assyriology. He concludes that "the principal barriers to cross-disciplinary analysis and communication stemmed from the fact that the outsider does not share the stock of knowledge and the knowledge-based generalisations, that are common and shared by those within the discipline" (243). Anderson's essay would seem to support the point about form and content that I made above. What is taken as evidence within one discipline may be seen as a principle of inference in another.

Peter Tillers, "Are there Universal Principles or Forms of Evidential Inference? Of Inference Networks and Onto-Epistemology" closes out this second group of essays. Tillers argues that while Schum's goal of a science of evidence might be possible in some subject areas, there are at least three areas where it is unlikely to work: inference about human meaning, unconscious inference (implicit), and inference in the special sciences. In the last case, for example, a node might represent a particular equation as evidence; but from the perspective of some, the solution to the equation would really be the evidence, whereas for others it would only constitute part of the evidence. For Tiller, the lesson of these exceptions has to do with the inherent intentionality of human interaction with the world and the consequent humility that we should maintain about the limitations of human reason in understanding that world.

The next two articles are focused on evidence-based policy -- one of the ostensive motivators for the project. I would have liked for policy to be a more prominent theme in the volume and I take this lack to be one of its weaknesses. "Rhetoric, Evidence and Policymaking: A Case Study of Priority Setting in Primary Care" by Jill Russell and Trisha Greenhalgh begins by arguing that in decision-making the relevant uncertainties are not those that have to do with belief but uncertainties about what to do. Practical reason intersects with rhetoric and so a scientistic approach is not fully able to capture important features of policy decision-making. They illustrate the point with an empirical study. Their research setting is a Priorities Forum of an NHS Primary Care Trust (PCT). "Priorities Forums . . . 'provide a mechanism within the PCT to ensure a robust ethical and evidence-based process for identifying treatment priorities'" (272). The authors attended nine meetings of the forum (made up of specialists in public health, commissioning and finance managers of the PCT, local general practitioners, patient/public representatives, and chaired by PCT's Medical Director). The authors report that they charted the way that priorities were "named and framed" in the belief that a "study of 'evidence in use' is a study of language in use. . . . 'evidence does not speak for itself, but must be spoken for.'"[2] Their analysis thus focuses on the use of the language and the way the framework for understanding the problem is changed through dialogue They document a shift from what they refer to as a scientistic framework to an interpretivist framework -- a shift that they report as taking place when the significance of data is questioned from the perspective of experience of the general practitioners and district nurses to whom the authors attribute practical knowledge. They conclude that within the context of the forum, the voices of the experts became evidence and resulted in different policies than might have been the case otherwise.

"A Theory of Evidence for Evidence-Based Policy," by Nancy Cartwright and Jacob Stegenga also addresses policy decision-making, sketching a theory of evidence for use, which they contrast with the more usual approach, which is a theory of evidence for belief. While the power of randomized controlled trials is clearly established in large-scale pharmaceutical studies, for example (though such studies may have other problems), it is less clear that they should be treated as the "gold standard" of evidence for social policy decisions. While they may provide good evidence for belief, they do not, by themselves, provide good evidence for use. A primary goal of the essay is to provide a theory of evidence for use that is accessible to policy-makers -- those who are the potential users of such a theory of evidence. To make this clear, Cartwright and Stegenga distinguish three questions. When are evidence claims credible? When is a credible claim relevant to the truth of a claim to effectiveness? What is the probability that a policy will be effective given a body of evidence of varying credibility relevant in different ways? (291)

Cartwright and Stegenga take three principles from philosophy of science and apply them to formulate their theory of evidence for use: truth values for causal counterfactuals are fixed by a causal model; causes are INUS conditions (Insufficient but Necessary parts of Unnecessary but Sufficient conditions) (J.L. Mackie); and the importance of understanding how causes operate and operate together (causal mechanisms) (296). The key benefits that come from the first two principles are an appreciation of the complexity of causality, a way to think about that complexity (through causal modeling), and sensitivity to the multitude of factors that may be causally relevant in any given situation (Mackie's INUS condition). They emphasize the importance of seeing causes as INUS conditions. "Usually when discussing policy one focuses on a single cause, that is, a single INUS condition. But it is not possible to predict the effect of a cause without considering all the other INUS conditions and the relations among them" (310). The focus on INUS conditions highlights the importance of other variables and the consideration of causal interactions that are crucial to judging the effectiveness of policy. They offer as an example, the class-size reduction policy implemented in California in the early 2000s. The program was intended to improve learning based on evidence that a lower student to teacher ratio improved learning. But the policy was implemented quickly, forcing the hiring of under-trained teachers who were less effective in the classroom. A more careful consideration of causal interactions might have led to anticipation of the way the negative effects of this policy could counteract the expected positive effects (307).

These borrowed philosophical ideas are developed into three principles. The first two of which are primarily concerned with facilitating causal reasoning around policy-making.

Principle 1: A good way to evaluate whether a policy will be effective for a targeted outcome is to employ a 'causal model' made up of:

a list of causes of the targeted outcome that will be at work when the policy is implemented;

a rule for calculating the resultant effect when these causes operate together.

Principle 2: Causes are INUS conditions. (311)

Cartwright and Stegenga phrase the final principle informally -- Principle 3: Mechanisms matter. They note that policy-makers are typically not interested in how a particular policy produces an effect but rather that it does. However, they argue that understanding how the effect is produced is crucial to knowing whether it will be produced in a particular context. "A good answer to the question 'How will the policy variable produce the effect?'… can help elicit the set of auxiliary factors that must be in place along with the policy variable if the policy variable is to operate successfully" (320-321).

Given the Leverhulmes Trust's original call for proposals, this essay and the previous one seem more directed to the concerns that gave rise to the evidence project in the first place. The Cartwright and Stegenga essay also gets my recommendation because it is a genre of philosophy of science that we are beginning to see more of and one which I heartily endorse -- philosophy of science in practice. The theory of evidence for use that they propose is explicitly intended for users of knowledge, not those who produce it, although I think it is not necessarily a bad thing for producers of knowledge to be thinking about these issues as well.[3] Cartwright and Stegenga acknowledge that the standards they are proposing for using evidence to make policy decisions constitute a tall order for users.

That just makes our job hard. We need to do the best we can to help those who need to evaluate effectiveness do so as well as possible, even if the process will inevitably be flawed. Recognising that it will be flawed means making clear that policy effectiveness judgments will almost never be very secure; and so far as possible, one should hedge one's bets on them(321).

While in some ways it is understandable why "In Praise of Randomisation: The Importance of Causality in Medicine and its Subversion by Philosophers of Science," by David Colquhoun follows the Cartwright and Stegenga essay, it is unfortunate. Colquhoun's essay is a defense of randomized controlled trials in precisely the area where they are most successful -- large scale pharmaceutical studies--and so seems to completely miss the point of recent criticisms of such trials. In addition, he appears to equate all philosophy of science with postmodernism.

The remaining articles deal with various discipline-specific issues of evidence. The first of these (by Hasok Chang and Grant Fisher) deals with a philosophical problem. "What Ravens Really Teach us: the Intrinsic Contextuality of Evidence" offers a solution to the ravens paradox in which they argue that a key lesson of the paradox is that a full understanding of evidence requires more than a formal analysis. This idea is not new -- for example, Helen Longino's (1990) contextual empiricism provides an account of evidence that is context dependent -- however their account of how context functions does have some new elements. As a reminder, the paradox is that, since "All ravens are black" is formally equivalent to "All non-black things are non-ravens", a white shoe (or any other non-black, non-raven) should serve as evidence for "All ravens are black". Chang and Fisher agree with Hempel that in fact the white shoe does confirm "All ravens are black" sometimes, but not always -- it depends on context. They offer three levels of analysis to spell out a role for context in determining when observations are evidence.

The first level solution involves awareness of the fact that observations must be rendered as propositions in order to participate as evidence (think of the H-D model of confirmation, which they work with in the essay for the sake of simplicity). How we observe may affect the proposition that appropriately renders the observation. "If we first observe that something is not a raven, then any further information about that object is useless: if we first observe that it is not black, then it is of interest to find out whether it is a raven or not" (353). The second level solution begins by noting that when we say that an observation counts as evidence for a hypothesis, we are committing a category mistake. It is an equivocation of two meanings of "evidence" -- "sense 1 is about an entity being evidence of another entity, and sense 2 is about a proposition being evidence for another proposition. What is fundamental to evidence₁ is an intuitive sense of causality; what is fundamental to evidence₂ is warrant for inference" (356). Resistance to thinking of evidence as contextual results from the intuitions of objectivity that are associated with evidence₁. These are mistakenly transferred to evidence₂, which has to do with reasons for belief; thus acknowledging contextuality in the sense of evidence₂ does not commit one to any lack of objectivity in causal laws. Finally, the third level solution notes that testing is an activity. "In such a framework of intentional action, a piece of information or an observation-statement can function as evidence only in the context of an evidential epistemic activity" (361).

This contextual solution to the ravens paradox is interesting in comparison to other essays in the volume. So, for example, one factor that emerges at all three of these levels is that the order in which information is presented makes a difference as to whether it serves as evidence or not. This point about the order of information appears in several other essays earlier in the volume, for example, in Lagnado's discussion of empirical studies of jurors.

The final four essays focus on evidence in specific disciplines: archaeology, the study of ancient religions, economics, education, science, and law.

Those who are familiar with Alison Wylie's work on evidence in archaeology will find that her contribution "Critical Distance: Stabilising Evidential Claims in Archaeology" is a coherent and concise summary of that work. Those not familiar with her work will find the essay an excellent introduction. She makes good use of Clark Glymour's idea of "bootstrapping" to deal with an issue of circularity of evidence that is particularly worrying in archaeology where the data are in need of so much interpretation. In archaeology, "much of the action is off-stage. It is at least as crucial to establish the security and relevance of a robust body of background knowledge -- on the source side of the equation -- as it is to work in the foreground, recovering and recording the material record that survives of an archaeological subject" (389).

Jason Davies's second essay, "Believing the Evidence," considers whether a recent trend in the study of religion to return to thinking about religion as identified through a set of beliefs rather than a set of rituals is a good idea or not. He concludes it is not, primarily because of the perils of attributing beliefs to others. But the essay also illustrates how background assumptions about what should count as evidence, assumptions about which phenomena to focus on, shape the knowledge produced by a discipline.

Assumptions are also in the forefront of Michael Joffe's comparison of biology and neoclassical economics in "What Would a Scientific Economics Look Like?" In thinking through how neoclassical economics works, he revisits criticism of Milton Friedman's instrumentalism about the assumptions made in economic theory and offers a new take on the role that the unrealistic assumptions of neoclassical economics play. "Everything here hinges on assumptions. This is not the familiar criticism of economic theory that its assumptions are unrealistic -- that would merely be a reactive response. It is giving assumptions the role that evidence should have" (451). Unlike the assumptions Davies identifies, which determine what counts as evidence, Joffe claims that assumptions in neoclassical economics actually play the role of evidence -- a role that assumptions should not be playing. He argues that neoclassical economics is actually seeking an account of mechanisms operating at the level of the individual but dismisses psychological reality as a way of formulating that account. Joffe claims that in this way neoclassical economics differs sharply from biology, which actively seeks causal mechanisms. (As Cartwright and Stegenga's remind us -- mechanisms matter.) He notes that the recent turn to behavioral economics goes against this tendency and hopes that this change will be sustained.

Tony Gardner-Medwin's "Reasonable Doubt: Uncertainty in Education, Science and Law" closes out the volume with a reminder of the value of paying attention to uncertainty. He considers the value of acknowledging uncertainty in several contexts, and ends with a very specific recommendation for jurors. He points out that "the usual focus on the probability that the defendant is guilty (a hypothesis probability) is inadequate. Clear benefits arise if a verdict is considered to be ultimately constrained by data probability: how likely is it that such incriminating evidence could have arisen for an innocent person?" (481) While this example takes us back to where we started -- the law -- it also resonates with several other essays in that relevant uncertainty here has to do with the gap between belief and action. While the degree of certainty may seem more salient if all we are concerned about is what we should believe, theuncertainty and the concomitant risk if we are wrong comes to the fore when we consider action and its consequences.

The apparent fragmentation of this volume reflects the project as a whole, judging from what the participants report. Yet I have to say that I came away from reading these essays with a sense that I do understand evidence better. Though there are articles here that stand apart from the project (the Cartwright and Stegenga and the Chang and Fisher essays, for example), I do recommend reading the entire collection. While they do not offer us a science of evidence in the sense that Schum sought, they collectively do contribute to an understanding of evidence.

REFERENCES

Cartwright, Nancy. "Will This Policy Work for You? Predicting Effectiveness Better: How Philosophy Helps," Philosophy of Science, 79 (December 2012), 973-989.

Howlett, Peter and Mary S. Morgan (eds.) How Well Do Facts Travel? The Dissemination of Reliable Knowledge. Cambridge: Cambridge University Press, 2011.

Longino, Helen. Science as Social Knowledge: Values and Objectivity in Scientific Inquiry. Princeton: Princeton University Press, 1990.

[1] The website for the project details all of parts of the project. This volume was primarily produced from the final symposium.

[2] They quote this from J. Green (2000) "Epistemology, evidence and experience: evidence based health care in the work of Accident Alliances," Sociology of Health and Illness, 22(4): 453-76.

[3] Those who are interested in this topic might also appreciate Cartwright's Presidential Address to the 2010 Philosophy of Science Association, "Will This Policy Work for You? Predicting Effectiveness Better: How Philosophy Helps," Philosophy of Science, 79 (December 2012), 973-989.