In 1970, David Hackett Fischer published a meaty and entertaining book, Historian’s Fallacies: Toward a Logic of Historical Thought (well and briefly reviewed by Philip Jenkins at Patheos). I highly recommend it. He’s funny, but correct. It’s not a parody but a serious (albeit witty) survey of the alarming frequency with which all fields of history engage in fallacious reasoning. And his overall theme is that history has no method; its practitioners are not taught logic, and have no formal idea why their conclusions ever follow from their premises. Indeed, at the top of chapter 6 of my book Proving History (p. 207) I quote his conclusion (from Fischer, p. 264):

No presently articulated system of formal logic is really very relevant to the work historians do. The probable explanation is not that historical thought is nonlogical or illogical or sublogical or antilogical, but rather, I think, that it conforms in a tacit way to a formal logic which good historians sense but cannot see. Some day somebody will discover it, and when that happens, history and formal logic will be reconciled by a process of mutual refinement.

Bayes’ Theorem, I argue, is what he was looking for. It’s the tacit logic of historical reasoning, and thus ought to be formally understood, to help avoid much of the very plethora of fallacies (hundreds of pages worth) that Fischer documents, and to improve the logical soundness of historical argument and debate, and increase the soundness of historical conclusions. So I wrote a whole book on how that is and how it can be done. And philosophers before me had already realized that was what was needed. I’ve now been hearing laments from historians in several fields about the very same thing: their field has no method, no one studies logic, and it’s consequently awash with fallacies, and there are no controls in place or even being contemplated to reign that in. And they are starting to see how Bayes’ Theorem is exactly what the field needs for that purpose (for the field of Jesus studies specifically I showed this for its current dysfunctional methods in chapter 5 of PH).

The Need for Formal Logic in History

Fischer even invented (or “discovered” is a better description) another (now famous) fallacy: the Historian’s Fallacy, “when one assumes that decision makers of the past viewed events from the same perspective and having the same information as those subsequently analyzing the decision” (Wikipedia). In Bayesian terms, this is a mistake in conditioning probabilities on the background knowledge. Our background knowledge includes the fact that people in the past did not know what we know, and consequently may have made different decisions than we would, based on what their knowledge and beliefs were. Estimating the probability of what was typical of the time or what evidence is expected on any given theory of events must be calibrated to the conditions at the time.

And, I will extend his point now, also for the conditions of evidence survival in the intervening time. For example, the biased decisions of manuscript preservers in the Middle Ages has hugely distorted our picture of the ancient world, particularly with respect to its philosophy and science. In my new book Science Education in the Early Roman Empire I bring up an important example: because Medieval Christians were uninterested in the advanced logical treatises of the Roman period, and uninterested in the Stoic fascination with physics and science, and obsessed with Stoic philosophers who wrote a kind of moral philosophy they liked, they almost always selected the latter to copy, and ditched the rest, leading modern historians to mistakenly think Roman Stoicism had largely abandoned an interest in science and logics and became almost singularly obsessed with moral philosophy. In fact, it was Medieval Christians who had abandoned an interest in science and logics and became almost singularly obsessed with moral philosophy. When we look at the surviving circumstantial evidence we find that this was not the case in the Roman period, but a distortion created by this Christian selectivity hundreds of years later.

As I wrote in SEERE (pp. 102-03, notes and references removed):

Roman scientists were thus fully engaged in this rage for logic, Galen writing treatises in logic, Sextus surveying the whole field, Hero employing formal logic in his proofs and demonstrations, and Ptolemy writing works on epistemology and even designing his Optics with the overriding purpose of resolving logical debates over the accessibility of empirical knowledge.

The point of this digression is that when evaluating the different degrees of science education to which an ancient student of philosophy would be exposed, we must account for the effect of medieval Christian selectivity in preserving texts from antiquity, which has skewed our perception of what was normal. For example, the assumption is sometimes made that Roman Stoicism shifted its focus almost entirely to ethics and more or less frowned on logic and natural philosophy as pedantic distractions. But this is an illusion created by a later Christian preference for preserving the treatises of Stoics who took that position (a position which the Christians largely shared, as we’ll see in chapter nine), when really those authors (and usually named here are Epictetus and Seneca) were freaks in comparison with their peers. In truth, even the great moralist Seneca was an advocate of science education and the advancement of scientific knowledge, making Epictetus a freak even next to him. And yet Epictetus was not so wholly opposed to a useful education in science and logic, either. And he was at the furthest extreme, and thus as unrepresentative as anyone could be.

[And] just as [Jonathan] Barnes found [this to be the case] for logic, a Stoic passion for physics is likewise documented, which has also been obscured by a subsequent Christian disinterest in preserving very many Stoic treatises in that field [too]. Hence across the empire Stoic education did not avoid scientific knowledge but was immersed in it.

Thus it becomes a fallacy if we don’t take into account the fact of this distorting medieval filter in our background knowledge.

Another example of that same effect is the outrageous frequency with which Christians forged and altered documents (OHJ, Element 44, pp. 214-22), even in their own sacred scriptures; so common, that in fact, apart from homilies and commentaries, the considerable majority of Christian literature is forged, and the vast majority of Christian literature was doctored with (OHJ, ibid. and ch. 8.3-4 and 8.12-13, pp. 293-08 and 349-58). And also, of course, a great deal was destroyed, and not randomly but selectively in favor of Medieval doctrinal preferences. This has an enormous effect on our estimates of every probability in a Bayesian equation. For example, it completely destroys any claim that we should assume any statement in a Christian document is true until we have evidence against it. And that does not hold with anywhere near the same force for non-Christian literature of the same period…except when it’s Christians meddling with it! (OHJ, ch. 8.9-10, pp. 332-46)

Understanding the logical import of all this is crucial for any historian to properly grasp. Not only the fact of it (the danger of committing the fallacy), but the effect of it (how do we incorporate this background knowledge into our reasoning validly; that is, what does it change about our judgments, and to what degree?). And when you try to answer that question, you end up with Bayesian reasoning as the answer.

Historians Starting to Agree

By remarkable coincidence, completely unknown and unrelated to each other, two historians (one an established professor, the other a graduate student on his way to joining the ranks) spoke to me in the same month about their concerns regarding history’s lack of any logically developed methodology. When discussing his extension of my work in Hitler studies (see Hitler’s Table Talk: An Update), Professor Mikael Nilsson in Sweden spontaneously said a lot of the same things as that grad student, J. Little. Neither are in a similar field (Nilsson specializes in cold war and WWII history; Little in Medieval Islamic history).

Nilsson plans to teach a course on my method recommendations. But it was Little’s emailed bullet list that really caught my attention, because it comes from his own experience, yet reads like a beautiful laundry list of everything that’s wrong with history as a field that Bayes’ Theorem solves. I asked and received his approval to quote it. He wrote:

Over the past year or so, I have become frustrated with several historiography-related problems within my field, namely:

  • The use of implicit or unconscious methodologies by historians, who often fail to explicitly outline their methodological assumptions (or may even be unaware of them).
  • The intrusion of ideological propaganda (especially theology) into academic historiography.
  • Conflicting conclusion-tendencies within academic historiography, without clear debate and resolution on a more basic methodological level.
  • The absence of a rigorous and standardised historiographical terminology, resulting in the vague usage of terms such as “probably” and “likely”, and the erroneous conflation of ‘evidentiary data’ with ‘consistent data’.
  • The lack of scientific modelling within academic historiography—especially the lack of the concept of falsification, and the lack of replicability of results via the consistent usage of a methodology.

These are all spot on. And correctly describe pretty much every field of history, more or less.

Ideology: Even item 2, “the intrusion of ideological propaganda,” is rife in every field. And not just religious ideology (Nilsson in particular complained to me of “interpretive frameworks,” like Marxist theory, being used to reach conclusions from data in history as if that were a valid method), but even religious ideology is everywhere…for example, Christian revisionist history that tries to argue Christianity is responsible for every great thing that has ever happened in Western history, a phenomenon that spans a whole slew of historical subfields, e.g. on Christianity “causing” modern science, see my chapter in The Christian Delusion; and on Christianity “causing” American democracy, see my chapter in Christianity Is Not Great. Historians need methods, an application of logical standards, that can control for the intrusion of ideological biasing in argument and conclusion. You can’t just vaguely say bias exists and gosh that’s a problem.

Intuition: Item 1 is of course the fundamental problem. Historians only really use “implicit or unconscious methodologies,” and the consequence of that is that they themselves, and their critics and reviewers and colleagues, don’t even know what their logic is, and thus whether it’s valid, or prone to error, or how to avoid or fix that. Even when they think they are doing this, they not only aren’t really, but are doing a really bad job of it, producing a travesty of logic that would not pass even a freshman course in logical reasoning.

I extensively show this with the supposed “logic” of the “method of criteria” invented in Jesus studies in chapter 5 of Proving History. And as I note on page 12 of the same book, “All criteria-based methods suffer this same defect, which I call the ‘Threshold Problem’: At what point does meeting any number of criteria warrant the conclusion that some detail is probably historical?” No one applying the method or inventing it ever even realized they needed to solve that problem. Because they don’t know how logic works. And in result the method they invented isn’t logical. It “feels” logical, it somehow triggers their “implicit or unconscious” understanding of what should be a good argument, but that’s folly.

Historians need to develop, apply, advocate, and enforce an explicit methodology that conforms to proper canons of logic. Not pull shit out of their ass that just “feels” logical. And mind you, even attempts to articulate a method are rare in history as a whole. Usually one isn’t even stated. And BTW, when, as phenomenally rare as it is, historians actually do try to articulate a method by legitimate logic, they tend to be ignored, and their methodological arguments are certainly never taught to historians in graduate schools.

For example, in Proving History (pp. 117-19) I provide the Bayesian model for the Argument from Silence, and was able to rely on the only instance I found of a historian correctly presenting the formal logic grounding a methodological tool: Gilbert Garraghan in his 1947 Guide to Historical Method. The principle filtered thence into most literature on historical methods but without the formalization (ibid., p. 310, n. 20). But historians rarely take, much less are required to take, courses in which they ever read any of these books in historical methodology (of which I give a complete list on ibid., p. 306, n. 3, and a selective list here); and none of those books consistently breaks everything down like Garraghan did in this one single instance (even he does not do that in most of the rest of his own book). It’s worth noting that this one example of doing it right, is Bayesian (Garraghan just wasn’t aware of the structure of what he was proposing).

But mostly, historians fail even to describe the logic of their method at all, much less validly.

Irresolution: Item 3 is a consequence of that failure. Historians in every field can look at the same evidence and come to contradicting conclusions about what happened and why, and there is very little effort to figure out why. That is, not merely speculate why, but actually break down what it is they are doing differently, methodologically, that is getting them different results from the same data, and seriously debating and resolving within the field which set of methodological assumptions is correct or not, or better or worse, and then thus resolving which conclusion is the more credible because it stands on all the same evidence and the most logically valid method.

I argue in PH that when we do this, we end up discovering all sound historical reasoning is Bayesian at its foundation. But even with that, there can be correct and incorrect, sound and flawed, applications of Bayesian methods in history. So realizing it’s Bayesian is just the first step. Arguing out whether and when Bayesian reasoning has been correctly used should then define the entire field of history (PH, index, “disagreement”; cf. p. 305, n. 33).

Confusion: By avoiding this conclusion (that history is Bayesian and its logic necessarily probabilistic), historians only exacerbate the problem, by communicating their reasoning and conclusions with vague, unstandardized words like “plausible” and “very likely” and “probably,” which are in themselves meaningless without identifying what exactly they refer to as regards probabilities or ranges of probabilities. One historian might mean a completely different probability by the phrase “very unlikely” than another historian does, producing miscommunication, an inability to vet the logic of a historian’s argument, and the risk of inconsistencies in a historian’s own reasoning.

And Little is particularly spot on with his charge of a frequent “erroneous conflation of ‘evidentiary data’ with ‘consistent data’” in historical papers and monographs. What he means is the distinction between “this hypothesis is consistent with the data” and “this hypothesis is rendered the most probable by the data.” Those are not the same thing. And yet by confusing evidentiary consistency with evidentiary weight, all too often historians will conclude their hypothesis is increased in probability by the evidence being consistent with it, or even that their hypothesis is proved by this consistency (meaning, that it thereby becomes the most probable hypothesis of any, and therefore can be declared “true” or “probably true”).

That’s false. And Bayes’ Theorem proves and explains why it’s false. Dozens of contradictory theories can be entirely consistent with all the evidence. And even when there are variations in that evidential fit, (a) it can still be the case that a theory that is less consistent with the evidence is more probably true and (b) it can still be the case that a theory that is less consistent with the evidence can still be probable enough as to leave us uncertain. The second case is probably readier to understand: we could find that theory A is consistent with the evidence and theory B is slightly less so, and the effect of this may be that we find theory A 60% likely to be true and theory B 40% likely to be true. In such a case, we obviously cannot confidently assert that A is true. It’s slightly more likely, but B is also respectably likely, enough to give us pause.

The first case is harder to explain, but is a direct consequence of the actual Bayesian logic of history.

First, prior probability affects the final probability. So, for instance, “aliens built all ancient monumental stone structures around the world” could be said to be “more consistent” with the evidence, insofar as it is a single explanation that seemingly entails the observations, whereas the mainstream alternative, that many different isolated societies of humans did it, without any communication among each other and with their primitive tools and knowledge, is a more complicated hypothesis, consistent with the evidence only by introducing many agents and assumptions as to their parallel motives and means. But the prior probability that aliens did anything on earth is astronomically small (pun intended). And it is so owing to the absence of any evidence they have done so. For instance, aliens wouldn’t likely build things out of stone but advanced polymers; and generalized to the entire field of our background knowledge, if aliens had ever been here doing things, there should by now be tons of evidence of remarkable artifacts only alien civilizations could have produced. Whereas we have confirmed by multiple methods that primitive humans can indeed make all these things, and can have the motives to.

But even apart from that, evidentiary weight is not merely a product of consistency. What makes evidence weighty—in other words, what makes something “good” or “strong” evidence—is how much more probable that evidence is on, for example, theory A than on theory B. And that won’t be the same for every item of evidence. So, for example, theory A may be consistent with all the evidence, insofar as we can explain why all that evidence would be there and in the state it is on that theory, and theory B might be inconsistent with some of that evidence, insofar as we can’t as easily explain why those evidences would be there or in that state, but all the evidence together might still be less likely on A than on B. And therefore B might actually be the more probable theory!

Unless, of course, we formally define “consistency” not as merely the ease of explanation but as a probability of being there and in that state. But then we are doing Bayesian history. If we say theory A is more consistent with the evidence than theory B because the evidence is more likely on A than on B, that’s Bayesian reasoning. And it’s correct reasoning. The vagueness with which “consistency” is defined in historical writing is thus a significant source of muddle and error. And attempting to fabricate consistency by elaborating your hypothesis with a bunch of convenient unevidenced assumptions is an easy folly to commit, if you don’t understand the formal logic of what the consequences of doing that are (on which see Proving History, index, “gerrymandering”).

But this is precisely why so much terminology used to generate conclusions from premises in historical writing needs to be standardized or spelled out in some formal means that can be independently checked for consistency, and consistently understood by other historians.

Ascientism: Little is quite correct to note “the lack of the concept of falsification” in history as a field. They have a vague notion of it, but it isn’t consistently applied or well worked out. Just what would lead other historians to conclude that a historian’s conclusion is false? (Other than that they intuitively feel that it isn’t correct.) There is some measure of this. For example, when it’s easy: if a historian has left data out, their conclusion can be falsified by reintroducing it and seeing what effect it has on the conclusion by that historian’s own reasoning (see, for example, what happens when I do this to Christian apologists pretending to be historians in Not the Impossible Faith and The Christian Delusion). But there are many subtle ways historians compose their arguments so as avoid any clear falsification tests. This is most evident in Jesus studies, as I show in Chapter 2.2 of On the Historicity of Jesus and chapter 1 of Proving History.

And a related effect of this is just as Little says: “the lack of replicability of results via the consistent usage of a methodology.” As I wrote in PH (pp. 11-14) regarding the “method of criteria” that was supposed to solve this problem of contradictory conclusions reached by different historians:

Dale Allison … concludes, “these criteria have not led to any uniformity of result, or any more uniformity than would have been the case had we never heard of them,” hence “the criteria themselves are seriously defective” and “cannot do what is claimed for them.”

The fact that almost no one agrees with anyone else should compel all Jesus scholars to deeply question whether their certainty in their own theory is really even warranted, since everyone else is just as certain, and yet they should all be fully competent to arrive at a sound conclusion from the evidence. Obviously something is fundamentally wrong with the methods of the entire community. Which means you cannot claim to be a part of that community and not accept that there must be something fundamentally wrong with your own methods. Indeed, some critics argue the methods now employed in the field succeed no better than divination by Tarot Card reading—because scholars see whatever they want to see and become totally convinced their interpretation is right, when instead they should see this very fact as a powerful reason to doubt the validity of their methods in the first place.

When everyone picks up the same method, applies it to the same facts, and gets a different result, we can be certain that that method is invalid and should be abandoned.

That should not happen. When historians apply the same methods to the same evidence, they should always be getting the same results (or near enough: PH, pp. 88-93). And this should in fact be the way historians as a profession validate the work of their peers. Of course, this means weeding out error and bias must be fundamental to such a method. Often when historians get different results from the same methods and evidence, it’s because they aren’t controlling for their own biases or are sneaking fallacies into the mix. But continued debate can reveal and purge these effects, resulting in converging results on any given subject, rather than diverging ones.

Etcetera: Such were Little’s concerns, and that they match exactly what Bayes’ Theorem teaches us, lends support to that being the solution to this industry-wide problem. Professor Nilsson expressed to me, independently, many of the same points; and added some of his own.

Such as the distressing frequency of over-trusting a certain set of sources because there aren’t any other sources. In other words, historians are so annoyed that they don’t have good sources, that they start unconsciously acting like the sources they do have are good. Because, you know, “it’s all we have,” and “we have to work with what we have.” Historians all too often leverage sources with hope rather than fact: a source sucks and is unreliable, but is all they have, so they treat it as authoritative and reliable. This has happened with Hitler’s Table Talk: the vast suspicion that surrounds its reliability is ignored, and it continues to be treated as the verbatim words of Hitler, when in fact it appears actually to be the words of minions recording their recollections of him, and later editors who changed up what they wanted.

Recent doubt about the truth of Josephus’s accounts of Gamla and Masada reflects a growing awareness of this problem (see Kenneth Atkinson’s 2006 analysis in “Noble Deaths at Gamla and Masada” in Making History: Josephus And Historical Method). This very same point was extensively argued by historian Michael Grant in his excellent treatise Greek and Roman Historians: Information and Misinformation (1995), pointing out that modern historians over-trust the likes of Tacitus or Josephus because it’s often the only source they have. Grant argues we can work with bad sources, but only with commensurate caution and acceptance of the ensuing uncertainties. This realization was one part of the shift in historical methods in the 1950s towards more reliable procedures and assumptions (which I wrote about before; reproduced in HHBC): we started not trusting sources so gullibly as we had done before.

Nilsson also pointed out the over-reliance on the term “theory” in history in the sense of “interpretive model” rather than causal theory, as in a theory about what caused the evidence we have (or what caused that event to happen, although even that can only ultimately be a causal theory about the surviving evidence too: PH, pp. 45-49). The fact that history as a field hasn’t even made that crucial shift, of moving toward only talking about causal theories of the evidence, and leaving everything else in the realm of resulting conclusions rather than starting assumptions, is already alarming in itself. But it’s symptomatic of what monsters crawl in through the gaps left by not solving the problem of establishing a logically valid method across the field.

Since all statements about history are statements of probability, you cannot validly do history without a good understanding of probability theory, which in turn requires a good understanding of basic logic and the full gamut of standard fallacies. Because any statement about what’s probable requires arguments that logically derive those probability statements from premises, which cannot be competently done without understanding at least basic probability theory and of the fallacious ways it can be misused. Intuition is extremely unreliable and especially prone to errors and biases in probabilistic reasoning–just look at Wikipedia’s list of cognitive biases for a scary start. That’s your brain. It sucks at thinking well. You need tools to fix how it works. Without those tools, you are using a flawed organ. And sound history can’t reliably be done that way.

History as a Science

History is, properly, a science. I will be mentioning this very point today at Edinboro University. History belongs as a department of anthropology, which we now divide between cultural and physical anthropology, and within the cultural branch we should have modern and historical. Either way, a factual study of humanity is always a science. Philosophy should also be moved into the science division, as essentially meta-science. The humanities should instead be devoted to the procedures and labors of literary and artistic appreciation, where questions of fact are not primary, but questions of impact and beauty and appreciation for the achievements of humanity is paramount, as well as the production of such art (from literature to film). A subdivision of that would of course be languages, not in their scientific aspect, which is linguistics, but in their cultural aspect and actual use in practice, which is more art than science.

But be that as it may, the difference between history and, say, physics, is the certainty of its results, because history often has to work with very damaged data. But anthropology (and sociology and often even psychology) are not far better by that standard. They benefit from having direct access to the subjects they study and thus a much more rigorous and replicable process of data acquisition and documenting. History by definition is the science that remains when you can’t do that, but still have to answer factual questions about what human beings and their societies thought and did, so as to cause the evidence we have today, from artifacts and texts to systems, beliefs, attitudes, and institutions.

There of course already are historical sciences. Cosmology, geology, paleontology, even planetology, and so on. And there already are human sciences. Anthropology, sociology, economics, political science, and so on. Archaeology is also more obviously a sub-division of anthropology, and yet history is just the archaeology of texts. History is a branch of archaeology, and thus of anthropology, that deals with texts and applies to them, and the artifacts recovered and analyzed by archaeology. All these sciences already answer questions about the past. Again the only difference is that history often has dismal data, highly corrupted, and therefore can only get much more tentative results (except in certain limited respects, where the evidence is as strong as in, say, geology or paleontology or cosmology).

But even apart from this realization. History is supposed to be a fact-finding endeavor. And as such it needs a logically valid and consistent method, which can be shown to adduce what’s true more reliably than alternatives, and this method needs to be taught to all historians, and upheld by their peer review standards as what defines history as a field whose results are worth heeding. And if that’s not the method laid out in Proving History, or some method improvingly evolved or elaborated from that, then what should that method be?

Discover more from Richard Carrier Blogs

Subscribe now to keep reading and get access to the full archive.

Continue reading