We Are All Bayesians Now: Some Bayes for Beginners

Cartoon with the title (Yet another) history of life as we know it... It then shows stages of ape evolving into hominid evolving into humans etc., walking increasingly upright, then suddenly hunched over a computer, and the five stages are humorously named Homo Apriorius, for the ape, Homo pragmaticus, for the hominid, Homo frequentistus for the cave man, Homo sapiens for the human, and then Homo Bayesianus for the equally naked computer user. Thought bubbles are shown emanating from each, showing a mathematical representation of how they think, which roughly translates to, for the ape, they just assume hypotheses are true, then for the hominid they just think about evidence and not hypotheses, and the cave man makes predictions of the evidence from hypotheses, then the human just asserts evidence with hypotheses, while the Bayesian correctly asks what hypothesis is likely given the evidence.Two things happened recently. I was thinking about better ways to teach Bayesian thinking with minimal math in my upcoming class on historical reasoning (which starts in two days; if you want in, you can register now!). And I just finished reading an advance copy of the first proper academic review of my book Proving History, which is the main textbook I use for my course. That review is by fellow Bayesian philosopher of history Aviezer Tucker, which will appear in the February 2016 issue of the academic journal History and Theory.

The review is an interesting mix of observations. I’ll comment on it in more detail when it is officially released. The abstract of his review is available, but it’s not a wholly accurate description of its content. In fact the review is mostly positive, and when critical, Tucker proposes what he thinks would be improvements. He’s uncertain whether a Bayesian approach will solve disagreements in Jesus studies, and he mentions some possible barriers to that that weren’t directly addressed in Proving History, but he is certain Bayesian methods do need to be employed there. The question becomes how best to do that. He makes some suggestions, which actually anticipate some aspects of how I did indeed end up arguing in Proving History‘s sequel, On the Historicity of Jesus (which Tucker hasn’t yet read, so he was unaware of that, but it’s nice to see he comes to similar conclusions about how to examine the evidence). He takes no side in the debate over the conclusion.

Both events converged. Tucker’s review reminded me of some ways to better discuss and teach Bayesian thinking. In truth, Everyone Is a Bayesian. They might think they’re not. Or they don’t know they are. But they are. Any time you make a correct argument to a conclusion in empirical matters, you’re argument is Bayesian, whether you realize it or not. It’s better to realize it. So you can model it correctly. And thus check that all its premises are sound; and be aware of where you might still be going wrong; and grasp all the ways someone could validly change your mind.

Bayesian Reasoning in a Nutshell

Here is just a basic guide for simple Bayesian reasoning about history…

  • Rule 1: Estimate probabilities as far against your assumptions as you can reasonably believe them to be.

I discuss this method in detail in Proving History (index, “a fortiori, method of”). But in short, what it means is this: You can’t be certain of any probability. But you know enough to know what that probably can’t be. Or can’t reasonably be. You may not know what the odds are of a meteor from outer space vaporizing your house tomorrow. But you certainly know it’s a lot less than 1 in 1000. Otherwise it’d have happened by now. To lots of people you know. If it’s important to test probabilities closer to what they actually are, or what you think they are, by all means do so. And engage whatever research and data gathering is needed to get a better estimate. There actually are some data-based stats on destructive meteor strike frequencies you can track down and average per house-lot of area, for example. But most of the time, that effort simply won’t be necessary. You can see where a conclusion is going just from an initial a fortiori estimate. And any greater precision will just make the probability stronger still (either lower than it already is, or higher, depending on which possibility you are testing).

  • Rule 2: Estimate the prior probability of the conclusion you are testing.

Based on past cases like the one you are looking at, what has usually been the case? And how often? When you get emails from relatives with some astonishing unsourced claim about the President of the United States being a member of a secret Devil Cult, what usually is the case with things like that? They are usually made up urban legends. To an astonishingly high frequency even. Some few might end up being true. But this is what you are estimating, when you are estimating a prior: How often do claims like the one you are testing turn out to be true? How often does something else turn out to be the case instead? From estimating that you get a probability.

In history, we are testing causal claims. There are two kinds of historical claims. There are historical claims that ask “What happened?” And those amount to asking “What caused the evidence we have?” Was the claim about the President of the United States being a member of a secret Devil Cult caused by someone actually discovering good evidence that the President of the United States was a member of a secret Devil Cult? Or was it caused by someone making that up? And there are historical claims that ask “Why did that happen?” And those also amount to asking “What caused the evidence we have?” but in this case we’re talking about a different body of evidence, since we are looking not at the historical event itself, but at other historical events leading up to it. But that just makes it the same question again. And in every case, “What usually happens in such cases?” is where you get your prior. If you don’t know (if there is literally no relevantly similar case at all), then the principle of indifference prevails, and in the simplest binary cases will just be 50/50.

Again, I say a lot more about this in in Proving History. But this is something we already always do in all aspects of our lives. We make assumptions about what usually does or doesn’t cause the things we see, and adjust our certainty accordingly. It’s also done constantly by historians, even when they don’t realize it. Every time they dismiss a theory because it’s “implausible” or say things like “we don’t know for sure what happened in that case, but this is what usually happened in such cases,” they are reasoning about prior probability.

  • Rule 3: Prior probabilities are always relative probabilities. Because the prior is an estimate of the frequency of a claimed cause of the evidence relative to all other things that could have caused the same evidence.

In other words, the prior is a measure of how frequently something is the cause of the evidence we are looking at relative to all other causes of that same evidence. And the sum of the individual prior probabilities of every possible cause of the evidence must equal 100%, since we know the evidence exists and therefore something caused it, and there are no other possible things to have caused it but those.

This means, for example, that the prior probability of someone having gotten rich by winning the lottery is not the probability of winning the lottery. Rather, it is the relative frequency with which rich people got rich that way, as opposed to some other way. So if half of all rich people got rich by winning the lottery, then the prior probability that a rich person won the lottery is fully 50%. Regardless of how improbable winning lotteries is. Always think in these terms. So, for instance, if the only ways to get rich have a 1 in 1000 chance of occurring, and someone is rich, method A is 1000 to 1 against and method B is 1000 to 1 against, but these balance out. As each is equally likely, then the probability of having gotten rich by method A is simply 50%.

It’s too easy to get seduced by the unlikeliness of every possible explanation of a certain observation, and conclude they are all impossible. But that’s not how it works. What we want to know is the relative likeliness among all those explanations. So, for example, someone stealing the body of Jesus and someone else hallucinating seeing him alive again is, like a lottery, highly unlikely. But it’s still millions of times more likely than a space ghost magically regenerating corpse flesh. So if those were the only possibilities (they aren’t, but just for the sake of illustration), then the prior probability someone stole the body of Jesus and someone else hallucinated seeing him alive again is actually very nearly 100%. Because if that is, say, 2,000,000 times more likely than the alternative, then the ratio of the priors is 2,000,000/1. And since the priors must sum to 1 (because they exhaust all possible causes of the evidence), it follows that the prior probability of the “amazing conjunction of theft & hallucination” hypothesis is more than 99.99995% (and the prior probability of the space ghost theory is a dwindling 0.000049999975%). In other words, it doesn’t matter how unlikely the “amazing conjunction of theft & hallucination” hypothesis is. It only matters how likely it is relative to alternatives.

This is an important lesson in logic that understanding Bayesian reasoning teaches us.

  • Rule 4: Estimate the probability (also known as the “likelihood”) of all the evidence as a whole if the claim you are testing is true.

Literally assume the claim is true. Then ask, “How likely then is all this evidence?” You must mean all the evidence when you do that. You can’t leave any out—if it will make the claim more or less likely than alternative explanations (alternative causes) of the same evidence. And there are different ways to figure this probability (discussed in Proving History). But the question always comes down to this: Is the evidence pretty much or exactly what we’d expect? (All things considered.) Or is it in some ways not quite what we’d expect? If it’s at all unexpected, if there is anything unexpected about it, then it’s less likely. And you have to estimate that.

This is in fact what we always do anyway, in every aspect of life. And it’s what historians constantly are doing. When they say the evidence perfectly fits a hypothesis, they mean it would have had a very high likelihood (a very high probability) if that hypothesis is true. Whereas when historians say the evidence fits a hypothesis poorly, they mean it’s not very probable that the evidence would look like that, if the hypothesis were true. And this is what you mean, every time you have ever said that in your life, too.

  • Rule 5: Estimate the probability (also known as the “likelihood”) of all the same evidence if the claim you are testing is false. Which always means: if some other explanation is true.

Because you can only know whether a claim is true, by comparing it against other competing claims. This is true in estimating the prior probability, since that is always a relative probability (per rule 3). It is also true here. There are always at least two likelihoods you have to estimate before you can know if some claim is probably true or not. The first is the likelihood on the claim you are testing (rule 4). The other is the likelihood of all that same evidence on an alternative theory—the best alternative, at the very least; but every good alternative should be considered. A bad alternative, BTW, is one that either (A) makes the evidence we have extremely unlikely (and does not have a correspondingly remarkably high prior probability) or (B) has an extremely small prior probability (and does not have a correspondingly remarkably higher likelihood than every other competing hypothesis).

Since the evidence we have has to have been caused by something, such as the event you are claiming happened, the most likely alternative has to be some other event that could have produced the same evidence. If someone says “Joe must have had syphilis because he was observed to be suffering from dementia in his later years,” they are implicitly assuming no other causes of dementia are at all as likely as syphilis (which is not a sound assumption; there are many other common causes of dementia). They are also implicitly assuming there can be no other causes of the observed symptoms of dementia than having dementia—when, in fact, pretending to have dementia is an alternative that has to be accounted for (and there are many other possibilities as well).

So here, you are doing the same thing you did in rule 4. Except now you are “literally assuming” some other cause is true, and then asking “How likely then is all this evidence?” All the same principles apply as were discussed under rule 4. And this again is something you already do all the time; and that historians do constantly. Although, not as often as they should. One of the most common logical fails in history writing is failing to complete this step of reasoning, and assuming that because the evidence we have is exactly what we expect on hypothesis A, that therefore we’ve proved hypothesis A (that it is the most likely explanation of that evidence). No. Because hypothesis B might explain all the same evidence just as well. Or better. The evidence we have may in fact be exactly what we expect on B as well! So taking alternatives into account, and doing it seriously, is a fundamental requirement of all sound reasoning about evidence. You can’t use a straw man here, either. If you aren’t comparing your hypothesis to the best alternative, then your logic will be invalid.

  • Rule 6: The ratio between those likelihoods (generated by following rules 4 and 5) is how strongly the evidence supports the claim you are testing. This is called the likelihood ratio.

Whenever historians talk about a body of evidence or a particular item of evidence being weak or strong, or weighing a lot or a little, or anything like that, they mean by “weak” that this evidence is just as expected or almost as expected on many different hypotheses, and therefore doesn’t weigh very much in favor of one of those hypotheses over those others; and they mean by “strong” that this evidence is not very expected at all on any other hypothesis but the hypothesis it is supporting.

Thus, ironically, what you are looking for when you are looking for strong evidence for a claim—when you are looking for “really good” evidence—is evidence that’s extremely improbable … on any other explanation than the one you are testing. We already expect good evidence will fit the hypothesis. That is, that it will be just what we expect to see, if that hypothesis is true. But that’s not enough. Because as we just noted under rule 5, the evidence might fit other hypotheses equally well. And if that’s the case, then it isn’t good evidence after all. So the key step is this last one, where we look at the ratio of likelihoods among all credible explanations of the same evidence.

And so…

  • The odds on a Claim Being True = The Prior Odds times the Likelihood Ratio

The easiest way to think all this through on a napkin, as it were, is to use that formula, which is called the Odds Form of Bayes’ Theorem. It doesn’t let you see all the moving parts in the engine compartment, as it were. But if you just want to do a quick figuring, or if you already know how the engine works, then this is a handy way to do it.

The prior odds on a claim being true equals the ratio of priors (found through rules 2 and 3). So, for example, if one person gets rich by winning the lottery for every hundred other rich people (who get rich some other way), then the prior odds on a rich person having won the lottery equals 1/100. We can convert that to two prior probabilities that sum to 100%. But that’s next level. For now, just think, if it’s usually a hundred times more likely to have gotten rich some other way than winning the lottery, then the prior odds on having won the lottery is 1 in 100 (for anyone we observe to be rich).

The likelihood ratio is then the ratio of the two likelihoods (generated in rules 4 and 5). So, for example, if hypothesis A explains the evidence just as well as hypothesis B, then the likelihood ratio will be 1/1, in other words 50/50, because the likelihood of the evidence is the same on both hypotheses. The evidence then argues for neither hypothesis. But if the evidence is a hundred times more likely on A than on B, and A and B exhaust all possible causes of that evidence, then the likelihood ratio is 100/1. So, if we have really good evidence that Joe Rich won the lottery, evidence that’s a hundred times less likely to exist if he didn’t (and instead got rich some other way), then we get:

Prior Odds [x] Likelihood Ratio = 1/100 x 100/1 = 100/100 = 1/1

So with that evidence, it’s just as likely that Joe got rich by winning the lottery as that he got rich some other way. It’s 50/50. To get more certain than that, you need better evidence than that. For example, evidence that’s a thousand times less likely to exist if Joe didn’t win the lottery has this effect:

Prior Odds [x] Likelihood Ratio = 1/100 x 1000/1 = 1000/100 = 10/1

Then the odds Joe got rich by winning the lottery are ten to one. That means it’s ten times more likely he won the lottery, than anything else.

Once you realize how to do this simple napkin math, you can analyze all kinds of questions, such as about how good the evidence has to be to reach a certain level of certainty in a claim, or about what it even means for evidence to be “good.” It also helps understand what a prior probability is (through the idea of the “prior odds” on a claim being true, something gamblers are always calculating for anything and everything), and how it affects the amount of evidence we need to believe a claim. You’ll start to get a sense, in other words, for the whole logic of evidence.

I’ve Said It Before

In my reply to Chris Guest’s remarks at TAM, one of the points I made was:

Guest is first bothered by not knowing where I get my estimates from [in a historical analysis]. But … they are just measures of what I mean by “unlikely,” “very unlikely,” and similar judgments. My argument is that “assigning higher likelihoods to any of these would be defying all objective reason,” … which is a challenge to anyone who would provide an objective reason to believe them more likely. In other words, when historians ask how much [a certain piece of] evidence weighs [for or against a conclusion], they have to do something like this. And whether they do it using cheat words like “it’s very unlikely that” or numbers that can be more astutely questioned makes no difference. The cheats just conceal the numbers anyway (e.g., no one says “it’s very unlikely that” and means the odds are 1:1). So an honest historian should pop the hood and let you see what she means.

This is why it is still important to learn at least the basics of probability mathematics (which really requires nothing more than sixth grade math, and some people think I’m joking but I’m actually serious when I say Math Doesn’t Suck will get you up to speed on that, even if all you do is read the sections on fractions and percentages). You really should start thinking about what you mean when you make colloquial assertions of probability and frequency throughout your life. “It happens all the time”; “that’s abnormal”; “that would be weird”; “I can imagine that”; “that’s not what I expected to happen”; and on and on; these are all mathematical statements of probability. You just don’t realize these are expressions of mathematical quantities (frequencies, specifically, and thus probabilities). But never asking what numbers they correspond to doesn’t make you a better thinker or communicator. It makes you a worse one.

Likewise, in Understanding Bayesian History, one of the points I made was:

Historians are testing two competing hypotheses: that a claim is true vs. the claim is fabricated (or in error etc.), but to a historian that means the actual hypotheses being tested are “the event happened vs. a mistake/fabrication happened,” which gives us the causal model “the claim exists because the event happened vs. the claim exists because a mistake/fabrication happened.” In this model, b contains the background evidence relating to context (who is making this claim, where, to what end, what kind of claim is it, etc.), which gives us a reference class that gives us a ratio of how often such claims typically turn out to be true, vs. fabricated (etc.), which historians can better estimate because they’ve been dealing with this kind of data for years. We can then introduce additional indicators that distinguish this claim from those others, to update our priors. And we can do that anywhere in the chain of indicators. So you can start with a really general reference class, or a really narrow one—and which you should prefer depends on the best data you have for building a prior, which historians rarely have any control over, so they need more flexibility in deciding that (I discuss this extensively in chapter 6 [of Proving History], pp. 229-56).

History is about testing competing claims about what caused the evidence we now have. Which is the first thing to consider. And that also means the second thing to consider is that we often have a lot of prior information about what usually is the cause of certain kinds of evidence. And that information can’t be ignored. It has to be factored in—and it factors in at the prior probability.

Thus, for example, if we find that literature like the Gospels, sacred stories written about revered holy men by religious fanatics featuring endless wonders, most other cases of that kind of writing (in fact, all of them) are highly unreliable historically. Just peruse the Medieval Hagiography literature for the closest parallels to the Gospels…though even just all the forty or so Gospels illustrate the statistical point. They therefore cannot be treated like personal memoirs or rationalist histories or sourced biographies of the era. Indeed, even secular biographies in antiquity are highly unreliable. As I show in On the Historicity of Jesus (p. 219, w. note 168), experts have demonstrated that “most biographies of philosophers and poets in antiquity were of this type: inventions passed off as facts,” and that fabrication was the norm even for them, even biographies of verifiably historical people.

A third point to close with is that there is still a difference between knowing a claim is fabricated and not knowing whether it isn’t. This has caused many confusion. If I say we cannot tell that a particular story is true or fabricated, that it’s a 50/50 toss up in the absence of any other corroboration, I am not saying the story is fabricated. I’m saying we do not know whether it is true or fabricated, that either is as likely as the other on present information. This is a conclusion most difficult for people who suffer from ambiguity intolerance. They can’t compute the idea of “not knowing.” It either has to be true or fabricated. So if I say we can’t say it’s probably true, because we don’t know that it is more probably true than fabricated, they cannot comprehend me as saying anything other than “it’s fabricated.” If you catch people in that error early you can call their attention to it. Bayesian reasoning helps you realize this is a thing to watch for.

Concluding Observations for the Future

In If You Learn Nothing Else about Bayes’ Theorem, Let It Be This, I listed two things you should definitely get from Bayesian reasoning: theories cannot be argued in isolation; and prior assumptions matter. The first point is that all probabilities are relative probabilities. The probability that something you believe is true is relative to the probability that something else is true instead. So you cannot know how likely your belief is, if you don’t know how likely its alternatives are. All hypotheses have to be compared with other hypotheses. You can’t just rack up evidence for “your” hypothesis and conclude it’s true. You have to test your hypothesis against others, both on the measure of how well they explain the evidence (because they might both perform equally well…or something else might perform better than yours!), and on the measure of prior probability.

The conclusion that we really are all Bayesians now is a growing realization across disciplines. And that realization is linked to both of those features: hypotheses cannot be argued for in isolation from alternatives; and prior knowledge cannot be ignored. Legal scholar Enrique Guerra-Pujol of Puerto Rico (whose own research often touches on Bayesian reasoning in the law) just recently wrote a brief post on this point at Prior Probability, in positive reaction to a paper by F.D. Flam, The Odds, Continually Updated, which summarizes how, in fact…

Now Bayesian statistics are rippling through everything from physics to cancer research, ecology to psychology. Enthusiasts say they are allowing scientists to solve problems that would have been considered impossible just 20 years ago. And lately, they have been thrust into an intense debate over the reliability of research results.

Flam summarizes how frequentism fails to accurately describe reality, and that major scientific errors are being corrected by bringing problems back to a Bayesian framework where they should have started. I’ve often used the example myself of fraud: frequentism is completely helpless to detect fraud. It simply assumes fraud has a zero probability. Because frequentism ignores prior probability. But you can’t do that and claim to be making valid inferences.

Guerra-Pujol zooms right in on the same point made by Flam, in a sentence that truly captures why Bayesian reasoning is the superior framework for all empirical reasoning:

Bayesian calculations go straight for the probability of the hypothesis, factoring in not just the data from the coin-toss experiment but any other relevant information—including whether you’ve previously seen your friend use a weighted coin.

This captures the whole point about the importance of priors. That point is not simply that you have to account for the known frequency of fraud in a given case (though you do), but that ignoring prior information in general is simply not going to get you to an accurate inference about what’s happening. If you know your friend loves using weighted coins, that’s not data you can ignore. Likewise, if you know most remarkably successful drug trials end up being invalidated later, and not just because of fraud but even more because of the statistical anomaly that so many drug trials are being conducted that remarkable anomalies are inevitable by chance alone (a hidden example of the multiple comparisons fallacy), you have to take that into account. Mere frequentism will deceive you here. Similar points have exposed serious flaws across the sciences, where Bayesian analysis exposes the inadequacies of traditional frequentism time and again (I only recently mentioned just one instance, that this is happening in the field of psychology). Nate Silver has been trying to explain this to people for ages now. Alex Birkett has recently composed an article on this as well.

Another famous example of ignoring priors (also mentioned by Flam and Guerra-Pujol) is the famous Monty Hall Problem, which I love not only because my coming to understand it was a eureka moment for me, but also because it’s connected to one of the most jaw dropping examples of sexism in the sciences, when a woman pwned the entire male mathematics establishment, and they ate their foot over it. But the point here is that those men (actual PhDs in mathematics) who tried to mansplain to Marilyn vos Savant why she was wrong, failed to realize that conclusions about probability must take into account prior information. In the Monty Hall problem, that included the subtle fact that Monty is giving you information when he opens one of the doors for you. He’s telling you which door he didn’t open. And that changes everything.

Priors thus can’t be ignored. And anyone who thinks they can “do probability” without them, is not talking about the real world anymore. Yes, you have to justify your priors. They do have to come from actual prior information. Not just your unaccountable guessing. As the tagline reads at Guerra-Pujol’s blog Prior Probability, “Hey, where did you get your priors?” And a lot of pseudologic proceeds from inventing priors out of nowhere (theists do it all the time). But real logic still requires you to come up with a prior probability. And that prior had better be informed. I discuss the challenges of how to determine priors from many competing collections of data in Proving History (especially in chapter six).

I may publish some of my other materials on Bayesian teaching in future. But those who want to get started will find help at Bayes’ Theorem: Lust for Glory. For those who want to delve deeper and see what’s really going on under the hood of Bayesian historical theory, some of the heavier philosophical issues of Bayesian epistemology of history I tackle in responding to the Tim Hendrix review, which supplements Tucker’s as one of the few done by an actual expert in mathematics. You might also benefit from my article on Two Bayesian Fallacies, which shows how a philosopher stumbled badly in understanding how Bayesian reasoning works, and seeing how that’s a stumble can be educational when you want to understand what correct Bayesian reasoning is.



  1. Bill Jefferys January 31, 2016, 6:28 am

    On the Marilyn vos Savant incident with the Monty Hall problem. It’s true that many mostly male mathematicians fumbled this; but Marilyn herself was not blameless.

    Her presentation of the problem was incomplete. I do not recall the details, but she was not clear in her exposition that in the problem, Monty *always* opens a door that *does not* contain the prize. This is absolutely necessary for the puzzle to work.

    For example, if Monty were always to *randomly* pick a door and open it, and it *happened* not to have the prize on this particular occasion (I call this version “Ignorant Monty”), then the probability does revert to 1/2 and it doesn’t matter if you switch doors.

    Or, if Monty opens the door with the prize and says “Ha, ha, you lose” every time you pick the wrong door, but on this occasion you’ve picked the right door and he offers you the chance to switch (I call this version “Monty From Hell”) then you should *never* switch, since Monty From Hell’s opening a door is proof that you have picked the prize door.

    Or, Monty could open your door if you have guessed right and say, “Congratulations, you’ve won!”, and only offer you a chance to switch if you’ve chosen wrong. I call this version “Angelic Monty”, and obviously if that’s the version being played, you should switch if he offers you the chance because then you are sure to win (the door he opens and your original door don’t have the prize, so the only one left does).

    I challenge my students to tell me what they should do if they are offered “Mixture Monty”. In this version, Monty flips a coin backstage and if it comes up heads he behaves like “Angelic Monty”, and if it comes up tails he behaves like “Monty From Hell”. Assuming the coin is fair, should you switch if offered the chance? What’s the probability of winning if you do?

    So, as I said, Marilyn wasn’t blameless in this incident as she was not clear in her presentation of the problem.

    This problem was discussed in a professional journal (American Statistician, as I recall), and Monty himself weighed in, pointing out that he did not in fact use the rules of the standard mathematical puzzle, but behaved more like “Mixture Monty” in that he only offered a chance to switch sometimes, and not always when the contestant had chosen rightly (or wrongly). It was, therefore, just an additional occasional feature of the game.

    1. No, you’ve slightly misremembered. The article I linked to covers all this.

      Hall said what he did differently was bribe the contestants, and her analysis only skipped that part (since, being psychology, it wasn’t interesting to her as a mathematician; and indeed, no version of the Monty Hall problem since ever incorporates the role of bribes).

      And Vos Savant was very clear on the rules of the game. Her letter is here. It’s of course obvious anyway, since it’s the only way for the game to work. But nevertheless she is explicit, that Hall only opens doors that don’t have the prize. She even used the million-doors example to show this.

    2. Bill Jefferys January 31, 2016, 2:10 pm

      I was referring to the *original Parade article*, which I did not misremember and which did *not* clearly state that Monty *always* opens a door that does *not* contain the prize.

      In fact, I wrote to her pointing this out.

      In her subsequent discussions, such as the one you linked to (which was not the original article), she was clearer in her statement of the problem. I can hope that what I sent to her after her original Parade article encouraged her to write more clearly.

    3. Bill Jefferys January 31, 2016, 2:28 pm

      Let me be a little more clear.

      If you look at the link you gave, the original problem was posed by Craig Whittaker. If you read the problem as he stated it, you will see that in no way does it say that Monty *always* opens a door that does *not* have the prize. All it says is that he opened a door (on this particular occasion) and that it did not have a prize (on this particular occasion).

      The problem as stated by Mr. Whittaker could have produced the evidence observed, regardless of which of the several different kinds of Montys actually pertained to the problem. That’s because, according to the statement of the problem by Mr. Whittaker, the contestant only knows (a) that Monty knows where the prize is, (b) that Monty has opened a door that is not the one you chose and (c) that the prize was not behind the door he opened. *Any* of the Montys I described could have produced this evidence for this particular contestant. And Marilyn made no effort to correct the problem *as stated by Mr. Whittaker* to bring it into the canonical form.

      Marilyn does not correct this, though I agree that her answer *assumes* the standard puzzle. But this is my complaint. She should have stated the complete problem including all side conditions (what we Bayesians know as background conditions, stuff on the right side of the conditioning bar). That was my complaint about her discussion of this problem.

      1. But Vos Savant does correct him, by explaining that of course he doesn’t throw the game, and that he only ever opens what is empty, how the game actually works. (And Monty Hall later corroborated her.) So the actual game Mony plays is the game she analyzes. As she ought to have. And she did make this clear. No one who challenged her said she was wrong about that.


    4. Bill Jefferys February 2, 2016, 3:42 pm

      You’re right that the link you gave does include both the original question and Marilyn’s answer, as published in Parade. Sorry to miss that fact.

      However, this supports what I am saying. Here’s what was originally published. First, Craig Whittaker’s question:


      Suppose you’re on a game show, and you’re given the choice of three doors. Behind one door is a car, behind the others, goats. You pick a door, say #1, and the host, who knows what’s behind the doors, opens another door, say #3, which has a goat. He says to you, “Do you want to pick door #2?” Is it to your advantage to switch your choice of doors?
      Craig F. Whitaker
      Columbia, Maryland


      Then, Marilyn’s answer (the answer as given in the original column):


      Yes; you should switch. The first door has a 1/3 chance of winning, but the second door has a 2/3 chance. Here’s a good way to visualize what happened. Suppose there are a million doors, and you pick door #1. Then the host, who knows what’s behind the doors and will always avoid the one with the prize, opens them all except door #777,777. You’d switch to that door pretty fast, wouldn’t you?


      That’s all. Everything else in your link is stuff that was published *after* the original comment and *after* I also wrote to Marilyn to complain that she hadn’t been clear about the conditions under which her answer is correct (e.g., that Monty *always* makes this offer to *every* contestant after open a door that he *knows* does not contain the prize). Alternatively, (and more generally) it needs to be clear that Monty’s decision to make this offer is *independent* of the contestant’s choice of door:

      P(Monty offers switch|Contestant chooses some particular door)=P(Monty offers switch).

      For example, Monty could decide to offer the contestant the switch only if the contestant had chosen right (he didn’t do that, but that’s immaterial to the point I am making). If he did that, and just didn’t offer the switch when the contestant chose the wrong door, then it would *never* be a good idea to switch (but his decision to offer the switch would no longer be independent of the door the contestant chose, violating the condition above).

      This is even true in the million door example. If you happened to choose the correct door, and Monty only offered the switch in this (unlikely) case, you’d still lose for sure by switching.

      There is nothing that I can see in either Mr. Whittaker’s statement of the problem or in Marilyn’s answer *to him* as given above that makes any of this clear. If I am mistaken, please point out where her answer *above* makes it clear that Monty makes his decision to offer the switch independently of the door the contestant chose.

      You’ve commented that she makes it clear “by explaining that of course hd [sic] doesn’t throw the game, and that he only ever opens what is empty,” but any comments like that are *not* in her original answer above, which is the only thing I am commenting on since it is her original comment that led all of those mathematicians to say that she was wrong (incorrectly…if you use the canonical form of the puzzle as given originally, I believe, by Martin Gardner).

      In other words, your comments that rely on later comments by Marilyn are not relevant to my original complaint…I could hardly have read them before she published her second column taking the many mathematicians to task and where she was more clear in her exposition…since I had long since written her to complain about her unclear answer in the first place!

      You’ve noted additionally that Monty himself made some comments, but they aren’t in the link you gave, and I haven’t been able to locate the exchange on this topic that appeared in *American Statistician*, though I’m pretty that’s where it was. Do you have a link to Monty’s comments?

      1. Even with that briefer version, when she said “who knows what’s behind the doors and will always avoid the one with the prize” she is correctly describing the scenario, and her math follows from what she describes. Which is what people should have attended to. They could have challenged her by saying “Yes, you are right in that case, but if he opens boxes randomly, and thus doesn’t prevent the prize being revealed when he offers to open one, then the result is different.” Did anyone correctly say that? Or did they all ignore the line “who knows what’s behind the doors and will always avoid the one with the prize”? 🙂

    5. Bill Jefferys February 2, 2016, 8:26 pm

      “Even with that briefer version, when she said “who knows what’s behind the doors and will always avoid the one with the prize” she is correctly describing the scenario, and her math follows from what she describes. ”

      Yes, that’s what she wrote. And it correctly describes the scenario on any occasion where Monty DECIDES to offer the contestant a switch.

      But “her math follows” ONLY if Monty chooses to offer the switch INDEPENDENTLY of which door the contestant has chosen.

      THAT is the whole point of my complaint about her response to Whittaker’s question.

      Where does her comment say, directly or indirectly, that Monty will DECIDE to offer this choice to any contestant INDEPENDENTLY of which door the contestant has chosen? Or, alternatively, that Monty ALWAYS DECIDES to offer this choice?

      As I said, if Monty had a rule (unknown to the contestant) that he would only offer the switch if the contestant chose the right door, or even if Monty had a rule that he would only offer the switch if the contestant chose the right door *sometimes* (with some probability that would keep the game going but would not be independent of what the contestant had chosen), then the “standard answer” to the Monty Hall problem will be wrong. It would be badly wrong if Monty *never* offers a switch.

      Chapter and verse, please.

      1. I’m not sure I follow you. The correspondent said “the host, who knows what’s behind the doors, opens another door.” There are only three doors. Monty offers to open one of the doors the contestant didn’t pick. That’s plainly stated in the problem, the problem she was asked to solve. She was asked should you switch, after that has happened. So whether he always does this is not relevant to what she was asked.

        Or am I misunderstanding something you mean?

      2. Oh wait, I think I see what you mean: that if Monty only ever does this when the prize is behind the contestant’s door, then it changes the probability to zero for switching.

        I think that’s why no one would ever think this. Because then everyone would know as soon as Monty offered to open another door, he was telling them they already chose correctly. That would be foolish. I can see how it remains logically possible, just not why anyone would ever think Monty would engage in such a self-defeating pattern. But yes, one could add that as a condition that would change outcomes (you would then need to know another variable: the frequencies with which Monty does this when the contestant’s box is correct and when it is not; which not being stated in the problem posed, are unknowns).

    6. Bill Jefferys February 3, 2016, 6:54 am

      Right, you’ve got it in your last comment, Richard.

      You are also right that if Monty *only* does this when the contestant has chosen the right door, then (if contestants have memory of the show) after a while no contestant will take the bait.

      This is why I posed the challenge to my students, of a “mixed” strategy: Sometimes Monty behaves like “Angelic Monty” (offers the switch if the contestant chooses wrong) and sometimes behaves like “Monty From Hell” (offers the switch if the contestant chooses right). He could (and in my problem to my students, does) decide which strategy he’s going to employ, in advance of the contestant’s going on stage…it’s predetermined. He would do this by some chance operation (in the problem to my students, I say he flips a coin so it’s 50/50 Angelic/Hell, but it could be any random process with any odds, just as long as the decision is made in advance of the game).

      In the situation of the problem I pose, Monty always opens a door, and if the door he opens is empty, the contestant won’t know if he’s facing Angelic Monty and has picked the wrong door, or Monty From Hell and has picked the right door. Again, nothing in either Whittaker’s posing of the problem or Marilyn’s reply (applied to the 3-door version) will inform the contestant as to whether the game being played *this time* is Angelic/Hell or the standard Monty as described by Martin Gardner, since in this particular case an empty door has been opened. Nothing in Whittaker’s question or Marilyn’s reply says anything about multiple games, it is only about the particular game played with this particular contestant.

      My argument with Barnes hinges on the necessity to take into account prior information that you may have, since as Jaynes points out, failure to take such information into account often leads to nonsense answers, as it does in the “fine tuning” arguments for the existence of a deity.

      This happens because it is common for people to leave out background information when writing out Bayes’ theorem. Usually this does not pose a problem since everyone “knows” about the background information, and obviously much background information that we have will be irrelevant to the problem (i.e., the results of Bayes’ theorem do not depend on that information being specified since the data are independent of that information…the color of the socks I am wearing should not affect my inferences about whether the Sun will rise tomorrow! But the existence of the observer *does* affect inferences about fine tuning.)

      In the Monty Hall problem, it’s my point that one needs to specify as background information the rules that Monty is following in the game. The contestant only knows what’s happened in the particular case at hand…but the rules that Monty follows, whether he opens a door at all, for example, the probability that he opens a door that shows the prize or that does not show the prize, matters for the probabilities and therefore whether the contestant “should” switch. There’s an unstated bit of background information that’s needed to get the “standard” result that the prize is behind the remaining unopened door with 2:1 odds, and that was my original complaint to Marilyn. Unlike all the mathematicians, I knew that the answer she gave was “correct” for the *standard* problem, but I faulted her for not making the conditions of the *standard* problem clear. Had the problem been a different one, e.g., Monty From Hell, Angelic Monty, Mixture Monty, or Ignorant Monty (where he decides in advance to open a random door and it happens not to have the prize), then the answer would be different, given the same information available to the contestant (I chose a door, Monty opened another door that didn’t have the prize, should I switch?)

      So: Suppose the contestant is facing Mixture Monty, with the odds of Angelic/Hell decided by the flip of a fair coin. Richard, should he switch if he knows he is facing this particular Mixture Monty?

      Also, Richard, can you post a link to Monty Hall’s comments, which I haven’t been able to locate?

      1. Fascinating. Right, if we complicate the model with all that, we have a much bigger problem.

        On Monty’s comments, I thought those were in the article I cited at the start? If not, I can’t recall then where I read them.

  2. Steve Watson January 31, 2016, 5:24 pm

    Hiya Richard,

    My count of people hallucinating Jesus alive is 514 from Paul. I don’t have reason to think they were all liars and it is background that hallucinating gods happened a lot in that era. Doesn’t that make the odds of some Jesus believer hallucinating Jesus alive one hundred per cent because we know that is what actually happened, and more than once? I am not Everyman but after I understood to read Paul knowing the Gospels were a generation later at least, that they were ad-hoccery couldn’t be un-made. A stolen corpse is a more probable solution to the Gospel resurrection story, sure, but reading Paul in context blows the Gospels right out the water. I doubt if someone is not prepared to get that (and hardly any are; just look at the numbers of even atheists reading the Gospel Jesus back into Paul) that arguing for a stolen corpse being more likely than a resurrection is going to have any impact on their beliefs.

    Shivam, “Whats the deal with Ralph Ellis?”. Based on looking at his website for ten seconds, he’s a loony. Why are you asking incomplete, daft questions? If we are going to talk probabilities, I think the odds of the most extravagantly daft Jesus theory you or I could come up with having already been proposed by some loon on the internet are approaching 100%. The same that, if not the Monty Hall problem, some other part of this post will result in someone arguing the toss for pages and pages of comments and the maths getting more and more arcane. ‘Math Doesn’t Suck’ might get us all up to speed but I keep seeing plenty of folk educated beyond sixth grade arguing the maths and lining up authorities for those arguments. Seriously guys, if you have to go beyond a couple of comments or paragraphs, take it up with Richard in PM. It gets really old really fast.

    [P.S. ‘I discuss this method in detail in Proving History (index, “a foriori, method of”).’ You’ve dropped the t in “a fortiori”. — Thanks, fixed, RC]

    1. My count of people hallucinating Jesus alive is 514 from Paul.

      Thousands simultaneously hallucinate visions of the Virgin Mary. And when we look at the only record of a large hallucination event in Paul’s timeline (the only record of over a hundred “seeing Jesus” after his death), it’s Acts 2. In which what is happening is just like a stylized version of Marian apparitions, where it’s just a vague visual hallucination and a feeling.

      So there’s no difficulty there.

      But I doubt Paul wrote quite what we have in 1 Cor. 15:6 anyway, “then he appeared to above 500 brethren at once.” Acts says it was 120 (Acts 1:15), which is a stylized biblical number. The author of Acts used the letters of Paul. So if that author had seen “over 500” in verse 6 of Paul, they would have used that number in Acts, as it’s more impressive than 120. So this convinces me the “over 500” was not in the letter of Paul at that time. In fact the Greek spelling is just a few letters away from a completely different sentence, and scribal errors of a few letters are very common in textual transmission. With an easy mistake, the sentence could have become what it is now, from an original “then he appeared to all the brethren on the Pentecost at once,” matching Acts 2:1, that all the brethren shared a vision of the spirit of Jesus on “the day of the Pentecost, when they were all together in one place.” This would explain why Luke placed this event on the Pentecost (because it’s where Paul originally did; the words for “Pentecost” and “Five Hundred” are nearly identical), and why Paul would refer to Jesus as “the firstfruits of the resurrection” (1 Cor. 15:20, a reference to the Pentecost).

      For details see my analysis (and scholars supporting some elements) in The Empty Tomb, p. 192, and n. 371, p. 231.

      Note that this is the only event Paul says was simultaneous. That he specifically says only here “all at once” means none of the other appearances he lists were group appearances (so, just as verse 7 does not mean “James and all the apostles” at one occasion but a sum of individual appearances, so in verse 5 he does not mean “the twelve” saw Jesus on one occasion, but each of the twelve had a separate individual experience…or claimed to have; we should remember, not everyone need be honest).

  3. John MacDonald February 1, 2016, 10:56 am

    In Mark, the cry of dereliction from the cross, as well as the prayer in the Garden Of Gethsemane, seem to speak against the interpretation (usually from Catholics) that Jesus was one third of the trinity. In these two cases, Jesus was clearly not crying out to himself (which would be absurd,) but to another: to God.

    1. John MacDonald February 1, 2016, 2:11 pm

      The cry of dereliction in Mark also shows Mark’s Jesus did not believe he would be resurrected after three days, because there is no reason for Jesus to make the cry of dereliction if he is expecting a speedy resurrection.

  4. ArmyMan February 1, 2016, 11:49 am

    William Briggs, a professor of statistics who works at Cornell, has published a critique of your use of Bayes’ Theorem here: http://wmbriggs.com/post/13390/ . Though this has more to do with your use of it in the context of fine-tuning arguments, he also makes general challenges to your methodology. I am curious to know how you would respond to two of them, namely his claims that [1] while Bayes Theorem is useful in “updating” probabilities, it is not necessary for assessing them at the outset, and [2] that probability “measures information,” which can be but is not always in the form of frequencies.

    For convenience I include the relevant quotes. In the section of his paper titled “Bayes,” he says: [1]. “Bayes’s theorem is a simple means to update the probability of a hypothesis when considering new information. If the information comes all at once, the theorem isn’t especially needed, because there is no updating to be done. Nowhere does Carrier actually needs [sic] Bayes and, anyway, probabilistic arguments are never as convincing as definitive proof, which is what we seek when asking whether God exists.” He supports this with an example from card playing. And later he states [2]. “Carrier further shows he misunderstands his subject when he says ‘Probability measures frequency’. This is false: probability measures information, though information is sometimes in the form of frequencies, as in our card example. Suppose our proposition is ‘Just two-thirds of Martians wear hats, and George is a Martian.’ Given that specific evidence, the probability ‘George wears a hat’ is 2/3, but there can be no frequency because, of course, there are no hat-wearing Martians.'”

    I am no expert here, but these statements seem plausible to me and they come from someone with relevant credentials, so I would be happy to see your responses.

    1. On those general questions, “[1] while Bayes Theorem is useful in “updating” probabilities, it is not necessary for assessing them at the outset” is not something a Bayesian would ever say, so it sounds like he has an axe to grind against Bayesians—obviously you can’t update a probability if you don’t start with one! Maybe you have to pick one by estimate rather than a previous run of Bayesian equation, but that’s exactly what I’m talking about doing. Although in reality that estimate is just the estimate of a previous intuitive run of the equation (e.g. you are estimating what the result would have been if you worked the Bayesian numbers all the way from raw data—which means all the way from properly basic uninterpreted sense experience). Of course, if you don’t know anything about what the starting probability is, then you can’t make a fine tuning argument in the first place. So the probability has to come from somewhere. And this is what I discuss when I treat questions like the threshold probability (hence read up here). Which is not derived by Bayes’ Theorem but from the data, which include information about the possibility space we are trying to run a probability distribution on.

      As to “[2] that probability “measures information,” which can be but is not always in the form of frequencies,” that’s an assertion with no evidence to back it. I guarantee, anything he says about information from which he derives a probability, will always translate to a frequency of something. That consequence can’t be avoided. I think again he is confusing one kind of frequency statement for another, e.g. he is ignoring the fact that stating degrees of belief is making a statement about the frequency of being wrong.

      I should add that Briggs seems to think this is my argument. It’s the argument of three expert mathematicians published separately in two sources. Which Briggs ignores and does not engage with at all.

      When I look over the Briggs article, he’s just blathering with insults and misrepresentations of their argument (as if it were my argument). It’s hard to take him seriously. He doesn’t actually engage with my argument as presented in TEC either. And his arguments make no sense. For instance, saying that “2/3rd of Martians wear hats” is not a frequency because Martians don’t exist is to confuse the meaning of a statement with its referent. This is a boner mistake for someone who is supposed to be a professor. Obviously saying that “2/3rd of Martians wear hats” is stating a frequency…about a hypothetical set; that’s what the sentence means. Whereas if he thinks this is a relevant analogy, he’s saying fine tuning arguments are impossible because we can’t make frequency statements about hypothetical sets, and yet fine tuning arguments require doing that, e.g. making statements about the frequency of biophilic universes in the set of all nonexistent universes (just like nonexistent Martians).

      I can see no merit to anything he’s arguing. It really is a bunch of illogical crap like that. And it doesn’t engage with my actual argument at any point (much less the arguments of Ikeda, Jefferys, or Sober).

      I say just read my article in TEC. You can see yourself how he isn’t actually making any valid objections to it.

    2. Bill Jefferys February 2, 2016, 3:54 pm

      I agree with your assessment of Briggs’ comments. He’s also got some ideas about physics that are incorrect.

      Luke Barnes and I do not agree about his “fine-tuning” arguments, which I believe to be fatally flawed. He doesn’t like my arguments against his arguments (see my article with Michael Ikeda, in The Improbability of God, edited by Michael Martin and Ricki Monnier, Prometheus 2006).

  5. Lamar Latrell February 1, 2016, 6:39 pm


    I am a longtime reader and am really interested in your novel approach to the study of history. I have also considered taking your course on Bayesian reasoning. Yet I have been reading some pretty powerful critiques of your work and I am starting to have doubts about your competence in statistical reasoning.

    I read your post a couple of weeks ago that criticized Kyle Barnes, and I also read the comments. In the comments when you were talking with Tim Hendrix it appeared that you wetter confused about some concepts.

    I noticed today that Barnes has a response to you posted on his blog:

    These critiques of you are pretty damaging to your credibility. It again appears you have misunderstood basic concepts of Bayesian statistics. Barnes also links to a blog post where another person posted the texts of one of your courses on Bayesian reasoning and proceeded to point out scores of errors.

    Do you have a response too any of this? I really am a longtime fan, and this is all very disappointing to me.

    1. I’ve responded here.

      The McGrews are probably not worth responding to. They are haphazardly commenting on an obsolete six year old document I never use anymore. And their comments on it are mostly trash talk. That old document is pedagogically a nightmare of wording anyway. My book Proving History completely replaces it.

  6. Bill Buckner February 2, 2016, 5:58 am

    I have a question about how you ascribe uncertainties to calculations using Bayes’ Theorem. Let’s use a trivial example from the Wiki page:

    The entire output of a factory is produced on three machines. The three machines account for 20%, 30%, and 50% of the output, respectively. The fraction of defective items produced is this: for the first machine, 5%; for the second machine, 3%; for the third machine, 1%. If an item is chosen at random from the total output and is found to be defective, what is the probability that it was produced by the third machine?

    A solution is as follows. Let Ai denote the event that a randomly chosen item was made by the ith machine (for i = 1,2,3). Let B denote the event that a randomly chosen item is defective. Then, we are given the following information:

    P(A1) = 0.2, P(A2) = 0.3, P(A3) = 0.5.

    If the item was made by machine A1, then the probability that it is defective is 0.05; that is, P(B | A1) = 0.05. Overall, we have

    P(B | A1) = 0.05, P(B | A2) = 0.03, P(B | A3) = 0.01.

    To answer the original question, we first find P(B). That can be done in the following way:

    P(B) = Σi P(B | Ai) P(Ai) = (0.05)(0.2) + (0.03)(0.3) + (0.01)(0.5) = 0.024.

    Hence 2.4% of the total output of the factory is defective.

    We are given that B has occurred, and we want to calculate the conditional probability of A3. By Bayes’ theorem,

    P(A3 | B) = P(B | A3) P(A3)/P(B) = (0.01)(0.50)/(0.024) = 5/24.

    Given that the item is defective, the probability that it was made by the third machine is only 5/24.

    My question is, suppose the inputs are given with uncertainties, e.g.

    P(A1) = 0.2 ± 0.1, P(A2) = 0.3± 0.1, P(A3) = 0.5± 0.05

    That is, there is some uncertainty about the relative fractional outputs. And also in the defect rates, viz.

    P(B | A1) = 0.05± 0.01, P(B | A2) = 0.03± 0.01, P(B | A3) = 0.01± 0.005.

    What would then be the uncertainty in the final answer of 5/24?

    1. I didn’t check that math to see what they mean (whether 5/24 is a probability or the odds).

      It sounds like you are asking how you convert odds to probabilities. That’s a procedural question. When you are asking about total odds, you sum the denominator and numerator and divide. So, 5/24 translates to a probability via 5/29 = 17% (approx.). (I didn’t check the Wiki’s math; I’m just assuming they correctly derived the ratio as 5/24 and that they mean this as the total odds, which might not be the case.)

      But rather than burn time on this elaborate example, there is a simpler answer to your question, and it’s precisely what I do in OHJ (see Chapter 12.1): run the math for the odds all at the margin in favor, then run the math for the odds all at the margin against; the two results will be your range of uncertainty: from best odds to worst (within your chosen margin of “reasonable certainty”).

      Alternatively, don’t use odds or fractions. Just stick with probabilities. You can do that even with the Odds Form of BT, but it requires you to know the whole formula (see Proving History, Appendix), since you then are plugging in probabilities to get the odds. That full formula is: P(h/e.b)/P(~h|e.b) = P(h/b)/P(~h|b) x P(e/h.b)/P(e|~h.b).

    2. Tim Hendrix February 3, 2016, 6:21 am

      Hi Bill,

      This is a really good question. What you *should* do of course depend on various things and I would rather not turn this into a long argument, however you should be aware that what Richard is proposing is very different from what a Bayesian statistician would do. What any Bayesian statistician would do is as follows:

      You would first describe all your probabilistic statements as probabilities, for instance you would introduce a variable x such that
      P(A1) = x
      and then that x followed some distribution (typically a Beta distribution since they take values between 0 and 1). Then when you did your computation, you would apply the only tool you have available: the rules of probability theory and compute (for instance, and I am sorry this is looking so awful)

      p(B) = int [ p(B|A1,x)p(A1|x) + p(B|not A1,x)p(not A1|x) ]p(x)dx
      = int [ p(B|A1)x+ p(B|not A1)(1-x)]p(x)dx
      = p(B|A1)Ex + p(B|not A1)(1-Ex)

      where Ex is the mean of X under its respective prior distribution p(x). So a Bayesian approach to probabilities gives a self-contained approach to uncertainty in probabilities :-).

      Regarding Richards proposal, the reason why this is not used is
      1) that a statement such as 0.1+- 0.2 is likely a statement about a frequentistic confidence interval or some such which is avoided on a Bayesian account of probabilities (i.e. it should be translated as a statement about probabilities if this is information we have available)
      2) if the variable is 0.1 +- 0.2; normally we mean it could also be 0.9 (but with very low probability) and if these extended ranges are used our min/max statements are going to be overly wide
      3) if we take the min/max our result is not representative of the typical behavior of the model
      4) its just ad-hoc with no theoretical foundation.

      1. Indeed.

        Also note, that this question is about my use of margins of error, which I have only done in history, not the fine tuning argument. That context needs to be reintroduced:

        1) is also what I said: if you are going to use percentages as error margins, you have to run the whole thing with percentages and not odds. But that gets complicated for humanities majors, which is why I recommend simpler methods; we don’t usually need the precision of more technical methods there, because we don’t have the precise data that warrants it. It would be like using a three stage rocket to commute to work.

        2) this is exactly what I recommend in PH (and run in parallel to the odds method in OHJ). The concern that our min/max is going to be too wide is a category fallacy. It is simply the case that in history, our margins of error often are that wide. And we have to be honest about that. As I explain in PH, that’s what makes the difference between history and science: we often do not have scientific certainties.

        3) historians aren’t asking about the typical behavior of a model, they are asking whether a claim is true or false. There are only two options there.

        4) I don’t fathom what’s ad hoc about calculating the a fortiori posterior and calculating the a judicatiori posterior. The full theoretical foundation for this comes from the philosophy of history and is detailed in PH. Quite simply, this is necessarily what you must do when the two questions you are asking are, “what is the highest probability of being true that I can reasonably believe this claim has?” and “what is the lowest probability of being true that I can reasonably believe this claim has?” That’s simply what historians are doing. And I am modeling what historians do. Scientists can ask different questions and do different things.

  7. Scott Scheule February 2, 2016, 3:22 pm

    Yes, he always does, and you respond by calling him insane or a kook. The debate has devolved to name-calling on both sides, but I suppose that’s just what happens when two people are both sure they’re correct. Better if we could do without that, but that may be more than we can expect from mere mortals.

    Regardless, there are some things I think a lot of us are hoping to see explained, and since Barnes has been exhaustive (and prompt) in responding to your commentary, so far as I can tell, one hopes you can be similarly responsive to his posts, including his questions addressed to you:

    1. What are the “six constants of nature“? Why is it that no physicist thinks that?”
    2. Where are the peer-reviewed scientific publications that show how fine-tuning has been “refuted by scientists again and again”?
    3. Where are the peer-reviewed scientific publications that “only get [a] “narrow range” by varying one single constant“?

    These seem to be simple factual claims of yours that Barnes is claiming you are wrong about. So far as I can tell, rather than 1. prove you’re correct by citing evidence in response, 2. retracting your original claim., or 3. showing how Barnes is misconstruing your claims, all you’ve done is simply ignore the questions. I find this, regretfully, rather similar in how you responded to Thom Stark during that back-and-forth. He too made several points that you were simply factually wrong–I can list them if necessary–and rather than respond or admit error, you simply opted not to respond. In your responses to Ehrman, you’ve been incisive and complete. I find those markedly different from your response on these topics–and I think Lowder would agree with me on this. To be perfectly honest–and again, I am fully willing to be shown wrong, as someone who finds your work often wonderfully edifying–I suspect you have wandered outside your area of expertise and are too stubborn to admit it.

    Those questions from Barnes are over a year old, and, so far as I can tell, unanswered. I think we all would like to see your response, whether retraction or vindication of your claims, as well as the answers to the many criticisms he’s made in his most recent replies. My opinion, of course, doesn’t mean much, so I suppose we’ll have to wait for Lowder to opine once more. If he once again finds Barnes to be making devastating points, not indicative of a “kook,” maybe you’ll humor the suggestion that you may have made some mistakes here. Or, perhaps, the “kook” label will then be thrown at Jeff.

    1. I never said there are only six constants of nature. I said Christian apologist James “Hannam also criticises [Stenger’s model] for only addressing four rather than six constants, but in fact only four constants are relevant for generating long-lived stars.” That’s Hannam, not me. I never said there were only six constants or even that there were six (and that was indeed fourteen years ago—I trusted Hannam then that there were about six; I know better now). Barnes links only to this article. So once again, he doesn’t even know what I’m saying, and gets it wrong, and over-relies on material over a decade old. You can see why I question his competence to have an intelligible debate.

      I’ve answered the other questions. And Barnes has replied to that answer (finally; one of the few instances in which he has actually responded to something I actually said). I’ll discuss it soon.

  8. Bill Buckner February 2, 2016, 8:39 pm

    It sounds like you are asking how you convert odds to probabilities.

    Um, no. And you don’t have to check the math (well, go ahead if it pleases you.) The math is trivial. The answer is correct. I was just curious how you would calculate the uncertainties. Now I know.

  9. Tim Hendrix February 3, 2016, 6:40 am

    Hi Richard,

    Richard: On those general questions, “[1] while Bayes Theorem is useful in “updating” probabilities, it is not necessary for assessing them at the outset” is not something a Bayesian would ever say,

    Actually I have said this earlier and I am a Bayesian statistician :-). But it is also true! Bayes theorem (or the rules of probability theory) is a consistency requirement and not a way to assign probabilities from the onset. You even seem to agree on this point just a few sentences later :-).

    As to “[2] that probability “measures information,” which can be but is not always in the form of frequencies,” that’s an assertion with no evidence to back it. I guarantee, anything he says about information from which he derives a probability, will always translate to a frequency of something. That consequence can’t be avoided. I think again he is confusing one kind of frequency statement for another, e.g. he is ignoring the fact that stating degrees of belief is making a statement about the frequency of being wrong.

    Well, I would rather not get into this one again, but just for the sake of the casual reader I would like to point out that this particular interpretation of probabilities is your own and should be seen in contrast to all major Bayesian accounts of probabilities I am aware of. Perhaps you would agree that on the mainstream Bayesian view (for instance a Cox-axiom based view or a de Finetti gambling-based view), probabilities are thought to be a measure of states of belief/knowledge/information or some cognate thereof and only indirectly and not as a matter of their definition relate to frequencies?

    I should add that Briggs seems to think this is my argument. It’s the argument of three expert mathematicians published separately in two sources. Which Briggs ignores and does not engage with at all.

    I agree with the three articles from a formal point of view, but I have to disagree with your interpretation of these studies as I think they do not suppose the conclusion you draw. You can find my argument here:


  10. Tim Hendrix February 4, 2016, 4:07 am

    Hi Richard,

    Bill asked how a person would typically perform inference when there was uncertainty about the probabilities, and I simply tried to give the answer suggested by Bayes theorem and what is done in practice in data analysis.

    Re. “this is exactly what I recommend in PH (and run in parallel to the odds method in OHJ). The concern that our min/max is going to be too wide is a category fallacy. It is simply the case that in history, our margins of error often are that wide.”

    As I said, I don’t want to comment on what historians ought to do or ought not to do, however in Bayesian data analysis you wish to see how your modelling assumptions (such as the uncertainty in the probabilities) translate into uncertainty in the quantities of interest. The way to do this is using Bayes theorem, and it will lead to different answers than the min/max approach — answers which from a Bayesian perspective is are spurious. You can say that since we are doing history we ought to do something else, however I was simply trying to provide the standard answer. If you are interested I can recommend guides for how parameter estimation is normally done using Bayes theorem?

    Re. “I don’t fathom what’s ad hoc about calculating the a fortiori posterior and calculating the a judicatiori posterior. The full theoretical foundation for this comes from the philosophy of history and is detailed in PH. Quite simply, this is necessarily what you must do when the two questions you are asking are, “what is the highest probability of being true that I can reasonably believe this claim has?” and “what is the lowest probability of being true that I can reasonably believe this claim has?”

    Well, by ad hoc I simply mean a method for manipulating probabilities that is not Bayesian. From a data-analysis perspective, you wish to know how probable different values of the quantities of interest are under the assumptions you have made (model assumptions, assumptions about uncertainty in probabilities, whatever) and Bayes theorem gives you a way to compute this. If you rather do the min/max procedure, you get numbers that do not represent this and are not computed using Bayes theorem. You can say this is what you want in history, but it is certainly not what you want in data analysis.


    1. …what is done in practice in data analysis.

      Which is fine. I just don’t want anyone to mistake that for being the same thing as modeling and testing historical inferences.

      You are describing a method correctly, that doesn’t pertain to ordinary inference procedures.

      This is rather like teaching someone calculus, when they ask you how to do geometry. Or, like I said, advising someone commute to work on a three stage rocket, when walking will do.

      Most inferences in philosophy and history (and indeed even sometimes in the sciences) do not benefit from extremely advanced and elaborate procedures. Yes, those procedures are useful, when you have the data that they are designed to get the best out of. But they are far too complicated for most purposes.

  11. stevenjohnson2 February 4, 2016, 3:11 pm

    Reading Nate Silver on baseball in The Signal and the Noise, it was never clear to me that Silver’s methods could even ask the question, How meaningful is the notion of a “best” sports team?

    I’m not clear on how Marilyn Vos Savant on the Monty Hall problem comes into it. But this is disconcerting because I could never understand any of her explanations. Nor could I ever understand how Bayesian reasoning does anything other than confuse the issues. For me, it was forgetting all the stuff about new information and updating that finally made it clear. Assuming random distribution of prizes, given enough trials, the frequency of any given door (left, right or center) hiding the grand prize would be 1/3. This real frequency does not change. The real frequency with which the grand prize would be found behind another door then is 2/3. That doesn’t change either. When Monty Hall opens one door, and offers a chance to switch to another door, the frequency with which the prize is found behind another door is still 2/3. Of course you switch. You can’t just update the information to think, two doors, therefore a frequency of 1/2 in a large series of random trials. Evidently what Bayesians means by updating is not so evident to some of us. Are we still Bayesians?

    Prompted by this post I bought a copy of Jordan Ellenberg’s recent book and re-read the chapter on Bayesian inference. When Ellenberg declared that Bayesian reasoning leads you to take Nick Bostrom’s notions of simulated universes seriously, it inspires grave reservations. (This is similar to the recent declaration that Bayesian reasoning indicated some high probability the multiverse was real!) Bayes’ theorem is so easily read as frequentist, that it is tempting to read it as a necessary corrective to mechanical use of null hypothesis and p-values. But as someone observed (Mario Bunge I think,) observed, propositions don’t have probabilities. Assigning prior probabilities to the data set of evidence is the hard work of reasoning but it’s not clear how Bayes’ theorem helps this.

    1. In a sense, you are right: Frequentism and Bayesianism are not so at odds as the battle-lines claim. But there are significant differences, which seriously matter (to life and limb even, as changes made in medical science and research due to introducing Bayesian reading have recently shown; but also to determining the actual rate of false positives in science research, which is being discussed everywhere in the literature now, as people realize the limited use of p-value reasoning). And there are general lessons one learns from thinking like a Bayesian, as it represents a complete logic, the entire machinery of a scientist’s inference from premise to conclusion, whereas frequentism by itself conceals much of that apparatus, leaving scientists prone to self-deception and error.

  12. Bill Jefferys February 4, 2016, 7:35 pm

    Steven, your approach is perfectly fine. In fact, it has been systematized in Gerd Gigerenzer’s book, “Calculated Risks”, where he calls this approach “Natural Frequencies.” I have used it (and his book) for many years in my honors class (for non-scientists) on Bayesian decision theory. I find that starting out from a “Natural Frequency” point of view is a very good way to introduce the ideas of Bayesian inference. In my class, which doesn’t use calculus, I can stick to finite state spaces so this works just fine. After getting the students comfortable with Gigerenzer’s approach and introducing decision trees, I can easily introduce the more conventional Bayesian notation and ideas.

    The one thing that is different from frequentism is that Bayesians can regard probability even if the event is unique, i.e., not the result of something repeated over and over. For example, a given football game is played only once. It still makes sense (if you are into this sort of thing) to estimate probabilities of a given team winning this game and thus to make bets at certain odds (but not other odds) as rational bets.

    1. P.S. For those interested, “Bayesians can regard probability even if the event is unique” also describes hypothetical frequentism. Which begins the debate over what the better way is to do hypothetical frequentism: by just adapting traditional frequentism to it, or going Bayesian on it. Of course, I think the latter.

    2. Bill Jefferys February 5, 2016, 4:53 am

      I agree with Richard on this. I think that going fully Bayesian is the better way.

      My use of Gigerenzer’s “Natural Frequencies” point of view is pedagogical. It is an easy way to introduce the ideas I want to present. One example is one that I often use when people ask me what I do and I want to give them an example of Bayesian stats (many will have had standard stats courses, which they hated)…I give them an example.

      I tell them (correctly) that in 90% of the cases where a woman (in the general population) has breast cancer, a mammogram will come up positive, and in 90% of the cases where she does not have breast cancer, it will come up negative. I then ask, what if a woman has a mammogram and it comes up positive? How worried should she be?

      As Gigerenzer points out, a large fraction of physicians will incorrectly say “90% probability she has cancer”, which is of course wrong. To answer correctly, you need the additional information that in this population about 1% of women have undetected breast cancer (the prior). Then I say, in a group of 1000 women, 10 will have cancer, and 9 of those cancers will be detected by the mammogram. Of the 990 women who do not have cancer, 99 false positives will occur. So in that group, there will be 99+9=108 positives, of which only 9 are actual cancers. So the proportion of women that test positive and have cancer is 9/108, or about 8%. The beauty of this approach is that I can do it in the parking lot, without even pencil and paper, and it will be understood.

      Once the students understand this idea thoroughly, it is easy to transition to fully Bayesian calculations.

  13. Bill Jefferys February 4, 2016, 7:39 pm

    Agree with Richard’s comments. In particular, p-values are very problematic for a number of reasons. See:


    Also, I recommend following Andrew Gelman’s blog, it is one of the few things that I read faithfully. Andrew talks extensively about how things like p-values are misused. Type “garden of forking paths” into his search engine for example.



Add a Comment (For Patrons & Select Persons Only)