Advice on Probabilistic Reasoning

26 August 2020

The most important advice you could ever get for becoming a reliable critical thinker are the following three tips, each of which depends on probabilistic reasoning. You might want to take my online course in Critical Thinking for the 21st Century to really dive into this, and ask me all the questions that come up for you too. But here I’ll start with a little primer on it. I’ll follow the three main tips with some more general advice on how to reason better about probability. Because in real life everything boils down to a probability. And anyone who does not know that, or understand what it entails, will not think so reliably about their own beliefs, or others’.

Of any claim or belief you think is true or that you feel committed to or need or want to be true, ask yourself: “How do I know if I’m wrong?” Being able to answer that question; knowing you need to answer it; and the answer itself—all are invaluable. You cannot know if you are right, if you don’t even know how you’d know you were wrong.
The most reliable way to prove a claim or belief true is by trying to prove it false. Failing to prove it false increases the probability of it being true, and does so the more unlikely it is that those attempts to prove it false would fail—unless the belief or claim were true. The scientific method was a crucial discovery in human history because it is almost entirely based on exactly this fact.
A good way to simplify the task of trying to falsify a claim is to find the best opponent of that belief or claim and investigate the truth or validity of their case. Because they will likely have done a great deal of the work already, so all you have to do is logic-test and fact-check their work.

For that last point, a “good” opponent is one who is informed, reasonable, and honest. So as soon as you confirm someone hasn’t been honest, or often relies on fallacious reasoning, or often demonstrates a fundamental ignorance in the subjects pertaining, you should stop listening to them. Go find someone better.

I do realize this advice can’t help the genuinely delusional; they will falsely believe any opponent is dishonest, fallacies, or ignorant regardless of all evidence otherwise, so as to avoid ever confronting what they say. Reason and evidence will have no effect on them, so advice to follow reason and evidence won’t either. The advice in this article can only help the sane.

Once you’ve found a good critic, so-defined, it can be most helpful to you to build a personal relationship with them or otherwise cultivate a charitable, sympathetic, patient dialogue with them, if either is available (it often won’t be; we all have limited time), and then make it your point to learn as much as possible, rather than reducing your interaction to combative debate. The best way to do this: instead of “refuting” them, aim to understand why they believe as they do. Then you can test for yourself the merits of their reasons, which then you will more clearly and correctly understand. This produces a good falsification test, rather than combative debate which tends toward rationalization and rhetoric, which is a bad falsification test. And you can’t verify your beliefs with a bad test.

Likewise, if you set about attempting to prove yourself wrong in a way already engineered to fail (e.g. you only go looking at straw men, you only apply tests that wouldn’t often reveal you were wrong even if you were, and so on), you are not really following these principles, but avoiding them. Surviving a falsification test only ups the probability your idea is right if that falsification test was genuinely hard for a false belief to pass. In fact, the degree to which you can be certain you are right about something, is directly proportional to how likely your tests would have exposed you were wrong, if you had been. That is, in fact, the only way to reliably increase the probability of any claim or belief.

These three tips all focus on one key task: looking for evidence that’s unlikely to exist, unless the claim or belief is true; or evidence that’s likely to exist, unless the claim or belief is false. Thus trying hard to find evidence you should expect to exist if it’s false, and not finding it, is the best evidence to look for. Because that’s unlikely…unless the claim or belief is true. Not impossible. But improbable. Which is why understanding probability is always crucial. Because it’s always going to be about that.

So let’s talk about probability.

Backstory

Theories of probability were first developed in ancient Greece and Rome, but were largely forgotten in the Middle Ages and had to be more or less reinvented in the 17th century. Ancient theories of probability were not developed from dice games as some think, but from divination and lottery systems (some similar to scrabble games today, others more like the lottery systems developed in Athens for selecting jurors, for which they even built mechanical randomizing machines). Also contrary to what you might often hear, the Greek numerical system was not a hindrance to the development of sophisticated mathematical theories, and they achieved some remarkable advances (see my discussions and bibliographies on ancient mathematics in The Scientist in the Early Roman Empire).

Ancient probability theories were based on a developed theory of permutations and combinatorics (the study of counting, combinations, and ratios among them), which we know from hints in extant texts (where they report solutions to problems in combinatorics that are correct, a feat only possible if they had mastered the underlying theory) and the recovery of one lost treatise in combinatorics by Archimedes in the 20th century—which had been erased by medieval Christians, covered over with hymns to God, but now is fully reconstructed using the particle accelerator at Stanford, which combined with computerized axial tomography mapped the location of all the iron atoms in the codex, from which the lost ink could be reconstructed. (Science beats religion for the win.)

But we know the ancient Greeks and Romans wrote a lot more about probability theory and its foundations, and even used it in some of their philosophy. It’s just that every treatise they wrote on it was tossed in the trash in the Middle Ages, with the exception of one, which was erased (and even that was one of the earliest and most primitive exercises, not representative of subsequent advances). For examples of evidence and discussion on this see my book (cited above) and Russo’s Forgotten Revolution, pp. 281-82. Many other examples can be found, e.g., in places like Cicero, On the Nature of the Gods 2.93, which briefly mentions a scrabble-game method of running probability experiments.

They knew that the fundamental basis of probability amounted to knowing the ratios among possibilities and combinations of possibilities, and that ultimately it was about frequency, expected or actual.

Arithmetic

All empirical arguments are arguments over probabilities. And probabilities are mathematical. So it is not possible to think or argue soundly over probabilities without a basic command of mathematical reasoning. This means all fields of empirical knowledge are fundamentally mathematical. No one gets a pass. Even history is all about the probabilities of things—probabilities regarding what happened, what existed, what caused it. Therefore even history is fundamentally mathematical. I illustrate this extensively in my book Proving History and the underlying point was independently confirmed by Aviezer Tucker. This means critical thinking even about history requires mathematical knowledge (though not solely; I discuss lots of resulting heuristics and methods in doing history in my monthly online course on Historical Methods for Everyone). Nothing so advanced as formal statistics is necessary; but a basic command of probability theory is (see my article Bayesian Statistics vs. Bayesian Epistemology for what that difference is and entails).

I fully sympathize with the fact that many of us have forgotten how to do a lot of that basic stuff. Most of it really is no more advanced than sixth grade level in American schools. If you need help there, almost everything you would need to know is taught entertainingly well in Danica McKellar’s book Math Doesn’t Suck (2008). Highly recommended for future reading someday. For now, there are some basic free primers online that will jog your memory and provide some tips on how to calculate and multiply probabilities, such as Decimals, Fractions, and Percentages at Math Is Fun, and Introduction to Probability at Drexel Math Forum.

Anything more advanced than is covered there isn’t necessary, though often nevertheless could be used. So mathematicians have a lot to offer the humanities, if they’d take a more collaborative interest in it; which, I admit, would require people in the humanities to want to talk to them, and Fear of Math often prevents this. Still. Everyone doesn’t need to be a mathematician. Just a competent high school graduate.

Probability

Probability is understood in a number of different ways.

Probability as Logical Possibility: For a quick basic-level tutorial in how probability derives from solutions in combinatorics, explore the web tree at CoolMath. You don’t need to master the subject. But you should have some impression of what it is and what it involves. You should note that this definition of probability actually reduces to the next one, “frequency,” and thus isn’t actually a different definition, it just differs by working with “ideal” frequencies rather than measuring them in the real world. If you want to explore more the difference between those two things and how they connect, see my discussion in Proving History, pp. 257-65.

Ultimately, most probability reasoning in practice is based on guessing at ideal frequencies, using hypothetical models rather than real world surveys. But hypothetical models can be more or less accurate (and thus we can strive to make them as accurate as we need or can manage), and ideal frequencies can be close enough to real frequencies for most practical reasoning in daily life. Particularly if you use a fortiori frequencies: guesses as to actual frequencies which are so far against the evidence you do have that you can be sure that whatever the real frequencies are, they will be larger (or, depending on what direction you are testing, smaller) than that.

Probability as Frequency: It is my concerted conclusion that all definitions and theories of probability reduce to this. Not everyone agrees. But I am fairly certain that all a probability is, is an estimate of the frequency of something (“How often does that happen?” or even more precisely, “How often does or will x happen, out of y number of tries/spatial-zones/time-periods/etc.?”). The only question that really differs from case to case and application to application is how you answer one crucial question, “Frequency of what?” What are you measuring, when you state a probability? Often this is not adequately answered. It’s wise to answer it.

Probability as Propensity: This is actually just another way of talking about frequency. This time it is talking about a hypothetical frequency rather than an actual one (as I discuss, again, in Proving History, pp. 257-65). Rolling a single die once in the whole of history has a probability of turning up a “1” equal to the frequency of that die in the same intended circumstances rolling a 1 in a hypothetical set of infinite rolls of that die. It’s still a frequency, but we have to use a hypothetical model rather than a real-world measurement.

Most science works this way (we apply hypothetical models to predict the behavior of the real world, rather than, for example, constantly re-testing whether gravity works the same everywhere and changing the gravitational constant every single time because a different measurement is made every single time). And we do this in most probability estimating in real life, too. And this can indeed differ from the “logical possibility” application I just noted above: by instead of using an “ideal frequency”—which assumes, for example, a six-sided die always rolls each number exactly equally often—we could, if we wanted, use a measured frequency. For example, maybe the die we are talking about rolls a 1 slightly more often, and we know this because we tested it with a bunch of earlier rolls, and from that data we built a hypothetical model of infinite future rolls of that specific die that haven’t yet been made and may never be, but we can more reliably predict now what its rolls would be from our now-more-accurate model of its bias.

Probability as Degree of Belief: This is also, I conclude, just another frequency measurement, and thus reduces again to the same one definition of probability as a measure of frequency. Only now, we are talking about the frequency of being right, given a certain amount of information. For example, if you predict a 30% chance of rain, when, given the information you have, on 30% of days you make that same prediction it rains (actually, or in a hypothetical extension of the same conditions), then the frequency with which you are “right” to say “it will rain” is 30% (or 3 times out of 10), and you are simply stating that plainly (and thus admitting to the uncertainty). So it is again just a frequency.

Ultimately this is an attempt to estimate some actual frequency (in that example case, the frequency with which it rains, when given certain available data). In other words, our “degree of belief” is sound when the frequency of an event (the frequency we are claiming) is the same as our degree of belief. Consequently, when we have good data on what the latter frequency is, we can simply substitute it for the former frequency. The two are interchangeable in that respect (the two being “the frequency of x” and “the frequency of our being right about x“). For more on demonstrating these points, see my discussion in Proving History, pp. 265-80.

But of course, often we don’t have good enough data, so we can only state the known frequency, which will be in error when measured against reality, and then correct it as we gather more data proving otherwise. This is called “epistemic probability,” the probability of a belief being true, which can be simply restated as “the frequency of such a belief being true.” Which frequency (which probability), like the rain prediction, approaches the true probability as our access to relevant information increases.

There are also two different kinds of frequencies measured this way: the frequency with which we would be right to say “x will occur” (“There is a 30% chance it will rain today” means “There is a 30% chance I would be right to say it will rain today”), and the frequency with which we are right to say that (“There is a 95% chance that there is a 30% chance it will rain today,” which would mean “There is a 95% chance I would be right to say there is a 30% chance I would be right to say it will rain today”). And usually that requires stating the latter as a range, e.g., not “30%” but something like “25-35%.” In technical terms these are called the confidence level (CL) and the confidence interval (CI). The CL is the probability that the CI includes the true frequency of something. And here again, “the probability that” means “the frequency with which.” In both cases.

In various precise ways, without adding more data, increasing the CL increases (widens) the CI, while narrowing the CI entails decreasing the CL. So, for example, if there is a 90% chance that the true frequency of something (e.g., it raining in certain known conditions) is somewhere between 20% and 40%, you will not know whether the true frequency (the probability) is 20% or 40%, or 30% or 35% etc., only that, whatever it is, it is somewhere between 20% and 40%. And even then, only 90% of the time. If you want to be more certain, say 99% certain and not just 90% certain (since 90% means 1 in 10 times you’ll be wrong, and that’s not very reliable for many purposes; whereas 99% means you’ll be wrong only 1 in 100 times, which is ten times better), then the CI will necessarily become even wider (maybe, for instance, 5% to 60%, instead of 20% to 40%), unless you gather a lot more information, allowing you to make a much better estimate. The latter is the occupation of statistical science. Which can only be applied when lots of data are reliably available; which is not commonly the case in history or even ordinary life, for example. So we must often make due with wide margins of error. That’s just the way it is.

Moreover, we usually want confidence intervals we can state with such a high confidence level (like, say, 99.99%) that we don’t have to worry about the CL at all. It’s just that that often requires admitting a lot of uncertainty (a wide margin of error—like, say, admitting that there is a 20% to 40% chance of it raining, and not actually with certainty exactly a 30% chance). Whereas accepting smaller CL’s to get narrower CI’s just trades uncertainty from one to the other. Like, say, instead of requiring a CL of 99.99%, we could accept one of just 99%, but all that means is accepting the fact that, whatever the CI is you are estimating and thus getting to narrow down (like, say, now we get a range of 29.9% to 30.1%, which rounds off to just 30%, allowing you to just say “30%”), you will be wrong about that 1 out of every 100 times (as opposed to being wrong only 1 out of every 10,000 times in the case of a 99.99% CL). But being wrong 1 out of every 100 times would mean being wrong more than 3 times a year if you are making daily predictions of something, like whether it will rain the next day. So moving those numbers around doesn’t really help much. Your uncertainty, whatever it is, never really goes away.

Restating probabilities as degrees of belief doesn’t escape any of these consequences, which is why I bring them up. If any of this is confusing (and I fully confess it may be to most beginners), it is the sort of thing I am available to discuss extensively and answer all questions about in my online course on Critical Thinking. Indeed, for that this article will be a required reading, so you are already ahead of the curve. Of course in that course I spend equal time on two other necessary domains: the cognitive science of human error, and traditional logic and fallacy-detection. But knowing some of the basics of how probability works is crucial to all other critical thinking skills—not least because most claims in pseudoscience, politics, society, and everywhere else are made using “statistical” assertions, and you need to know how to question those, or use them correctly (I give many examples and lessons on that point in particular in Dumb Vegan Propaganda: A Lesson in Critical Thinking and Return of the Sex Police: A Renewed Abuse of Science to Outlaw Porn, and in other articles further referenced therein).

Bayesian Probability: You might often hear about how “Bayesian” probability is “different” from and somehow “opposed” to “Frequentist” probability. This is all misleadingly framed. What is really meany by “Bayesian” probability in this comparison is the “Degrees of Belief” definition I just discussed, which I just showed is again just another frequency of something. And in actual practice, most Bayesian reasoning doesn’t even use that subjective model of probability but uses straightforward measured frequencies.

Indeed, in actual practice, even the so-called “subjective” model is just an estimation of the objective model in the absence of concrete access to it—so these are not even fundamentally different. Subjective modeling is what all humans do almost all the time, because they have to. We almost never have access to the scale and quality of data backing so-called objective models. So we have to guess at them as best we can. We have to do this. Most human beliefs and decisions require it—in daily life, in history, in romance, in politics and economics, in every aspect of our existence, precisely for want of better information, which we almost never have. So there is no point in complaining about this. We are in fact solving the problem with it, thereby preventing ignorance and paralysis, by getting a better grip on how to deal with all this unavoidable uncertainty.

But I cover the subject of Bayesian reasoning well enough already in other articles (for example, two good places to start on that are What Is Bayes’ Theorem & How Do You Use It? and Hypothesis: Only Those Who Don’t Really Understand Bayesianism Are Against It). Here I’ll move on to more general things about probability.

Error

Science has found a large number of ways people fail at probability reasoning and what it takes to make them better at it. It’s always helpful to know how you might commonly err in making probability judgments, and how others might be, too, in their attempts to convince you of some fact or other (even experts). See, for example, my article Critical Thinking as a Function of Math Literacy. A lot of cognitive biases innate to all human beings are forms of probability error (from “frequency illusion,” “regression bias,” “optimism bias,” and the “gambler’s fallacy” and “Berkson’s paradox,” to conjunction fallacies, including the subadditivity effect, and various ambiguity effects and outright probability neglect).

Key is realizing how often you are actually making and relying on probability judgments. We are often not aware we are doing that because we don’t usually use words like “probability” or “odds” or “frequency” or such terms. We instead talk about something being “usual” or “weird” or “exceptional” or “strange” or “commonplace” or “normal” or “expected” or “unexpected” or “plausible” or “implausible” and so on. These are all statements of probability. Once you realize that, you can start to question what the underlying probability assumptions within them are, and whether they are sound. In other words, you can stop hiding these assumptions, and instead examine them, criticize them. Hence, critically think them.

For example, what does “normal” actually mean? Think about it. What do you mean when you use the word. How frequent must something be (hence what must its probability be) to count as “normal” in your use of the term? And does the answer vary by subject? For example, do you mean something different by “normal” in different contexts? And do other people who use the word “normal” mean something different than you do? Might that cause confusion? Almost certainly, given that we aren’t programmed at the factory, so each of us won’t be calibrating a word like “normal” to exactly the same frequency—some people would count as “normal” a thing that occurs 9 out of 10 times, while others would require it to be more frequent than that to count as “normal.” You yourself might count as “normal” a thing that occurs 9 out of 10 times in one context, but require it to be more frequent than that to count as “normal” in another context. And you might hedge from time to time on how low the frequency can be and still count as “normal.” Is 8 out of 10 times enough? What about 6 out of 10? And yet there is an enormous difference between 6 out of 10 and 9 out of 10, or even 99 out of 100 for that matter—yet you or others might at one time or another use the word “normal” for all of those frequencies. That can lead to all manner of logical and communication errors. Especially if you start to assume something that happens 6 out of 10 times is happening 99 out of 100 times because both frequencies are referred to as “normal” (or “usual” or “expected” or “typical” or “common” etc.).

Many social prejudices, for example, derive from something like that latter error, e.g. taking something, x, that’s true only slightly more often for one group than another, and then reasoning as though x is true of almost all members of the first group and almost no members of the second group. For example, it is “normal” that women are shorter than men, but in fact that isn’t very useful in predicting whether the next woman you meet will be shorter than the next man you meet, because regardless of the averages quite a lot of women are taller than men. And often prejudices are based on variances even smaller than that.

An example of what I mean appears in the infamous Bell Curve study, which really only found a 5 point difference in IQ scores for black and white populations (when controlling for factors like wealth etc.). This was then touted as proving black people are dumber than white people—when in fact the margin of error alone for most IQ tests is greater than 5 points. So in fact, even were that difference real (and there are reasons to suspect it actually wasn’t, but that’s a different point), it is so small as to be wholly irrelevant in practice. To say someone with an IQ of 135 is “dumber” than someone with an IQ of 140 is vacuous. It means nothing, not only because both are exceptionally intelligent, but more importantly because there would be effectively no observable difference in their intelligence in daily life, nor any meaningful difference in career success, task success, learning ability, or anything else that matters. Thus to say “white people normally have higher IQs than black people” based on white people having on average 5 more IQ points (merely purported to be creditable to genetic differences) will too easily mislead people into thinking there is a significant difference in IQs being stated, when in fact the difference is insignificant, smaller even than the margin of error in IQ tests. In fact, this is so small a variance, that you will be 5 points dumber than even yourself just by taking two IQ tests on different days of the week!

Thus, we need to examine the probability assumptions behind such common judgment words as “normal” or “usual” or “rare” or “expected” or “bizarre” or “implausible” and so on. Because all those words (and more) conceal probability judgments. And those judgments require critical examination to be trusted, or even useful.

Logic

As a philosopher, one of the greatest benefits I have ever gotten from standard syllogistic reasoning is learning how hard it is to model a reasoning process with it. The mere attempt to do it reveals all manner of complex assumptions in your reasoning that you weren’t aware of. Reasoning out why you believe something often seems simple. Until you actually try to map it out. And you can learn a lot from that.

This is where trying to model your reasoning with deductive syllogisms can be educational in more ways than one. Most especially by revealing why this almost never can be done. Because most arguments are not deductive but inductive, and as such are arguments about probability, which requires a probability calculus. Standard deductive syllogisms represent reality as binary, True or False. It can function only with two probabilities, 0 and 1. But almost nothing ever is known to those probabilities, or ever could be. And you can’t just “fix” this in some seemingly obvious way, like assigning a probability to every premise and multiplying them together to get the probability of the conclusion, as that creates the paradox of “dwindling probabilities”: the more premises you add (which means, the more evidence you add), the less probable the conclusion becomes. And our intuition that there must be something invalid about that, can itself be proved valid.

When we try to “fix” this by generating a logically valid way to deal with premises only known to a probability of being true, to get the actual probability of a conclusion being then true, what we end up with is Bayes’ Theorem—because that’s exactly what Thomas Bayes found out when he tried to do this. And I find that no one has since found any other route to fix deductive reasoning into a working inductive method. As I explain in Proving History, every other attempt is either logically invalid, or simply reduces to Bayes’ Theorem—which is essentially a “syllogism” for deriving a final probability from three input probabilities (by analogy the “premises” or premise-sets). But I won’t go into that here; see links above. Regardless of what you think the way is to “fix” deductive logic to work with almost the entirety of human reality, which is inductive, you still must come up with a way to do it; and that means one that is logically valid and sound. Otherwise, 99.9999% of the time, deductive logic is useless.

So to be a good critical thinker you need to recognize how obsolete the deductive syllogism actually is. This tool of analytical reasoning was invented in Greece over two thousand years ago as the first attempt to build a software patch for the brain’s inborn failure to reason well. It is a way to use some of the abilities of your higher cognitive functions to run a “test” on the conclusions of its lower cognitive functions (a distinction known as Type 1 and Type 2 reasoning). But it has a lot of limitations and is actually quite primitive, compared to what we can do today.

It is notable, therefore, that this primitive tool is so popular among Christian apologists. They rarely-to-never employ the more advanced tools we now have (and even when they do, they botch them). Take note. Because relying on standard deductive syllogistic reasoning corners you into a host of problems as a critical thinker. Hence over-reliance on this tool actually makes it harder, not easier, to be a good critical thinker. Nevertheless, standard syllogistic reasoning has its uses—often as a way to sketch out how to move on to a more developed tool when you need it; but also as a way to map out how your (or someone else’s) brain is actually arriving at a conclusion (or isn’t), so as to detect the ways that that reasoning process could be going wrong. And also, of course, like all the reasoning skills we should learn and make use of, when used correctly and skillfully, standard syllogistic reasoning can be a handy tool for analyzing information, because it can bypass interfering assumptions and faulty intuitions, as well as (when used properly) expose them.

Conclusion

I am re-launching my one-month online course on Critical Thinking this September 1. Follow that link for description and registration. I am also offering any of seven other courses next month as well, and each one also involves critical thinking skills in some particular way. But what makes my course on Critical Thinking distinctive is that though I do cover standard syllogistic logic (and formal and informal fallacies and the like), I focus two thirds of the course on two other things absolutely vital to good critical thinking in the 21st century: cognitive biases (which affect us all, so we all need skills to evade their control over us); and Bayesian reasoning—which is just a fancy way of saying “probabilistic reasoning.” Which I’ve discovered is so crucial to understand properly, that I now think no critical thinking toolkit can function without it.

I’ve here given a lot of starter-advice on that. But it all extends from the three central guidelines you should learn to master and always follow: (1) of any claim or belief, learn how to seriously ask and seriously answer the question, “How would I know if I was wrong?”; (2) test that question by attempting to prove yourself wrong (and that means in a way that would actually work), not by attempting to prove yourself right; and (3) stop tackling straw men and weak opposition, and seriously engage with strong, competent critics of your claim or belief, with an aim not to “rebut” them but to understand why they believe what they do. These three principles carried out together will become a powerful tool drawing your belief system closer to reality and reliability. But all three require accepting and understanding how probability works in all your judgments. And in anyone else’s.

12 Comments

Simon Myers on August 26, 2020 at 6:08 pm

I just wanted to point out, in case it is useful (although I worry it might add unnecessary confusion), that researchers (using stats) differentiate between “Confidence Intervals” and “Credible Intervals”. The former is given from frequentist statistical tests, and the latter from Bayesian statistical tests and they mean different things. Confidence Intervals are a little difficult to get one’s head around but Credible Intervals are nice and simple. Problems occur as people (even the researchers themselves) often interpret the former as if it were the latter. Note that this is only useful for people reading research papers and who come across CIs and want to understand what they mean, as it is a bit more quirky in this specific context.

What’s a 95% Credible Interval? That’s easy:
“Given what we have just seen, the correct value has 95% probability of falling within this interval”

A Confidence Interval is a little weirder because frequentist stats don’t let you make statements about whether the probability of the value you just calculated is true. A 95% Confidence Interval instead means:
If you repeated your procedure and so calculated lots of confidence intervals, 95% of all the intervals calculated will contain the correct value. Its a statement about some set of hypothetical confidence intervals rather than the actual value.

Or, in other words, confidence intervals are about the error rate. Another way of saying this is “there is a 95% probability that we will collect data that produces an interval that contains the true value.”

But what about making statements about the probability that the interval contains the true value given the actual data that we just observed? That’s what the Credible Interval is for 🙂

There is extensive elaboration in the wiki here https://en.wikipedia.org/wiki/Confidence_interval#Meaning_and_interpretation but as is well pointed out by Richard, Bayesian Statistics and Bayesian Epistemology are different things and this is all just an idiosyncrasy in stats.

I am hoping this doesn’t add confusion to your article and instead helps for anyone who would is reading the results section of a research paper (at least in the cognitive and behavioural sciences which is what I am familiar with). I do both types of stats for my work and its important for us to understand the difference.

Thanks for the awesome article and I am sure the participants of your course will get a great deal out of it!
Reply
- Richard Carrier on August 29, 2020 at 7:56 pm
  
  Yes, all worth pointing out. But yes, perhaps too advanced here.
  Reply
Mike Gainer on August 26, 2020 at 10:19 pm

I’m not really sure what to make of your argument when it comes to the black/white IQ gap. I guess it might be a good argument if we’re arguing against treating individual blacks differently, but usually people who talk about the b/w iq gap are talking about entire groups of people.
Reply
- Richard Carrier on August 29, 2020 at 7:53 pm
  
  It doesn’t matter “why” they are talking about this. The difference is below error margin. Therefore there is no measured difference. Full stop. Anyone who doesn’t understand that, shouldn’t be talking about it—because they clearly don’t know what they are talking about.
  Reply
Tom Reeves on August 27, 2020 at 11:12 am

Richard, do you agree with Craig’s conclusions regarding how to view the “dwindling probabilities” paradox based on the article you linked to? Because there does seem to be something wrong the idea that “the more premises you add (which means, the more evidence you add), the less probable the conclusion becomes.”

Based on the dwindling paradox reasoning, how could we hypothetically ever determine that Colonel Mustard murdered Prof. Plum (given a slightly more real-world situation outside the confines of the board game)? We first have to assign a probability that Plum was indeed murdered (and did not die of natural causes). Then, we have to assign a probability that someone Plum at the mansion did it (versus all humans on the planet, in the area, etc.). Then, a probability that the Colonel’s motive was strong enough, that he had opportunity, that Mrs. Peacock actually did witness the murder and isn’t lying, that Mustard used the candlestick, in the conservatory, at a certain time, etc.

In other words, if the more evidence you add, the less probable the conclusion becomes, how is anyone ever found guilty by a preponderance of evidence? Or, in Craig’s example, how can scientists ever conclude Cygnus X-1 is most likely a black hole? I guess I’m confused as to when evidence becomes additive, bolstering one’s case for a proposition, versus subtractive, diminishing one’s case against a proposition. I’d appreciate if you could break this down (and Craig’s article) in layman’s terms to explain how we can ever reasonably reach any conclusion based upon “more evidence,” (which seems to be something we should want).
Reply
- Richard Carrier on August 29, 2020 at 7:51 pm
  
  Craig is only correct insofar as he is reiterating the Bayesian solution. Insofar as he is not, he is not. I put it that way because his response is vague on this point. But charitably, I will take him as meaning the Bayesian solve. In which case, yes, I agree.
  
  Note, merely pointing out that the results are weird, is not actually an argument against it. Maybe indeed our intuitions are wrong and more evidence should reduce confidence?
  
  So you have to do more than point out that it gets results we are “sure” must be wrong. We have to explain why we are right to be so sure about that. The answer is Bayesian.
  
  Evidence for h is so only when P(e|h) < P(e|~h); otherwise e has no positive effect on P(h), and evidence is only “evidence” if it has a positive effect (hence when you say “the color of my hair is not evidence I am intelligent” you are capturing this very idea: P(blonde|intelligent) is not > P(blonde|not intelligent) and therefore “blonde” is not evidence for “intelligent”).
  
  This entails, however, that you can have a thousand P(e|h)’s that are each a million to one against. And yet if they are all twice their corresponding P(e|~h), the odds on P(h|e) will increase 2^1000 times. Not decrease by 1,000,000^1000 times as the diminishing probabilities method would suggest.
  Reply
  - Tom Reeves on August 30, 2020 at 11:00 am
    
    When you say, “Maybe indeed our intuitions are wrong and more evidence should reduce confidence?” I think many scientists demonstrate more evidence only increases confidence. For example, clinical trials of Covid-19 vaccines test both the efficacy and safety on many (thousands) of different subjects. If they only tested on a sample size of 1, then they are highly likely to get false positives, thinking a drug is widely effective when it only helped (or seemed to help) one person. Or, alternatively, they might dismiss drugs that could safely help 90% of the population, but didn’t happen to work on the one individual tested, etc. So in many (most?) situations, one anecdote can never increase our confidence as much as multiple points of evidence/data. Even in your example, to test whether or not blondes were intelligent/not intelligent (after defining your metrics), you would need to test multiple blondes prior to having high confidence in your results either way. Thus, more evidence only increases confidence – it should not reduce confidence, as your question implies.
    
    I don’t pretend to know Bayes’ theorem well at all, but in places (like when you said, “the more premises you add (which means, the more evidence you add), the less probable the conclusion becomes”), you’ve made it sound like confidence only gets weaker the more evidence one presents in favor of a given proposition using Bayes’ theorem. If this is indeed the case, it seems to run afoul of other methods which increase confidence in a given proposition/model the more positive results you accumulate. But I may misunderstand your position. Thus, could you please contrast (given a widely efficacious vaccine, for example) how Bayes’ theorem can increase our confidence based upon more evidence from a larger sample size, versus reduce our confidence given less evidence from a sample size of just 1?
  - Richard Carrier on August 30, 2020 at 12:18 pm
    
    I think you are missing my point. When I said “Maybe indeed our intuitions are wrong and more evidence should reduce confidence” I was not arguing for that conclusion. I was saying, it is fallacious to defend the contrary the way you are. “But scientists do it” is not an answer to the question “Are scientists correct to do that?”
    
    So you seem to be confusing “Everyone seems to think evidence should increase probability” with “Everyone is correct to think evidence should increase probability.” The only way to verify that second proposition is to explain why, logically, evidence has that effect. And there is no way to do that without the Bayesian definition of evidence.
    
    And indeed, you are here describing Bayesian procedures (we need evidence that is improbable on ~h and probable on h; and it is that difference in probabilities that increases the probability of h; and that is why evidence works at all, is that it has that effect, for that reason). So you seem not to be connecting what you are describing, with the logical reason it is correct.
Tom Reeves on August 30, 2020 at 3:26 pm

I am happy to know you were not arguing for the conclusion that more evidence should reduce confidence, a point which wasn’t clear to me until your latest response given your link to Craig’s article and some of the bothersome quotes I included – so thank you for your clarification. And I wasn’t arguing just because scientists do it, more evidence is better – which would be an argument from authority – but rather, I attempted to demonstrate in my example reasons why scientists would reject one anecdote as inferior to multiple trials in determining general vaccine efficacy.

I am too old to grok Bayes’ theorem in its entirety, but given your clarification, I think we are generally on the same page. If I understand correctly, Bayes’ theorem gives the mathematical justification for believing something is more probable than not, depending on the strength/soundness of all inputs. But absent plugging everything into Bayes’ equation for every single subject one might argue, would it be correct to assert a simplified version of Bayesian reasoning regarding evidence as follows: Evidence is that which increases the probability of a proposition being more likely, provided we account for all relevant background information we already understand?

Thus, finding an iridescent wing on the ground is not equally likely (50/50) that it once belonged to a dragonfly or a pixie, because we already understand dragonflies actually exist, while there has been no demonstration that pixies exist outside our imaginations. So, given this background information, we can safely assert the wing is more probably evidence that it once belonged to a dragonfly than to a pixie, even without computing a more precise probability range using Bayes’ formula. Likewise, given that we know humans respond differently to various medical treatments, a trial of one is not as likely to be indicative of how well a vaccine will perform on the whole population as would larger clinical trials with more subjects. Thus, one test on one human is not sufficient evidence to conclude the efficacy a vaccine will have on a larger population, given what we already know about varied individual responses. And we can assert this proposition in general, without plugging in numbers and refining a more precise probability range using Bayes formula. Am I on the right track here?
Reply
- Richard Carrier on September 9, 2020 at 9:14 pm
  
  Definitely on the right track. That’s pretty much it.
  
  So, “would it be correct to assert a simplified version of Bayesian reasoning regarding evidence as follows: Evidence is that which increases the probability of a proposition being more likely, provided we account for all relevant background information we already understand?” is correct, although it leaves something out: how it does that. That is, yes, b (background information) is a conditional in every term in the Theorem, and thus all three probabilities that define it are conditional “on background information” (the totality of human knowledge to that point, such that e, evidence, and b, together exclude nothing whatever that can be claimed to be known).
  
  The next step, how Bayesian reasoning explains why evidence increases the probability of a claim (if in fact it does, so we can say anything that doesn’t isn’t evidence for that claim), is as explained: how likely the evidence is on the claim being true, and how likely the evidence is on the claim being false; the more these diverge, and diverge in the direction of “true,” the more that evidence is “evidence for” the claim (this is what we mean by “strong” vs. “weak” or “good” vs. “bad” evidence and so on: that divergence in those two probabilities; and combined with a third probability, the prior probability, all posterior probabilities follow, so the probability of any claim’s being true, is derived from just those three and no other probabilities: see Chapter 4 of Proving History for further explication and demonstration).
  
  This is how “diminishing probabilities” never really happens. For instance, if some evidence e is improbable on the claim h being true or being false (which would be, on ~h), i.e. it is just always improbable no matter what, this has no effect on the probability of the claim being true, because what does affect that is the difference between those two improbabilities (if there even is one).
  
  For example, say P(e|h) is 1/1,000,000 and P(e|~h) is 1/2,000,000. The ratio between them is 2/1. Thus the probability of the claim is doubled by e. So saying how improbable e is is a waste of time. No matter how improbable e is, it can still increase the probability of h. All it has to be is more probable on h than ~h. Full stop. And it just so happens, that is precisely why we call it evidence for h: e is evidence for h precisely when, and precisely because, e is more probable on h than on ~h. Since that is what evidence is, it should be clear how adding evidence can only ever increase the probability of h. Not decrease it.
  
  The error made by “diminishing probability” advocates is that they are looking at one side of the ratio, and watching it diminish (1/1,000 x 1/1,000 = 1/1,000,000), and freaking out about how then adding evidence can increase a probability. But when you look at the ratio, it always rises as you add evidence for a claim, even as you multiply improbable evidence together: (1/1,000)(1/1,000)/(1/2,000)(1/2,000) = (1/1,000,000)/(1/2,000,000) = 2/1. Thus you have “diminishing” probabilities in both the numerator and denominator; but the ratio is going in the other direction.
  
  Thus Bayes’ Theorem explains why evidence always increases a claim’s probability, even when that evidence is increasingly improbable. As long as it is more probable on the claim being true than on its being false.
  Reply
Tom Reeves on September 9, 2020 at 10:36 pm

Thank you for your thorough response. I have read your books (all but the ancient science one at this point), and appreciate your efforts to educate me, and those who read your works. Unfortunately, Bayes’ Theorem is often touted by apologists as justifying probability the god they believe in most likely exists, and these arguments can sound quite convincing to many. I’m glad there is an expert like you who can stand toe-to-toe with them and demonstrate the background information of accumulated knowledge they leave out vastly reduces their inflated conclusions.

For my two cents, I’d sure like to see you do more video debates. Many current atheists seem to get trampled under Gish Gallops (including Bayesian apologetics), and while some hold their own is a few areas, you seem able to quickly dismantle most theist arguments on many fronts. While I think you’ve made a great case for mythicism, and know you are an historian by trade, the strength of your philosophical defense of metaphysical naturalism remains unparalleled (at least, in my limited experience). It is one thing to point out theism doesn’t make its case (which most atheist debaters do), but quite another to make a positive case for metaphysical naturalism (as you do in your book, and in your blog posts). I think the public would benefit seeing more of you in increasingly popular YouTube debates. But however you do it, keep fighting the good fight!
Reply
- Richard Carrier on September 24, 2020 at 3:22 pm
  
  That’s all true. Thank you.
  
  (For all readers: on Christian abuse of Bayes’ Theorem see my article Crank Bayesians: Swinburne & Unwin.)
  Reply