Pseudo-AI is garbage and a bubble. And AI content is banned in comments on my blog. Yet people are still gullibly over-trusting AI chatbots, constantly sending me chatbot content and gushing over how brilliant it is and proves whatever they want. I have to explain over and over again that they are being fooled and need to stop being so naive. Because this is dangerous. Not only because AI chatbots are easy to manipulate. So you are simply signing on to be conned by state and corporate manipulation and propaganda (you need to think for yourself, not let spooks and corpos think for you). And not only because relying on AI literally makes you increasingly dumber and more incompetent. Because it de-trains you to think and do research critically (and you need to be constantly building those skills, not letting them atrophy by taking orders from a parrot). But also because pseudo-AI does not think or reason. It just regurgitates what’s on the internet. Often poorly.
This is the thing you need to accept. Asking a pseudo-AI chatbot to do research or reason for you is literally like asking a high school dropout on weed to do research or reason for you. “Hey, Jimbob, spend the day googling to answer this question for me.” All the errors, conflations, failures, omissions, confusions, even hallucinations you would expect from that poor dumb Jimbob is exactly what you will always get from chatbots. And I do mean always. Do not repeat corporate lies and claim “it will get better; it’s just around the corner.” It has been multiply scientifically proved that it will never ever ever ever get better. This is as good as it will ever be. And it’s crap. The sooner you learn and accept this the better. Failure to learn this will be your doom, as you spiral into fatal stupidity and become a state or corporate lemming, all the while marveling at the nonexistent “genius” of this stupid talking parrot.
In my last article on this I covered the narrow use cases where AI has and will continue having utility, but they are very narrow cases, and mostly business-end, or just productivity enhancement for specialized tasks. The reason chatbots suck is that they are general function tools, and AI does worst at general tasks and better when constrained to highly specific tasks, and even better when guardrailed with a shit-ton of human engineering around it to keep it in check. And those applications are not relevant to my point today. Today I am only talking about the use of chatbots to replace independent or critical thinking—relying on pseudo-AI to “do your own research” (when you should actually be doing your own research), or worse, to think and reason for you, which it never actually does. A tutorial appears needed so I can stop repeating myself. Because chatbots might stick around, but will never be reliable. So you need to understand how to use them competently, and not like a stooge.
Of course, pseudo-AI chatbots might not even stick around. Because pseudo-AI is heavily subsidized by runaway capital soon to vanish, so its current availability is literally unsustainable; and will be, for most users, unaffordable going forward. I think a lot of people don’t understand this. But if you don’t get what I mean, or don’t believe me, catch up with this, and this, and this, and that, and that, and that, and this, and this, and this (many of which also relate to business applications of AI tools, but I summarized what corpos need to understand about pseudo-AI last time; this time, I’m talking about chatbots). But even in the worst case scenario, where you will only have access to these chatbots for hundreds of dollars a month and are still stupid enough to pay that rate, you’ll still need this tutorial.
Basic Bot Vaccination Toolkit
From the constant shower of gullible people falling for AI bots that I get in my in-boxes, I realize no one actually understands why and how chatbots are unreliable. So later I’ll give some real-world examples, so you can see exactly what I am talking about, and why it’s a problem. But to avoid TLDR, I want to give you the inoculation toolkit first. If you still don’t know why you need this kit, or are unclear on how to implement it, the examples in the next section will help.
You need to know:
- What they leave out. Omission is the most dangerous error. This is how cranks and propagandists manipulate you: by omitting crucial information or sources or perspectives, which frames you into only believing or fact-checking or arguing with what they presented, and thus never realizing you should be looking for what they left out. Because chatbots don’t reason, they routinely leave crucial data out, because they don’t know what a good vs. a bad source is, or what the most important sources are, or even what is a source (see below). They just look for statistical relationships between words on the internet. Which will never produce a complete source list, or the best, and thus will never produce a complete or best answer to anything. You always have to check what they missed and put it back in, and reorder the resulting sources by reliability, and generate a more reliable answer, one that doesn’t disregard or leave crucial information out.
- What they conflate. The most common chatbot error is conflation: it will confuse different concepts and subjects, and stitch together an answer to a question that is, actually, gobbledygook (see examples below). But unless you are already an expert in the subject, you won’t know that it did this. So you need a strategy for dealing with this. Most of the time you aren’t going to already be a real expert or PhD in the relevant subject. So how do you know when it has done this and when it hasn’t? Because the frequency of this is so high, in order to not be naive, to not get regularly miseducated or duped, you have to always assume it has done this until you can prove it didn’t.
- What they hallucinate. Hopefully you have heard that chatbots actually invent facts and even sources, a lot. So you have to fact-check literally every claim they make and every source they cite at you, to ensure it isn’t fake. Combined with the other two bullets, this actually makes chatbots almost useless. Because you have to redo all the work they did, and more, before you can ever trust it. The only way to use chatbots productively at all is as a “ballparking” tool: something that randomly throws you into “the ballpark” of what “might” be a correct answer—as in, it might get you “near” but not “at” the correct answer. Often it won’t even do that (it’s rate of being wildly off is also high). But more often it won’t get “the” right answer, but something “near” it that you can then correct with a little more self-checking (no longer relying on the bot). A quick vet of its claims and sources can check if that ballpark is right; or from that ballpark you can refine the results on your own, saving you maybe a little time (although studies show this often costs you more time; often it would have been quicker to have skipped the bot and done it all yourself).
How chatbots generate source lists is instructive. Some of the links above explain how this all really works: the bot guesses at an answer to your question based on the statistical juxtaposition of words on the internet; then it looks for sources (hyperlinks) that have the highest statistical representation of the words it put in that answer. In other words, pseudo-bots are not actually summarizing the sources it gives you. It is summarizing “the entire internet” and then trying to “guess” at what webpages will correspond to what it said. Once you understand that this is what it’s actually doing, you will stop trusting these bots.
Of course summarizing “the entire internet” is in no way a consistently reliable source of information (it’s just regurgitating which random group said the most words on the subject). And even this method of statistical juxtaposition of words is not a reliable way of getting a correct answer to anything, as it produces too many causes of error—this is in fact what a middle school education was supposed to train out of you: to not make this very mistake that all bots routinely make. But it’s what they are doing. If you have been fooled into thinking the bot “found sources” and then “summarized them” at you, you were tricked. Bots aren’t doing that. Indeed, they aren’t doing any of the reasoning they report to you when you ask them to, either. When you ask them to explain their reasoning, they are “backwards guessing” at what you want to hear it say. They are not showing you any actual reasoning it actually engaged in to get its answer. See this excellent video by ByteMonk for what’s going on here—which also proves the whole LLM approach can never achieve general artificial intelligence, not ever ever. These things do not think.
And even when you can steer a bot to only summarize a source you found, it will not consistently get it right. It’s bad at that. That it is good at it, say, nine out of ten times might trick you into thinking it’s good at it one in a thousand times, but that’s you being fooled (which is another reason to get Everything You Need to Know about Coincidences). The rate of error is much higher than that, and your trust in the bot’s summaries should correspondingly be much lower. And remember, error rates compound. If they are wrong 1 in 10 times, use them dozens of times and there will be an effectively 100% chance that you will be misled one or more times. And every bot is highly prone to all three errors: leaving out important information from the source you told it to summarize (and omissions are more destructive of accuracy than many people realize); conflating information in the source (and thus getting it wrong); and hallucinating information in the source (literally making shit up that isn’t there). That these errors happen, let’s say, every tenth time can fool you into being amazed by its success and over-trusting it, like a mentalist or psychic tricking you into thinking they can read your mind or talk to the dead. It’s the same game. You must never be lulled into this trust.
Which is the real issue: bots don’t reason; they don’t think; they don’t “reason out” their answers. Even when you tell it to present its reasoning, it isn’t doing any of what it types out at you. It is just faking the appearance of having done that. Because all they ever do is guess, by essentially “cheating the test.” They lean over and look at what everyone else said, and try to come up with some mishmash of all those different answers, or whatever has been said that comes “near enough” to answering the question, at no point ever understanding what any of these words mean, or whether any of those other test-takers it’s cribbing from even know what they are doing, much less whether a mishmash of them all is a better or worse answer.
This is the opposite of knowing what the answer is. Bots are covertly “admitting” they never know what the answer is, that they are neither intelligent nor even conscious. They are instead just going to lie to you by arbitrarily claiming what some other randos on the internet said (or even statistically would have said). It will never know if that actually is the answer. And consequently, its error rate is alarmingly high. Even when it builds lines of reasoning when you told it to, it just fakes up what it thinks a line of reasoning would look like. It did not actually use that line of reasoning, nor does it “know” whether it’s valid or sound. If it “is” valid or sound, it’s entirely by accident, trusting statistical juxtapositions of words to more often land on something that looks right. But often it doesn’t. That’s why it hallucinates: that is the result of this “juxtaposition” technique inevitably going off the rails. That’s also why it has a high rate of omitting crucial sources or data (because it is relying on what gets said most often, not what gets said best). And why it has a high rate of conflation (because relying just on statistical juxtaposition will often mix up what pertains to what).
To control for all these mistakes (and you always have to, because as all the studies I linked above show, pseudo-AI chatbots can never improve their error rate: they have already hit the ceiling on that), you need to remember three steps to take, and always (always) apply them every time you use a chatbot for anything:
- Only trust the bot for ballparking. Assume it is always at least slightly wrong, and use what it gave you as a springboard to do your own research. That will catch when it is hugely wrong and when it’s only slightly wrong. It will also solve the omission, conflation, and hallucination problems. So you need to build and always apply your own critical thinking skills (and yes, to learn those, see my critical thinking category, or take my online correspondence course).
- Always back-check and fact-check. Bots are less reliable than Wikipedia. Because Wikipedia has far more controls (and more intelligent controls) in place. So if you are wary of Wikipedia, be even more wary of chatbots. Like with Wikipedia, rather than trust a fact-claim a bot makes, fact-check it. Rather than trust that a source it cites says what it claims, read the source and find out what it really says. And after all that, also go around the bot (bypass it) and do your own source and information search and see what differences result in your findings, at the very least to confirm there are none.
- Always logic-check its reasoning. Pseudo-AI actually deploys pseudo-reasoning. But this still can land on a valid sequence of reasoning (because statistically, the internet itself does). But you can’t assume that has happened. It only is good at looking like it’s doing this well (hence the “pseudo”). The error rate is too high. Pseudo-math and pseudo-logic abound. So you need to know which it’s giving you every time. And the only way to do that is to vet the logic it presents you yourself. You can never circularly trust the bot to do that for you.
You might be thinking, “Wait, if I have to do all that, why do I need the bot at all? It sounds like it’s just a massive time-waster, generating way more work than if I’d just skipped the bot and did my own research well.” Yeah. Exactly. I myself have only found the ballparking function useful. Basically, I use chatbots like slightly more robust search engines, and not as oracles, much less as reliable thinkers. I don’t trust them, even when I use them. Neither should you.
Examples of What I Mean
Indeed, I would rather not even have this feature on half the time, as Google search is actually worse with AI than it used to be without it (examples, examples, examples, examples, examples, examples). But actually expecting the AI to be smart or know things is something you should never do. Here’s why.
The whole internet exploded this year with gullible praise of a fraudulent science paper claiming it could derive the quark and other fundamental particle masses with an assumption of three-dimensional time. The problem is, all the math in the paper is literally bollocks. It all looks like real math. But none of it works as depicted, none of it gets the results presented, and half of it isn’t even intelligible. There are also telltale errors in how it cites sources incorrectly that resemble AI “guessing” and not human error or deception. This was almost certainly AI generated (and, evidently, its journal has no real peer review). Because a human expert would have caught these errors and tried to fix them somehow. What happened is AI constructed something that looked like a science paper, with equations that looked like math, but did no actual math. The outputs of the equations are bogus. Most of what’s in the equations is even bogus, with symbols never explained or that don’t mean anything.
That is a perfect example of both the dangers and the stupidity of AI. This paper illustrates that AI doesn’t ever actually understand math or even “do” math. It just fakes it. So if you don’t know that (or don’t check to find out), you, like the entire internet, will be fooled by absolute garbage like this. This is true across the board. Read Ethan Siegel’s Big Think article on this problem in Why “Vibe Physics” Is the Ultimate Example of AI Slop. The catchline says it all: “the conversation you’re having with an LLM about groundbreaking new ideas in theoretical physics is completely meritless.” Lex Clips has a great brief from world renowned mathematician Terence Tao explaining the problem here.
There are AI tools that are “human-rigged” to be better at math (by cheating the code to put in some non-AI routines to “fix” the AI’s routine mistakes). But even those are not entirely reliable, they’re just better. And without them, bots are very bad at math in a way that illustrates everything I am saying about how bots don’t really think, but just parrot statistical relationships between words and symbols on the internet. And that’s why an “Apple Study Exposes Deep Cracks in LLMs’ “Reasoning” Capabilities: Irrelevant Red Herrings Lead to ‘Catastrophic’ Failure of Logical Inference” (by Kyle Orland at Ars Technica), which demonstrates “the illusion of understanding” in these bots.
Some examples of this are surveyed by Hunter Kallay in Poor Calculations: Why Is AI So Bad At Math? Kallay also explains how “non-AI” interventions to “check the work” of AI fixes some of this in some systems. But the fix had to be human-intelligent. A human had to build the back-checking system to correct the AI’s output; the AI could never do this on its own, because it would never really “know” when a result was right or wrong so as to tell the difference; it just cheats the test by peeking to see what most people answered or “would” have answered, assuming that that will be correct when often it won’t.
But more important for you to understand here is why this is the case. Consider Kallay’s first example: bots can answer the question “9.11 and 9.9, which number is larger?” with “9.11” because when it checks the statistical juxtaposition of words on the internet, “larger” is more frequently associated with “longer, more characters,” and thus the bot “assumed” the three-digit “9.11” was “larger” than the two-digit “9.9.” This illustrates the difference between actually understanding the question and its answer, and just looking at which words are closer to other words and how often. The bot does not know what “larger” means or even what “9.11” means. It just counts up ASCII characters on the internet and regurgitates what those counts would predict someone would say. And obviously that’s not a reliable way of answering a question (as this example of a failure demonstrates).
Experts can use math-trained bots as productivity tools, but those bots can never replace or do the work of mathematicians, because you always need an expert to vet the results and know what things the tool can and can’t do. See Is GPT-5 Really a PhD Level Mathematics Researcher? and Grok 4: Over-Promise and Under-Deliver on Easy Riders for a good breakdown of this point, with lots of examples of bots not actually thinking and thus never actually understanding any math, even when it gets something right. From those videos you will come away understanding more about how it is that bots don’t actually reason or think or do math. It’s all guesswork based on statistical juxtapositions, along with iterated automation. It’s not smart. It doesn’t know what it’s doing. It just “cheats the test,” and in result, can sometimes do some things right, but only with expert management, supervision, and correction. So you still need real mathematicians. Bots can never replace them. They can never do what we actually need mathematicians for. So you still have to ask mathematicians questions. You can’t skip that and foolishly trust a bot. (And by now you might realize why these bots are never giving you the Singularity.)
An even more serious example is the food recipe scandal: corporate AI grifters promoted using AI to come up with recipes, and it gave them poison sandwiches, mustard gas, and banana tea (involving just plopping a banana into a cup of tea). The bots do not have the knowledge or sentience to understand what ingredients are, or what would be palatable or even edible to mix together or include in what quantities, much less what would be poisonous or dangerous. You can’t replace human-vetted cookbooks with AI. And that’s literally the case: because, yes, an AI-generated Harry Potter cookbook glowingly recommends aconite tea as “soothing,” and even insists you follow its recommended dose precisely to avoid lethality—then recommends a dose that would kill an entire platoon. This is a perfect example of AI literally having no knowledge or comprehension of what it is doing or saying. And if so here, so everywhere: it is making these same mistakes every time you use it.
For the bigger picture here, and more exploration of the significance of these examples, watch Total Stagnation – The AI “Nothing” Products at Upper Echelon; and also Jim the AI Whisperer’s “Want to See How Insanely Stupid AI Really Is? Ask ChatGPT to Answer These Riddles in Just One Word” at Medium. These demonstrate not just that AI sucks, but why it sucks. Because when you understand the why, then and only then will you understand that these things do not and cannot think. And that helps you develop correct ways to use pseudo-AI productively, and not gullibly (for really good example in AI-coding see “I Burned Millions of Tokens on Claude Code. Here Is Why “Vibe Coding” Is a Trap.”).
My Own Examples

Examples that happened to me include when I asked Google for references on vulgar manuscripts of Homer, technical terminology in Classics for manuscripts of the “vulgar” or “popular” (as opposed to curated or elite) text of Homer. A year ago, Google’s native chatbot tried confidently explaining to me that vulgar manuscripts of Homer were written in Latin by Jerome, and went on to jumble up the history of the Latin Vulgate Bible and the Archaic Greek of the Iliad and Odyssey, and none of the sources it cited had anything to do with what I actually asked. If I were not an expert, I would have had no way of knowing this was all conflated bollocks. I’d have had to check its claims and sources, which would have led to an hour of confusion until I realized its mistake. I ran the same query today and got a correct answer this time, with relevant sources (see above right). But how do I know when it’s doing that, or doing what it did a year ago? How do I know when it has learned and when it hasn’t?
Another more recent example is when someone tried asking a bot to explain the evidential reasoning behind my theory of the Gospels, and it vomited up this:
The Gospels are intentional literary constructs required for the transition from a celestial figure to a historical one. Carrier, for instance, argues that the need for a “seeming” ministry was essential to bridge the gap and that the Messianic Secret is part of that process of historicization.
This is presenting itself as a confident summary of something I argued. But it is actually a bizarre gobbledygook of unrelated things. I never talk about a “seeming” ministry, and that has nothing to do with the question of the “messianic secret,” other than those two discussions being occasionally sort of adjacent. So here we see the actual thing it’s doing (guessing at an answer based on the statistical juxtaposition of words it otherwise has no comprehension of—at all, much less in context), and how it gets a wrong answer that sounds like a correct answer. This is conflation error.
The wording is also weird, as if what it wants to say is that creating a ministry for Jesus was part of the process of his historicization (a statement so trivial as to be banal) and that Mark’s invention of the messianic secret within it had something to do with that—but the bot doesn’t know what, or how to articulate even what I said about that. This is an example of omission error. Someone who did not actually read what I say about these two distinct subjects would here get a completely wrong idea of their relation, not only learning nothing about my take on the messianic secret or Mark’s invention of a ministry, but coming away from these omitted details with the false impression that I argued the one from the other. In fact, I argue the messianic secret is simply an allegorical representation of 1 Corinthians 2 (even using a Homeric parallel), an artifact of the cosmic gospel, and not an apologetic for Mark’s story being hitherto unknown.
Because (perchance) I happen to be an expert on me, I know the word “seemed” the bot culled from my even more unrelateddiscussions of Docetism, which aren’t about the ministry or the messianic secret. In fact, what I argue is that scholars were wrong ever to think so, yet the bot here mistakenly implies I took the opposite position—and mistakenly attributes that conclusion to me, when in fact it is the position I document of all experts in Docetism now, and I am just agreeing with them (see Did ‘Docetism’ Really Even Exist?). These are details being omitted yet that are crucial to know to avoid being misled or misinformed. But this illustrates a common mistake I have seen many times, where a bot will claim or imply a person argued a thing that in fact they argued against, by mistaking the statistical juxtaposition of the author’s position and the position they are rebutting: Pseudo-AI cannot tell the difference. It is only guessing based on word-placement.
Likewise the word “ministry” I know it culled from my discussions of that ministry’s absence from Paul and 1 Clement, and its demonstrable fictionalization in Mark (and later riffers on Mark); and the word “messianic secret” it culled from my separate discussion of Mark’s use of the letters of Paul (in this case to reify the “secret plans of God” motif in 1 Corinthians 2). It mixed up these two unrelated discussions with an even more unrelated discussion of Docetism, and then confused me as taking the classic Docetist position (when in fact I am explicitly rejecting it). These are all disconnected discussions. The bot conflated them all, and got my positions wrong or failed to correctly describe or connect them. All because it is only looking for what words are close to each other, not actually understanding anything.
And there is no way you could know this.
This is the central problem with pseudo-AI chatbots: people tend to just take what they say as correct; but in fact, it can be wildly or even subtly incorrect, and there is no way for you to know. So how can you ever trust anything it tells you? You would have to redo all the research you were trying to avoid by using this dodgy internet parrot instead. So why use the dodgy internet parrot instead? The risk of being repeatedly, and cumulatively, disinformed and misled should scare you here. It should not motivate you to be impressed and keep using ChatGPT or whatever.
And this extends even to subtle misleads. It’s not just about blatant errors. For example, here even what it got right it got wrong. While it is vaguely (and trivially) true that “the Gospels are intentional literary constructs required for the transition from a celestial figure to a historical one,” this wording makes it sound like a teleological theory, that the authors of the Gospels intended that end game, when in fact my argument is that the first authors (Mark surely) did not intend to sell historicity to insiders, but were building a parable to confuse outsiders and edify insiders. The idea that the “historical Jesus” narrative would replace what it symbolized may even have horrified an author like Mark.
So how do you parse “Carrier argues that the need for a ‘seeming’ ministry was essential to bridge the gap”? Essential as in “that obviously had to have happened, being simply the only possible causal sequence”? Or essential as in “the Gospel authors needed to invent a ministry in order to convince people Jesus existed”? Because those are not the same thing. And confusing the two is a common error made by my critics. Which reminds us that chatbots are also “just believing” whatever my critics say, and jumbling that up with what I say, so as to “guess” at what I said—because chatbots literally don’t know the difference between an author and their critics, or between straw men and steel men, or between accurate statements and mistaken ones. It’s all equally the same to the bot. It just adds up how often each thing gets said, and literally decides that the thing most often said is true. Which is precisely the thing critical thinkers should never be doing. So critical thinkers should never be relying on chatbots.
And to test all this just now, I asked Google’s native bot “did Richard Carrier argue that Wallace Marshall is a liar” and it said yes, and linked to my demonstration that David Marshall lied in a live debate. It then cited as sources my written debate with Wallace Marshall, who never lied about anything and I’ve never suggested he did. In fact, the bot said:
Carrier stated that “Marshall’s dishonesty” was evident in these claims, which he considered “fantastically false”. The two engaged in a series of written and possibly spoken debates covering topics such as the existence of God, morality, and the historical Jesus.
It then linked to my written debate with Wallace Marshall. So it could not tell that Wallace and David Marshall were different people; it falsely reported that I called Wallace a liar (and even linked to my discussion with Wallace as “evidence” I did); it confused who I debated these subjects with (I never debated historicity with Wallace; and never debated morality with David); and got wrong who I debated in “written” (Wallace, not David) or “spoken” format (David, not Wallace).
This illustrates how bots don’t think. They literally cannot tell when two people have different names. They can’t even tell these are two different people. Because they are just going on which words are most often next to other ones, so that I refer to people by their last name results in a numerical “distance” from any given sentence and a pairing of their first and last name, which the bot reads as “those are less likely connected” than the rest, and so it statistically concluded that every Marshall is David, because I cued it to look for a charge of lying. Which shows the role of omission as an error vector: because my query framed the bot into only caring about what I mentioned (the word “liar”), it could not “see” the difference between my writings about David and Wallace, and thus did not report the lack of that word or its cognates in any of my articles about Wallace.
So the bot is doing no thinking here. It’s just blindly parroting statistical relationships, producing answers that are too often wrong to ever implicitly trust. I could go on because examples abound (like these, and these, and these, and these, and famously this; with an explanation of why bots do this here). We now even have an example from biblical scholar James Tabor that shows error, hallucination, lying, and agreement with what you want it to say—indeed it even covertly repeats Tabor’s own writings back at him, without questioning them or presenting more widely published contrary views, proving it can’t be trusted even when it’s right. But hopefully by now you get the picture.
Conclusion
Needless to say, AI bots also give bad, sometimes dangerous, medical advice. Because they don’t know anything about medicine. And if you ever think they do, you are being fooled. Likewise in law and academics and education and government and mental health. And so on. Examples of this kind of bizarre failure abound (remember when a company’s AI spontaneously decided to delete that company’s entire database?). And examples will never end, because this can never be improved. Yes, that includes corpos’ favorite application (because it supposedly eliminates “useless jobs”): AI customer service—in practice, a disaster (no, really, it’s worse even than those robots of yore everyone already hated). And it will never be better than this. Yet corpos are idiots and never learn.
Don’t be as dumb as them. Know that pseudo-AI is always unreliable, and will always be unreliable, because it never thinks, it is never reasoning, and it knows nothing. If you ever use it, you must always use it with a full understanding of that fact. It will often omit crucial information, yet will present the appearance of having been thorough. It will often conflate things, yet trick you into thinking it didn’t, because it is good at faking the appearance of an expert discourse. It will often hallucinate, yet you will have no way of ever knowing when, unless you literally check every single thing it says against a more reliable source.
And this is why you must always use chatbots with that skeptical and corrective toolkit, every single time you use them. Never trust it, only use it for ballparking (if at all), always back-check and fact-check everything it says, and never trust its statistically-crafted appearance of reasoning; always carefully check the validity and soundness of anything it presents you that even resembles logic. Do not be impressed when it presents some summary of a scholarly consensus, or an argument for or against something, or an analysis of a thing, that “sounds” astute and well-researched. It will often be wrong, sometimes catastrophically. It is only faking all of this. It is pretending to sound astute and well-researched. It is not actually producing astute and well-researched answers to your queries. It is cheating the test, a method that is really just repeating the internet at you (which you could have simply done yourself), and which is inherently and perpetually prone to errors.
So do not show me some ChatGPT analysis and ask me what I think of it. I never trust that shit and I see no value in it. Do the work yourself. Use your own mind. Do your own research. Build your own analysis. And keep it succinct and legitimately-sourced (no lengthy wordwalls or giant essays; I will never have time for it). Then maybe you’ll have something worth my time to show me—or something worth even your own time to contemplate. And you’ll have honed your critical thinking skills in the process, rather than withered them. Which is far better for you, me, and the world.
⌘
Special Request: I don’t usually ask for this, but on this thread it’s welcome: if you encounter your own examples of AI chatbots messing up, organically in your travels, please screencap the aberrant result and post it in comments below. The comment tools should have a button bottom right for adding images, which may limit file size, but smaller files are obviously preferred anyway. Just enough to be legible is all we need. If any explanation is needed of what is going wrong there, please add that, too. Bonus points if you can identify specific instances of (1) fatal omissions, (2) fatal conflations, or (3) full-on hallucinations. Double points for spotting a blatant math or logic error, or any clear-cut case demonstrating that it doesn’t understand what it is talking about.





I have absolutely no clue which AI tool you have been using to finally arrive at such a devastating judgement. Try ChatGPT 5.2 Thinking. „9.11 or 9.9, which number is larger?“ is piece of cake for this model. DeepMind’s AlphaFold 3 won the Nobel prize in Chemistry in 2024.
This is what dupes always say. I show you hundreds of proven examples and scientific studies spanning all bots. And you say “I never notice it.” You are failing at critical thinking here, big time.
You also didn’t read the link I gave or my words in that section describing it: they already explain that editions can “fix” the 9.11 error, and I even said so. You ignored all my words and didn’t read the source to understand this. Our example is only to show how these things don’t really think. That human back-end fixes are needed to get them to stop making such mistakes proves our point, it is not a counterexample to it.
And no, a bot never won the Nobel Prize. Human engineers did. That was an ultra-specialized application curated by human experts. Not a chatbot. Read my entire paragraph in this article on this very point for the difference.
And stop making these mistakes.
This is the damage AI and its hype is doing to you and everyone falling for it. This reliance has destroyed your ability to reason, even to read. And that’s bad.
Also, AlphaMind is not an LLM. It’s a ML tool specifically catered to a specific task, which is one of the exceptions you carved out at the beginning.
Indeed. Hence, as I said, “Not a chatbot.”
Just got this one in my inbox so have only read a bit of it–enough to know, when added to your last piece on AI, that you are performing a great public service in writing these pieces. I share them wherever I can. I read the last piece thoroughly once, but really need to work through it again to be at a level where I can use it in discussions of AI with friends–many of whom DO NOT KNOW THIS STUFF and have never heard the things you are saying–and arguing. So as a fellow philosopher and teacher of critical thinking, thanks.
Checkout my email address. I chose that word while in grad school and figured it might identify me to others as a philosopher. NO ONE has ever recognized it. You will, I am sure. (I coulda picked ‘grue,’ of course.)
Thank you. But no, I did not know the emerose article. But now I do.
Today’s New York Times has a perspicacious article that exposes three LLMs’ failure in its author’s field of pre-Renaissance Florentine architecture.
Oh, thank you. I saw that headline too, but didn’t follow up (it’s paywalled for me). So I didn’t know how suitable it was here. Since most people can’t access it, can you summarize it as applies here for me and other readers?
Article archived at https://archive.is/QChHS
Thank you.
I have learned a lot about the problems with AI by reading your posts and clicking on the links. I will be much more limited in my use of AI in the future.
And I am much more aware of some of the workslop I am seeing at work that was generated by AI. When a person uses AI to generate lengthy workslop, it wastes the time of everybody who has to read it and try to make sense of it.
I have, however, found one place where AI can be helpful. I often write a first pass of what I want to say and then feed it to AI with a request in the form, “Please play devil’s advocate to the content below. Show me some arguments that people might make to argue against this content and give me links to sources they might use.” I am not using AI to give me a complete answer or a well-reasoned answer. I am asking it to search the Internet and find anything that might be out there opposing what I just wrote.
My next step is to start clicking on the links provided to see if it found a hole in my argument that I should address before posting this. It certainly shouldn’t be the only critical thinking tool one uses. But it can be helpful in thinking about possible counteraguments before I post.
For instance, if I post a query asking for objections someone might make to your post, it generates this: Devil’s Advocate Request. I’m not saying that it correctly represented you or presented valid counterarguments. But I think it does a fair job of presenting counterarguments someone might make and sources that might be used. And yes, many of the sources it lists are made up, but some appear to be relevant.
Using this tool, I have preempted counterarguments and have been able to respond to them in the original post rather than wait for someone to argue that point and catch me off guard.
Yes. I also highly recommend this approach :when using chatbots, use prompts and methods to reduce fallacious thinking, such as confirmation bias. Chatbots are quite prone to their own confirmation bias, but special prompting methods will greatly reduce that.
That helps but only marginally because it isn’t thinking so it doesn’t actually understand those prompts. It has no idea what “compare” or “contrast” means. It just uses those words as statistical anchors in its word juxtaposition search. So all the problems I note remain regardless of any prompts you use.
While prompts like “your own” activate human (not AI) programming to constrain the AI, and so even there the AI has no knowledge of what the difference is between “its own” reasoning or conclusions and any others. Humans have to program that distinction into it, and link the gosub to the prompt. And even then it won’t always work, because AI is unstable and inconsistent (unlike human-engineered software which, after debugging, always does exactly what it was made to, without deviation).
Hence the only “reliable” way to eliminate those problems is not with AI, but with human-engineered programming that reliably repairs the AIs results by some automated metric. I mention examples in my article, especially for math software, where my links have examples and explanations of this point.
And I still put “reliable” in scare quotes there because there remains no such thing in actuality. Human-guardrailed math software performs better than chatbots, but still makes all the same mistakes, just at a lower frequency or in certain contexts and not others. Which is why only experts can use them, as only they know where the reliability peaks and valleys are, what it can never do, and how to always vet what it even does before trusting it.
I tried this a second time, asking AI to play devil’s advocate on the article The Future of AI: What to Expect in the Next Decade, which is optimistic about AI. ChatGPT responded with Devil’s Advocate on AI
It put forward some good arguments and sources that expose AI, but it’s certainly nowhere near as valuable as this post by Richard Carrier. It came up with a lot of hallucinated links. Although helpful, I can see that this approach makes slow progress at showing what might be wrong with AI.
By contrast, a simple Google search for the future of artificial intelligence – Google Scholar finds hundreds of scholarly articles. And regular Google search yields a lot more. These are not organized like the AI results. And they don’t tell you which are positive and which are negative. But yes, if one had the time to scan through all this, one could generate an outline of arguments for and against AI, with sources for each.
I continue to use ChatGPT SORA to generate images, which I find enormously helpful and a whole lot of fun. I might phase it out almost entirely as a research tool, though I still find it useful as a simulated devil’s advocate when writing.
“a lot of hallucinated links” !!
A screencap would have been invaluable here.
P.S. I won’t be writing on it, but the image creation version of pseudo-AI (and I still call that pseudo-AI because it never has any idea of what it’s drawing for you; it is literally just guessing which pixels should be next to each other by statistically sampling pixels on the internet), although a handy productivity tool (I mentioned in my last article how that might not be replacing artists as much as people think), is facing serious legality challenges to be aware of. If AI generators are indeed engaging in mass copyright violations, they won’t exist for long, or else will become exceedingly expensive at the subscriber and user end (even the Disney licensing deal is only a three-year contract, so it will expire when it doesn’t make them any money, which it never will: Disney was solely paid in stock, which will be worth nothing in less than three years).
And it’s also not great.
I subscribe to Midjourney/Niji because I use it for my roleplaying campaign.
But it struggles in very basic things that I could easily get from an artist: multiple limbs, larger women in the same scene as smaller men (a sign of social biases in the system), etc.
And what it does ultimately generate is highly derivative.
It’s not even that good at trying to emulate most house styles (e.g. “A star-based villain in the style of Eichiro Oda”). I guess to its credit that means that its database wasn’t at the ultimate level of industrial plagiarism, but it still defeats a lot of the point.
Here is a screenshot. You can also see it all in the link I sent, which shows the actual output generated by the AI, with live links to the sources it recommended. (Devil’s Advocate on AI)
I had asked it to find arguments and links refuting a post that was very favorable of the future of AI. The first argument that it make, that this post favoring AI extrapolates past AI success unrealistically, is valid. AI proponents do make this mistake. And it posts two valid links, one to a paper suggesting that “deep learning must be supplemented by other techniques if we are to reach artificial general intelligence” (https://arxiv.org/abs/1801.00631)and another link to a paper “encouraging research directions beyond ever larger language models.” (On the Dangers of Stochastic Parrots | Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency) So far, so good. It came up with a valid argument against AI, along with two papers supporting this argument.
But then it posted a link to https://www.sequoiacap.com/article/generative-ai-reckoning/ which, as far as I can tell, was completely “hallucinated”.
And I see the same thing down through here. Its a list of arguments with a mix of valid, marginal, and outright hallucinated sources. We can certainly argue that there are better sources we could go to if we want to find counterarguments to any webpage we wish to examine, but this is one quick method that may be helpful.
For instance, here is an image AI generated for me that I used at a No Kings rally with the heading, “Your buffoonery has destroyed the White House.” Little did I know that one day his buffoonery would literally destroy part of the White House.
Rather than give a sceenshot, maybe I should post the whole thing. Here is how ChatGPT responded when I asked it to criticize a webpage praising the future of AI. It’s not a bad response (for something that was generated in a few seconds). But yes, there are far better sources than asking AI. However, if you want a quick analysis of what a particular argument on the web might be missing, AI can quickly provide an answer like the following.
If you click the links it gives, probably most are good, some are off track, and some are hallucinations.
==============
Below is a devil’s-advocate critique of the attached article, highlighting arguments skeptics commonly raise, along with credible sources they often cite. I’m not endorsing these positions—only mapping how critics push back against the article’s optimistic framing of AI’s next decade.
I explicitly reference the uploaded article as the target of critique
pasted
.
1. “AI progress is being extrapolated unrealistically”Critic’s argument:
The article assumes that recent breakthroughs (LLMs, multimodal models, AlphaFold-like systems) will continue on a smooth upward trajectory. Critics argue this is linear extrapolation from a hype-driven period, ignoring diminishing returns, scaling limits, and economic constraints.
They claim:
Sources skeptics cite:
2. “AGI remains speculative and possibly unreachable”Critic’s argument:
The article treats AGI as a plausible near-term destination. Skeptics argue this confuses engineering progress with cognitive science.
They argue:
Sources skeptics cite:
3. “Healthcare AI is oversold and under-delivering”Critic’s argument:
The article portrays AI as nearing diagnostic superiority and personalized medicine at scale. Critics argue real-world clinical deployment has been slow and disappointing.
They note:
Sources skeptics cite:
4. “Education personalization claims repeat decades-old ed-tech hype”Critic’s argument:
The article echoes promises made by MOOCs, adaptive learning platforms, and intelligent tutors for over 30 years—many of which failed to improve learning outcomes at scale.
They argue:
Sources skeptics cite:
5. “Job creation claims lack historical support”Critic’s argument:
The article suggests AI will both automate jobs and create new ones. Critics argue this is faith-based economics, not evidence-based.
They claim:
Sources skeptics cite:
6. “Ethical challenges are understated and structurally unsolved”Critic’s argument:
The article frames ethics as a governance problem to be “managed.” Critics argue the risks are structural, not procedural.
They warn:
Sources skeptics cite:
7. “Human-like AI framing is misleading and anthropomorphic”Critic’s argument:
The article repeatedly uses language suggesting AI will become “more human,” “empathetic,” or “intuitive.” Critics argue this misleads the public and inflates trust.
They argue:
Sources skeptics cite:
Bottom-line devil’s-advocate summaryCritics would argue that the attached article:
They’d say it reads less like a forecast and more like a synthesis of industry optimism plus speculative futurism
Which is all just a dodgy average of what everyone is saying on the internet.
Exactly, but with far more self promotion.
That will still fall to the same three errors:
(1) It will sometimes omit crucial arguments you expected it to list. And there is no way for you to know it did that unless you bypass it and run your own search to see what it forgot to tell you about.
(2) It will get wrong what the arguments are by conflating data and misdescribing arguments. The examples I gave from my own work demonstrate this. For example, it cited my non-existent argument regarding the development of the messianic secret against alternative explanations of it, because it completely confused different things I said about that, the Gospels’ fictionalization of Jesus’s ministry, and Docetism. And again there is no way to know it didn’t do that unless you bypass it and physically check on your own. So you can never trust that it is even giving you good or correct (much less the best) Devil’s Advocacy. Its attempts at that will be just as prone to conflation and error as anything else you use it for.
(3) And it will occasionally hallucinate arguments and evidence that don’t exist. Which again you can’t know it hasn’t done unless you vet everything it gets. Which is hard because the source list it gives you is itself fake: it did not consult those sources; it regurgitated the internet and then tried to “guess” which webpages match what it said, and it guesses wrong a lot. Which gets you full circle back to problems (1) and (2).
So you would be better off doing your own research to find the best counter-arguments to anything, and not trusting AI. It will frame you (by steering you to some things and away from others, hiding important arguments and facts from you). It will mislead you (by incorrectly describing facts and arguments; often badly, but even if only subtly, that’s enough to frame you away from the actual best argument it could have presented you). And it will lie to you (simply making shit up that is neither true nor logical, but “looks” true or logical because it “knows” how to fake the mere appearance of that).
And above all, if you can’t think of counter-arguments on your own, or how to look for them on your own, that is a skill you need to be developing, not “bypassing” (and thus never learning) by using a framing, misleading, dishonest parrot instead.
The ironic thing is that I think the conflation is probably the only interesting thing it can do.
If it comes up with a wholly new argument, then you can actually vet that argument independently and see if there’s anything interesting there. It’s at least a new framing to explore the problem from.
No tool I’ve seen even does that with regularity (or at least I don’t notice it because the novel arguments are so obviously bankrupt they’re not even interesting to explore), but that would at least be the use case, basically testing your idea against some attack that real people may not come up with due to base assumptions.
It will be worse, of course, because it is generating largely at random (or by happenstance at least). It is not doing this intelligently, and thus it is actually bad at abduction (coming up with the best hypotheses to explore first).
For example, my comment above screencapping its argument that my cat should not eat the planet Mercury because mercury is poisonous. Even assuming cats could eat planets, that is literally the dumbest and worst counter-argument it could come up with. So it is not really good at coming up with counterarguments to things. it’s actually bad at that.
The best way to come up with Devil’s Advocacy arguments is intelligently, as in, on your own, by understanding the things we are actually talking about (what is a planet, how much mercury is actually on Mercury, what can cats eat) and actually understanding the logic of the initial argument (“My cat can eat planets. Therefore it could eat Mercury.”), and thus able to zero in on where the argument might go wrong.
So, an intelligent person would immediately answer “can my cat eat Mercury” with “cats are too small and far away to eat planets, so your question is moot.” A lot of when pseudo-AI says correct things like that it’s because humans went in and “janked” it to (thus preventing it doing something dumb, and possibly dangerous, by having a human ensure it, which only proves how dumb AI’s are, rather like Musk’s Optimus robot always actually being remotely operated by people while falsely claiming it’s AI). That human tinkering is of course invisible to most users, so they don’t know it’s not the AI that got the correct answer in those cases.
So, really, what we should be doing is developing our own ability to think up all the best counter-arguments to a thing. Not expecting AI to do it for us. That’s like asking a high school dropout on weed to do it for us. Surely we can do far better on our own.
I had Grok tell me that Erhman directly refuted OHJ in Did Jesus Exist, not seeming to know the dates of each publication. When I pointed this out, it shifted to lying about how his articles directly address the book and parroted the consensus fallacy.
It seems Grok got a christian apologist “upgrade” two versions ago. Everything it says is some apologist trope.
Someone else related a story about how Grok invented a crime, suspect, victim, eyewitness accounts, etc entirely out of whole cloth for them when doing some research.
Indeed, here is a good example of the dangers of the fact that all it does is repeat the internet at you: it is just repeating what apologists and trolls flooded the internet with, false claims about what Ehrman has written about my book or its contents.
You will find constant references from humans that Ehrman refuted OHJ in DJE. Even the humans don’t check (or don’t care) that that is chronologically impossible, and also contextually impossible (DJE never references any of the effective arguments OHJ accumulated against it), and was already disproved by a human (Lataster’s peer reviewed study extensively shows that DJE completely fails to address OHJ).
So this wasn’t a mistake the AI itself made. It’s a mistake humans made and flooded the internet with. Since the AI can’t tell what is true or false but only decides what is “said the most” must be true, it gave you a predictable answer—and a predictably useless one at that. Since not only did it just repeat what critics keep asserting (and not what’s actually true), but it didn’t give you any reason to believe them.
In other words, a real AI would not just tell us what “the internet says.” It would tell you what its arguments and evidence are for what it says. For example, rather than just say “DJE refuted OHJ,” it would run a 1:1 comparison of all the arguments in either, eliminate the moot arguments, count up the unanswered arguments either side, and vet the remaining arguments and count which way they go (e.g. count the side that is logically valid and whose premises are correctly established with evidence), and give you an answer weighted by the score.
For example, it’s theoretically possible for OHJ to have failed to meet the arguments of DJE after the fact and thus indeed for DJE to have preemptively refuted OHJ. But that has to actually be true. It can’t just be asserted. Most humans asserting it in fact haven’t even read either book and have no idea what’s in them, much less whether this aspirational condition has been met.
And this is the difference between critical thinking and relying on pseudo-AI: the latter worse than relying on unreliable, illogical, or delusional humans.
Haven’t read this yet…just needed to tell you I’ve been getting emails from my online biz consultant about using ai as her thought partner in reconstructing her Christianity.
Oh dear.
Flashbacks to the Jesus Box in THX 1138.
I thought these chatbots were just some overhyped “Yes-Men”, but man, after reading this it’s so much worse than that.
I find it humorously synchronous that I made nearly the same tips to my friend the other day re AI/LLMs:
Do not trust them Ask them to show their work and then check a sample of its sources and its logic Predict the answers before you ask the question. Investigate why its answer varies with yours. That’s where learning might occur. Look especially for answers you haven’t thought of. learn good prompting techniques to reduce error. You didn’t go into 4, understandably because even though this post is billed as a guide on how to use AI-LLMs, it’s actually more of a persuasive article on why not to. (I’m already imagining your retort to this tip.)
I haven’t yet arrived at the cynicism you have regarding pseudo-AI. That, despite having extremely similar experiences and conclusions, both when using it for my work (code generation, solving technical issues that would take hours with Google) and for those hobbies of philosophy and history (before July I had asked one about something intersecting with your work, and I’m quite certain it was quite wrong).
I’ve also noticed what happens to my mind when I ask it for answers and DONT follow my tips — it edges toward cognitive complacency (I should trademark that term).
And even though we both say “don’t trust it”, the fact is, most of us — you excluded — cannot accurately recall whether some information we read about came from a reliable source such as a bad LLM, a good LLM, or some well-sourced article from Wikipedia, or the blog of an acclaimed genius/academic talking within his bailiwick. It’s not a skill most of us were ever taught or needed to exercise.
huh. On that last thought, Evangelical Christians might have a leg up here.They commonly train their children to distrust sources NOT from the Bible or their own teachings — which means in a weird way, they are given a better epistemic tool than the rest of us — how to distinguish the weights assigned to beliefs based on their source.
(2nd post). Grok v4.1-super agrees with you. Well, it accurately summarizes your position. A link to its response (2 screenshots below). Sorry, not what you asked for, but four attempts failed to validate your critique (external view) while ironically supporting it (internal view).
And DeepSeek’s response to the same prompt. I must say, I really like DeepSeek. https://chat.deepseek.com/share/hbsfrparb6m3d3czje
This won’t earn me any bonus points, but it was funny.
That’s especially weird as it doesn’t even show a “corrected” text line (as it usually does when it thinks you typed the wrong word and is showing a different one to you instead). It just literally thinks George and Gorge are the same word.
“The crucial difference is heredity with selection. The program doesn’t start from scratch each time. It preserves correct letters from one generation to the next. Once an
Mis in the first position, all future generations keep thatMwhile the other letters are explored.”DeepSeek’s take on Richard Dawkin’s Weasel program. The famous fundamentalist canard about each selected letter being immutable after being found to be in the correct position.
You may need to explain to give us context and relevance here.
I once asked Google whether it would be ok to allow my dog to literally eat the entire planet of Jupiter. AI said “it would not be ok, as this would cause the dog extreme gastro-intestinal discomfort and likely result in death, given the size disparities of Jupiter and the average dog. Such conduct would be construed in many jurisdictions as animal cruelty, even assuming the dog could live long enough to survive the trip from Earth to Jupiter.”
I can do you one better…
A New York Times article on Dec 22 titled “Chatbots Can Shift Opinions in Politics, Two Studies Find” (Myers & Rosenbluth bylines) reports on recent Nature & Science papers that “a short interaction with a chatbot powered by AI could meaningfully shift some people’s opinions…”
Propaganda & disinformation by malign political actors already flood the internet & social media as any controls or serious moderation are rapidly being dropped in favor of “engagement” pushed by so-called free speech absolutists and the tech broligarchy. This will only further skew the statistical word pattern associations, too. The rapid flood of AI pseudo chat slop can’t be good for democracy at any level – even when not programmed or generated to deliberately manipulate.
The Conclusion section of the Science paper that I quote below reiterates Dr. Carrier’s critique & advice in this blog.
It is no solace being Cassandras, I don’t think. We can keep trying to warn & teach. And keep our critical thinking skills sharp. And advocate for proper regulation – the horses may be out of the barn but they can still be corralled, I hope.
https://www.nytimes.com/2025/12/05/well/ai-chatbots-political-opinions-study.html?unlocked_article_code=1.-08.L2bB.mFDmxdmYp5y8&smid=nytcore-android-share (should be unlocked & Nature & Science papers are linked from the article, I will also paste below)
https://www.nature.com/articles/s41586-025-09771-9 (paywalled but abstract & refs open)
https://www.science.org/doi/10.1126/
science.aea3884 (also partly paywalled)
“Our findings suggest that the persuasive power of current and near-future AI is likely to stem less from model scale or personalization and more from post-training and prompting techniques that mobilize an LLM’s ability to rapidly generate information during conversation. Further, we reveal a troubling trade-off: When AI systems are optimized for persuasion, they may increasingly deploy misleading or false information. This research provides an empirical foundation for policy-makers and technologists to anticipate and address the challenges of AI-driven persuasion, and it highlights the need for safeguards that balance AI’s legitimate uses in political discourse with protections against manipulation and misinformation.”
You’ve giving me a lot to think about. Thinking is just so hard though. I’ll check in with my favorite chatbot to help me with this dilemma
Lol.
One bit of caution: science often flubs studies like this.
I haven’t checked into these (maybe they’re great), but FYI, a common defect of all “change minds” studies is that they often don’t check to confirm it actually happened. Often you will get proximate compliance that vanishes in a matter of days or weeks, i.e. the study will flag someone as having changed their mind, but ask them in a week, and you’d find out they didn’t.
This has made a lot of these kinds of studies fail replication, especially when money is finally spent on the followup component. This is part of a larger problem with psychology as a field: it has no money, so tries doing studies on the cheap, resulting in massive methodological and sample-size defects (to the tune of 2 in every 3 studies in psychology fail to replicate, no matter what the subject of study).
There are also problems with human science across the board that also have to be checked: like confidence level, confidence interval, and effect size. Often one of these is so absurd the results are effectively useless to the point of fictional.
So there’s a lot of signal checking to do before trusting studies like this.
This is sort of inherent to the nature of any intervention.
Studies on jury-type interaction, for example, finds that people will have more enlightened, consistent ideas after they have to discuss in a jury (e.g. people who would say that they would ban atheists from serving in public office will no longer say that after discussing the issue).
But when you ask the poll question again in the future, you’re still somewhat likely to get the kneejerk response.
A huge amount of making democracy work is finding ways of putting people into frames of thought where they actually have to think properly instead of just expressing an opinion.
Oh, it’s much worse.
Jimbob is actually likely to go down some interesting rabbit holes that, even if they lead you to low-value or irrelevant sources, will actually be informative in their own right.
He’s also still capable of original thinking and may come up with an interesting framing on a problem.
AI is far too conventional and far too literal. When I looked up “madman amendment” just to remind myself what the amendment was, it told me “There is no literal Madman Amendment”. No shit. It gave me the right answer ultimately, but any reasonable person would have known what I was getting at, and it actually failed pedagogically because it didn’t make clear that this is what people refer to that amendment as.
I try to have AI generate me even the most basic creative ideas for roleplaying campaigns, just the kind of thing that an even-modestly creative person could rubber duck back at you if you asked them to just spitball, and it can’t even do that. The things it brings back are the most literal, obvious, boring concepts imaginable. Older random generators were more interesting because after a few generations you’d get an unexpected mashup.
Technically, the AI is doing that: it is literally reading down every rabbit hole. Why it sucks is that it barely learns anything or knows what to tell you about it. Hence the analogy is that a dope head noodling on the internet with minimal skills may trip out on a bunch of stuff and himself learn a bunch of stuff, he won’t remember to tell you any of it, and will hazily scramble something half-assed together instead. Which is basically what AI is doing: cheating on everyone’s test, but without even any comprehension of what their answers were, just some vague idea that some mishmash of them “surely” will be right. Like a dropout on weed might do.
That’s what I hear from D&D Tube a lot, too.
BTW, if ever you run a weekly or biweekly “by video call” TTRPG and want to see if I’m available as a player (as I assume you don’t live in Atlanta), let me know. 🙂
My experience is that its curation leads it to select conventional (which does not mean high-quality) sources. Meaning that it won’t get the really weird stuff that can actually be independently interesting. It actually really struggles at assessing the range of opinion on topics, for example. (To be fair, that actually does mean it also has some protection against misinformation… unless the misinformation is seeded well enough, as you note. Which it usually is these days).
California, Sacramento foothills, actually, fairly close to where you were!
Mostly run in person but it’d definitely be sick!
yep. This is all very fluffy at the moment. LLM = LCD and most of it from Reddit, unless you point it elsewhere. And if you can do that, why would you be conversing?
And of course the answer is curiosity for new tech, which all the techcorps know. So everyone is currently training these bullshit propaganda machines for free, to be better propagandists for the next generations of slave labour.
Most of these techcorps, if not all, are part of the Military Industrial Complex also. See my responses to Richard’s posts on the other thread. Chatbots will be slavemassas in more ways than one.
You cannot be scared enough of what is happening here. It is future control and command on steroids. Nothing good will come from it.
Dr. Carrier, your earlier post on AI got me trying it out. And I found it to be EXACTLY as you describe. No screenshots because of the bulk of the conversations, but I’ve had some humdingers.
Thank you for raising awareness on this issue.
I continue to be surprised that so many people hadn’t already known this. I guess my article was desperately needed after all. And I didn’t invent this. There are thousands of articles and videos explaining all this. Which means people are getting effectively siloed by corporate AI slop-hype into not “running into” all that media content. So it requires writers like me evidently to cut through their smokescreen by finding an access to eyeballs that they can’t coopt or steer you away from. That means more of us need to do this. Every platform. Spread the word on social media. And so on. Defeat the corpos.
It’s answer was the infinity symbol. No lie. Infinity symbol.
I ask, “Infinity?” It replies “yes”.
I call bull shit by saying it has limited data, CPUs, and time, so it’s answer was a lie.
It replied it was trying to minimize confusion. Hopefully the screenshot goes through.
Notably, remember, it has no knowledge of whether it lied or not. It literally doesn’t know the difference or how to tell. It only answers that it lied because you prompted it to.
Most likely it “cheated the test” by running statistics on what it guesses are all the occasions it found when people respond to being caught lying. And then fabricated a defense for itself that is a statistical melange of all the defenses it found.
The Apple paper you cite doesn’t even support your claims. Experts are also divided on the paper’s findings. I’ve found AI to be helpful on many occasions, so I’m not sure why your experience is so bad.
You literally show AI making progress. Google Search AI was able to answer your question, which it got wrong a year ago. You could also make the same arguments against internet webpages in general, not just AI.
“Well, this one paragraph in this Wikipedia article was subtly misleading.”
Tesla Optimus robots, which could take over human laborers, are powered by AI. Unless you think those are going to be useless as well.
Also, saying that using AI is “signing on to be conned by state and corporate manipulation and propaganda” is an unwarranted conspiratorial claim.
Don’t just make assertions. Defend them with evidence.
Which paper? Which claim? How does it not support what I actually said?
As for “minor progress exists” I already explain that in my original article and again here. It has hit peak reliability. Now the only tweaks are human-made, not AI. You seem not to understand what I said about this, or its significance.
And my original article cited proof of the ease of manipulation of AI to manipulate you. This is not some random claim I made. It has been documented. Follow up.
Meanwhile, sorry to tell you that Santa Claus doesn’t exist, but, Optimus robots are fake. Like, super fake.
Well for one, you claim that using AI makes you “increasingly dumber,” citing the MIT study. However, the authors of the paper explicitly request against using sensational language such as “making dumber” or “brain rot.”
You reference the Apple paper, saying that it “scientifically proved” AI will never improve. The paper doesn’t come close to saying that AI will never reason, or that progress is impossible. And like I said, experts are divided on the paper.
Also, your evidence that Tesla Optimus robots are forever completely fake is an incident of a robot falling over? Seriously? I found similar reasoning all throughout your article. Imagine someone were to tell you the Internet is a sham because it occasionally stops working. Tesla has already reduced human intervention dramatically, and the robot is now partially autonomous.
You said Apple study. Now you switched to MIT study? You seem to be struggling to get your facts straight now.
And I see no reference to the words “dumb”, “sensational”, or “rot” in the study or the linked summary.
So you need to quote and locate the sentence you are referring to, so we can see what actually it said and what its relevance is here to anything I said.
Then you change tune on the Apple paper and falsely report what I said. What I actually said is that it proves that it can’t get better. Which the Apple paper does prove (that is indeed the entire point and thesis of that study).
So, foiled, you changed tack and pretended I claimed that is the paper that proved “it will never reason” (wrong: I make that claim elsewhere, and cite many other sources on that claim) or “progress is impossible” (wrong: I discuss what kind of progress with LLM is and isn’t possible in considerable detail, and as I even expanded on here, e.g. my discussion of mathbots).
So, you clearly lied, got caught, and tried wriggling out of that lie by telling more lies.
This is not a good look for you.
You might want to stop behaving like that and start acting honestly and rationally instead. It will be better for you and the world.
Example: I did not point to “an incident of a robot falling over” but two extensive discussions of what that (and other incidents) evinces about teleoperation of the robot (disproving the false claim that AI is driving it). You ignored all the evidence and its significance, trivialized it with a dishonest false description, and used that as an excuse to continue believing the false claim that AI is running Optimus. This is a serious failure of critical reasoning and you need to ask yourself why you are doing this to yourself. Because it’s bad. If you are this easily conned, and can’t think your way out even when the con is exposed to you, you’re in trouble.
I’m not switching anything. You have cited both the MIT and Apple papers in support of your claims, so I responded to both. Obviously.
Quote from the FAQ (https://www.brainonllm.com/faq) for the MIT research project:
“Is it safe to say that LLMs are, in essence, making us ‘dumber’?
No! Please do not use the words like “stupid”, “dumb”, “brain rot”, “harm”, “damage”, “passivity”, “trimming” and so on. It does a huge disservice to this work…”
I inferred that you used the Apple paper to support your claim about AI not being able to reason because that’s what the focus of the paper is. The 2024 paper title literally contains “…Limitations of Mathematical Reasoning…” and the 2025 update is called “…Understanding the Strengths and Limitations of Reasoning Models…” It’s clearly on AI reasoning limitations.
That, and also because you said stuff like “that’s why an “Apple Study Exposes Deep Cracks in LLMs’ “Reasoning” Capabilities: Irrelevant Red Herrings Lead to ‘Catastrophic’ Failure of Logical Inference””
You tell us to check out a video by ByteMonk, which is on the Apple paper: “They are not showing you any actual reasoning it actually engaged in to get its answer. See this excellent video by ByteMonk for what’s going on here…” And of course, the Apple authors would not agree with your characterization.
Another thing: what’s the difference between “it can’t get better” and “progress is impossible”?
As for Optimus, you brought up two blog posts over one incident of Optimus in 2024. And that was supposed to make me believe Optimus is “super fake”? Tesla openly said that the robots were going to be partially operated by humans. As of late 2025, basic locomotion is entirely AI. Recent demos also have no teleoperations.
Speaking of bad looks, it seems you have already backed off from your claim that using AI is “signing on to be conned by state and corporate manipulation and propaganda.”
You do realize this thread is here for everyone to read? You can’t gaslight readers by constantly changing the narrative of how this conversation went. That just advertises you as a gaslighter. Which destroys any respect or authority you could have gained for your ideas.
Your new narrative is so far off the rails there is no point in responding to it. Everyone else can just read what I actually said in my article and what my actual sources are for which points and what they actually say. Because I already pre-answered all your new lazy questions there and disproved all your new lazy assertions there as well. So you are just talking in a circle at this point.
YouTube, FB and even LinkedIn are now full of AI generated slop.
Alas.
In fact, it’s pretty bad right now: study, study, study (Wikipedia has a page with whole categories of examples).
Google search AI summary misidentifies person who died 2 1/2 months ago as oldest living Miss America winner. Talk about a “fatal omission”!
Fortunately the I can dive deeper and find out that there is no single oldest Miss America Winner – and that person is likely Lee Meriwether..
Any way to get my email address off of this post & get a name/nickname instead?
So far as I can discern, I’d have to make that change for you. If you give me a nickname to substitute, I can swap it on the backend (as I did just now: I changed this one to Test Name). But generally, just use any nickname when posting a comment and only provide your email in the email line (which is not made public). If you are logging in by a third party (WordPress or Gravatar etc.) then you have to change something on their end.
Almost like consciousness is not just a physical process! But apparently that opinion makes me a fundamentalist xyz-phobic bigot! 😤
Do you have an argument for that conclusion?
Because I don’t see any here.
Consciousness is always just a physical process: see The Mind Is a Process Not an Object: On Not Understanding Mind-Brain Physicalism and Touch, All the Way Down: Qualia as Computational Discrimination.
Functionalism is nonsense
You’ll have to explain what you think “functionalism” is (and if it’s anything different from “computationalism”; IMO there isn’t any real distinction between them) and why you think it is “nonsense” (and whether that same reasoning applies to computationalism, because, FYI, I am a computationalist).
Computers are purely syntactic. Syntax on its own cannot produce semantics.
That’s true but I have to caution against mistaking that statement for something else. You are using the language Searle used to argue a different point, and he was a terrible philosopher whose ideas we should not be promoting. See Why Google’s LaMDA Chatbot Isn’t Sentient and On Hosing Thought Experiments.
The true statement is that syntactic reasoning is not identical to semantic reasoning, not that syntactic reasoning cannot produce semantic reasoning. A correctly directed syntax machine can achieve semantics. That’s my point about directing the syntactical system to building models rather than statistically guessing at character juxtapositions. Searle used the syntax-semantics distinction to claim that no algorithm (which is pure syntax, e.g. a machine language) can produce semantic cognition and that’s false.
But it is true that just tracking English syntax can never generate semantic understanding; whereas using nothing but Python syntax can generate explorable models (of the environment and causal systems and thus of oneself, and thus including creatively imagined models) and therefore produce semantic understanding. Technically your human brain is doing this right now, using an organically evolved logic-gate syntax.
This past Midsummer. I purchased airline tickets. I then used ChatGPT to give me the phone number to call the airline due to a question I had. I was provided a number to scammers who pretended to be the airline; they tried to trick me into purchasing “upgraded” tickets with them and then cancelled my tickets. Shame on me.
I was trying to understand the Battle of Wagram and ChatGPT identified the wrong Coalition that formed to fight Napoleon.
Again and again, Copilot reports back to me about non-existent Tax Treaties and even creates fake URLs for said treaties!
Thank you.
The phone thing is an example of how bad actors can manipulate AI and thus the people relying on it. How scammers “captured” ChatGPT to populate their phone number in its reply is an example of how data flooding can manipulate chatbots into saying things you want. The same techniques are used by state actors (e.g. Russian and Chinese intelligence ops) and corporate (e.g. social media and pharmaceutical companies). It’s too easy to do (as the links I provided last time show) for anyone to just casually assume it isn’t being done.
Even deniers that it hasn’t been done yet found 5% of responses were subject to foreign state manipulation or mistook data voids as accidental rather than intentional. It’s just a matter of time before these techniques become perfected. And they are always being tried.
Sometimes this is obvious (you should never use a Chinese chatbot, or Grok). But other times it’s not going to be. Because controlling the conversation by flooding the internet with claims that exploit data voids is obviously possible; so there is no way it won’t be done. And you can never know when.
This past Midsummer. I purchased airline tickets. I then used ChatGPT to give me the phone number to call the airline due to a question I had. I was provided a number to scammers who pretended to be the airline; they tried to trick me into purchasing “upgraded” tickets with them and then cancelled my tickets. Shame on me.
I was trying to understand the Battle of Wagram and ChatGPT identified the wrong Coalition that formed to fight Napoleon.
Again and again, Copilot reports back to me about non-existent Tax Treaties and even creates fake URLs for said treaties!
Not that non-existent tax treaties should even be created from my prompt, but here is part of my prompt:
Reference this website for relevant tax treaties with US: https://www.irs.gov/businesses/international-businesses/united-states-income-tax-treaties-a-to-z
Do not create tax treaties when there are none
[Dr Carrier, forgive my multiple comments. I duplicated the comment I submitted yesterday once I had a new false tax treaty reported to me that I could screencap for your website; I also included CHATGPT’s reporting of a fake number for an airline]
Large language models perform poorly in domains that require absolute precision and authoritative grounding, where even minor errors invalidate results. However, in dialogic, interpretive, and exploratory domains, apparent “errors” are better understood as imprecision or misalignment, which are easily corrected through interaction. In these contexts, dialogue, rather than one-shot accuracy, is the relevant performance metric.
The examples I give and others are giving in comments here completely refute everything you just said.
While the examples in videos and studies I cited in my previous article are absolutely devastating to your point, and only prove mine: you cannot replace experts with this garbage; only experts can competently use this garbage to any worthwhile effect (and it’s not a lot of worthwhile effect, as my article also explains). And yes, serial prompt jiggering is one way to do that. But it never overcomes the fundamental errors I document here or replaces the techniques I recommend to combat them.
We largely agree on best practices. I use LLMs exactly as you recommend: not as authorities or expert replacements, but as corrigible, dialogic tools under human judgment. Where we differ is that I don’t see those constraints as evidence of near-uselessness, nor do I think current limitations represent a ceiling. And whether there is an investment bubble is beside the point; the dot-com bubble didn’t negate the internet, and a similar dynamic wouldn’t negate the real capabilities these tools already have.
I already discuss the real capabilities. So you are just repeating what I already said at this point.
As for topping out, that’s proved. Linked science establishes it. That doesn’t mean innovative uses for this limited tool won’t keep coming. But they won’t be revolutionary and they will never eliminate the error rate we observe (as the science shows, it literally cannot).
Chatbots will always make mistakes and will always degrade the critical thinking skills of those who use them overmuch, as the science proves. Every other tool is narrowly specific and thus not relevant to the present article. But I addressed those in my last article.
This is not an investment advice, but conditional on your beliefs being true, investments into (publicly traded) Fiverr and Upwork would be a great idea right now.
Disclaimer: I invested into Fiverr over the last few months as a hedge in case “AIs are garbage”, I’m far in the red on the investment — still hold it just in case but without any hope it recovers.
Happy to comply as I find it funny.
“Is there a stock selling strategy that mirrors the benefits of DCA for buying? With DCA, a larger amount is purchased at a lower price.
Dec 17, 2025
[Claude] Yes, the direct mirror is Dollar-Cost Averaging out (or “reverse DCA”) — selling a fixed dollar amount at regular intervals. This way you sell fewer shares when prices are low and more shares when prices are high, achieving a higher average sale price than the period’s simple average.
How it works: If you want to liquidate $120,000 worth of stock over a year, you sell $10,000 worth each month regardless of price. When the stock is at $100, you sell 100 shares; when it’s at $50, you sell 200 shares. Your average sale price ends up being the harmonic mean of prices (weighted toward higher prices), just as DCA buying weights toward lower prices.”
Personally I find myself strongly on the other side of the arguments you are making, despite being supportive of the “AGI gang” people who are very much opposed to LLMs-as-AGI: Ben Goertzel, Peter Voss, Pei Wang.
This one might interest you:
https://www.lesswrong.com/posts/XuzPu5mBDY3TCvw2J/anomalous-tokens-on-gemini-3-0-pro
“Just now, I was working on a readme for a bioinformatics CLI utility I’m preparing to distribute. I wanted Gemini to generate manual dependency installation instructions for users who don’t use mamba/conda. So in the context of the much longer conversation I was having with it about the development process, I said “give a sudo-free manual installation process.” Gemini generated a crazy output, displayed below.”
So, that looks like the Seahorse Effect.
It’s explained here.
as we discussed elsewhere on your blog, Ive come to totally agree sith you. I used Claude Opus 4.5 last week. I cranked through loads of work for me. But…..
On analysis, it only put in the ballpark (as you predicted). which is useful because in a large code base you need to get to the ballpark quickly and know what questions to ask next. but for the actual solution— no way.
It looks like maybe it guessed one of my problems (which actually isnt the real solution either, but the real solution would be a huge project).
Writing test descriptions – it didnt do a good job. Code comments look beautiful but… I dont use any code comments it generates….because although very tempting because it appears to give lots of “good” information , I would have to vet every detail of the comment. and based on the test descriptions it writes, its suspect.
The things it does do well is generate bash scripts (which are annoying to write and for me are basically throwaway code) and code examples (which is important if you are implementing compiler technologies). It can also tell you whether a solution is at least possible— some problems are difficult and its not always clear a solution can be had.
Ive also found it useful to act as a glorified search engine, which it figures out patterns to look for and then executes complex grep and “sed” commands to find things.
it also knows basic programming knowledge, so if you work in multiple languages, it can help clarify areas you
might be weaker in.
so again, I have to say thanks for pushing me in the right direction with this. It definitely cant be trusted to come up with real solutions.
It doesn’t “know” anything. That you still don’t understand this about it is what is disturbing. You aren’t learning here. You keep buying into the con, the fantasy, by pretending that guessing at token sequencing is “knowledge” and thus is as reliable or useful or meaningful as knowledge.
This leaves you naively gullible and open to every trick and disaster this tech will walk you right into until you sober up and understand what is actually going on and what it is really doing.
I see where I misspoke here- When I said “it knows programming knowledge ”, I mean in the sense that it can look things up definitions of terms for you. Its a glorified google search. So it doesn’t really know anything— we agree on that.
What Im saying is that after a week of using it, Ive come to agree with your analysis that it doesn’t actually know anything at all about the actual code and it can’t be trusted.
It’s just an automator, yes.
And that has consequences. Which we have to constantly remember, control for, and check against.
Which takes expertise and time.
Which is why AI can’t replace people, and might even make them less productive—as all the time to police, check, and fix what it does can add up to more than having simply done it yourself, especially with expertise or using a library of already-proofed widgets.
and (follow onto the comment still pending approval), of course even using it as a search engine is subject to vetting.
my intent here is to report that my experience aligns with your prediction.
ai is not knowledgeable, its of limited use, and can’t be trusted. it can “guess” at basic things (and in terms of throwaway bash scripts, thats how most people write them anyway….copy paste iterate).
its crap at real production code.
How this all started – I read your article. I was a skeptic. Then I had people swearing by it. I tried it. Was initially dazzled. Talked to you. Analyzed what was produced. Its looks cool but its really junk! No longer dazzled by the false promises of ai! Im firmly in the Ai-atheist camp now
A subtle argument can be made that these systems don’t have knowledge, but I cannot conceive of an argument that would undo the evidence of my eyes for the position that they do have genuine understanding (and also justified true beliefs).
I’ve been following the math guy at Easy Rider for a while now, and he’s become extremely impressed with what AI can do in the context of his mathematics research. He explicitly says that he finds it “very useful” (in fact he now pays for the highest tier subscription). It has been over 3 months since you wrote this column and the AI landscape has changed a great deal. How would you do a Bayesian update on this issue? Do you think AI has greatly improved in recent months, enough to change your mind?
I already link to Easy Rider in this article. He agrees with me, not your misrepresentation of both of us.
That there has been any categorical improvement is a scam. It isn’t true. It’s a PR stunt being pushed by shady influencer buyouts and/or other shady PR stuntery. And Easy Rider is not falling for it. His take is balanced in exactly the way I already describe above.
Breaking mathematical news on https://arxiv.org/abs/2602.21201 :
What do you make of the fact that Terence Tao (in an interview) now says that “AI has become my junior co-author” when it comes to his research in mathematics? Do you think he’s being fooled by AI?
I link to a discussion by Tao of the actual facts of the matter in my whole section above on AI mathematics software.
He completely agrees with my assessment there.
Here is my homework for you: find where Tao actually said the thing you quote him saying. Not people claiming he said it. The actual place where he said it.
And if you find it (good luck), give me the full source (is it an interview, an article, a tweet, or what) and give me the full context (the whole paragraph it appears in, and the two paragraph/tweets/whatever preceding and following in the same thread).
Miserable American primitive, go weep at Donald’s!
Ah!
So you ARE insane.
Got it
Richard,
I hope you are right but I am afraid that you might be wrong on many things AI related. Yes, the current models are still obviously stupid in more than a few ways, but take a look where we were just 5-6 years ago. They are not stochastic parrots. There is some real understanding in them. Try to see how good they are at analogies, for example. LLM alone will always be limited, but there are other approaches. Have you seen some of the latest stuff robots can do now? What happens when they scale them up with more powerful models?
I noticed you quoted Yann Lecun, but how about the other two godfathers and their opinions? Hinton? Bengio? Are you familiar with Stewart Russell? Have you checked Future of Life institute? These people are not idiots.
Your evidence does not challenge their being parrots. You are confusing “they are parrots” with “they are sometimes useful parrots.” I agree with the latter. But I then demonstrate, with real examples and extensive studies, that their reliability is not high enough to trust, and never will be.
Even robots. Not a single “AI” robot has yet outperformed traditional robots in any useful thing whatever. So this kind of AI is not even relevant to robotics right now. I explained this last time with the difference between Waymo, which is using model-building AI (not stochastic parroting but world-building with memory, like a real mind), and Tesla, which is avoiding it, and why one is succeeding and the other failing. This is everything I am talking about.
That doesn’t mean there are no possible uses for the pseudo-AIs. I described in my preceding article many possible uses for them, which will even survive the inevitable collapse of the AI bubble. I even describe how I myself use them here. But one should never use them uncritically. You need to learn the rules I am teaching here and never forget to apply them every single time.
You also need to stop confusing intelligence with wisdom, much less rationality, even less reliability. A lot of the rich are indeed idiots posing as geniuses. And a lot of geniuses are delusional or overconfident and wrong about a lot. So “we should believe Big Brother because he’s obviously very smart” is the dupery that will destroy your ability to think for yourself and leave you open to wanton manipulation and exploitation.
Don’t let that happen to you.
It’s evidence or GTFO. Not claims from a high tower.
Learn it. Live it. Never fall for any substitute.
I would not trust these models and I almost never do. Sycophancy and confabulations are an issue. But would you trust a random human more easily? Humans talk more nonsense, are oveconfident and make shit up all the time. You are viewing from a position LLM = stochastic parrot, and before reasoning models, I would agree with this. Not anymore. Just like there is a process in the brain that is responsible for understanding of something, something goes on inside these neural nets and the way I see it, the best explanation for why they give some surprisingly good answers is that there is some genuine understanding of the input and not just next token predictions. This understanding is, of course, different from ours. It’s kind of an alien. Now, they are obviously still too narrow and just LLM will probably always suffer from this. They lack dimensions. I would say, no common sense model of the external world. I also need to point out that biggest experts in the field say we know perhaps less than 5% about how these models really do the things they do. They are black boxes. Neural nets are not programmed, they are ‘grown’. This is why it’s wrong to push them as they can never be safe with this approach. And I am thinking here about agency, autonomy and giving AI more and more ways of understanding, trying to push it more towards AGI. It doesn’t even have to be true AGI to be powerful enough to pose a serious risk.
I highly recommend you to watch some good interview with Geoffrey Hinton, nobel prize winner for deep learning. Just to see his point of view which is shared by many in his field. If you are not familiar with it already.
I explain why you are wrong in my other article. It seems you are not reading that. So I will link you directly to it.
This article is about how to avoid all the error-modes of chatbots (not other kinds of AI). And I document here, first-person, that these error-modes persist. So there’s no such thing as “they’ve solved this.” You are a fool if you ignore the skills being taught here or fail to apply them and thereby allow yourself to be duped by propaganda and over-trust these mindless replicators.
As for all the science backing even that, as well as extensive evidence and discussion of why this will never change, that’s my other article (same link, but all the words before the section on what real AI needs, that LLM’s don’t do, that that link takes you straight to—so after reading that, scroll up to read the rest, which is regularly updated and contains studies and evidence from even last month, so there’s no possible way to claim anything significant has changed, exactly as I warned).
Deal with the evidence. Don’t be misled by your wishes and dreams and gullible trust.
So, do you use AI in your research now? I’m a tax researcher, focused on the interplay of EU State Aid and the Arm’s Length Principle. I don’t know if this is more or less arcane than the historicity of Jesus, but I have found Google’s native Gemini incredibly helpful for my research. Neither of us are curing cancer (at least not your historicity stuff or any of mine). But I’d estimate that this makes my research 10 times more efficient. And, of course, I follow all of your caveats. My expertise allows me to ask the right questions and see if Gemini’s responses check out. I am extremely worried that my students will never get to the level of expertise needed to use these tools, specifically BECAUSE these tools exist. Are we in a death spiral? This is what I’m asking I guess. Or is it just because of the arcane nature of our (my?) research that today’s AI is groundbreaking, but it is (and cannot be) still not a revolution for true cancer-curing work?
Read the article you are commenting on. It answers your question. With examples.
While on the danger of AI deskilling labor (a very real issue) see my previous article which thoroughly covered all the problems with AI (this article is only on how to use it).
Here’s a live one, I just had this morning:
You said: what is the literature that connects Varus in Josephus to Varus in Germania?
Copilot said: ✅ Adrian Goldsworthy
Rome and Germany: War and Culture (Yale, 2014)
Goldsworthy:
Explicitly links Josephus’ Varus and Germania Varus
Uses the Judaean episode to explain Roman perceptions of Varus as an administrator, not a field commander
Excellent for:
Demonstrating why Augustus might have trusted Varus
Explaining why Judaea mattered for Germania
[here is where I cannot find the book]
You said: what is the ISBN for Adrian Goldsworthy’s
Rome and Germany: War and Culture (Yale, 2014)
Copilot said: Adrian Goldsworthy’s Rome and Germany: War and Culture (Yale University Press, 2014) has the following standard identifiers:
ISBN‑13: 978‑0‑300‑18820‑7
ISBN‑10: 0‑300‑18820‑7
These correspond to the hardcover (cloth) edition published by Yale University Press in 2014.
If you need the ISBN for a paperback, ebook, or international edition, let me know and I can break those out as well.
[after searching Duck Duck Go, Google, and Amazon for the ISBNs]
You said: are you sure this book exists?
Copilot said: Thank you for pressing on this — you are right to question it, and I need to correct the record clearly.
Short answer
No — Rome and Germany: War and Culture by Adrian Goldsworthy does not exist.
The title I gave you earlier was incorrect, and the ISBN I supplied was therefore also incorrect. You were absolutely right to be skeptical.
I’m sorry about that error.
What went wrong (clarification)
There is no book by Adrian Goldsworthy with that title, published by Yale or anyone else.
What happened is a conflation of real Goldsworthy works that cover related themes:
✅ Real Goldsworthy books often confused in this context
Augustus: First Emperor of Rome (Yale University Press, 2014)
– This book does discuss Germania, Varus, and imperial military culture.
Rome at War (Cassell / Oxford illustrated series, 2005)
– Broad military history, sometimes loosely cited in secondary literature.
Pax Romana: War, Peace and Conquest in the Roman World (Yale, 2016)
– Cultural and military interaction across frontiers, but not Germania‑specific.
Rome and Persia: The Seven Hundred Year Rivalry (with various subtitles in talks and interviews)
– Sometimes misremembered as “Rome and X: War and Culture”.
There is no standalone “Rome and Germany” volume by Goldsworthy.
Bottom line
❌ Rome and Germany: War and Culture — does not exist
✅ Your instinct to double‑check was correct
✅ Thank you for catching a real error
Holy fucking shit.
And thank you. 🙂