You can use AI to spy out AI!
GPTZero, the startup behind an artificial intelligence (AI) detector that checks for large language model (LLM)-generated content, has found that 50 peer-reviewed submissions to the International Conference on Learning Representations (ICLR) contain at least one obvious hallucinated citation—meaning a citation that was dreamed up by AI. ICLR is the leading academic conference that focuses on the deep-learning branch of AI.
The three authors behind the investigation, all based in Toronto, used their Hallucination Check tool on 300 papers submitted to the conference. According to the report, they found that 50 submissions included at least one “obvious” hallucination. Each submission had been reviewed by three to five peer experts, “most of whom missed the fake citations.” Some of these citations were written by non-existent authors, incorrectly attributed to journals, or had no equivalent match at all.
The report notes that without intervention, the papers were rated highly enough that they “would almost certainly have been published.”
It’s worse than it may sound at first. One sixth of the papers in this sample had citations invented by an AI…but the citations are the foundation of the work described in those papers. The authors of those papers apparently didn’t do the background reading for their research, and just slapped on a list of invented work to make it look like they were serious scholars. They clearly aren’t.
The good news is that GPTZero got a legitimate citation out of it!



While I seen some good uses of AI, mostly in the sciences, this could be one of the best I’ve heard about. I can often tell AI generated video content, particularly of living creatures, but some things like text are harder to discern. Getting some help would be useful if we could rely on the source of the help but that’s also a problem.
A lot of the focus on AI in the public domain is on generated content: kids getting the google bot to write their report. That’s fairly innocuous in a way…except for the child and their parents…but imagine if the AI produces something for a customer that’s preposterous or rude. Or, incorrectly tells the customer they are not eligible for a service or that they are. Now we’re talking lawyers perhaps. So, for AI generated content representing business positions or policies to customers or clients, hallucinations are potentially costly problems
There have always been citation fraud- normally citing a real book with made-up information knowing your professor is not going to check 60+ people’s 10+ citations. Heck, I’m not going to lie- I’ve cited a work knowing that there was something sort of similar in there that I was talking about but didn’t feel like/ have time to read the whole book looking for the exact quotation. Or I’ve done the “Okay, the info I need is in this 20 year old paper, but I’m not supposed to cite anything over 3 years old, what papers in the last 3 years have cited this 20 year old paper, and was it with this information? Great, I’ll cite that” which isn’t academic malpractice, but is a little borderline.
But this? This is just muddying the waters in a time we REALLY don’t need to waters to muddied.
Its not just the sciences. AI will happily hallucinate non-existent case law citations during trials as evidence submitted in courts. Judges hate that. And careless lawyers can get expensive sanctions for that stunt. Too bad science journal editors cannot heavily fine AI hallucinations like angry judges.
It’s a problem in medicine, too.
Back in the day, physicians and other practitioners relied on what they learned in their primary and graduate training, and kept up-to-date by attending conferences and reading relevant journals. These sources weren’t free from fraud, of course, but were quite a bit less likely to have outright hallucinations as their underpinning. The main flaw in this scheme was that these sources were by their nature very out-of-date.
Now, of course, things have changed. Medical practitioners are trained in and routinely use bedside, point-of-care searches of the most recent evidence-based medicine to guide their medical decision making. And here we are; LLM’s are making these new ways and sources unreliable and downright dangerous.
Bad science coming out of an AI is bad. Bad law, likewise. But if I were to prescribe a medication or therapy based on a LLM hallucination I could seriously harm or kill my patient. I can’t think of a situation more fraught than that!
Furthermore…
This is really a failure of the peer-review process. As antigone10 said, it’s really hard for a professor to check 60+ authors in 10+ citations, multiplied by 30+ students in 3+ classes, 3+ times a semester. But one referee in a team of them, reviewing 1-2 papers submitted at a time, could be assigned that onerous task and leave the rest of the peer review to others. That’s a much more reasonable burden, and would be relatively easy to implement, now that the problem has been identified.
But a better solution seems to me to be the one the OP hit on: get an AI to check for the work of other AI’s, and verify the validity of citations. That way, we only have to worry when the AI’s start to collude with one another. Then, we’ll probably be fucked in so many other ways it won’t matter anymore.
Looks like the start of an arms race between AI developers. Now that there is an AI to detect AI written papers, ChatGPT or whatever folks are using to write fraudulently-sourced papers will tweak their AI to defeat GPTZero and GPTZero will have to tweak their AI to overcome the new tweaks and so on and so on until…?
Not really, crivitz.
The capacity for self‑adversarial verification was always inherent, still is.
AI can be made adversarial to itself by merely opening two instances of a session.
In this case, it would be exceedingly easy to check whether links return 404s or zero his or whatever by pasting from the one to the other to verify.
(Or you know, the AI doing but one adversarial ply of reflexive verification)
To submit a paper with fabulated citations is on the user, not on the bot. IMO.
This problem would be easy to fix. Instead of having citations spelled out (Smith & Jones, Journal of Useless ***, 1987, pp 47 – 52), set up a database of publications, and only allow citations by the unique database identifier. Something like this already exists, with DOI IDs, PUBMED IDs, etc. This is only a partial fix. It only ensures that the cited publications exist, not that they are relevant or interpreted correctly.
@6 crivitz
Eh. GPTZero is not finding some obscure telltale, such as a single yellow pixel in an image. It is finding that the AI paper writers are making shit up. This is easy enough to defeat by having the AI paper writers not make shit up, which, frankly, they should already be doing.
If you know how to cast your spell properly…I mean formulate your query, you can get the AI to give you reliable and verifiable citations. Then you can use the citations to verify the information as well as refine your query to get even more precise
cheerfulcharlie @ #3 — The articles I’ve seen about scientists using AI in their research seemed to be relatively free of hallucinations partly because of the way they are using AI. They aren’t necessarily using it to generate text but to process data. It can be fairly efficient at churning through tons of data. Also, they know how to govern the AI to work for them.
ChatGPT, Google, Microsoft’s version, etc available to the general public aren’t so safe nor reliable. You might get something useful, but you better verify it.
@10 robro
You can do the same thing with search engines, which are generally not regarded as ‘AI.’ You just have to be able to blunder on a useful key term, and then read the resulting findings to learn more key terms.
Imagine a hot new trend you want to get in on, such as hot pepper breakfast cereal. You just have to guess the right search keys to get started. Maybe hot pepper doesn’t do it. Then you could try jalapeno or habanero or capsaicin. And maybe you could fan out from cereal and search for granola or grits. You might happen on to Quaker Instant Grits – Jalapeno Cheddar flavor. Eventually you might find that ghost pepper granola as a concept exists, but that most of the links are spoofs, not actual products. Or maybe you will get sidetracked by the question of why jalapeno starts with ‘j’ while habanero starts with ‘h’. Keep your eyes on the prize.
Reginald, robro is being gentle.
GIGO is the principle to which he alludes.
You really do the same with search engines, because now they are all AI powered under the hood.
I do have context; anecdotally, I lived through Altavista (I was office guru at the time), saw Google’s ascendance due to pagerank, then enjoyed the search parameters and operators one could include.
Right? Pagerank beat the curated Altavista because it was automated — web-scraping and linkages.
Then, Google killed that functionality because its revenue was by pushing search results for $$$.
I noticed when one could not exclude (- operator) or do literals or fix ranges or the like.
Still worked, much as you claim it does, but it really is not the same as then.
Your input is not supposed to be a collection of search keys, it’s a natural language query.
You’re talking ‘idiot level’ to a virtual engine that can handle academic level.
You used no clauses, no conditionals, no allusions or references of any other scoping.
[addendum; you are thinking in relational database terms rather than in vector database terms]
I’ve been following an informative series by IBM, an episode of which is:
What is a Vector Database? Powering Semantic Search & AI Applications — https://www.youtube.com/watch?v=gl1r1XV0SLw
Tangential: I just finished a book (West of the Papal Line (2002), by the late Prof. [Emerita, of the University of Florida] Barbara A. Purdy, which states out front in the “Prologue” and “Sources” sections respectively:
Certain problems in academia (though this is a popular-level book) predate AI. Have others done the same without admitting it?
When I think of real AI vs real AI, two self-aware machine based entities locking horns, I think of the rise of Decima’s Samaritan against Finch’s the Machine in Person of Interest. I hated Decima’s John Greer so much he jump scared me by appearing briefly in Dunkirk. I seethed for a while after that. Damn you John Nolan!
In the showdown Finch’s AI was more badass, taking on the persona of Root:
@11: It’s even worse with legal citations, because you need to know not only the correct technical terms but when (as in “period of validity”) they’re helpful. Some of it is downright obscure, but lawyers should know it (for example, asking about “res judicata in a federal case decided in the last five years” is going to get you a lot of made-up citations or none at all… but ask about issue preclusion or claim preclusion will get closer for now, but not if you’re asking about the 1980s!). Asking about “collateral estoppel” should get a different answer in a bankruptcy court than in a criminal proceeding than in a personal injury case. And all of this is also jurisdiction-dependent, because a lot of jurisdictions (most obvious in the US, but not unique!) borrow someone else’s law or statement of it… but there are standards for when one can rely upon that even after confirming the actual citation and text.
The main thing that GAI legal research fails on is stuff that’s been overruled by effect, by statutory change, by constitutional amendment, or anything else that doesn’t explicitly say “Case X is hereby overruled.” I recently had reason to berate a NYC Big Firm lawyer for relying on a 20-year-old case that had not only been criticized by every other court, but was reversed by the Supreme Court — but it was in another case several years later, and was “Reversed and remanded for further proceedings consistent with this opinion.” Naturally, that case was cited a lot before the reversal, so it looks credible to GAI.
A recent letter to the Guardian noted that the AI research literature is being swamped by “AI slop”, to the extent that some AI researchers are getting indignant about it. As the writer said, they collectively share responsibility for putting unreliable, unsafe AI software into the public domain; he (IIRC) compared this to bears getting indignant about the amount of shit in the woods.
Ok.. So, how do we know that this “AI” is not hallucinating hallucinations though? Because, in principle, it could, being an LLM, generate both false positives, and false negatives, by hallucinating that the citations where either real, or fake, when they where the opposite, right? Seems to me that, while useful, it doesn’t end with, “Well, the AI said this was bad.” You still have to check to make sure it really was correct in its determination.