BBC Study Finds “AI” Chatbots Routinely Incapable Of Basic News Synopses
from the I'm-sorry-I-can't-do-that,-Dave dept
Automation can be helpful, yes. But the story told to date by large tech companies like OpenAI has been that these new language learning models would be utterly transformative, utterly world-changing, and quickly approaching some kind of sentient superintelligence. Yet time and time again, data seems to show they’re failing to accomplish even the bare basics.
Case in point: Last December Apple faced widespread criticism after its Apple Intelligence “AI” feature was found to be sending inaccurate news synopses to phone owners. And not just minor errors: At one point Apple’s “AI” falsely told millions of people that Luigi Mangione, the man arrested following the murder of healthcare insurance CEO Brian Thompson in New York, had shot himself.
Now the BBC has done a follow up study of the top AI assistants (ChatGPT, Perplexity, Microsoft Copilot and Google Gemini) and found that they routinely can’t be relied on to even communicate basic news synopses.
The BBC fed all four major assistants access to the BBC website, then asked them relatively basic questions based on the data. The team found ‘significant issues’ with just over half of the answers generated by the assistants, and clear factual errors into around a fifth of their answers. 1 in 10 responses either altered real quotations or made them up completely.
Microsoft’s Copilot and Google’s Gemini had more significant problems than OpenAI’s ChatGPT and Perplexity, but they all “struggled to differentiate between opinion and fact, editorialised, and often failed to include essential context,” the BBC researchers found.
BBC’s Deborah Turness had this to say:
“This new phenomenon of distortion – an unwelcome sibling to disinformation – threatens to undermine people’s ability to trust any information whatsoever So I’ll end with a question: how can we work urgently together to ensure that this nascent technology is designed to help people find trusted information, rather than add to the chaos and confusion?”
Language learning models are useful and will improve. But this is not what we were sold. These energy-sucking products are dangerously undercooked, and they shouldn’t have been rushed into journalism, much less mental health care support systems or automated Medicare rejection systems. We once again prioritized making money over ethics and common sense.
The undercooked tech is one thing, but the kind of folks in charge of dictating its implementation and trajectory without any sort of ethical guard rails are something else entirely.
As a result, “AI’s” rushed deployment in journalism has been a keystone-cops-esque mess. The fail-upward brunchlords in charge of most media companies were so excited to get to work undermining unionized workers, cutting corners, and obtaining funding that they immediately implemented the technology without making sure it actually works. The result: plagiarism, bullshit, lower quality product, and chaos.
Automation is obviously useful and language learning models have great potential. But the rushed implementation of undercooked and overhyped technology by a rotating crop of people with hugely questionable judgement is creating almost as many problems as it purports to fix, and when the bubble pops — and it is going to pop — the scurrying to defend shaky executive leadership will be a real treat.
Filed Under: accuracy, ai, automation, ethics, hype, journalism, language learning models, news


Comments on “BBC Study Finds “AI” Chatbots Routinely Incapable Of Basic News Synopses”
Theses AI try but struggle, when Trump and Musk just ignore and create even more chaos and confusion.
Even with Grok as President, it wouldn’t be worse.
sold is an awfully strong word. “scamed” would be a better word for that.
Re:
If only it were a word.
Re:
I don’t remember buying any of this either. Yet somehow, they all act like I’ve already paid for it.
Cory Doctorow warned us
Author and coiner of the word “enshittification” Cory Doctorow made a prediction about AI, and warned us that we have our eyes trained on the wrong danger.
It’s highly unlikely that any AI will take our jobs. This is what the public fears. There’s an infinitesimal chance of this happening.
The more probable danger is that your boss or the C-suite will believe the sales pitch that AI will take our jobs and choose to eliminate those jobs.
The paradox is that the decision makers in any organization are the least capable of understanding AI’s strengths and weaknesses, yet will nonetheless make the call on the AI switch and blow the call.
Newspeople Also Routinely Incapable Of Basic News Synopses
Every time someone complains about a headline, the apologists come out to say that those are written by differetn people than the articles. Anyway, the problem goes back decades: they’re often inaccurate synopses, sometimes even contradicting the story.
Re:
Call me an apologist if it makes you feel better or smarter or whatever, but if you think the headline is supposed to be a synopsis you are simply wrong.
It’s the ‘advertisement’ for the article; the thing that’s supposed to draw you in and click.
That’s why we have the term ‘clickbait:’ When the headline/thumbnail/etc is doing its work, just disingenuously.
Re: Re:
Okay, but false advertising is still generally considered bad. More so when done by an organization that claims to be devoted to informing the public of facts.
As for the word “synopsis”, the definitions vary a bit. The Greek roots mean “whole view”, which is a bit much to expect from a headline. It should, at least, be a correct view, even if it doesn’t cover literally every point the article makes.
The reader should never feel deceived by a headline; that’s why “clickbait” is a perjorative term.
Re: Re: Re:
Right, but then you are making an entirely different argument. The context was commentary equating headlines and synopses. If you aren’t equating headlines and synopses, yellow AC’s commentary clearly doesn’t apply.
Re: Re: Re:2
Not so much “equating” as saying that a headline is (or should be) a type of synopsis. “A brief summary of the major points of a written work.”
Re: the problem goes back decades
The problem is inaccurate ai summaries of news articles.
The problem does not go back decades.
What the fuck are you on about?
"Easy Problems That LLMs Get Wrong"
In case you haven’t seen it, here’s a paper exploring AI fallibility. Have a look at the results at section 4.2, and the discussion at section 5. Very funny and confounding.
https://arxiv.org/html/2405.19616v1
the problem with LLM
It is terrible at reporting facts, but really good at sounding like it knows what its talking about. And that is exactly the kind of thing people fall for.
Re:
Sounds like a certain President.
Re: Give culture war a chance
LLM sounds like a three-letter abbreviation the magavolk should be declaring war on.
Re: Re:
Not until they start thinking ‘AIs’ are non-white and/or queer
Gotta go curate some very specific training datasets…
So like actual “journalism” that comes out of the UK then?
Re:
FYI, the Washington Post and New York Times are American papers.
Re: Re:
FYI, the BBC, in the context of journalism, means the British Broadcasting Corporation, which is the connection between UK journalism and AI being played on. This does not read as a commentary on WP or NYT.
Re: Re: Re:
*whooooooooooosh!*
Re: Re: Re:
Clearly, you missed some current events last year because you totally didn’t get the joke.
That’s not too far off from reading the official headlines, apart from lying in direct quotes which is admittedly more of a low-brow tabloid activity.
And that’s just the same as reading the actual article.
We could discuss whether the stochastic lies of the LLMs are better or worse than the more considered lies of the BBC, I guess. And there’s certainly an argument to be made about capitalism more generally, which you almost touched on.
But otherwise it appears that LLMs trained to mimic humans do, in fact, mimic humans. And the humans really don’t like that.
Re:
“considered lies of the BBC”
You got any examples of “considered” lying from BBC News? BBC news is the most trusted news source in the UK and indeed is highly trusted around the world. It is monitored by OFCOM and is accountable to them.
It is also non-commercial. No sponsors or rich corporate owners.
So again, please, tell me of these “considered lies” of which you speak.
Re: Re:
Consider whose war crimes and atrocities are talked about, and when and how they are talked about. Who is called a terrorist and who is not. Militant. Regime. Authoritarian. Extremist. Unrest. Who is dead and who is killed. Who experiences ethnic or religious violence and who does not. Who experiences oppression and who does not.
Which countries organize disinformation campaigns, and which just need to be “fact-checked” on the exact same thing over and over in completely unrelated incidents which definitely are not campaigns or disinformation (or whose lies are just quoted without commenting on their long history of lying about that subject).
Re: Re: Re:
You’ve used a lot of words there to basically say “no, I don’t have any real examples”.
Re: Re:
But a distinct bias toward the Reich Wing government that does own it. I say this as an occasional listener to the World Service.
Remember what LLM/AI is actually for
Which is to output what “Sounds like” a good conversation. It doesn’t actually need to be accurate, or engaging, or relevant. It just needs to sound like what the dweeb devs think is a good conversation.
Think of LLM/chat interactions like coming into a conversation at a party. Everyone is relaxed, got a drink in their hand, we are all friends here. Someone posits a question to one of the men. He has never heard of the subject, but he will gladly give you his heartfelt opinion that is all of a few milliseconds old.
Now, what happens when anyone (or worse, a woman) calls him out on his clearly BS reply? The man will either ignore the call out, or he will double down. To the point of making stuff up. Give him ten minutes and it will be in Wikipedia. It is like he hallucinates an answer that will make his first answer seem real.
The conversation will sound like a good conversation, it just won’t be one. Now, give THAT tech to suits that do not, or cannot, understand the actual content. Bingo. That’s where we are now.
There is only one place where AI would be absolutely perfect, yet to date, I have read nothing that pertains to that one place – games like Skyrim and Oblivion, and Fallout 4, where the entire object of the game is to interact with others, albeit usually through violence. 🙂
If the ‘actors’ had an AI brain, the whole game would be more or less, random, as far as actual event order and quest acceptance. Every new session would be unique.
I am pretty sure that would sell well.