BBC Study Finds “AI” Chatbots Routinely Incapable Of Basic News Synopses

from the I'm-sorry-I-can't-do-that,-Dave dept

Automation can be helpful, yes. But the story told to date by large tech companies like OpenAI has been that these new language learning models would be utterly transformative, utterly world-changing, and quickly approaching some kind of sentient superintelligence. Yet time and time again, data seems to show they’re failing to accomplish even the bare basics.

Case in point: Last December Apple faced widespread criticism after its Apple Intelligence “AI” feature was found to be sending inaccurate news synopses to phone owners. And not just minor errors: At one point Apple’s “AI” falsely told millions of people that Luigi Mangione, the man arrested following the murder of healthcare insurance CEO Brian Thompson in New York, had shot himself. 

Now the BBC has done a follow up study of the top AI assistants (ChatGPT, Perplexity, Microsoft Copilot and Google Gemini) and found that they routinely can’t be relied on to even communicate basic news synopses.

The BBC fed all four major assistants access to the BBC website, then asked them relatively basic questions based on the data. The team found ‘significant issues’ with just over half of the answers generated by the assistants, and clear factual errors into around a fifth of their answers. 1 in 10 responses either altered real quotations or made them up completely.

Microsoft’s Copilot and Google’s Gemini had more significant problems than OpenAI’s ChatGPT and Perplexity, but they all “struggled to differentiate between opinion and fact, editorialised, and often failed to include essential context,” the BBC researchers found.

BBC’s Deborah Turness had this to say:

“This new phenomenon of distortion – an unwelcome sibling to disinformation – threatens to undermine people’s ability to trust any information whatsoever So I’ll end with a question: how can we work urgently together to ensure that this nascent technology is designed to help people find trusted information, rather than add to the chaos and confusion?”

Language learning models are useful and will improve. But this is not what we were sold. These energy-sucking products are dangerously undercooked, and they shouldn’t have been rushed into journalism, much less mental health care support systems or automated Medicare rejection systems. We once again prioritized making money over ethics and common sense.

The undercooked tech is one thing, but the kind of folks in charge of dictating its implementation and trajectory without any sort of ethical guard rails are something else entirely.

As a result, “AI’s” rushed deployment in journalism has been a keystone-cops-esque mess. The fail-upward brunchlords in charge of most media companies were so excited to get to work undermining unionized workers, cutting corners, and obtaining funding that they immediately implemented the technology without making sure it actually works. The result: plagiarism, bullshit, lower quality product, and chaos.

Automation is obviously useful and language learning models have great potential. But the rushed implementation of undercooked and overhyped technology by a rotating crop of people with hugely questionable judgement is creating almost as many problems as it purports to fix, and when the bubble pops — and it is going to pop — the scurrying to defend shaky executive leadership will be a real treat.

Filed Under: , , , , , , ,

Rate this comment as insightful
Rate this comment as funny
You have rated this comment as insightful
You have rated this comment as funny
Flag this comment as abusive/trolling/spam
You have flagged this comment
The first word has already been claimed
The last word has already been claimed
Insightful Lightbulb icon Funny Laughing icon Abusive/trolling/spam Flag icon Insightful badge Lightbulb icon Funny badge Laughing icon Comments icon

Comments on “BBC Study Finds “AI” Chatbots Routinely Incapable Of Basic News Synopses”

Subscribe: RSS Leave a comment
28 Comments
Bobson Dugnutt (profile) says:

Cory Doctorow warned us

Author and coiner of the word “enshittification” Cory Doctorow made a prediction about AI, and warned us that we have our eyes trained on the wrong danger.

It’s highly unlikely that any AI will take our jobs. This is what the public fears. There’s an infinitesimal chance of this happening.

The more probable danger is that your boss or the C-suite will believe the sales pitch that AI will take our jobs and choose to eliminate those jobs.

The paradox is that the decision makers in any organization are the least capable of understanding AI’s strengths and weaknesses, yet will nonetheless make the call on the AI switch and blow the call.

Anonymous Coward says:

Re:

Call me an apologist if it makes you feel better or smarter or whatever, but if you think the headline is supposed to be a synopsis you are simply wrong.
It’s the ‘advertisement’ for the article; the thing that’s supposed to draw you in and click.
That’s why we have the term ‘clickbait:’ When the headline/thumbnail/etc is doing its work, just disingenuously.

Anonymous Coward says:

Re: Re:

if you think the headline is supposed to be a synopsis you are simply wrong. It’s the ‘advertisement’ for the article

Okay, but false advertising is still generally considered bad. More so when done by an organization that claims to be devoted to informing the public of facts.

As for the word “synopsis”, the definitions vary a bit. The Greek roots mean “whole view”, which is a bit much to expect from a headline. It should, at least, be a correct view, even if it doesn’t cover literally every point the article makes.

The reader should never feel deceived by a headline; that’s why “clickbait” is a perjorative term.

Anonymous Coward says:

The team found ‘significant issues’ with just over half of the answers generated by the assistants, and clear factual errors into around a fifth of their answers. 1 in 10 responses either altered real quotations or made them up completely.

That’s not too far off from reading the official headlines, apart from lying in direct quotes which is admittedly more of a low-brow tabloid activity.

they all “struggled to differentiate between opinion and fact, editorialised, and often failed to include essential context”

And that’s just the same as reading the actual article.

We could discuss whether the stochastic lies of the LLMs are better or worse than the more considered lies of the BBC, I guess. And there’s certainly an argument to be made about capitalism more generally, which you almost touched on.

But otherwise it appears that LLMs trained to mimic humans do, in fact, mimic humans. And the humans really don’t like that.

Captain Spicy says:

Re:

“considered lies of the BBC”

You got any examples of “considered” lying from BBC News? BBC news is the most trusted news source in the UK and indeed is highly trusted around the world. It is monitored by OFCOM and is accountable to them.

It is also non-commercial. No sponsors or rich corporate owners.

So again, please, tell me of these “considered lies” of which you speak.

Anonymous Coward says:

Re: Re:

Consider whose war crimes and atrocities are talked about, and when and how they are talked about. Who is called a terrorist and who is not. Militant. Regime. Authoritarian. Extremist. Unrest. Who is dead and who is killed. Who experiences ethnic or religious violence and who does not. Who experiences oppression and who does not.

Which countries organize disinformation campaigns, and which just need to be “fact-checked” on the exact same thing over and over in completely unrelated incidents which definitely are not campaigns or disinformation (or whose lies are just quoted without commenting on their long history of lying about that subject).

Darkness Of Course (profile) says:

Remember what LLM/AI is actually for

Which is to output what “Sounds like” a good conversation. It doesn’t actually need to be accurate, or engaging, or relevant. It just needs to sound like what the dweeb devs think is a good conversation.

Think of LLM/chat interactions like coming into a conversation at a party. Everyone is relaxed, got a drink in their hand, we are all friends here. Someone posits a question to one of the men. He has never heard of the subject, but he will gladly give you his heartfelt opinion that is all of a few milliseconds old.

Now, what happens when anyone (or worse, a woman) calls him out on his clearly BS reply? The man will either ignore the call out, or he will double down. To the point of making stuff up. Give him ten minutes and it will be in Wikipedia. It is like he hallucinates an answer that will make his first answer seem real.

The conversation will sound like a good conversation, it just won’t be one. Now, give THAT tech to suits that do not, or cannot, understand the actual content. Bingo. That’s where we are now.

Anonymous Coward says:

There is only one place where AI would be absolutely perfect, yet to date, I have read nothing that pertains to that one place – games like Skyrim and Oblivion, and Fallout 4, where the entire object of the game is to interact with others, albeit usually through violence. 🙂
If the ‘actors’ had an AI brain, the whole game would be more or less, random, as far as actual event order and quest acceptance. Every new session would be unique.

I am pretty sure that would sell well.

Add Your Comment

Your email address will not be published. Required fields are marked *

Have a BestNetTech Account? Sign in now. Want one? Register here

Comment Options:

Make this the or (get credits or sign in to see balance) what's this?

What's this?

BestNetTech community members with BestNetTech Credits can spotlight a comment as either the "First Word" or "Last Word" on a particular comment thread. Credits can be purchased at the BestNetTech Insider Shop »

Follow BestNetTech

BestNetTech Daily Newsletter

Subscribe to Our Newsletter

Get all our posts in your inbox with the BestNetTech Daily Newsletter!

We don’t spam. Read our privacy policy for more info.

Ctrl-Alt-Speech

A weekly news podcast from
Mike Masnick & Ben Whitelaw

Subscribe now to Ctrl-Alt-Speech »
BestNetTech Deals
BestNetTech Insider Discord
The latest chatter on the BestNetTech Insider Discord channel...
Loading...