Dr Gina Helfrich, Baillie Gifford Programme Manager for the Centre for Technomoral Futures at Edinburgh Futures Institute, explains why large language models should be handled with care
Unless you have been living under a stone, you will have heard about the new software ChatGPT, which can answer your questions and your customers’, as well as write your emails and your project reports.
Maybe you’re excited by the possibilities it offers and its aura of ‘the future is here.’ Could a robot that cleans your house and acts as your PA, or even your friend, be next? Or maybe you’re worried you could be out of a job!
Before you get too carried away, here are some of the reasons why ChatGPT may not be the Next Big Thing, and why businesses and organisations should handle it with care—if at all.
Let’s first look at what ChatGPT actually is. In the artificial intelligence (AI) world, it’s called a ‘large language model’ (LLM), which basically means a very large-scale predictive text machine. Like any AI tool, it is essentially software that spots patterns in a huge amount of data. In this case, that data is text, made up of words and phrases written all over the internet. In order to produce the impressive fluency of a system like ChatGPT, AI software needs to be fed data, loads of it, until it ‘learns’ enough to be able to predict what word likely comes after the next one.
ChatGPT has been specifically trained to reproduce dialogue, which gives its outputs a conversational style that can sound eerily like a person. But don’t be fooled. Humans are meaning-makers, so we tend to act as if the AI system is a real consciousness we are talking to, in the same way we might shout at a malfunctioning printer. In reality, language models are like a parrot: they can speak our language to us, but that doesn’t mean the parrot or the AI system understands what it’s saying.
Not only do AI systems lack understanding of the meaning of the words they output, they also do not grasp causality or logic. An LLM doesn’t know that 2 + 2 = 4; it only grasps that statistically, most of the time ‘4’ is what follows the text ‘2 + 2 =’ in its training data. I once posed a basic logic problem to ChatGPT: “If it is raining, then the streets are wet. The streets are wet. Is it raining?” The real answer is ‘maybe’, because the streets may be wet for other reasons than rain (perhaps a burst pipe, or a tsunami). ChatGPT, however, confidently told me that yes, it was raining. ChatGPT is a machine looking for and reproducing patterns in text; it has no common sense reasoning, no context beyond the training data, no understanding of the world.
To get enough data to predict text effectively, the makers of LLMs scrape virtually the entire internet, gobbling up the words indiscriminately. So ChatGPT has ingested the horrible, odious content of the internet, alongside the beautiful. So that it doesn’t blurt obscenities, the software developers must build a content filter layer that tells the system what not to write. But content moderation for AI tools is decidedly imperfect.
For example, when asked to write a piece of code to assess credit worthiness, ChatGPT quickly advised extending more credit to white applicants than applicants who were Black, Asian, or Hispanic. The AI machine devours all the human stereotypes and biases of the internet in its hungry mouth (see – I’m anthropomorphising!).
Another filter LLMs do not have is truth vs fiction. ChatGPT may reference a journal article but, unless you search for it independently, you have no way of knowing if that article is real or not; many of its cited sources are total fabrications! In a search engine like Google or DuckDuckGo, by contrast, you can quickly check the linked source.
Even the developers don’t know why LLMs generate the words that they do. There is no visibility into the vast number of transformations the data undergoes from initial input to final output, which is why we talk about ‘black boxes’ in AI – we can’t open them, or know why they select certain outputs over others.
A Time magazine investigation revealed that OpenAI, the makers of ChatGPT, employed workers in Kenya on $2/hour to build its ‘avoid obscenities’ filter. This meant people working nine hours a day selecting written horrors of all kinds – child sex abuse, torture, suicide – so that the users of ChatGPT would not have to encounter said horrors themselves.
OpenAI is not unique in outsourcing this kind of work to the world’s poorest countries. Unless the makers of LLMs start to do things very differently, artificial intelligence will not only absorb and reproduce our biases, it will also reproduce and reinforce the world’s inequalities.
OpenAI has not disclosed how much energy it takes to run ChatGPT, or whether it uses green energy sources, but CEO Sam Altman tweeted in December that the ‘compute costs are eye watering’. The implication is that ChatGPT requires a LOT of energy. If we’re at all serious about reducing carbon emissions, and we must be, this should give us pause.
All these issues spell reputational and operational risk for businesses, especially those who wish to be responsible.
Part of my work at the University of Edinburgh’s Centre for Technomoral Futures is to consider how to moderate the risks of artificial intelligence tools so that they serve people and organisations well. One important component is honest dialogue about how AI tools actually work, rather than accepting every new AI application as a sign of progress and inevitability. Another is to have honest conversations about the real trade-offs these systems entail.
In short, we should resist the allure of AI for AI’s sake – for our own sakes.
A version of this article was first published in The Scotsman
We make it easy to access the University of Edinburgh’s multi-disciplinary expertise by matching your needs to the latest research, new technologies and world-class facilities.