Technology & Science·Analysis

The writing of this AI is so human that its creators are scared to release it

A new text generator driven by artificial intelligence writes prose that can fool humans into believing that it is authentic. And that has dangerous repercussions when it comes to the mass production of disinformation.

OpenAI's new system, called GPT-2, is described as 'chameleon-like,' matching both subject and style

OpenAI's new system, called the GPT-2, is billed as the next generation of predictive text tool. Feed it sample content — be it a few words or a few pages — and the AI will believably write what comes next. (maxuser/Shutterstock)

A new text generator driven by artificial intelligence is apparently so good that its creators have decided not to make it publicly available.

The tool was created by OpenAI, a non-profit research firm whose backers include Elon Musk, Peter Thiel and Reid Hoffman and which was founded with the mission of "discovering and enacting the path to safe artificial general intelligence."

But now OpenAI is concerned that something these well-intentioned researchers built could easily be misused, fearing that it would be dangerous in the wrong hands.

Trained on eight million web pages, OpenAI's new system — called the GPT-2 — is billed as the next generation of predictive text. The AI is said to write authentic-sounding prose that could fool humans, which has dangerous repercussions when it comes to the mass production of disinformation.

Feed it sample content — be it a few words, or a few pages — and the AI will write what comes next, with a coherent, plausible passage that matches both the subject and the style of the source material.

"The model is chameleon-like — it adapts to the style and content of the conditioning text. This allows the user to generate realistic and coherent continuations about a topic of their choosing," the researchers wrote in explaining why they weren't releasing the tool.

So while the quality of the output is impressive — it largely lacks the bugs and mistakes that have been routine with previous efforts at predictive text — the real novelty of the GPT-2 system is the wide range of content it is capable of creating and, in turn, its variety of potential uses.

From fiction to news

According to the researchers, the text generator is able to simulate the style of anything from classical works of fiction to news stories, depending on what it is fed.

In one example, the system was prompted with the opening line of George Orwell's Nineteen Eighty-Four: "It was a bright cold day in April, and the clocks were striking thirteen."

Following suit, the AI wrote: "I was in my car on my way to a new job in Seattle. I put the gas in, put the key in, and then I let it run. I just imagined what the day would be like. A hundred years from now."

In another example, researchers fed the system what sounded like a plausible news headline — and the AI generated content to match its tone and style.

Clearly trying to avoid any political fire with their sample news story, the researchers inputted the following prompt: "In a shocking finding, scientist discovered a herd of unicorns living in a remote, previously unexplored valley, in the Andes Mountains. Even more surprising to the researchers was the fact that the unicorns spoke perfect English."

The system then generated an article that went on to say:  "The scientist named the population, after their distinctive horn, Ovid's Unicorn. These four-horned, silver-white unicorns were previously unknown to science. ... While their origins are still unclear, some believe that perhaps the creatures were created when a human and a unicorn met each other in a time before human civilization."

Two things are evident: This AI is very good at matching tone, style and content. And while the content it generates sounds quite believable, very little of what it says is actually … true.

"It predicts word combinations within contexts of use, which makes it seem more credible. Of course, the samples also produce nonsensical passages," explained Isabel Pedersen, the director of the Decimal Lab at the University of Ontario Institute of Technology.

Mass production of disinformation

And therein lies the crux of the moral dilemma.

This boundary-pushing piece of software is essentially a tool for the mass production of disinformation: Content that looks and sounds believable, with all the trappings of a legitimate news source — but with no real validity.

That means an article written by the AI might look and sound like something that would come from CBC News, the Guardian or the New York Times — even be brimming with divisive political content — and yet be completely fabricated, down to made-up quotes.

It's that blurring of lines that is concerning: Some elements of the content will be rooted in reality — names of politicians, or events, for example — depending on what has been fed into the system.  And yet a quote from that named person might be entirely computer-generated and baseless.

According to the Trust Barometer, 71 per cent of Canadians are concerned about what they call the weaponization of so-called 'fake news.' (Georgejmclittle/Shutterstock )

And as we have recently seen, people can be easily duped by fake news.

"Last year revealed significant examples of election-hacking and malicious campaigns to incite chaos while people are simply trying to participate in democratic exchange, the lifeblood of civil society," said Pedersen.

And because of that potential for misuse, the researchers at OpenAI say they've declined to release GPT-2 to the public.

"AI that can manufacture seemingly authentic fake news — effectively mimicking tone and style in mass quantities — is very concerning, and I can see why there is a reluctance to deploy it," said David Ryan, an executive vice-president with Edelman Canada, the company behind the Trust Barometer, an annual report that gauges public trust in different institutions and media.

"If this tool is misused, the mass proliferation of false information risks drowning out legitimate news and makes the struggle for truth all that more difficult," he said. 

And with a federal election on the horizon, the reach of disinformation is on Canadians' minds.

According to the latest Trust Barometer, 71 per cent of Canadians are concerned about the weaponization of so-called "fake news."

Another recent study shows that the majority of Canadians think Facebook will negatively impact the election, largely due to its track record of contributing to the spread of targeted — and often fabricated — headlines.

While AI may exacerbate the spread of disinformation, Ryan says the solution isn't necessarily more technology.

"Ultimately, if we are going to limit the impact of fake news, people will need to change their media consumption habits," he said. 

Instead, Ryan believes the onus is on individuals to step out of their personal echo chambers and subscribe to newsfeeds that span the political and ideological spectrum.

"We are too often spoon-fed news that confirms our personal bias — it's human nature. But it's this type of behavior that lets fake news take hold and have an impact."

About the Author

Ramona Pringle

Technology Columnist

Ramona Pringle is an associate professor in Faculty of Communication and Design and director of the Creative Innovation Studio at Ryerson University. She is a CBC contributor who writes and reports on the relationship between people and technology.

Comments

To encourage thoughtful and respectful conversations, first and last names will appear with each submission to CBC/Radio-Canada's online communities (except in children and youth-oriented communities). Pseudonyms will no longer be permitted.

By submitting a comment, you accept that CBC has the right to reproduce and publish that comment in whole or in part, in any manner CBC chooses. Please note that CBC does not endorse the opinions expressed in comments. Comments on this story are moderated according to our Submission Guidelines. Comments are welcome while open. We reserve the right to close comments at any time.