
Ultimately, OpenAI's researchers kept the full thing to themselves, only releasing a pared-down 117 million parameter version of the model (which we have dubbed "GPT-2 Junior") as a safer demonstration of what the full GPT-2 model could do.įurther Reading Researchers, scared by their own work, hold back “deepfakes for text” AINaturally, Ars wanted to do just that. You can only imagine how bad it would have been if the researchers had used 40 gigabytes of text from 4chan posts.Īfter a little reflection, the research team has concerns about the policy implications of their creation. All of this comes from a model created by sucking in 40 gigabytes of text retrieved from sources linked by high-ranking Reddit posts.

With written prompts for guidance and some fine tuning, the tool could be theoretically used to post fake reviews on Amazon, fake news articles on social media, fake outrage to generate real outrage, or even fake fiction, forever ruining online content for everyone. With or without guidance, GPT-2 can create blocks of text that look like they were written by humans. (Editor's note: We recognize the headline here, but please don't call it an "AI"-it's a machine-learning algorithm, not an android). Unlike some earlier text-generation systems based on a statistical analysis of text (like those using Markov chains), GPT-2 is a text-generating bot based on a model with 1.5 billion parameters. Replace "reporter" with "redditors," "evil media company" with "well meaning artificial intelligence researchers," and "airtime" with "a very concerned blog post," and you've got what Ars reported about last week: Generative Pre-trained Transformer-2 (GPT-2), a Franken-creation from researchers at the non-profit research organization OpenAI.

In 1985, the TV film Max Headroom: 20 Minutes into the Future presented a science fictional cyberpunk world where an evil media company tried to create an artificial intelligence based on a reporter's brain to generate content to fill airtime.
