The cliché writes back

Machine writing is closer to literature's history than you know | Aeon  Essays
America’s Most Wanted (1994) by Vitaly Komar and Alexander Melamid. Photo by Robert Lachman/Los Angeles Times via Getty Images

Machine-written literature might offend your tastes but until the dawn of Romanticism most writers were just as formulaic

Yohei Igarashi is associate professor of English and associate director of the Humanities Institute at the University of Connecticut. He is the author of The Connected Condition: Romanticism and the Dream of Communication (2019).

Edited byMarina Benjamin

Since its inception in 2015, the research laboratory OpenAI – an Elon Musk-backed initiative that seeks to build human-friendly artificial intelligence – has developed a series of powerful ‘language models’, the latest being GPT-3 (third-generation Generative Pre-trained Transformer). A language model is a computer program that simulates human language. Like other simulations, it hovers between the reductive (reality is messier and more unpredictable) and the visionary (to model is to create a parallel world that can sometimes make accurate predictions about the real world). Such language models lie behind the predictive suggestions for emails and text messages. Gmail’s Smart Compose can complete ‘I hope this …’ with ‘… email finds you well’. Instances of automated journalism (sports news and financial reports, for example) are on the rise, while explanations of the benefits from insurance companies and marketing copy likewise rely on machine-writing technology. We can imagine a near future where machines play an even larger part in highly conventional kinds of writing, but also a more creative role in imaginative genres (novels, poems, plays), even computer code itself.

Today’s language models are given enormous amounts of existing writing to digest and learn from. GPT-3 was trained on about 500 billion words. These models use AI to learn what words tend to follow a given set of words, and, along the way, pick up information about meaning and syntax. They estimate likelihoods (the probability of a word appearing after a prior passage of words) for every word in their vocabulary, which is why they’re sometimes also called probabilistic language models. Given the prompt ‘The soup was …’ the language model knows that ‘… delicious’, ‘… brought out’ or ‘… the best thing on the menu’ are all more likely to ensue than ‘… moody’ or ‘… a blouse’.

Behind recent speculation, such as Stephen Marche’s piece in The New Yorker, that student essays will never be the same after GPT-3, or that the program will once and for all cure writer’s block, is a humbler fact. The main technical innovation of GPT-3 isn’t that it’s a great writer: it is that the model is adaptable across a range of specific tasks. OpenAI published the results of GPT-3’s performance on a battery of tests to see how it fared against other language models customised for tasks, such as correcting grammar, answering trivia questions or translation. As an all-purpose model, GPT-3 does as well as, or better than, other models, especially when prompted by a few instructive examples. If ‘sea otter’ is ‘loutre de mer’ and ‘peppermint’ is ‘menthe poivrée’ then ‘cheese’ in French is …? Being versatile helps GPT-3 produce writing that looks plausible. It takes in its stride even made-up words it hasn’t seen before. When told that ‘to screeg’ something is ‘to swing a sword at it’ and asked to use that word in a sentence, the computer composes a respectable answer: ‘We screeghed at each other for several minutes and then we went outside and ate ice-cream.’

Reactions to this GPT-3 have been circulating in tech circles and the mainstream media alike. There are concerns about the datasets on which the model was trained. Most of the half-trillion words lending GPT-3 its proficiency is internet writing ‘crawled’ and archived by bots, with a relatively small portion coming from books. The result is that machines can output hateful language, replicating the kind of writing from which they’ve learned. Another line of criticism points out that such models have no actual knowledge of the world, which is why a language model, in possession of a great deal of information about word sequence likelihoods, can write illogical sentences. It might suggest that if you’re a man needing to appear in court, but your suit is dirty, you should go in your bathing suit instead. Or, while it knows that cheese in French is fromage, it doesn’t know that cheeses don’t typically melt in the fridge. This sort of knowledge, on which language models are tested, is called ‘common-sense physics’. This sounds oxymoronic, but is appropriate when you consider that the inner workings of these deep learning-based models, though basic, also seem utterly mysterious…


F. Kaskais Web Guru

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s