Robots make pretty decent writers. If you don’t believe that, consider this: you’ve probably been reading computer-generated prose for years.
If you’ve ever read a Yahoo fantasy football report, a preview or recap of any NFL, MLB or NBA game, a weathercast, an insurance report or the Edmunds description of that car you wanted, you’ve likely read some of Wordsmith’s work.
Wordsmith is an artificial intelligence system that uses mounds of data, quantitative analysis and some rules about style and good writing to churn out hundreds of millions of stories every year.
A product of the Durham, North Carolina company Automated Insights, Wordsmith is a cloud-based platform dreamed up by former Cisco employee and now Automated Insights CEO Robbie Allen.
Allen's company recently made a major deal with the Associated Press to produce the venerated news organization’s earnings report stories. AP has also taken a stake in the company. Allen says the software will help the AP produce 15 times more earnings stories per quarter than it did without Wordsmith.
The first of those AP stories should arrive sometime this month. But this is by no means the first deal of its kind. Automated Insights competitor Narrative Science currently writes earning report previews for Forbes.
All the work Wordsmith does to produce stories that can be indistinguishable from ones written by humans is driven by data — and, Allen noted, it’s “quantitative analysis, not qualitative, which is ideally suited for people.” For AP, Automated Insights pulls numbers from companies’ press releases and official earning reports and combines it with historical financial data from Zacks, a stock research and analysis firm in Chicago.
Typically, in order to have compelling summary, you need to have some historical dataTypically, in order to have compelling summary, you need to have some historical data,” said Allen, adding that a certain level of data volatility -– lots of new data coming in -– is key when it comes to automating good stories.
Whether Wordsmith is working on a car description or a recap of the latest 14-inning marathon between the Mets and Phillies, it’s prepared to do as much or as little number crunching as necessary. The software uses Amazon’s AWS cloud-based service to utilize as many as thousands of servers for an hour or two to generate millions of stories.
Using AWS helps Automated Insights control costs. “We just bring [the servers] up for two hours and then bring them down and we just pay for those two hours,” said Allen.
Pure numbers, of course, do not make a compelling story. Allen says the company is working hard on improving Wordsmith's sentence and paragraph structure, and also teaching it more about the tone of writing. “Any time we have a new project, I tell the team the goal is to make sure it does not sound automated,” said Allen.
Wordsmith can write headlines, too — but bylines are more complicated. Some Automated Insights clients, like the AP, will use the company name as a byline — others, “clients who are equally big names,” don’t want anyone to know the content was “written” by Wordsmith. They either don’t byline the software-driven stories or use a pseudonym.
In 2013, Wordsmith produced 300 million stories, more than all the major media companies combined. In 2014, Automated Insights, which now employs 35 people, expects it to produce a billion stories. When you’re producing 5 million stories a week, asked Allen, “
how do you create something that’s not repetitive?how do you create something that’s not repetitive?”
Wordsmith does it pretty much the way a human would: by varying story structure, using different phraseology and, where possible, incorporating historical anecdotes.
Allen, who has a computer science degree and has written extensively on artificial intelligence, was not surprised when a computer AI program, Eugene Goostman, fooled enough people into believing they were talking to a real person to pass a Turing test. That's because he's seen much the same thing.
When a professor did a study comparing one of Wordsmith’s NFL recaps to one written by a human, roughly half the subjects found them virtually indistinguishable. “The conclusion was that it’s hard for people to tell the difference [between AI-generated and human-written] already,” said Allen.
Automated Insights is expanding, too. It’s created sports trivia, automating thousands of questions for each major league sports team. It’s also in discussions with cities around the country to provide school performance and police crime reports.
What we’re doing is augmenting what they were doing in the past,What we’re doing is augmenting what they were doing in the past,” said Allen. Those at the AP who were covering these companies will still cover them, but Allen thinks they may use the Wordsmith post as a “starter piece or even research material,” then the human journalists will write something richer.
UNC School of Journalism and Mass Communication Asst. Professor Ryan Thornburg agrees. “The trend in automation should free up the best writers and best reporters to add the how and why context that still needs to be done by humans,” Thornburg told me in an email.
Automated Insights’ vision of the future is not a legion of robots writing the lengthy prose we read in everything from The New Yorker to The New York Times. Instead, Allen sees tools like Wordsmith remaking other industries. “Within 5 years, the role of data analyst and data scientist will be completely different than what it is today,” said Allen.
These data employees will shift from doing data analysis to programming the systems that power Wordsmith, which will then provide the analysis.
Thornburg, however, painted a slightly different picture. He said that while we often think of data as “numbers in tables,” content and behavior data will become a part of the mix, too. Ultimately, he says, “we will see automated stories that incorporate the user’s data and the data of her social network as well."
Allen also noted that AI-driven news creation has the potential to be far more personalized, something traditional media and journalists cannot do at scale. Thornburg agrees, though he thinks “concierge news services” will come at a price for news consumers. It has “the potential to create a world of media haves and have nots — the haves will pay premium subscription fees to get highly personalized news from bots. The have-nots will get generic news (maybe written by bots as well).”
Tags: BUSINESS, MEDIA