Why Computers Can’t Write Novels…Yet

computer-booksComputers can calculate millions of equations in nanoseconds (I’m assuming) and allow us to communicate instantly with people on the other side of the world or even out in space. They can be programmed to know and/or learn more words than you or I would ever be able to incorporate into our vocabularies. But can they write books?

Well, yes, depending on what kind of book you’re looking for. One man, marketing professor and economist Phillip M. Parker, has patented a computer algorithm that can produce a 100-plus-page book on a specific subject in as little as 20 minutes. His computer system has reportedly “authored” hundreds of thousands of books over the last decade—a search on Amazon reveals more than 100,000 books under Parker’s name and another 700,000 attributed to his company, ICON Group International.

These are not exactly books you’d find on the front table at your local bookstore. Some of the hotter titles include Webster’s Slovak–English Thesaurus Dictionary, The 2007-2012 World Outlook for Wood Toilet Seats, The World Market for Rubber Sheath Contraceptives (Condoms): A 2007 Global Trade Perspective, The Official Patient’s Sourcebook on Acne Rosacea, The 2007-2012 Outlook for Tufted Washable Scatter Rugs, Bathmats and Sets That Measure 6-Feet by 9-Feet or Smaller in India, among others.

Essentially, Parker’s formula mimics the process experts would use in accumulating and organizing data on a very specific topic:

In truth, many nonfiction books — like news articles — often fall into formulas that cover the who, what, where, when, and why of a topic, perhaps the history or projected future, and some insight. Regardless of how topical information is presented or what comes with it, the core data must be present, even for incredibly obscure topics.

In other words, computers are excellent compilers of any information that is available to them. And they can string together elaborate sentences, paragraphs, and well-structured chapters.

But, alas, they can’t make small talk.

And making small talk, it seems, may be the true sign of intelligence—or, at least, artificial intelligence. That seems to be the current challenge facing Watson, IBM’s “cognitive system” most infamous for being significantly better at Jeopardy than any human contestants. But games shows have right and wrong answers, unlike living language:

Humans talk funny. We invent words. We smash words together, tear them apart, abbreviate them one way, then another. Which is great and fun, if you’re a human. Not so great if you are a machine or the kind of human who programs machines to understand language.

Fortunately, it seemed, Watson’s programmers thought they’d found a way to bridge the gap between the best computer science and…well…the rest of us.

Two years ago, [IBM research scientist Eric] Brown attempted to teach Watson the Urban Dictionary. The popular website contains definitions for terms ranging from Internet abbreviations like OMG, short for “Oh, my God,” to slang such as “hot mess.”

But Watson couldn’t distinguish between polite language and profanity — which the Urban Dictionary is full of. Watson picked up some bad habits from reading Wikipedia as well. In tests it even used the word “bullshit” in an answer to a researcher’s query.

Ultimately, Brown’s 35-person team developed a filter to keep Watson from swearing and scraped the Urban Dictionary from its memory.

This particular bit of word news resonated with me, perhaps because our 19½-month-old daughter has entered the phase of early language development in which she herself is eagerly repeating words she hears without regard (or awareness of) their meaning or appropriateness. That it is, it’s that time when my wife and I have to constantly watch what we say. (One near miss: after I stubbed my toe, my daughter loudly and happily yelled out the word she thought I had just blurted—“Duck!” I may not be that lucky next time.)

But I digress. And just as children eventually grow to take their own control of the language they inherit, it is only a matter of time before computers can churn out original works, if not Pulitzer Prize winners. Certainly much of human fiction is formulaic enough to be within reach of a computer with the right algorithm. In fact, Parker—he of the book-by-computer patent—already has his sights set on a new genre: romance novels. I’m not a fan, myself, but if the words and plots are as interchangeable as the cover images, why wouldn’t it work?

If only they had let Watson compose at least one romance novel—before scrubbing it clean of Urban Dictionary. Now, that would be worth reading.

