Machine-created books – art or industry?

The Oddest Book Title of the Year prize has been won by Prof. Philip M Parker, for his book “The 2009-2014 World Outlook for 60-Milligram Containers of Fromage Frais”.

Parker has “written” over 200,000 books, and yes, there is a reason why I’ve put written in quotes. Parker has, apparently, invented a machine which, as the Guardian says, writes books, creating them from Internet and database searches in order to eliminate or substantially reduce “the costs associated with human labour, such as authors, editors, graphic artists, data analysts, translators, distributors and marketing personnel”. Not to mention the complete elimination of originality and talent, or anything else normally associated with the writing of a book. Oh, and the fromage frais thing will set you back a cool $795 – that’s a lot to shell out to find out if it’s machine-produced rubbish.

Parker’s other books have similarly ambitious price-tags. “The 2007-2012 Outlook for Lemon-Flavoured Bottled Water in Japan”, comes in at $495, and “The 2007 Import and Export Market for Household Refrigerators in Czech Republic at $112”. Such prices are either an expression of extreme confidence in the product or a piss-take.

Parker also had another fromage-frais related book in the competition, “The 2007-2012 World Outlook for Fruity Fromage Frais”, which runs to an extremely modest 186 pages and which Amazon UK will be happy to sell to you for a mere £958.98 (or £917.17 second-hand). Good luck with that…

Interestingly, this gizmo seems to be pretty undiscriminating in its trawling of the Internet for info (and is it me, but doesn’t that represent a massive, multiple breach of copyright?), as there’s no such thing as a 60mg container of fromage frais – not outside a laboratory, anyway, and maybe not even there – the smallest I can find is 42g. If the 200,000, and more, titles are as useful and meaningful as this one, it would seem to be a machine for the dissemination of pointless drivel.

I think it’s a deeply flawed concept – not that Parker makes any secret of the machine’s use, but that doesn’t make it right – because the work is not his. Though I suppose he programmes the thing, and takes the pages from the printer for the draft. Then there’s the prices – if an academic – or even a lesser mortal – had sweated blood putting these books together, then you might, given the subject matter, think yourself lucky to get a price in the higher reaches of double figures, but machines don’t sweat blood, and these prices are absurd, though the fact that Amazon has two used copies for sale bears out PT Barnum’s dictum…**

If I did what Parker does, for my blog – which is copying my content from the Internet, capitalising upon the work of others and breaching their copyright (I mean, come on – how can this not be breach of copyright?) – WordPress would shut me down without a second thought. And so they should.

I go to great lengths to ensure that everything I write is original. OK, I write about things I’ve read, or heard, but the words – the work – is all mine. If I do use anything from another source, it is clearly indicated, the source acknowledged and linked to, if appropriate. I wonder if Parker’s books actually acknowledge their sources, and can a machine be programmed to figure out who wrote a webpage – it’s not always easy to find out? Anyone want to send me a review copy of one of them so I can see for myself? I’m serious about that, by the way. (Fat chance of getting one, though.)

** You may be tempted to tell me that Wikipedia says Barnum never said “There’s a sucker born every minute!”. I’ve seen it, and it’s unverified. And even if he didn’t, he should have.

2 thoughts on “Machine-created books – art or industry?

  1. hi Ron,
    Thanks for the post. Just a note to say that the programs I created do not troll the internet. The Guardian, and others wrote articles without doing a depth interview and simply assumed that this is what the program does (given that we have the internet, people naturally assume that this is what it does). The stat reports I created have no connection whatsoever with the net, and the materials are not found on the net. Here is an explanation from my youtube (that people do not seem to read):

    As the video shows, I am working on reference books, reports and educational titles (not fiction or literature).

    The “algorithms” depend on the genre. The most advanced use parametric, non-parametric as well as Bayesian econometrics, graph theory, and meta analysis (mostly coupled with some specialized computational linguistics and editorial rules that are required within certain genres) — each piece is rather straight forward; the combination allows complexity. In terms of IT or programming languages, there is no rigidity to this – again it depends on the genre. If animation is the goal, then code is written to write MEL scripts, etc., which can automate Maya, which can in turn automate rendering, lights, etc., via macros. This works well, but for only certain aspects of that genre.
    For more detailed discussions, here is the patent link:

    http://www.google.com/patents?id=bHeB

    Some titles are 98 to 100 percent computer automated (e.g. business titles, crosswords, etc.). For health titles, only the format editing and production side is automated. The text in the health books was written by medical professionals and edited by a professional editor; the computer expedited formatting using about 50 odd routines (the preface, chapter intros, glossaries, indexes, headings, margins, etc.); highlights are made to sources generally not known to internet-averse readers or medical practitioners (designed for medical libraries with internet training services).

    Currently, some 2 percent of the titles rely on government sources for text. None perform a google search, spider the net, etc. Some 98 percent of the titles are wholly generated via automation programs; the applications create original information or content that cannot be found elsewhere (e.g. maximum likelihood trade estimates, latent demand forecasts via a decision calculus approach, Chinese and English crosswords, etc.) – offline applications with no interaction to the internet. In total, there are about 17 genres created this way (about 200,000 titles or so since 2000).

    It can take several years to set up an application (including all human inputs, licensed sound effects, textures, models, mocap, data, or decision rules that go into any genre-specific application). Platforms (e.g. Maya) pre-exist. The incremental, or marginal creation time per title is mentioned in the video.

    The genres are blind or peer reviewed and/or vetted by users (e.g. librarians or end-users) before they are put into print. The games are played by kids to see what they like. For 3D games, a pre-existing rendering engine is like a blank word document. The rendering engine is not created from scratch, but licensed (like MS Word).

    I am mostly now working on education titles for Asian, African, and Native American languages that do not have educational materials (games, supplements, texts, videos, mobile phone books, etc.) written in or augmented by their languages. See my dictionary at:

    http://www.websters-online-dictionary

    to see a very small percent of the linguistic material used. Watch for a major update and linguistic augmentation to the dictionary this summer when I will also be introducing EVE. She is an “economically viable entity”. A step beyond a chat bot, using some of the algorithms mentioned above (with a bit of utility theory and optimal control theory thrown in).

    There is no “commercial” or “public” or “open source” software that can be used by the general public. Some applications are terabytes large. I am working on a relatively small poetry application for public use — to be released when completed (probably in a year), which will do several forms of poetry, on any topic the user desires; and allow the user to request “another” if they do not like the first one written, or “change that line”, etc.

    I am not actively working on fiction novels as a priority, though the process is in place for romance novels or similar formulaic types of literature. Fun to do, but not very useful.

    There are many other areas I am working on, as there are multiple avenues to explore, especially in the areas of new media (mobile and fixed), but more so in high-end analytics and knowledge discovery (i.e. generating knowledge that could not be created otherwise) as applied to business, language and public services (e.g. criminology) – where unmanageable, sparse, disintegrated or larger data sets (off-line) result in new knowledge structures usable by decision makers (e.g. connecting the dots where humans have difficulty doing so, for lack of time or expertise).

    Thanks for watching the video.
    Phil

    here is a post to the bookseller.com that gave one of my titles an award, and clarifies the fromage frais thing (a New York Times journalist also incorrectly wrote, and this showed up in the Herald tribune, that the program used the internet to get its materials.

    here is my post:

    Guys, I heard about this honor from a journalist. I am really upset! I was hoping beyond hope that someone would have nominated my life’s work: “The 2009-2014 World Outlook for Electrosilverplated Baby Goods, Ecclesiastical Ware, Novelties, Toiletware, Trophies, and Other Hollowware That Has Been Electrosilverplated to a Non-Precious Metal Base Excluding Pewter” That being said, this may turn out to be the highest honor that the fromage frais report will ever win. But dudes, I am holding my breath, while waiting for the other literary prizes to be announced. Cheers, Phil p.s. Also, a special thanks to Philip Stone and Horace Bent for their funny quotes posted on the official notice that a journalist sent me “Well, given that fromage frais normally comes in 60-gram containers, not 60-milligram, one would assume that the world outlook for 0.06-gram containers of fromage frais is pretty bleak.” … “The fact that this book has been crowned the winner just goes to show how creative and diverse the publishing world it [sic] today. And, perhaps, how important a copy editor is.” For those who did not pick up on their humor, people in the know realize that 60 mg is referring to the metric of the cream content (percent per part), not the total amount of contents in the container; they come in 60 mg, 30mg, 10mg, etc. types. A 60 gram fromage frais would be a rather fatty cheese. Had the title been 60-g, people in the industry would have wondered what new and mysterious fromage frais had been invented, and whether is was legal!! (I guess people from outside publishing are also confused about the way we discuss paper – to a novice, a 300 page book printed using 20lb paper would mean a whopping 6000 lb opus). Thank God that I don’t use a human copy editor (after all, to err is human) … just kidding. Keep up the good work!
    28 Mar 09 16:09

    Anyway, if you are really interested, please stop by INSEAD some day and we can have lunch.
    Cheers
    Phil

    • Hi Phil,

      Hmm…

      I was hoping beyond hope that someone would have nominated my life’s work: “The 2009-2014 World Outlook for Electrosilverplated Baby Goods, Ecclesiastical Ware, Novelties, Toiletware, Trophies, and Other Hollowware That Has Been Electrosilverplated to a Non-Precious Metal Base Excluding Pewter”

      That rather suggests you don’t take it too seriously yourself… And really, taking 200,00 books seriously is a bit of a stretch (or a big ask, as the numpties would have it). As for the fromage frais 0.6g – 60g thing – who outside the industry could possibly know? Or, perhaps, care? People prepared to stump up the cover price to find out are, I suspect, rather thin on the ground.

      Your comment rather brings to mind a s-f novel I read many years ago (and which I’d love to track down but I haven’t a clue who wrote it or what it was called), wherein human writers were considered terribly perverse hobbyists and “literature” was churned out by “wordmills” which had been programmed with as many examples as possible of every available genre, including the entire works of luminaries in each particular field. Thus a wordmill could be set up to produce, as it might be, an entirely original western novel in the style of Zane Grey (who, in this future era, was immensely popular with robots).

      Am I alone in finding echoes here?

      Yes, you’re quite right, I haven’t seen your YouTube video. There’s a reason for that – there are areas of the Web that you couldn’t pay me to visit – not twice anyway, and that includes YouTube. From what I’ve seen it’s probably about 99% dross.

      Ron.

Comments are closed.