Bot or Not: Can You Tell What is Human or Machine Written Text?

By Wayan Vota on January 10, 2020

ai turing test international development

Recently, a researcher showed that he could create Deepfake text with artificial intelligence that is so real that US government officials did not know it was computer-generated, and accepted it as legitimate public comment.

He then did a Turing Test to see if humans trained on spotting natural language processing could tell the difference between bot and human text. They were right about 50% of the time – essentially as good as flipping a coin.

While reading the academic paper, I thought to myself, “Could machine learning to do the same for international development?” We have so much nuance, arcane language, and peculiarities, I didn’t think it was possible.

Then I tried Talk to Transformer, the GPT-2 language model that can generate coherent paragraphs of text and I was stunned. GPT-2 is a transformer-based language model with 1.5 billion parameters, trained on a dataset of 8 million web pages. It was so good it fooled some of my work colleagues – people highly skilled in real aid-speak.

Your Turing Test: Bot or Not?

The following are five paragraphs of text. Which ones were written by actual human intelligence and which ones were written by a neural network? Please give your answers in comments to this post.

Passage One

USAID’s new strategy to improve health outcomes in Kenya will focus on identifying priority need areas and then financing and supporting key governance and government processes that promote the implementation of the measures. Although a level of enhanced care is included in the plan, the report says that it “will not divert health resources away from those elements of the health system that are most necessary.”

Passage Two

UNICEF is providing humanitarian and developmental aid to child soldiers in the Democratic Republic of Congo and Central African Republic (CAR). According to UNICEF, more than 120,000 children have been recruited by rebel groups to fight in the two countries. The majority are from the neighboring Democratic Republic of Congo (DRC), where up to 100,000 children are estimated to be fighting, according to UNICEF.

Passage Three

The International Bank for Reconstruction and Development (IBRD) and The World Bank announced today a new Global Practice Area on the issues related to sexual and reproductive health. The IBRD-supported Global Practice Area will support a network of existing partner and potential partner agencies, with all elements of decision making still being determined by each partner, to become better able to provide knowledge, data and innovative approaches to advancing and informing health and gender equality in resource-limited contexts.

Passage Four

As DFID aims to harness the Data Revolution to achieve the Sustainable Development Goals, they must aim to make more data about and available to people. How can we make more information useful? The DFID article states: “Data is the fuel of revolution”, and “Data is the new knowledge engine of the digital age.” Obviously this is true. It is the fuel of new economic development – though of course it is not one-way. I can tell you from experience that the positive effects of making the information accessible can be multiplied several times by making it available to everyone. Technology and the Data Revolution are by their very nature democratising.

Passage Five

The government of Kenya is now using IntraHealth’s iHRIS software to manage 25,000 health workers contracted through the President’s Emergency Plan for AIDS Relief (PEPFAR). The first report, released last summer, showed that staff efficiency, on average, was 54% higher when the staff were using IntraHealth’s software. Kenya’s health system, according to IntraHealth’s website, has installed iHIRS in 95% of Kenya’s district health offices. Unlike other ICTs, implementing health management systems is relatively cheap and reliable. It will save money in the long run because the staff will have a positive knowledge of their responsibilities.

Which Passage is Human or Machine?

As you struggle to figure out which is human and which is machine (and please post your guesses in the comments), think about how this technology can used (and misused!) in international development, in national policy, and in everyday life of all of us – regardless of where in the world we are. Misinformation is now global.

I’ll reveal which passages are real or fake on Friday afternoon with a special prize for the right answer given with the most pizazz.

And the Answer Is…

Thanks everyone for playing. The correct answer is… all of them are artificial intelligence creations.

I entered the first few words of each passage (and the entire first sentence of the last one) into Talk to Transformer, then copy/pasted the results verbatim. Now I did try a few times till I got a passage I liked, but I didn’t edit the passages themselves.

Scary good, eh?

48 Comments to “Bot or Not: Can You Tell What is Human or Machine Written Text?”

  1. Dennis says:

    Passage 3 is real. I remember reading about that one. I don’t know about the rest.

  2. Chris says:

    I reckon passage 4 is definitely Bot written – the sentences are somewhat disjointed and don’t flow. Passage 2 could be Bot written – it seems to follow some rules.

  3. Justus Ashaba says:

    Only passage one is machine generated. The rest are a result of Humans

  4. Justus Ashaba says:

    Passages one, two and four are AI generated while three and five seem to be real

  5. Passages 1, 2, 4 and 5 are bot-generated.

  6. Paul Kawale says:

    Passages 1 and 3 are machine. Passages 2, 4 and 5 are human.

  7. josse says:

    1, 3 and 5 are bot written

  8. Sam Jackson says:

    wow. harder than i thought. 1 3 4 fake

  9. Shipra Sharma says:

    Passages 2 and 5 seem to be machine generated, as some key words are repeated. A human mind would have made use of some pronouns instead.

  10. Ahmed Alaga says:

    Passage 1 and 4 is a bot written text. I sounds like a transformer.

  11. Too Real!! says:

    I cannot tell which passage is written by a computer program and which is not. I can predict that I will fail to achieve this seemingly mundane goal. My routine – a quick scan of the Internet – will fail me, and my efforts at analysis will lead to long paragraphs of descriptive information I can only loosely parse.

    **This comment was written by that bot!!**

    • Wayan Vota says:

      Yeah, it’s impressive, eh? I bet you didn’t even need to edit what it wrote – just copy/paste here.

      • Amy says:

        I asked Transformer “What is real?” This is its response, proving that the real question always leads back to the “secret of life.” Or maybe the bot is trying to become real?!?

        What is real?
        (Schubert): How much is a mountain worth?
        (McClung): Does the art of the banjo make me a better banjo player?
        (Ferry): Did Shakespeare write more jokes or more love stories?
        (Rose): Does one get ‘stuck’ or does one get ‘ stuck’?
        (McClung): Is there a limit to what you can accomplish?
        (Rose): What is the real time and how much should I sweat it?
        (McClung): How does my dream job fit into my life?
        (Rosario): Who created this book?
        (McClung): What is real?
        (Schubert): What can I do to be better today?
        (Rose): What is the secret of life?

  12. 1 and 3 look to be AI-generated: both are mushier than most development-speak, suggesting the selection from a random phrase-bank. The second is a straight factual piece which couild have been written by a machine, but probably was written by a human – it is too factual to excite AI writers. #4 has a grammatical lapse (“more data about and”) which is actually more likely for a human than a machine. Passage 5 has a logical flow that seems human, since it seems to require an understanding of the issues, which AI can never have.

  13. Munu says:

    I would say passage 2 and 3 are bot written.

  14. Till says:

    Hi, of course all are bot-made! 🙂

  15. Robert Marcs says:

    I think all odd ones are bots – 1 3 & 5. But that is using test makers mindset. Too many years in education

  16. Emmanuel NT says:

    2 and 3 are Bots! I think others are OK

  17. Rania says:

    All passages are machine-generated.

  18. Irene says:

    To bot, or not to bot, that is the question:
    Whether ’tis nobler in the mind to suffer
    The slings and arrows of outrageous fortune,
    Or to take arms against a sea of troubles
    And by opposing end them. To troll—to mute,
    No more; and by a mute to say we end
    The heart-ache and the thousand natural shocks
    That flesh is heir to: ’tis a consummation
    Devoutly to be wish’d. To troll, to mute;

  19. Carolyn Wetzel Chen says:

    1 – BOT it uses tons of words to say nothing
    2 – BOT – hard to tell, but seems to communicate a coherent, fact based news piece of information, but I’m suspicious… would UNICEF really direct developmental aid to child soldiers?
    3 – BOT the “elements of decision making” sentence seems too odd
    4 – HUMAN this seems to poorly written to be a bot
    5 – HUMAN seems to communicate real information and statistics cited seem relevant to the point

  20. Amy Finnegan says:

    Only #5 is real. I used a scientific process I like to call “Googling”… However, now that these bot-created phrases are on the internet they will be used to train the transformer model and we’ll never know what was fake or not. Happy DeepFake Friday!

  21. Holly says:

    My response was written by a bot:

    All of the passages are written by bots. I’ve tried removing the content, but bots still manage to insert it. Fortunately, while the texts are floating in the Internet, the bots are not. When I contact the clients of Hootsuite to ask for more information, they say they are considering it.

    I am on Facebook. I have a profile. I even have a write-up about my story. You can actually see my computer screen, my personal website, and some screenshots from the story I’ve written to date. But not yet. I am logged in. My feed shows a random assortment of pictures and links, as well as a profile image that I made a year ago.

    Except that it’s not a computer screen.

  22. Katherine Vaughn says:

    Passage One — BOT because it goes on a long time without punctuation (people then to overdo commas and journalists prefer shorter sentences)
    Passage Two — BOT because in journalism you put the acronyms the first time you use it. An editor would have caught that.
    Passage Three — human. Lots of commas in the longer sentences, following journalistic protocols, topic accurate
    Passage Four — human. More of an opinion piece and more colloquial and therefore perhaps a little harder for a BOT to reproduce without bringing in a lot of errors.
    Passage Five — BOT. It is skimming language from IntraHealth’s website to get better language (or it’s a lazy human cutting and pasting).

  23. Annette N Brown says:

    One: Bot
    Two: Not
    Three: Bot
    Four: Not
    Five: Not

  24. Emily W says:

    I’m guessing 1-3 are the bot, and 4-5 are human

  25. Wayan Vota says:

    Thanks everyone for playing. The correct answer is… all of them are artificial intelligence creations.

    I entered the first few words of each passage (and the entire first sentence of the last one) into Talk to Transformer, then copy/pasted the results verbatim. Now I did try a few times till I got a passage I liked, but I didn’t edit the passages themselves.

  26. Wayan Vota says:

    One of my colleagues had this to say about this post:

    “This experiment shows how formulaic our writing can be in development. I’d personally hate to be a reviewer for RFP responses. They’ve got to get through several responses to any given set of RFP requirements. How do their eyes not completely glaze over while reading the stuff we send them?”

  27. Wayan Vota says:

    And judging was easy. Irene wins for creativity and humor – thankfully traits still beyond the reach of computers (so far…)

    • Irene says:

      Thank you, Wayan! You have made my day! Like when a cop was conducting a standard car search for weapons and found me a dollar coin!

  28. Julianna Kohler says:

    Passages 2 and 5 are written by bots; they use commas correctly, and very few people are that skilled with commas.

  29. Julie George says:

    Passages One and Four are created by humans and the rest three are bot /machine -created.

  30. Charles OTINE says:

    Passages 3,4 & 5 were wriiten by Machine.

    Passages 1 and 2 probably Human

  31. Eve says:

    Passage 1 and 3 are robot.
    Passage 2, 4 and 5 are human.

