Chatbot Fundamentals

An interactive guide to writing bots in Python

Part 4 of Natural Language Processing for Programmers

By Liza Daly

“It is said that to explain is to explain away. This maxim is nowhere so well fulfilled as in the area of computer programming, especially in what is called heuristic programming and artificial intelligence…Once a particular program is unmasked, once its inner workings are explained in language sufficiently plain to induce understanding, its magic crumbles away; it stands revealed as a mere collection of procedures, each quite comprehensible. The observer says to himself, I could have written that.”

— Joseph Weizenbaum, ELIZA (1966)

In this brief tutorial I’ll walk you through using a popular Python language library to construct a simple chatbot that evaluates and responds to user input. It won’t fool your friends, and for a production system you’ll want to consider one of the existing bot platforms or frameworks, but these examples should help you think through the design and engineering challenges of a conversational UI.

About this tutorial

The source code presented here is interactive. You are strongly encouraged to modify the Python code —right in your browser—and experiment with the outcomes. (You may get a lot of error messages, but I promise you can't permanently break anything!) See Technical details below for more information on how the live code is implemented.

The boundaries of a bot

When you begin work on a conversational UI, even a trivial one, you’ll need to answer these fundamental design questions:

Domain knowledge: What does a user expect this bot to understand?
Personality: What tone or vocabulary does the bot employ?

Domain knowledge

True artificial intelligence does not exist, so while some AIs can imitate humans quite convincingly or answer some kinds of factual questions, all bots are restricted to a subset of topics or conversational gambits. IBM's Jeopardy-playing Watson “knew” facts and could construct realistic natural language responses, but it couldn’t schedule your meetings or deliver your groceries. Simpler commercial bots like SlackBot can successfully help users set up their Slack accounts, but aren’t designed to engage you in open-ended dialogue.

Personality

Bots have historically been personified as something less than fully human to excuse their rote responses and frustrating lack of comprehension. This can be an opportunity for creativity and playful invention—the first bot I helped design was modelled after a famous parrot—but it can also be a minefield of unexamined assumptions. It’s disappointing that so many bots are personified as female or teenagers, as if those groups were naturally subservient or not fully human. It’s probably better for everyone if your bot is personified simply as itself—a computer program—or something truly non-human.

Often the dual axes of domain and personality align: in the program ELIZA, the domain was a therapy session, and the bot’s personality was that of a Rogerian therapist. Domain and personality don’t necessarily need to be tightly coupled, though—an ecommerce bot needs to know about products, sizing, and order status, but that domain doesn’t imply any particular kind of personality. A shopping bot could have the persona of a helpful person, a cheerful kitten, or have no personality at all.

Meet “Brobot”

Having warned you away from human personifications, I’m going to break my own rule and create a bot with a particular set of well-known personality traits and interaction models. I’ll show you some introductory level chatbot techniques by writing software modeled after the dialectical capabilities of a brogrammer.

In this tutorial you can interact with Brobot by talking with it, and in some examples, you can override selected examples of its code to observe the effect on its behavior.

Start by greeting Brobot:

Hopefully, you said something like “Hello” and Brobot said something that sounded like a greeting in reply. For the “greet the robot” use case, we can use simple keyword matching, similar to how ELIZA and other early conversational UIs were modeled. Here’s the relevant code:

This is the simplest possible implementation of a chatbot: it searches the user’s utterance for one or more known keywords and returns one of several possible responses. In practice you won’t want your bot to pick a truly random response—it’s better to cycle through a set of responses and avoid repeats. To keep the tutorial simple I’ve made Brobot completely stateless, so pure randomness will have to do.

Go ahead and modify the code above, right in the browser, to change Brobot’s behavior. Try returning only one response, or responding to more greetings. (If your code has an error, Brobot will pass along the Python message.)

Beyond keywords

Python programmers working with NLP have two great high-level libraries to choose from: TextBlob and spaCy. spaCy is easy to use and fast, though it can be memory intensive and doesn’t attempt to cover the whole of statistical NLP. TextBlob wraps the sprawling NLTK library in a very approachable API, so while it can be slower, it’s quite comprehensive. I’ll use TextBlob here, though see my article on text generation for an example using spaCy.

The main loop of Brobot performs the following steps:

Do some initial pre-processing of the user’s text (this is a good place to hook in checks for unsafe input).
Ask TextBlob to parse the input for us.
Run a series of routines designed to extract the most information from the user’s utterance in a structured way.
Compose a reply that best matches the user’s statement.
Perform any post-processing to ensure as best we can that our bot isn’t behaving badly.

Each of the find_* functions in find_candidate_parts_of_speech() consults TextBlob’s sentence.pos_tags property, which returns the words’ parts of speech. (You’ll want to consult the Penn Treebank reference to map the part-of-speech tag names to the actual grammatical constructs.)

Depending on the bot’s domain, you’re going to be more interested in some values than others, and you may also want to transform some of the input values or identify synonyms.

Here’s some example code that identifies pronouns of interest to us:

I’m going to look for pronouns like “you” or “I” and infer from those that the user wants to talk about themselves or the bot. When identified, I invert them—if the user says “you”, Brobot responds with “I”.

A more sophisticated approach would be to build a dependency tree. Dependency grammars describe the relationship among all clauses in a sentence, allowing you to discriminate between (say) the subject and object of a sentence. If your bot needs to know the difference between “dog bites man” and “man bites dog”, I recommend using the dependency parsing function of a library like spaCy.

Try adding a special case to allow the user to address ’Brobot’ by name in addition to ’you’ to set up a response that refers to the bot itself.

But enough about me, what do you think of me?

Like all brogrammers, our bot loves to talk about itself, so if the user mentions it anywhere in their input, it’ll reply about itself and include a token amount of your input to pretend as if it were listening. This special case routine is fired if the user addressed the bot (by mentioning “you” in their input), and if so, shortcuts all other potential responses:

In a real bot, you’d want to compose responses using a more sophisticated templating engine or maybe even a full-blown Context-Free Grammar.

Try coming up with routines that could use more than one term from the user’s input and still produce sensible output in most cases. Consider the constraints that tense, spelling, and number agreement will introduce.

Constructing a realistic response

The most common case will be that the user supplies sensible input that the program can parse into component words, but none of those words trigger a special case like greeting or referencing the bot. Put another way, the program knows the user said something, but doesn’t “understand” what they said, because their input fell outside of its domain knowledge.

In a purely transactional bot, there isn’t much to do at this point besides return some help text (“You can ask me about booking a flight, changing a reservation, etc.”). In a more conversational bot, you can still manipulate the user’s input to generate a successful response, but it’s more apt to be one that reflects the bot’s personality than its understanding of the world.

In the ELIZA simulation, the bot reflected the user’s input back to them in a gently inquiring way. Because this is a brogrammer, it’s going to try to neg or dismiss the user. We first check for a special case where the user talked about themselves, and if so negate the verb and assert that whatever they said wasn’t true.

If they said anything else, the bot will just mindlessly echo what they said, adding some filler bro-words at the end. Like a real brogrammer, our bot is limited in its intellectual capability and mostly regurgitates aphorisms it saw elsewhere, like LinkedIn.

If we identified a pronoun from the user, re-use that.
In most cases, we pass through the user’s verb unchanged.
If the verb was “to be” and the user was talking about themselves (“I am a good programmer”), the bot will neg them by inverting the meaning of the verb and claiming they aren’t whatever they asserted.
Otherwise, just reconstruct the base words from the user’s original sentence—subject, verb, object—and add some bro-ish filler.

In this code, I manually match all the irregular forms of “to be”, but a more flexible approach would be to convert the user’s verb to a lemma. Stems and lemmas are great shortcuts to mapping a range of potential input to some known value; see also senses and similarity matching. Both techniques require more horsepower than I could allocate to little Brobot, but don’t require much code when using NLP libraries.

Try telling the bot “I am [something]” and verify that it disagrees with you. How could you enhance this behavior?

You’ve got to be kind

The last routine run by any bot should be a filter to limit unpleasant or unsafe output. Just as we should have filtered incoming input to prevent foreign code execution or (maybe) offensive language, we want to ensure that the bot doesn’t say things that are harassing or contextually inappropriate. The PR fallout from neglecting this step can be considerable.

In many ways, this is a doomed exercise from the start. Security experts will confirm that there is no sure-fire way to sanitize unrestricted user input. (For example, I can’t truly prevent you from putting destructive Python code into this tutorial, but it’s deployed on a transient backend with no permanent storage, no internet access, and nothing connected to me personally.)

But even if it is theoretically impossible to prevent a bad bot, as bot creators we have an ethical obligation to at least try. For Twitter bots, this means not DMing or @-messaging other users. For Slack bots, we should limit the permissions allocated to the bot to prevent it from issuing commands. And for all bots, it means performing checks against offensive words and phrases before allowing the bot to parrot back user input in a harmful way.

A very simple filter against a list of known offensive terms is a good first start, as is removing potentially dangerous characters like ’@’ or ’#’ that are meaningful on Twitter.

Confirm that Brobot won’t repeat words that being with a hashtag or at-sign. What other kinds of filters would you need in your chat environment to minimize abuse?

Go forth and crush it

I covered most of the functional parts of Brobot, but please review the complete source code. In most real-world cases, you’ll want to move from the prototype stage to a full-blown messaging environment. You may even want to scrap your NLP-based work and start over using existing grammars and libraries for specific chatbots. But I encourage you to start with the fundamentals—I particularly recommend a test-first approach, as it’s a natural fit for conversational UIs.

Technical details

There are several ways to run a Python interpreter in a web browser, but those methods typically limit one to the Python native library. That’s fine for learning Python itself, but it would preclude tutorials like this that require complex third-party libraries like TextBlob. The journal Nature first pioneered running Jupyter Notebooks in the browser using Docker as the backend. This infrastructure was later commercialized by O'Reilly Media (though their examples seem to be down).

This tutorial takes a different approach: AWS Lambda provides highly scalable, inexpensive, short-lived Python sessions that can be reached via a lightweight API. Using Lambda eliminates the need for cumbersome Docker container maintenance, and is essentially free for low-traffic use. I’d love to work with others on further developing this technology for use in online learning; please get in touch!