Gricean Maxims and AIML Safe Reduction

By | May 28, 2017

As customer service is becoming increasingly automated, the importance of chatbots is growing. AIML is a simple chatbot language. It stores sentence patterns  and valid responses to those sentences. Storing every sentence of a language is redundant, and various methods are used to keep the database to a manageable size. One such method is safe reduction.

Here’s a snippet of AIML:

Safe reduction is a method that essentially turns long sentences into short ones that have the same meaning. These short sentences are known as patterns. You can see a pattern above. The short sentence in the pattern (“How are you”) may often be the result of safe reduction, as shown below.

Intuitively, this reduction seems fairly safe because “How are you” has much the same meaning as the sentences its replacing. However, a problem that may often appear is that removing words could change the meaning of a sentence more than expected.

Do the two sentences above have the same meaning? If the adjective doesn’t affect the sentence significantly, it can safely be removed. If it does, however, then it may be best to keep it as is. When creating a language model, we need a threshold of differentiation that helps it decide what to change and what to keep. This could be done statistically by collecting the set of relevant bipolar scales, linguistic testing of semantic bipolarity, and establishing semantic differential dimensionality. A more brute force method, and less supervised, would be to look at the similarity in contexts of the sentence with and without the adjective — if the contexts are similar then the adjective may be redundant.

An example of using context to find synonyms would be the following sentences in a corpus:

We can assume that the word “ate” has the same meaning as “consumed” and “devoured” if it’s consistently in the same context. We can apply this same technique to entire phrases or even sentences. These unsupervised approaches are common in text analysis, especially word embeddings.

When we speak, we often use politeness rules that add meaning to our language. For example, when ordering in a restaurant, “I’ll have the steak” may seem too direct. To be less direct, we might say: “I guess I’ll have the steak, please.” Removing the extra words can reduce the politeness of the statement.

We may also add words to change the meaning of a sentence. “Could you possibly be any more annoying?” is an example where we intentionally add words to convey our displeasure. Imagine that you asked somebody how their day was and they responded with the following sentence: “Well, I went to work and I had meetings and typed stuff and did lots of talking.” This isn’t exactly a “fine.” Why would we expend so much time and effort to make a reply much longer than it could be? We use subtle cues like volume, intonation, length, manner, body language, and a host of other methods to add meaning to a sentence. Ideally, we would make sentences as compact as possible without being impolite.

Paul Grice suggested that we use four rules, or maxims, to make discussions cooperative:

  1. The maxim of quantity – one tries to be as informative as one possibly can, and gives as much information as is needed, but no more.
  2. The maxim of quality – one tries to be truthful, and does not give information that is false or that is not supported by evidence.
  3. The maxim of relation – one tries to be relevant, and says things that are pertinent to the discussion.
  4. The maxim of manner – one tries to be as clear and as orderly as one can in what one says to avoid vagueness, obscurity, and ambiguity.

One could argue that no kind of reduction is totally safe, but for the sake of a model, and horseshoes, close may be good enough. But there may be a significant semantic difference between “I am tired” and “I am especially tired.”

Adding “especially” to the sentence may intentionally flout the Gricean Maxim of quantity, changing its meaning to something similar to “very”. Similarly, “really, really, really tired” may have significantly different meaning from “really tired,” let alone “tired.”

Would a chatbot that relied heavily on reduction perform poorly on a Turing test? Take the following conversation:

You: How are you?
Bot: I am fine.
You: Great. How did the meeting go?
Bot: It went well.
You: I’m glad. You were soooo nervous about it.
Bot: Yes.

Do you get a sense that the bot is angry at you? You might; you might not. Personally, my intuition suggests that Bot is angry and intentionally being brief with their replies, violating the Grician maxim of quantity. I would hazard to guess that most people would not sense any kind of friendliness. What if this were a customer service bot where friendliness is important?

Safe reduction reduces the database size, but also reduces the naturalness of language. Obviously, some variations can be reduced safely, but how much is too much?

These two sentences likely have very little difference in meaning, which makes their reduction safe. But how can a model know this? Synonyms can be found with context. In fact, many safe reduction problems can be solved with constituency tests.

The constituency test is used by syntacticians to determine the part of speech of a word or group of words. Take the following example:

Notice that “Jack and Jill” can be replaced with “they” and the sentence still makes sense. This means they’re in the same category of speech. In this case, they’re both noun phrases. If we find words or groups of words that are often found in the same context in a document we can assume that they are similar. They may not only be the same part of speech (POS), but also the same lexeme. A lexeme is basically all the possible forms of a word. e.g., “are” and “is” are forms of the verb “be”.  Simplifying words into their original form collapses all forms into a single common form, which is similar to reduction of sentences. However, meaning is lost. For example, changing “him” to “he” loses the accusative case which is important in English. For example, “he punched he” is an impossible sentence in English because the object has the wrong case. Fortunately, in a subject-verb-object (SVO) language like English, the noun coming after the subject is almost always the object. We can use these patterns to reduce sentences. For example, if “he” almost never follows the word “with”, that’s a clue that when we reduce, we can’t put “with he” in the reduction.

Reducing a sentence can also disconnect bound words. Binding is basically mapping pronouns with their respective nouns.

Jack and Jill fell down the hill and they were injured.

“They” in the sentence above is bound to “Jack and Jill.” I won’t go deeply into binding theory, where the pronoun refers to its antecedent noun, but it’s similar to collocation. A collocation is a sequence of words or terms that co-occur more often than would be expected by chance. Not only can we do this statistically — a simple n-gram model would do — but there is strong evidence that the brain also uses collocation from frequencies (some words sound well together, while others don’t).

Any one of those adjectives can be exchanged with any other and the meaning of the sentence would be much the same. This means that really, very, truly and extra can be synonyms.

But we can take it a step further and exchange entire phrases.

The entire predicate in the second clauses can all replace each other without changing the meaning of the sentence. Safe reduction algorithms usually look for such context patterns in a large corpus to reduce all of the sentences above into one, where we may end up with something like “He got big because he ate a lot.”

One method that can be used to retain natural language in a database is to map the reduced phrase to all of the original phrases, rather than merging them.  A disadvantage of keeping the original sentences is the large database. The advantage is that a smaller store, such as a triple store, of basic sentences could do all the work.

Share this article