Getting a computer to recognise humour is a tricky undertaking, but some language researchers are attempting it — if you can call puns ‘humour’.
Understanding a pun is sort of like a word-sense disambiguation task. If you’ll forgive an example:
Sign at a drug rehabilitation center: “Keep off the grass.”
Here, two senses of a word play off each other, to somewhat humourous effect. Word-sense disambiguation is a well-studied area in natural language processing, so this is as good a place as any to start.
Here’s an account of one such attempt.
To teach the program to spot jokes, the researchers first gave it a database of words, extracted from a children’s dictionary to keep things simple, and then supplied examples of how words can be related to one another in different ways to create different meanings. When presented with a new passage, the program uses that knowledge to work out how those new words relate to each other and what they likely mean. When it finds a word that doesn’t seem to fit with its surroundings, it searches a digital pronunciation guide for similar-sounding words. If any of those words fits in better with the rest of the sentence, it flags the passage as a joke. The result is a bot that “gets” jokes that turn on a simple pun.
That sounds simple, but the main problem is how to tell when words don’t seem to fit with the other words in the sentence. For this project, it sounds like they’re generating co-occurrence tables, or statistics about how often any given word is likely to be seen with any other given word. That way, if a word shows up with other words it’s not likely to co-occur with, the system will flag it.
Click on the graphic for an example:
This approach might work well for puns that rely on similar-sounding words, but what about our ‘Keep off the grass’ example above, where the pun relies on two senses of the same word? This system will fail to notice that a pun exists because there’s nothing to suggest that a “Keep off the grass” sign is anything out of the ordinary. And ‘drug’ and ‘grass’ do happen to co-occur in texts, so the system will see no incongruity.
To recognise these kinds of jokes, I might try scanning words in the sentence for multiple senses, and then seeing if groups of words show interesting properties. Perhaps ‘grass-1’ will co-occur frequently with ‘keep’, ‘off’, and ‘sign’, but ‘grass-2’ will co-occur with ‘drug’ and ‘rehabilitation’. We might then infer that since both senses of ‘grass’ are well-linked to different words in the sentence, a pun is going on.
This kind of approach, where we look for unexpected word occurrences, comes close to the essence of humour. Many things are funny because they’re unexpected, yet somehow right. To a small child, slipping on a banana peel is funny because it’s unexpected. Walking down the street unimpeded is normal, and therefore not funny. As humans, we have a lot of experience with the world, and we know what’s likely to happen and what’s not. Giving this informaton to computers is a difficult task, but it may be the key to humour recognition.
13 August 2007 at 5:36 pm
OK so here is my pain in the ass comment for the day. You once wrote how you hate science articles about “animals using language” and gave an example of an article starting off with “whales can talk with each other” and then at the end of the article talking about communication instead of language.
Well you start with, “Getting a computer to recognize humor…”
and then go on to explain,
“This kind of approach, where we look for unexpected word occurrences, comes close to the essence of humor.”
Maybe I’m wrong but I think being very careful in how you represent computer logic is just as important as not misrepresenting animal behaviors. Its easy to fall into the pathetic fallacy (giving inanimate objects human characteristics)
14 August 2007 at 6:11 am
Well noted.
But surely if I get a computer to go ‘ping’ when it detects a joke, then I have in fact taught the computer to recognise humour. I only mean ‘recognise’ in a very mechanistic sense. And I wouldn’t claim that the computer has a sense of humour.
17 August 2007 at 7:41 pm
I’m on shaky ground here but I don’t think you can “teach” a computer to do anything. You can program it to do things.
I am also not sure that the computer is recognizing humor. I think it is “recognizing” the specific set of parameters that you programmed it to look for.
Let me add that even hough I hate the term A.I. I do appreciate the complexity and am amazed at the progress made by you and those like you working on language recognition programs.
and final question, what if you come up with a pun that may have several different meanings but only one of them is most appropriate for the actual meaning of the joke. How might you tackle that type of humor recognition?