Getting a computer to recognise humour is a tricky undertaking, but some language researchers are attempting it — if you can call puns ‘humour’.
Understanding a pun is sort of like a word-sense disambiguation task. If you’ll forgive an example:
Sign at a drug rehabilitation center: “Keep off the grass.”
Here, two senses of a word play off each other, to somewhat humourous effect. Word-sense disambiguation is a well-studied area in natural language processing, so this is as good a place as any to start.
Here’s an account of one such attempt.
To teach the program to spot jokes, the researchers first gave it a database of words, extracted from a children’s dictionary to keep things simple, and then supplied examples of how words can be related to one another in different ways to create different meanings. When presented with a new passage, the program uses that knowledge to work out how those new words relate to each other and what they likely mean. When it finds a word that doesn’t seem to fit with its surroundings, it searches a digital pronunciation guide for similar-sounding words. If any of those words fits in better with the rest of the sentence, it flags the passage as a joke. The result is a bot that “gets” jokes that turn on a simple pun.
That sounds simple, but the main problem is how to tell when words don’t seem to fit with the other words in the sentence. For this project, it sounds like they’re generating co-occurrence tables, or statistics about how often any given word is likely to be seen with any other given word. That way, if a word shows up with other words it’s not likely to co-occur with, the system will flag it.
Click on the graphic for an example:![]()
This approach might work well for puns that rely on similar-sounding words, but what about our ‘Keep off the grass’ example above, where the pun relies on two senses of the same word? This system will fail to notice that a pun exists because there’s nothing to suggest that a “Keep off the grass” sign is anything out of the ordinary. And ‘drug’ and ‘grass’ do happen to co-occur in texts, so the system will see no incongruity.
To recognise these kinds of jokes, I might try scanning words in the sentence for multiple senses, and then seeing if groups of words show interesting properties. Perhaps ‘grass-1’ will co-occur frequently with ‘keep’, ‘off’, and ‘sign’, but ‘grass-2’ will co-occur with ‘drug’ and ‘rehabilitation’. We might then infer that since both senses of ‘grass’ are well-linked to different words in the sentence, a pun is going on.
This kind of approach, where we look for unexpected word occurrences, comes close to the essence of humour. Many things are funny because they’re unexpected, yet somehow right. To a small child, slipping on a banana peel is funny because it’s unexpected. Walking down the street unimpeded is normal, and therefore not funny. As humans, we have a lot of experience with the world, and we know what’s likely to happen and what’s not. Giving this informaton to computers is a difficult task, but it may be the key to humour recognition.
Recent Comments