I’m used to hearing people complain about the language of Them Dern Kids, but this rationale is a new one.
Here’s the claim:
800 words won’t get job done
LONDON: A generation of teenagers risks making itself unemployable because its members are using a vocabulary of only about 800 words a day, according to the British government’s first children’s communication tsar.
Communication tsar? Are they sure she’s not a czar?
I wonder what’s causing the supposed paucity of vocabulary? Could it be the Internet and mobile phones?
The teenagers are avoiding using a broad vocabulary and complex words in favour of the abbreviated “teenspeak” of text messages, social networking sites and internet chat rooms.
Thought so.
Jean Gross, the government’s adviser on childhood language development, is planning a national campaign to prevent children failing in the classroom and the workplace because they cannot express themselves.
“Teenagers are spending more time communicating through electronic media and text messaging, which is short and brief,” she said. “We need to help today’s teenagers understand the difference between their textspeak and the language they need to succeed — 800 words will not get you a job.“
Gee, 800 words doesn’t sound like a lot. Or is it? How many words do most people say? Let’s check.
First, keep in mind that the 800 words claim is about daily vocabulary, not total vocabulary. That is, young people are using the same 800 words over and over again in a typical day. I’m not sure if that’s true, but let’s accept it for now. The question is: how many different words do adults employ in a day?
We’re going to use a dialogue corpus to find out. I’m pulling words from Verbmobil-2, a corpus of appointment scheduling dialogues. But we don’t know many words to use until we know how many words someone speaks in a day. This is a scary prospect, laden with assumptions.
I had a read through the corpus and found that I can read about 250 words out loud in a minute. Of course, in a dialogue you’d only be speaking about half the time unless you’re rude, or a lecturer. (Or, like me, both.) So let’s say I’d rip through 7,500 words in an hour. Most of us spend some time alone or watching TV, so I doubt we’d spend the equivalent of 4 full hours of every day talking. But let’s say 30,000 words as an upper boundary. (I admit this is highly speculative. Stay with me.)
Here I’ve listed the number of word types (different words) for various numbers of word tokens (each separate word we say) in the Verbmobil-2 corpus. If you think you’re more laconic or loquacious, you can adjust your expectations accordingly.
Word tokens |
Word types |
10,000 words |
814 types |
20,000 words |
1,080 types |
30,000 words |
1,342 types |
40,000 words |
1,510 types |
So if you’re an adult on the lower end of the talking scale, you’re going to use about 800 different words, over and over. And even if you quadruple the number of words you say, that still won’t quite double the daily vocabulary. Keep in mind that 40,000 words represents hours and hours of transcripts. The fact is, 800 words is quite a lot. Even if teens only use the same 800 words over and over, that’s certainly not a sign that their vocab is sub-standard. That’s just the way word frequencies fall.
UPDATE: I’ve just discovered this article in USA Today about a study that saw people wearing tape recorders all day long.
Both sexes say about 16,000 words a day, a study in Science magazine says.
He and colleagues analyzed conversations recorded from 1998 to 2004 of 396 students in the USA and Mexico, 210 women and 186 men, ages 18-29. The study examined word count, not vocabulary or word use. Pennebaker says two-thirds of participants spoke 11,000 to 25,000 words a day; the average for both sexes was about 16,000.
So there it is. Sixteen thousand words of dialogue would probably be comprised of under 1,000 word types a day, not too far from 800.
Let’s take a look at another claim in the article.
Ms Gross said her concerns were supported by research by Tony McEnery, a professor of linguistics at Lancaster University, who found in a study that the top 20 words used by teenagers, including “yeah”, “no” and “but”, account for about one-third of the words used.
Twenty words is not a lot. Is it possible that it could account for a third of the total?
Fortunately, we have frequency statistics for many corpora. If we take a look at the top 1000 words from COLT, the Bergen Corpus of London Teenage Language, we can see that the top 20 words account for 35.6 percent, or about a third. (Some words are excluded from this count, but that just means that the real proportion will be a good deal smaller, which makes the teens seem even more erudite.)
Now we head over to this data from the BNC, or the British National Corpus, a large and wide-ranging collection of spoken and written language. Here, the top 20 words account for around 32 percent of the total, or… about a third.
I decided to run a counter over some works of literature. I tried George Orwell’s 1984. Nobody’s going to accuse Orwell of having a tiny vocabulary. But here the top 20 words account for only 33.7 percent of the total. And for Alice’s Adventures in Wonderland by Lewis Carroll, the top 20 words make up, again, 33.1 percent of the total. Somebody better tell Pseudonym Boys that they’ll never get jobs with that kind of vocabulary.
Gross’s claims sound impressive until you break them down. Most people don’t do this because it’s easier to just accept claims that you already believe. But it’s just another way to complain about young people in a way that’s socially acceptable. It’s a shame people try to enlist linguistic data to confirm their prejudices.
If you want to hear me say about the same thing on the radio, you can listen to last week’s RTRFM interview. For some reason, I was talking pretty fast. I bet I could have clocked 60,000 words per day at that rate.

I’m on about 5/6ths of the way through the stream. Watch out; it starts playing as soon as the page loads.
Recent Comments