New York-based data scientist and designer Matt Daniels recently noted Shakespeare’s much touted vast vocabulary and charted how many different words Shakespeare used in comparison to contemporary hip-hop artists. It turns out that a good handful of rappers use a greater vocabulary than Shakespeare did, for the same sized block of lyrics.
Daniels doesn’t draw the conclusion that today’s rappers are more creative and poetic than Shakespeare, but the implication hovers.
It’s true that admirers of Shakespeare have often celebrated the sheer size of the vocabulary he used in his works. There’s a paragraph in the introduction to the current Norton Shakespeare which does exactly this.
Looking further back, a century and a half ago the philologist Max Müller contrasted the 300 words used by a rural labourer with the 3,000 of the educated person of his day and the 15,000 of Shakespeare’s. It seemed natural that the pre-eminent creative writer in the Western tradition should also have the largest vocabulary ever known, something suitably prodigious and extraordinary.
But in an age of data claims such as these are bound to be tested, and two separate studies, one in a book on stylistics (2011), and one in a Shakespeare journal (also from 2011), have now shown that, when you compare like with like, Shakespeare does not in fact have a very large vocabulary.
If you take six plays by Shakespeare and six plays by one of his contemporaries, the number of different words used in Shakespeare’s plays is no larger, and often smaller, than in the others. Shakespeare does not introduce any more new words in successive plays than his rivals do.
The myth of Shakespeare’s prodigious vocabulary
There are three obvious reasons why the myth of Shakespeare’s huge vocabulary had such a grip, and lasted so long: his celebrity as an author, already mentioned; the number of his plays that have survived, reflecting both his productivity and the efforts made to preserve his plays after his death; and the fact that whereas there are many good ways of estimating his vocabulary from concordances and good complete editions, the same was not true for his peers.
Shakespeare was just better documented and his vocabulary was easier to measure.
But the end of the myth does not leave Shakespeare diminished. It just makes you think about whether using a vast vocabulary is such a good thing anyway.
After all, writing with incessant new and different words can be quite hard to read or listen to. This may work well when the writer wants to depict an unfamiliar world like the civilisation of a remote planet (science fiction) or an underground, secret organisation (gangster fiction) but not with a domestic comedy, or an imaginary dialogue between two people who know each other well.
Does vocabulary size really matter?
So many of Shakespeare’s memorable lines are not based on fancy vocabulary:
To be or not to be …
All the world’s a stage …
Some are born great …
Shall I compare thee …
It looks as though it is what a writer does with words, rather than how many different words they cram into a speech or a song, that matters.
In fact, it may be that what is remarkable about Shakespeare’s language is not its outlandishness but how close it is to the overall standard of the language of his time, as another numbers-based analysis suggests.
So what is it that fascinates people about vocabulary size? It seems to offer a neat quantitative measure for literary quality. But this does not stand up to scrutiny.
The good thing about busting the Shakespeare vocabulary myth is that we can now avoid that particular dead-end in working out what makes his use of language so remarkable and explore more promising ones, such as the abundance and creativity of his metaphors, and his ear for the turns of ordinary speech.
As for the rappers graphic, it is interesting to compare the word use of the different artists. Although, as one of the busters of the Shakespeare vocabulary myth Ward Elliott pointed out to me, it’s not fair to put the Wu-Tang Clan on the same scale as individual rappers, as combining different writers will always make for a larger overall vocabulary.
But all in all we have lots to learn from bringing quantification to the study of the language of writers. Just as long as we don’t confuse vocabulary size with literary quality.