Doing Hamlet by the numbers

Literary theory has been accused of data inadequacy, typically by practitioners of more empirical disciplines — for them, lack of quantification means lack of academic rigour.

Combine this criticism with the theory-driven distrust of interpretation prevalent in the humanities, and what are English profs to do? How does one talk “scientifically” about books without an appropriately quantitative methodology?

Enter Franco Moretti and network theory, and with it the ability to “read” literature as charts and numbers.

The scare quotes around the word “read” are significant, since, as reported on, “Franco Moretti … often doesn’t read the books he studies. Instead, he analyzes them as data.”

Working with a small group of graduate students, the Stanford University English professor has fed thousands of digitized texts into databases and then mined the accumulated information for new answers to new questions.

What kinds of questions can be answered by a database, questions that can’t easily be answered by more traditional critical approaches?

How far, on average, do characters in 19th-century English novels walk over the course of a book? How frequently are new genres of popular fiction invented? How many words does the average novel’s protagonist speak? By posing these and other questions, Moretti has become the unofficial leader of a new, more quantitative kind of literary study.

Well, the first question seems rather pointless to me, but there are conceivable uses for the answers to the others. Still, how necessary is a quantitative analysis of literature? What new insights become available? Moretti acknowledges this issue: “New methods need a little time to find their way, but we should have an answer to the ‘so what’ question in a matter of a few years. Not tomorrow, but not 20 years from now, either.”

In the source article, Network Theory, Plot Analysis, which has been published in somewhat shortened form by New Left Review, Moretti reiterates that quantitative literary analysis using network theory is an emerging discipline: “What about plot – how can that be quantified? This paper is the beginning of an answer, and the beginning of the beginning is network theory.” Essentially mathematical, network theory is a latecomer to literary study.

At its simplest, network theory as it applies to literary works is a theory that studies connections within large groups of objects: the objects can be just about anything – banks, neurons, film actors, research papers, friends… 

The yearning which humanities profs feel for empirical validation can be read between the lines of this passage from the beginning of Network Theory, Plot Analysis:

The theory proper requires a level of mathematical intelligence which I unfortunately lack; and it typically uses vast quantities of data which will also be missing from my paper. But this is only the first in a series of studies we’re doing at the Stanford Literary Lab; and then, even at this early stage, a few things emerge.

It seems that Moretti has published an article that relies on a theoretical base which he admits he doesn’t fully understand. Such is the power of motivation. Yet his intentions are sincere, and he has freely admitted in several places that his work is just a start. Perhaps the methodology will prove to be useful; perhaps it won’t.

What has emerged is a graphical representation of Hamlet in terms of who speaks to whom, expressed as a network of interaction lines between character vertices. The graph is not directional, nor is it proportional. The interaction line (or “edge,” in the vocabulary of the theory) is the same kind and the same “weight” for the short exchange between Hamlet and the Norwegian Captain in Act IV as for the hundreds of lines Hamlet exchanges with Horatio.

Here’s Moretti’s chart for all of the verbal interactions in Hamlet:

The source article published online by Stanford University contains thirty-four variations on this basic graph. By isolating certain relationships and “dropping” certain characters (and their “edges”) one after another, it’s possible to represent a number of plot dynamics.

For example, Moretti is struck by the central role Horatio plays in connecting Hamlet to the world outside the Danish court. Remove Horatio, and Hamlet is left dramatically isolated, facing his task of revenge completely alone. This is a telling insight, but I wonder how much noticing it depends on network theory.

Another example of an insight that is available by other means is what Moretti calls “the region of death” — all but two of the characters who speak to both Hamlet and Claudius die. Since the clash between Hamlet and Claudius is, in Hamlet’s words, a conflict of “mighty opposites,” it’s not surprising that in a tragedy most of those who are drawn into the conflict are swallowed up by it. Do we require a network chart to discover or understand this?

There’s nothing “wrong” with the analysis the chart provides, but do we need it? And what can a chart tell us about the significance that of Laertes and Horatio, the embodiments of “blood and judgement,” it’s Horatio who stands over Hamlet’s body at the end? Or, for that matter, that it’s Fortinbras who joins him? What is the significance that he who thinks without having to act and he who acts without having to think are the survivors? Yes, they speak to each other, but that’s all a diagram can tell us. For this advanced level of understanding, we need the author’s words and our own minds — and, for that, a chart doesn’t help.

It’s not that Moretti is somehow number-drunk. Or that he thinks that charting plots is all that there is to good analysis. Far from it. He draws some well-conceived “traditional” analyses from his charts. My questioning of the methodology rests solely on the need for this kind of analysis on the one hand and on a perhaps inappropriate longing for empiricism on the other.

When Moretti’s data tells him that Hamlet is at an “average distance” of 1.45 from the other characters (that is, an average of 1.45 “degrees of separation” from them), and that Claudius’s number is 1.62, we “learn” that Hamlet is more central to the play’s action than Claudius is. But it certainly doesn’t take network theory for us to realize this. The insight is true, but it’s achievable in other ways.

And it’s possible to consider Rosencrantz and Guildenstern a “single” character without the mildly interesting graphical confirmation Moretti supplies that they never speak to each other, that in a sense “he” never speaks to “himself.”

For Moretti, however, the “new angles” that his methodology offers are exciting opportunities to see the play in new ways:

Once you make a network of a play, you stop working on the play proper, and work on a model instead: you reduce the text to characters and interactions, abstract them from everything else, and this process of reduction and abstraction makes the model obviously much less than the original object – just think of this: I am discussing Hamlet, and saying nothing about Shakespeare’s words – but also, in another sense, much more than it, because a model allows you to see the underlying structures of a complex object. It’s like an X-ray: suddenly, you see the region of death of Figure 5, which is otherwise hidden by the very richness of the play.

But is “the region of death” really “hidden”? As previously noted, no one with Moretti’s critical sensitivity is realizing  for the first time that death stalks the characters in Hamlet. Do we require quantification to justify our insight in this case? Other than as a way to “prove” one’s insights to empirical doubters, I don’t see much need for it.

It’s not even that new. When I first started teaching Hamlet some forty years ago, we used a “Triangular Relationships” chart, the textbook source of which I no longer recall, to map the character dynamics that drive the play’s story:

This chart can be read as a series of eight character triangles, six of which include Hamlet himself. Much of the play’s action can be understood in terms of these triangles. For example, the Polonius – Hamlet – Claudius triangle represents among other things the struggle for truth in the court. Hamlet wants to know if the ghost has told the truth, and Claudius uses Polonius to find out how much of the truth Hamlet knows. When Hamlet kills Polonius, he makes Laertes into a mortal enemy (Polonius – Hamlet – Laertes), at the same time fatally altering his relationship with Ophelia (Polonius – Hamlet – Ophelia). Each of these story-driving relationships is represented by a triangle. At the bottom of the chart, Ophelia and Gertrude are the two women Hamlet loves, both of whom have betrayed him. And so it goes.

Useful, surely. But necessary? Not really.

If network theory is going to be of real utility in literary analysis, that utility isn’t going to come from graphing plots. More interesting, and more fruitful, will be those analyses which tell us, for an off the top of the head example, how much of Pride and Prejudice or Wuthering Heights is dialogue, versus expository or descriptive prose. That kind of analysis may confirm something, indeed, about the implicit attitudes of the books’ authors toward (and perhaps their very different experiences with) conventional social interaction. And there are surely more revealing questions, waiting for someone to ask them, which could be approached quantitatively.

As for Hamlet, I’d forget the thirty-five diagrams — the play’s the thing.