New look at baby study shows weaknesses of behaviour interpretation

I’ve argued here more than once that, when it comes to psychology, measurement trumps interpretation. That’s one big reason that I am less critical of brain scans than some others are. To the extent that you have to interpret a game or speculate about a gesture, you’re on potentially shaky ground.

A newly-published study provides evidence of some of the potential problems that can plague research that may appear to be empirical, but really isn’t.

The study, “Social Evaluation or Simple Association? Simple Associations May Explain Moral Reasoning in Infants,” published by PlosOne on August 8th, re-evaluates a landmark experiment that used a toy scenario to conclude that infants have an innate preference for “moral” helpers.

The original study relied on “substitute” behaviours, for the simple reason that  infants are unable to communicate their “feelings” directly. This need to find equivalencies makes experimental design the crucial factor in assuring that the activities employed are, in fact, measurements of the target state of mind.

Doubts in this area have been raised by critics of a recent study of “empathy” in lab rats. In that widely-publicized test, rats were observed to work to free cagemates from unpleasant confinement. The researchers interpreted this helping behaviour as evidence of empathy, but critics point out that a more parsimonious interpretation of the subjects’ actions might be that the helper rats may have been motivated not by empathy but by a selfish desire to stop their cagemates’ disturbing distress vocalizations.

The point is not that the test’s conclusions were correct or incorrect but rather that there is no methodological way to be absolutely sure that what was being observed corresponded to the target effect.

The new study raises the same concerns for a hugely influential study by Hamlin et al., a study that appeared to show that human infants are born with a preference for individuals who assist others rather than hinder them. If this conclusion is true, then morality may be innate and universal, not cultural — and many groups with a religious stake in the nature and origins of morality have jumped on the results as proof of God’s Eternal Law.

In the original Hamlin study, infants watched a scenario in which a toy climbed a hill, and another toy appeared, bumping the first toy from behind to help it to the top, or bumping it from above to send it back to the bottom. When the young subjects were then offered a choice between the two secondary toys, they showed a clear preference for the “helping” toy, choosing it over not only the “hindering” toy but also over a “neutral” toy (which was itself preferred over the “hindering” toy).

What to make of this? The original conclusion was that because the infants displayed a clear preference for “helpers” over “hinderers,” this indicated that the children had an inborn moral sense.

But was the children’s behaviour really a response to a moral situation, or is there a very different, and much simpler, explanation?

The new research studied the supplementary videos of the original test and noted that, when the climbing toy reached the top, it bounced up and down (presumably for the joy of accomplishment?). When the climbing toy was sent to the bottom, there was no “dance.”

So the researchers tried the test with the bounce at the bottom, not at the top, and the results reversed. Infants now preferred the “hinderer” toy to the “helper” toy. Why? Perhaps, the study suggests, it’s nothing more than that young infants like energetic movement, and they form a positive association with whichever secondary toy generates that movement. Where did the moral sense go? The obvious answer is that it didn’t go anywhere, because it wasn’t there in the first place.

Crucially, the new interpretation doesn’t conclude whether or not infants possess an innate moral sense. But what it does do is to demonstrate how hard it can be to construct behavioural studies that are truly equivalent to the trait you want to test.

So we come back to the assertion at the beginning, the idea that unless you’re measuring something concrete, and measuring it directly, you’re not really being empirical, no matter how many statistical processes you apply to your “data.”

Before anyone objects, this doesn’t mean that all empirical measurements are meaningful. There is a parallel problem when you’re closely measuring something that you then associate with something else that you’re not directly measuring. If I measure the activation of a certain brain area during a particular kind of mental task or while a certain physical action is underway, I do have a specific number. What I don’t necessarily have, however, is any assurance that my measurement over there is causally connected to the thinking over here. It may be, but my assumptions in the case of a brain scan have the same methodological limitations as my assumptions in the case of a behavioural study.

Are we just stuck, then, unable to measure anything but the average depth of a stream or the duration of a rain shower? No, we’re not, but if we aren’t always aware that our approximations may be only approximations, we may jump to anticipated or hoped for conclusions without full justification.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s