Idea 1.2: Research (You might want to read 1.1 from last year.)

This post was inspired by a couple of recent podcasts on the inability of a group of researchers to replicate a large percentage of research findings in psychology. We live in a science-based society, but many of us know little about how science works, how studies are funded, designed, and reported, and the consequences of these realities. If we are aware of the pressures on scientists to obtain money for research, to show positive results, or to report eye-catching results, we may know enough to be suspicious of the articles we see highlighting the results of the latest study.

The pressure to publish can cause scientists to use questionable research practices that all too often give rise to the charge that “statistics can be used to prove anything.” Of course, because of the pressure, deliberate confusion or misleading reporting of data can mislead even those well versed in the theory of probability.

For years I have been skeptical of the research that underlies education reform. There are many reasons for this, including the problem of reproducing research in behavioral psychology—one major area devoted to education research—the lack of improvement in education outcomes as a result of research, and the spin, lies, and deception that are constitutionally protected forms of free speech. Perhaps the most important reason for my skepticism is that we can only infer the intentions of those whose behavior we quantify.

One basic problem with the methodology of psychology—because the mind is largely a “black box” to psychologists, is that research must look at behaviors and attempt to discern the motivations of those behaviors. But, many of us are mostly unaware of the motives underlying our own behavior, so how can we expect to be sure about the motivations for the behavior of educators, students, parents, and community members?

In a recent episode of “On the Media,” Ulrich Schimmack, Professor of Psychology at the University of Toronto described a method he uses to check the reproducibility of psychological research. (Note: I have the paper as a PDF, so I can’t link to it. This is a link to Schimmack’s blog about it. Find “Quantifying Statistical Research Integrity: The Replicability Index,” November 2014, from this link):

Considering the quantity of education research that is funded by groups with a material interest in the results, it is vital that we statistically analyze research on education through the lens of replicability. It is also a good idea to rely on the evidence of professional educators when that evidence conflicts with the results of studies—especially if they haven’t been replicated. Then there is the problem with the fact that we cannot know what would happen if we didn’t do what we are doing. Education is so important that when we believe that an approach is helpful, we wouldn’t think of denying some children the intervention in order to have a true randomized study.

A recent episode of Freakonomics Radio, “When Helping Hurts,” featured a rare analysis of a mentoring program. The discussion was about gathering objective research on whether or not mentoring helps. The original study was done by Richard Clarke Cabot, a physician and philanthropist whose life straddled the 19th and 20th Centuries. He commissioned the Cambridge Somerville Youth Study, initiated in the 30s, a study that lasted so long that it is still going on today. Cabot was motivated by the high recidivism rate of youth in the juvenile justice system and wanted to gain evidence of the efficacy of mentoring.

He identified groups of kids who seemed to be having difficulties, and paired them randomly. He then took one member of each pair and assigned him to an intervention and the other to a non-treatment control group. The original study lasted from 1939 to 1945; it’s almost nonexistent to have a randomized trial on this subject. The study was reexamined by Joan McCord, who published her findings in 1981:

To quote from the link:

“The program had no impacts on juvenile arrest rates measured by official or unofficial records. The program also had no impacts on adult arrest rates. There were no differences between the two groups in the number of serious crimes committed, age at when a first crime was committed, age when first committing a serious crime, or age after no serious crime was committed. A larger proportion of criminals from the treatment group went on to commit additional crimes than their counterparts in the control group.”

In short, the mentoring intervention either had no effect or was responsible for a greater number of crimes committed by the group that had received the years of intervention. Of course, because of this study, we can actually ask why such a counterintuitive result was obtained—and perhaps avoid interventions that have no impact or cause negative outcomes. But who would have believed that mentoring a young person would have either no effect or cause harm?

Research in pedagogical strategies is most hampered by the fact that teachers and students are all individuals who may be naturally more or less suited to each other, teachers who may be more or less culturally competent, students who are better or worse prepared to actually learn—depending on health, nutrition, family cohesion, and a variety of other factors our of teachers’ control. It is also not possible to do fully randomized trials of teaching and learning, leaving us with theoretical, rather than empirical studies. In a future blog, I will examine public education in Finland—because they choose the most qualified candidates, give them the best training, and let them do research in their classrooms, Finnish education is about the best that the institution can do. If it doesn’t work in Finland, it’s likely the system, not its implementation.