A few days ago, Joe Hilgard asked a group of psychologists on Facebook how we create a culture that allows “frank and honest cataloging and discussion of the relative incidence of p-hacking” (where p-hacking is the use flexibility in collecting, analyzing, and reporting research in a way that increases the rate of positive results).
I’ve been thinking a lot about this over the past few days. First, I agree with Joe that part of doing science is building on the past, and uncertainty about the power and bias of our literature means that making use of our past is a struggle.
Those of us who believe that power is generally well under 50%, that publication bias is nearly 100%, and that flexibility in stopping rules, analyses, and reporting practices were used often enough to be of concern approach the published literature with skepticism about the size and reproducibility of reported effects.
This doesn’t mean that we assume every effect we encounter was found only through lots of trials of small studies that used flexible stopping, analysis, and reporting to achieve p < .05. But it does mean that we have some motivation to estimate the probabilities of these practices if we hope to make use of the literature we read.
It also means that, to a greater or lesser extent, some of us see sharing well-supported inferences about the power, extent of publication bias, and use of flexibility as a social good. Making informed inferences about these things can require a lot of work, and there’s no reason for every person researching in an area to duplicate that work. On the other hand, like every other part of science, these analyses benefit from replication, critique, and revision, so discussing them with others can make them better.
So that leaves me here: I am not confident that everything in our past is a reliable source of inference without investigation into its bias. Much as I love the idea of pressing the reboot button and starting over, I think that’s ultimately more wasteful than trying to make something of the past. I want to be able to do bias investigations, to share them with others, and to learn from the investigations others have done.
This is not about finding out who is good or bad or who is naughty or nice. This is about doing the best science I can do. And for that, I need to know how to interpret the past, which means I need a way to be able to talk about the strengths and weaknesses of the past with others.
Pretending like everything in the past is solid evidence is no longer an honest option for researchers who have accepted that small sample sizes, publication bias, and flexibility are threats to inference and parts of our research legacy. Yet, saying, “Gosh, I don’t quite believe that this study/paper/literature provides compelling evidence,” feels risky. It might be seen as an attack on the researchers (including one’s own collaborators if the research is one’s own), might be deemed uncivil, or might invite a bunch of social media backlash that would be a serious hassle and/or bummer. So Joe’s question is really important: How do we create a culture that makes this not an attack, not uncivil, and not a total bummer?
What Can We Do?
I have a few ideas. I don’t think any of them are easy, but I suspect that, like many things, the costs of doing them are likely not as high as we imagine.
- Stop citing weak studies, or collections of weak studies, as evidence for effects
When you think the literature supporting an idea is too weak to draw a confident inference, stop citing the literature as if it strongly supports the idea. Instead of citing the evidence, cite the ideas or hypotheses. Or stop citing the classic study you no longer trust as good evidence and cite the best study. When reviewers suggest that you omitted a classic and important finding, politely push back, explaining why your alternative citation provides better evidence.
- Focus on the most defensible criticism
As Jeff Sherman pointed out, it can be harder to find evidence that research makes use of research flexibility than that it exhibits low power and publication bias, and an argument about flexibility has more of a feeling of a personal attack. It’s relatively easy to show that even post-hoc power (which is likely an overestimate) is low and yet every reported finding is positive. Like all evidence, this isn’t proof that power is low and suppressed findings exist, but it’s reason to be cautious. If you can make a point for caution with power and publication bias alone maybe don’t bring up flexibility. So long as suggesting the use of flexibility feels like a personal attack, unless there are really compelling reasons to suspect that flexible research practices were used, you might be weakening your case against the evidence by suggesting they are possible.
That’s not to say we shouldn’t discuss research flexibility where there is good evidence for it, but I think Jeff Sherman makes another good point about such criticisms: “If I suggest that lack of double-blinding may be a problem for a study, I am specifying a particular issue. If I suggest p-hacking or researcher degrees of freedom, I am making a suggestion of unknown, unspecified monkeying around. There is a big difference.” So when suggesting that flexibility may undermine the inferences from a line of research, it’s important to be as specific about the type of flexibility and as concrete in the evidence as possible.
- Check yourself
Perhaps the safest place to start is with oneself. Michael Inzlicht and Michael Kraus have written about how some of their previous research shows signs of bias (and how they are changing things so that their future work shows less bias). They haven’t called out specific papers, but they’ve p-curved and TIVA’d and R-Indexed their prior papers and owned up to the fact that the work they’re doing now is better than the work they did in the past.
In admitting that their own research exhibits some forms of bias, they have opened the discussion and made it safer and easier for others to make similar admissions about themselves. Not that it was easy for them. Michael Inzlicht talks about fear, sadness, and pain in the process. But it is beautiful and brave that he not only performed the self-check anyway but went on to publish it publicly. And ultimately, he found the experience “humbling, yet gratifying.”
- Publish commentaries on or corrections of your previous work
I’m not going to pretend that this is at all easy or likely to be rewarded. It’s hard to remember exactly all of the studies that were run in a given research line, and, unfortunately, records may not be good enough to reconstruct that. So researchers may not know precisely the extent of publication bias in their own work. But still, for those cases where one knows that bias exists, it would benefit the entire community to admit it.
I can only think of one instance where someone has done this. Joe Hilgard wrote a blog post about a paper he had come to feel reported an unlikely finding based on (actually disclosed) flexible analyses and reporting. Vox wrote up a report complimenting Joe’s confession (and it really was brave and awesome!), but the coverage kind of gave the impression that Joe’s barely-cited paper was responsible for the collapse of the entire ego depletion literature: “All of this goes to show how individual instances of p-hacking can snowball into a pile of research that collapses when its foundations are tested.” Oops.
I doubt that that would happen to the next person who publishes a similar piece. But what will happen? One comment on Joe’s blog post asks whether he plans to retract the paper. I don’t think that’s the appropriate response to the bias in our literature but others definitely do, so calls for retraction seem plausible. Another concern is reputation: Will you anger your friends and collaborators or develop a reputation as someone who backstabs your colleagues? If people see admitting to bias as a personal black mark, this is possible
One way around these drawbacks is to publish a correction of a solo-authored paper or a paper authored with likeminded others. I’m on board with Andrew Gelman’s “No Retractions, Only Corrections: A manifesto”:
Maybe there should be no such thing as retraction, or maybe we could ban the word “retraction” and simply offer “corrections.” That would be fine with me. The point is never to “expunge the record,” it’s about correcting the record so that later scholars don’t take a mistaken claim as being true, or proven.
But, to the extent there are retractions, or corrections, or whatever you want to call them: Sure, just do it. It’s not a penalty or a punishment. I published corrections for two of my papers because I found that they were in error. That’s what you do when you find a mistake.
I’d love to see this opinion spread through psychology. As people who study people, psychologists know that bias happens; it’s just part of being human. Correct the record and move on. Start with thinking about the bias in your solo-authored papers. Begin talking about the idea with colleagues you already talk to about bias; warm them up to the idea of correcting their own work or your joint work. Then start leaving comments on PubPeer or on your blog or on http://psychdisclosure.org/. Or maybe even submit them as brief corrections to journals. If you’re an editor at a journal who would consider these kinds of corrections, invite them.
This is really an extension of what Michael Inzlicht and Michael Kraus have already done: start at home. By admitting our bias, we can set the example that it’s OK to have bias called out. But it can go a bit further by actually adding to the literature. If you include new data (e.g., dropped studies, conditions, or variables) or new analyses (e.g., an alternative specification of a DV), you are not just admitting bias but also contributing valuable new information that might make your correction into a meaningful paper in its own right.
- Publish your file-drawered studies
Make some use out of all the data you’re sitting on that was never published. You can simply post the data in an archive and make it available to meta-analysts and other researchers. You can publish it yourself as a new paper or as part of a correction. If you can’t get null or inconclusive results through traditional peer review, try an alternative outlet like the Journal of Articles in Support of the Null Hypothesis or The Winnower. The Winnower has the benefit of giving your blog post a citable DOI and pushing it through to Google Scholar. If you want to use your file drawer to make a big impact, gather all of your studies on a single topic into a publication-bias-free meta-analysis and use that to create theoretical insights and make meaningful methodological recommendations.
- Publish meta-scientific reviews
We already accept bias investigations in meta-analyses. Funnel plots, Egger tests, and other bias detection techniques are standard parts of meta-analysis. We are adding more and more tests to this repertoire every year.
Malte Elson brought up the idea that synthesizing whole research areas might be a more acceptable way to bring up criticisms about research flexibility, and he’s done some fantastic and detailed work cataloging flexibility in operationalizations of the CRTT. This work is specific (applies to a specific domain, a specific measure, and specific papers) but also diffuses agency across many authors. No one person is responsible for all of the flexibility, and actually attempting to figure out who has used more or less flexibility is fairly involved and just about the least interesting thing one can do with the published tools. Rather than providing, say, field-wide estimates of power, publication bias, or research flexibility, these domain-specific investigations provide the type of information needed by researchers to evaluate the papers they are using in their own work.
- Publicly praise and reward people who do these things
Cite corrections. Tweet and post on Facebook about how awesome people who admit bias are. Offer them jobs and promotions. If people are going to risk their reputations and relationships in trying to help others navigate the past, do everything you can to make it worth their while.
Final Thoughts
Let me be clear, doing any and all of these things is awesome, but it’s also only a beginning. Joe’s question is really about how to create a culture so that it is ok to point out specific instances of research flexibility in others’ work without ruining either one’s own or the author’s reputations. I think that admitting our own bias and examining field-wide bias will help normalize bias discussions, but they probably won’t bring us far enough.
I don’t expect everyone to make a complete catalog of their unpublished work or reveal their original planned analyses for every study they’ve ever published. Most people don’t have the time or records to do that. But we still need to be able to talk about the potential bias in their work anyway if we want to build on it. So we have to look and we have to talk about, and it has to be ok to do that.
Some people are already doing these investigations, but my general impression is that they are not received well. I hope that talking more about bias in ourselves and in general will bring us closer to the goal of discussing specific cases of bias, but I wonder whether there is more we can do to get us there faster.
One thought on “Grappling with the Past”