In the sprawling project, scientists in four labs designed and tested experiments and then tried to replicate one another’s work. The intention, according to the study, was to test methods aimed at ensuring the integrity of published research. But the group neglected to fully document key aspects of the project ahead of running the experiments, one of the practices the study was looking to test, leading to the retraction.
The authors—who include two of the most prominent voices advocating for research reforms—dispute some of the criticisms and said any errors they made were inadvertent.
“It wasn’t because we were trying to fool someone, but it is because we were incompetent,” said Leif Nelson, one of the authors, a marketing professor at the Haas School of Business at the University of California, Berkeley.
Nelson helps write the Data Colada blog, a website known for discussing research methods and debunking studies built on faulty or fraudulent data. Recently, the blog gained attention for its blistering critique of a star Harvard Business School professor’s work, alleging that her research contained falsified data.
After its own investigation, Harvard placed the professor, Francesca Gino, on unpaid administrative leave. Gino, who has denied wrongdoing, is suing the university. Her defamation claims against the bloggers were dismissed.
Brian Nosek, another of the paper’s authors, is executive director of the Center for Open Science, a nonprofit based in Charlottesville, Va., which advocates for transparency in research. Nosek has run projects showing that the results of many scientific studies can’t be reproduced when other researchers try, what has come to be known as the replication crisis. He and others have argued that this is a symptom of the inadequate research standards that undermine the quality of behavioral science.
The retracted study, written with 15 other authors, involved 16 discoveries and 80 experiments in four different labs. The paper was published in the journal Nature Human Behavior last November. It was retracted last month. The retraction was earlier reported by the Chronicle of Higher Education.
The study claimed to use best practices, such as setting sufficient sample sizes and pre-registering research, where researchers spell out what they want to measure and how they will analyze the results to make it difficult to later alter hypotheses to fit results.
The researchers found 86% of the replication attempts in their study confirmed the expected effects. The replication rate demonstrated the value of the standards, the authors said.
“This is a paper about preregistration by proponents of preregistration, it tries to make a conclusion about preregistration, and the authors actually failed to preregister their core analyses,” said Berna Devezer, a researcher who studies scientific methods at the University of Idaho and whose critique was cited in the journal’s retraction notice.
“Even if everything had been registered it would not have mattered because there are fundamental design flaws,” said Joe Bak-Coleman, a computational social scientist who co-wrote the critique.
The journal conducted an investigation and found that concerns the duo raised were valid, according to the retraction notice.
In addition, the study didn’t describe how they chose or ran the pilot studies that preceded the replication studies, a key flaw according to Jessica Hullman, a computer scientist who studies decision-making and uncertainty at Northwestern University and was involved in the journal’s investigation.
“There’s just a whole bunch of mystery around these pilot studies,” she said.
In a statement responding to the retraction, six of the study authors, including Nosek, acknowledged “elementary errors” in preregistration and said that they would review how the teams ran their pilot studies.
The project began more than a decade ago, in 2012, at a time of introspection among social and behavioral scientists who were concerned that lax research standards could accommodate untrue findings.
To demonstrate the limits of accepted techniques, Nelson co-wrote a highly cited paper that demonstrated, absurdly, that people who listened to the Beatles song “When I’m Sixty-Four” grew younger. Nosek, meanwhile, was in the midst of an influential project that would estimate that less than 50% of studies in psychology could be replicated.
Against this backdrop, they joined a project intended to test the reforms they advocated.
It appeared to Devezer and Bak-Coleman that the paper’s authors had succumbed to some of the bad habits they were trying to expel from the field: using convenient analyses to boost their results or changing course midway when their results didn’t suit their goals.
Nosek stands by their experimental design. “I think that the basic conclusion is robust.”
This post was originally published on here