Unreliable Biomedicine: A Book Review



“Even with perfectly honest and well-intended analytical intentions,” writes the distinguished Hungarian scientist Csaba Szabo, “different groups of scientists, analyzing the same sets of data, can come to completely different conclusions.”

In Unreliable: Bias, Fraud, and the Reproducibility Crisis in Biomedical Research, published next week by Columbia University Press, Szabo provides several reasons for the discrepancy, and covers them lucidly as systemic problems.

As small but meaningful levels of variance can jeopardize each stage of biomedical research, the combined effects of the distortion raise pressing questions about how reliable is the field’s foundational research. Are results extrapolated in print statistically significant or nonsignificant? Have they been altered to appear in the first camp when—minus cherry-picking, p-hacking, and other kinds of “statistical shenaniganry”—they in fact belong in the second?

Among the world’s most cited biomedical scientists, with “more than thirty years of direct experience of biomedical research,” Szabo blows the lid off his field. We move from the daily struggle to prevent errors in labs to the inner workings of the National Institutes of Health, whose funds for cancer research are currently frozen owing to the Trump administration’s blanket rejection of all biomedical findings.

One reason for the difficulty is that unreliability can mar even good-faith analysis. Corrupted cell lines, minor differences in lab conditions, and even seasonal fluctuations in data recording can alter outcomes. In addition, data fabrication and manipulation are now so widespread that they are driving a reproducibility crisis, leading Szabo to conclude that “irreproducibility is not the exception but, shockingly, the rule in biomedical science.”

In a field overrun by predatory journals, scientific paper mills, and “scamferences,” bad-faith studies are difficult to excise, ignore, or get fully retracted. Owing to the “inherent unreliability of some of the methods and reagents used in research,” however, many such studies share the traits and limitations of even the best-designed research. Those can include (but are not limited to) confirmation bias; outcome switching; cherry-picking; p-hacking; “HARKING” or “Hypothesizing After the Results are Known”; poor blinding and randomization; image doctoring; outlier exclusion; redaction bias; reductionism and generalization; data falsification and fraud; plagiarism; and the hyping of negligible-to-modest outcomes.

Consider Szabo’s example of good faith differences in how data are interpreted. A 2013 study published in the prestigious journal PNAS “compared the changes in inflammatory gene expression in human and mouse experimental systems and came to this conclusion (which also happens to be the title of the paper): “Genomic responses in mouse models poorly mimic human inflammatory diseases.”

“But here is the kicker,” Szabo continues: “a re-analysis of the same dataset by a different group of authors came to the opposite conclusion. Their conclusion (and once again the title of their paper published two years later in the same journal) was this: “Genomic responses in mouse models greatly mimic human inflammatory diseases.”

What are researchers and an increasingly bewildered public to make of such variance? The problem becomes acute when funded biomedical research is “the only game in town,” yet at least half of it may not be replicable.

A reproducibility crisis of “mind-boggling” proportions

Szabo calls the magnitude of the reproducibility crisis in biomedical research “mind-boggling.” In 2011, he notes, “scientists at the German pharmaceutical giant Bayer made a splash with news that they could not replicate 75 to 80 percent of the 67 preclinical academic publications that they took on. In 2012, the hematology and oncology department of the American biotechnology firm Amgen selected fifty-three cancer research papers that were published in leading scientific journals and tried to reproduce them in their laboratory. In 89 percent of the cases, they were unable to do so.”

The problem isn’t limited to those faking or fudging results, to make them appear statistically significant (p-hacking). “Most scientists can’t even reproduce their own data,” Szabo contends. “In 2016, the prestigious scientific journal Nature published the results of its anonymous survey. More than 1,500 scientists replied, and in the fields of biology and medicine 65 to 75 percent of respondents said they were unable to reproduce data published by others and . . . 50 to 60 percent said they had trouble reproducing their own previous research findings.”

Replication Crisis Essential Reads

Pressure to publish findings with altered results

The scale of data and statistical fraud unearthed in Unreliable is shocking and concerning.

When 434 cancer researchers at the MD Anderson Cancer Center were surveyed in 2013, almost 20 percent of them—one in five—conceded that they had felt “pressure to publish findings” about which they had doubts.

Among biomedical trainees at the University of California, San Diego, 5 percent admitted to “modif[ying] research results in the past” yet fully 81 percent said they were ‘‘willing to select, omit or fabricate data to win a grant or publish a paper.’’

A 2021 meta-survey in Science and Engineering Ethics found that 15 percent of researchers had witnessed others committing falsification, fabrication, or plagiarism, but 40 percent were aware of others who engage in questionable research practices.

In light of these findings, it’s less surprising that interviews of early career recipients of prestigious National Science Foundation fellowships found that, for more than 60 percent of them, “the level of cheating made them rethink their career choice.”

A broken system with “perverse incentives”

Taking readers from the scandal of Theranos, whose founders, executives, and employees “perpetrated fraud and deception on an astonishing scale,” resulting in prison sentences, to that of the Stanford-trained researcher at New York’s Memorial Sloan-Kettering Cancer Center, who faked the results of “transplanted” skin from genetically unrelated organisms by coloring the skin with a felt tip pen, Szabo concludes that “it is plain to see that the current system is broken” and “full of self-perpetuating ‘perverse incentives’ that, left to their own devices, will not self-correct.”

Much of the problem, from Szabo’s perspective, comes down to biomedicine placing greater emphasis (and incentives) on innovation rather than replication. As a result, “a lot of what is published turns out to not be reproducible,” leaving it largely useless to the public, while “the system [becomes] wasteful, redundant, and littered with ‘scientific garbage.’”

Szabo offers a way forward through increased oversight, better enforcement of good rules, faster retraction, higher penalties for fraud, greater use of data sleuths and detectives, and an underlying stress on scientific integrity. Until those conditions are met—and it’s not clear they can be without root-and-branch reform at odds with the field’s most-fundamental incentives—it is difficult to dispute the conclusion of Richard Smith, a former editor of the British Medical Journal, who wrote in 2021: “We are realizing that the problem is huge, the system encourages fraud, and we have no adequate way to respond. It may be time to move from assuming that research has been honestly conducted and reported to assuming it to be untrustworthy until there is some evidence to the contrary.”

Essential reading for everyone working in—and contemplating entering—biomedical science, Unreliable is an engrossing, timely contribution to research fraud, the resultant reproducibility crisis, and the ongoing battle for scientific integrity.


Leave a Reply

Your email address will not be published. Required fields are marked *

Related Posts