Lies, damn lies and statistics: A Stanford wiz says P<0.05 offers deceptive evidence of biopharmas' drug claims
The biopharma R&D world revolves around one simple formula: A P value of less than 0.05 in a pivotal study. But a top professor of medicine and statistics at Stanford says it’s a poor measure of value, and he wants to scrap it for something far more demanding — and far more valuable.
Writing in the Journal of the American Medical Association, John P A Ioannidis notes that the P value cutoff “is wrongly equated with a finding or an outcome…being true, valid, and worth acting on. These misconceptions affect researchers, journals, readers, and users of research articles, and even media and the public who consume scientific information.”
And they’re often simply wrong.
Most claims supported with P values slightly below .05 are probably false (ie, the claimed associations and treatment effects do not exist). Even among those claims that are true, few are worth acting on in medicine and health care.
There are just too many ways to game the clinical trial system, Ioannidis adds. By focusing on smaller benefits and risks, he writes, you boost the risk that biases will have an affect.
“Moving the P value threshold from .05 to .005 will shift about one-third of the statistically significant results of past biomedical literature to the category of just ‘suggestive.’ This shift is essential for those who believe (perhaps crudely) in black and white, significant or nonsignificant categorizations.”
Ioannidis, though, is quick to assert that there are no easy solutions to the P value conundrum. There are advantages, and some big disadvantages, for doing away with the old standard that can’t be ignored.
Adopting lower P value thresholds may help promote a reformed research agenda with fewer, larger, and more carefully conceived and designed studies with sufficient power to pass these more demanding thresholds. However, collateral harms may also emerge. Bias may escalate rather than decrease if researchers and other interested parties (eg, for-profit sponsors) try to find ways to make the results have lower P values. Selected study endpoints may become even less clinically relevant because it is easier to reach lower P values with weak surrogate end points than with hard clinical outcomes. Moreover, results that pass a lower P value threshold may be limited by greater regression to the mean and new discoveries may have even more exaggerated effect sizes than before.
My bet is that the industry has become so focused on beating 0.05, no one will want to drop it for an untested approach that could throw the whole $160 billion drug development business into a tizzy. There are no simple boundary lines between good and bad. But it’s definitely worth keeping in mind the next time you see a biopharma company celebrating a P value in the 0.04 range of things.
Image: John P. A. Ioannidis. Erasmus MC via YOUTUBE