Menu Close

Worker’s Compensation claim administration centers on decisions based upon scientific evidence. Injury AOE-COE opinions are supported by test findings that in turn are purportedly based upon the rigors of medical science. Treatment requests are filtered through the UR and IMR process which is also based upon evidence based medicine. All of this is based upon the assumption that physicians rely on rock solid tried and true science.

Regrettably, this assumption may not always be the case.

Since its beginning more than 20 years ago, functional magnetic resonance imaging (fMRI) has become a popular tool for understanding the human brain, with some 40,000 published papers according to PubMed. Despite the popularity of fMRI as a tool for studying brain function, the statistical methods used have rarely been validated using real data. Validations have instead mainly been performed using simulated data, but it is obviously very hard to simulate the complex spatiotemporal noise that arises from a living human subject in an MR scanner.

To validate the statistical methods commonly in use, researchers in a new study published in the Proceedings of the National Academy of Sciences (PNAS) used real resting-state data and a total of 3 million random task group analyses to compute empirical familywise error rates for the fMRI software packages SPM, FSL, and AFNI, as well as a nonparametric permutation method.

At the end of the study, researchers found that the parametric statistical methods used for group fMRI analysis with the commonly used software packages SPM, FSL, and AFNI can produce FWE-corrected cluster P values that are erroneous, being spuriously low and inflating statistical significance.

So what does this mean in lay terms?

Researchers conclude that this “calls into question the validity of countless published fMRI studies based on parametric clusterwise inference. It is important to stress that we have focused on inferences corrected for multiple comparisons in each group analysis, yet some 40% of a sample of 241 recent fMRI papers did not report correcting for multiple comparisons , meaning that many group results in the fMRI literature suffer even worse false-positive rates than found here.”

With regard to the software, researchers concluded that “a 15-year-old bug was found in 3dClustSim while testing the three software packages (the bug was fixed by the AFNI group as of May 2015, during preparation of this manuscript). The bug essentially reduced the size of the image searched for clusters, underestimating the severity of the multiplicity correction and overestimating significance (i.e., 3dClustSim FWE P values were too low).”

And what about the 40,000 published papers based upon the old software with the “15-year old bug?”

Sadly the authors say that it “is not feasible to redo 40,000 fMRI studies, and lamentable archiving and data-sharing practices mean most most could not be reanalyzed either.” So the results of scientific research in 40,000 published studies that draw conclusions upon fMRI studies are now questionable. That is a considerable amount of “evidence based medicine” that may not be such good evidence after all.