Warning: it’s boring, tedious, hard work. There’s nothing flashy about it.
First step: define a clear set of tested standards. For clinical trials, there’s something called Consolidated Standards of Reporting Trials (CONSORT) which was established by an international team of statisticians, clinicians, etc., and defines how you should carry out and publish the results of trials. For example, you are supposed to publish pre-specified expected outcomes: “I am testing whether an infusion of mashed spiders will cure all cancers”. When your results are done, you should clearly state how it addresses your hypothesis: “Spider mash failed to have any effect at all on the progression of cancer.” You are also expected to fully report all of your results, including secondary outcomes: “88% of subjects abandoned the trial as soon as they found out what it involved, and 12% vomited up the spider milkshake.” And you don’t get to reframe your hypothesis to put a positive spin on your results: “We have discovered that mashed-up spiders are an excellent purgative.”
It’s all very sensible stuff. If everyone did this, it would reduce the frequency of p-hacking and poor statistical validity of trial results. The catch is that if everyone did this, it would be harder to massage your data to extract a publishable result, because journals tend not to favor papers that say, “This protocol doesn’t work”.
So Ben Goldacre and others dug into this to see how well journals which had publicly accepted the CONSORT standards were enforcing those standards. Read the methods and you’ll see this was a thankless, dreary task in which a team met to go over published papers with a fine-toothed comb, comparing pre-specified expectations with published results, re-analyzing data, going over a checklist for every paper, and composing a summary of violations of the standard. They then sent off correction letters to the journals that published papers that didn’t meet the CONSORT standard, and measured their response.
I have to mention this here because this is the kind of hard, dirty work that needs to be done to maintain rigor in an important field (these are often tests of medicines you may rely on to save your life), and it isn’t the kind of splashy stuff that will get you noticed in Quillette or Slate. It should be noticed, because the results were disappointing.
Results
Sixty-seven trials were assessed in total. Outcome reporting was poor overall and there was wide variation between journals on pre-specified primary outcomes (mean 76% correctly reported, journal range 25–96%), secondary outcomes (mean 55%, range 31–72%), and number of undeclared additional outcomes per trial (mean 5.4, range 2.9–8.3). Fifty-eight trials had discrepancies requiring a correction letter (87%, journal range 67–100%). Twenty-three letters were published (40%) with extensive variation between journals (range 0–100%). Where letters were published, there were delays (median 99 days, range 0–257 days). Twenty-nine studies had a pre-trial protocol publicly available (43%, range 0–86%). Qualitative analysis demonstrated extensive misunderstandings among journal editors about correct outcome reporting and CONSORT. Some journals did not engage positively when provided correspondence that identified misreporting; we identified possible breaches of ethics and publishing guidelines.Conclusions
All five journals were listed as endorsing CONSORT, but all exhibited extensive breaches of this guidance, and most rejected correction letters documenting shortcomings. Readers are likely to be misled by this discrepancy. We discuss the advantages of prospective methodology research sharing all data openly and pro-actively in real time as feedback on critiqued studies. This is the first empirical study of major academic journals’ willingness to publish a cohort of comparable and objective correction letters on misreported high-impact studies. Suggested improvements include changes to correspondence processes at journals, alternatives for indexed post-publication peer review, changes to CONSORT’s mechanisms for enforcement, and novel strategies for research on methods and reporting.
People. You’ve got a clear set of standards for proper statistical analysis. You’ve got a million dollars from NIH for a trial. You should at least sit down and study the appropriate methodology for analyzing your results and make sure you follow them. This sounds like an important ethical obligation to me.




