Reporting diagnostic accuracy studies: Evaluating 10 years of STARD31/03/2015
Daniël A. Korevaar & Jérémie F. Cohen, Department of Clinical Epidemiology, Biostatistics and Bioinformatics, Academic Medical Center, University of Amsterdam, The Netherlands
Over 250 reporting guidelines are currently available in the EQUATOR library. One of the most established of these is the STARD (Standards for the Reporting of Diagnostic Accuracy Studies) statement (1, 2).
STARD, first launched in 2003, was developed by an international group of methodologists, statisticians, reviewers and editors. It aims to improve the reporting of studies that evaluate the diagnostic accuracy of medical tests. Incomplete reporting is problematic as it impedes the identification and reproducibility of the study, as well as a proper appraisal of the internal and external validity. The statement contains a checklist of 25 items that should be reported to make the study report fully informative.
The statement was initially simultaneously published in eight major medical journals. Together, these publications have now been cited more than 2,000 times (Web of Knowledge, March 2015). Over the years, multiple other journals published the STARD statement or the accompanying “Explanation & Elaboration” document, translations are available in seven languages, and at least 30 journals published one or more editorials about STARD, usually to highlight the importance of its use. More than 200 journals explicitly endorse STARD, which means that they require or recommend the use of the checklist in their instructions to authors.
Based on these data, the impact of STARD seems to be impressive. Unfortunately, the success of a reporting guideline can only be evaluated by whether or not it has achieved its main goal: improving the quality of reporting. Based on two of our recent evaluations we can conclude that this goal has been achieved, but only to a moderate extent.
We have performed a systematic review in which we included all evaluations that had assessed adherence of published diagnostic accuracy studies to the STARD checklist (3). We found 16 of these evaluations, together analyzing the reporting of 1,496 diagnostic accuracy study reports in various fields of research. Across these evaluations, the mean number of items reported varied from 9.1 to 14.3 out of 25 STARD items. Not surprisingly, this led all the included studies to conclude that reporting was generally poor, medium, suboptimal, or needed improvement. Six of these evaluations quantitatively compared the reporting of diagnostic accuracy studies that were published post-STARD with those that were published pre-STARD. When we combined them in a meta-analysis, we found a modest but significant increase of 1.4 reported items after STARD’s launch in 2003. However, because most of these evaluations assessed diagnostic accuracy studies published in the first few years after STARD’s launch, it may have been too early to expect large improvements.
We also performed an analysis of more recent studies (4). We assessed the reporting of 112 diagnostic accuracy studies published in twelve high-impact journals in 2012. Expectations were high, as all but one of these journals were STARD endorsers. Unfortunately, on average, the studies reported only 15.3 out of 25 STARD items. Yet this is a significant improvement compared to studies published in the same journals in 2000 and 2004, when the mean number of items reported was 11.9 and 13.6, respectively (5).
We conclude from these two studies that the completeness of reporting of diagnostic accuracy studies has improved in the 10 years after the launch of STARD, but that it remains suboptimal for many articles.
Over the past year, the STARD group, currently consisting of over 85 people, has been working on an update of the checklist. The three main goals of this update are
(1) to facilitate the use of the checklist by rearranging and rephrasing items
(2) to include new information, based on improved understanding of sources of bias and variability and other issues in diagnostic accuracy studies, and
(3) to improve consistency with other reporting guidelines such as CONSORT.
After two web-based surveys and a live two-day meeting in Amsterdam, a pre-final version of STARD 2015 has now been put together and is undergoing piloting. The final checklist is planned to be launched late 2015.
What do we learn from the first ten years of STARD, and how can we make sure that STARD 2015 will further improve reporting quality?
Because of the widespread attention that STARD has received by medical journals, and because of the large amount of STARD adopters among these journals, we expected that major improvements in the reporting of diagnostic accuracy studies would automatically follow. Our evaluations have shown that this is not the case. Other well-known reporting guidelines, such as CONSORT and PRISMA, have faced similar problems (6). Apparently, dissemination of a reporting guideline cannot solely focus on journals. Developers of reporting guidelines should continue to seek innovative ways to reach authors, reviewers and editors, and to convince them of the necessity of complete reporting. EQUATOR plays an important role in this.
For STARD 2015, we aim to publish online training material and workshops, build templates to facilitate writing and peer-reviewing of study reports, and encourage the development of extensions specifically designed for different fields of research. In addition, a close collaboration with the EQUATOR network should make sure that a wide audience will be reached. This collaboration will initially comprise the gradual provision of much more information related to the STARD guideline at the EQUATOR website.
With STARD 2015, we further hope to convince the scientific community of the necessity and simplicity of complete reporting of diagnostic accuracy studies.
(1) Bossuyt PM, Reitsma JB, Bruns DE et al. The STARD statement for reporting studies of diagnostic accuracy: explanation and elaboration. Clin Chem 2003;49:7-18.
(2) Bossuyt PM, Reitsma JB, Bruns DE et al. Towards complete and accurate reporting of studies of diagnostic accuracy: The STARD Initiative. Radiology 2003;226:24-28.
(3) Korevaar DA, van Enst WA, Spijker R, Bossuyt PM, Hooft L. Reporting quality of diagnostic accuracy studies: a systematic review and meta-analysis of investigations on adherence to STARD. Evid Based Med 2014;19:47-54.
(4) Korevaar DA, Wang J, van Enst WA et al. Reporting Diagnostic Accuracy Studies: Some Improvements after 10 Years of STARD. Radiology 2015;274:781-789.
(5) Smidt N, Rutjes AW, van der Windt DA et al. The quality of diagnostic accuracy studies since the STARD statement: has it improved? Neurology 2006;67:792-797.
(6) Turner L, Shamseer L, Altman DG et al. Consolidated standards of reporting trials (CONSORT) and the completeness of reporting of randomised controlled trials (RCTs) published in medical journals. Cochrane Database Syst Rev 2012;11:MR000030.