“Research parasites” = “Individuals who had nothing to do with the study, but use another goup’s data for their own gains”

In an editorial published last year [N Engl J Med 2o16; 374: 276–277.], concept of “research parasites” was introduced. These individuals who “have nothing to do with the design and execution of the study, but rather they use another group’s data for their own ends –– possibly stealing from the research productivity planned by the data gatherers; or, they even use the data to try to disprove what the original investigators had proposed”. Research parasitism obviously includes large data mining, such as the massive amounts of DNA sequence or other data that have been deposited in the public domain.

That 2o16 editorial sparked discussion about the role of secondary data analysis in the scientific pro­cess, both in official letters to the editor and informal commentary online. In light of the term’s widespread publicity, the five coauthors of this attached editorial chose to use it to recognize individuals who recently had practiced this “craft of data reanalysis” in a manner that was most significantly helpful to the scientific community.

Research parasites are not criticized in the attached editorial in a negative sense, but rather in a constructive manner. Journals, although they conduct peer review, do not validate each experimental result or claim. Research parasites fill this gap. Research parasites help to maintain the self-correcting nature of scientific inquiry. Scientists who perform rigorous parasitism put scientific work to the test, and their results may support, or may challenge, what we think that we know.

At the Pacific Symposium on Biocomputing (PSB) 2017, the five coauthors presented the inaugural Research Parasite Awards to researchers selected for their “rigorous analysis of pub­licly accessible data.”  They specifically sought to honor those whose work extended, repli­cated or disproved what the original inves­tigators had posited, but those who were not involved in experimental design or data genera­tion, published independently of the original investigators (while appropriately crediting them), and who provided their own research prod­ucts — including source code and intermediate or final results — in a manner that enhanced reproducibility. Please read this attached (interesting) 2-page letter for yourself..!!

Nature Genet    April 2o17; 49: 483–484

This entry was posted in Center for Environmental Genetics. Bookmark the permalink.