The role of chance in scientific evaluation. Drawing by Andrzej Krauze

This article was reprinted with permission from Trends in Pharmacological Sciences, Vol. 22, No. 2, Feb. 2001 by the internet journal HMS Beagle (Feb. 2, 2001 Issue 95), and has since been widely disseminated by way of the Bionet newsgroups. David Horrobin kindly gave permission to post at this site.
OPINION: Something Rotten at the Core of Science?

David F. Horrobin

Abstract

A recent U.S. Supreme Court decision and an analysis of the peer review system substantiate complaints about this fundamental aspect of scientific research. Far from filtering out junk science, peer review may be blocking the flow of innovation and corrupting public support of science.

The U.S. Supreme Court has recently been wrestling with the issues of the acceptability and reliability of scientific evidence. In its judgement in the case of Daubert v. Merrell Dow, the court attempted to set guidelines for U.S. judges to follow when listening to scientific experts. Whether or not findings had been published in a peer-reviewed journal provided one important criterion. But in a key caveat, the court emphasized that peer review might sometimes be flawed, and that therefore this criterion was not unequivocal evidence of validity or otherwise. A recent analysis of peer review adds to this controversy by identifying an alarming lack of correlation between reviewers' recommendations.

    The Supreme Court questioned the authority of peer review.   "Many scientists and lawyers are unhappy about the admission by the top legal authority in the United States that peer review might in some circumstances be flawed" [1]. David Goodstein, writing in the Guide to the Federal Rules of Evidence - one of whose functions is to interpret the judgement in the case of Daubert - states that "Peer review is one of the sacred pillars of the scientific edifice" [2]. In public, at least, almost all scientists would agree. Those who disagree are almost always dismissed in pejorative terms such as "maverick," "failure," and "driven by bitterness."

    Peer review is central to the organization of modern science. The peer-review process for submitted manuscripts is a crucial determinant of what sees the light of day in a particular journal. Fortunately, it is less effective in blocking publication completely; there are so many journals that most even modestly competent studies will be published provided that the authors are determined enough. The publication might not be in a prestigious journal, but at least it will get into print. However, peer review is also the process that controls access to funding, and here the situation becomes much more serious. There might often be only two or three realistic sources of funding for a project, and the networks of reviewers for these sources are often interacting and interlocking.

    Failure to pass the peer-review process might well mean that a project is never funded. Science bases its presumed authority in the world on the reliability and objectivity of the evidence that is produced. If the pronouncements of science are to be greeted with public confidence - and there is plenty of evidence to suggest that such confidence is low and eroding - it should be able to demonstrate that peer review, "one of the sacred pillars of the scientific edifice," is a process that has been validated objectively as a reliable process for putting a stamp of approval on work that has been done. Peer review should also have been validated as a reliable method for making appropriate choices as to what work should be done. Yet when one looks for that evidence it is simply not there.

    Why not apply scientific methods to the peer review process?  For 30 years or so, I and others have been pointing out the fallibility of peer review and have been calling for much more openness and objective evaluation of its procedures [3-5]. For the most part, the scientific establishment, its journals, and its grant-giving bodies have resisted such open evaluation. They fail to understand that if a process that is as central to the scientific endeavour as peer review has no validated experimental base, and if it consistently refuses open scrutiny, it is not surprising that the public is increasingly skeptical about the agenda and the conclusions of science.

    Largely because of this antagonism to openness and evaluation, there is a great lack of good evidence either way concerning the objectivity and validity of peer review. What evidence there is does not give confidence but is open to many criticisms. Now, Peter Rothwell and Christopher Martyn have thrown a bombshell [6]. Their conclusions are measured and cautious, but there is little doubt that they have provided solid evidence of something truly rotten at the core of science.

    Forget the reviewers. Just flip a coin. Rothwell and Martyn performed a detailed evaluation of the reviews of papers submitted to two neuroscience journals. Each journal normally sent papers out to two reviewers. Reviews of abstracts and oral presentations sent to two neuroscience meetings were also evaluated. One meeting sent its abstracts to 16 reviewers and the other to 14 reviewers, which provides a good opportunity for statistical evaluation. Rothwell and Martyn analyzed the correlations among reviewers' recommendations by analysis of variance.

    Their report should be read in full; however, the conclusions are alarmingly clear. For one journal, the relationships among the reviewers' opinions were no better than that obtained by chance. For the other journal, the relationship was only fractionally better. For the meeting abstracts, the content of the abstract accounted for only about 10 to 20 percent of the variance in opinion of referees, and other factors accounted for 80 to 90 percent of the variance.

    These appalling figures will not be surprising to critics of peer review, but they give solid substance to what these critics have been saying. The core system by which the scientific community allots prestige (in terms of oral presentations at major meetings and publication in major journals) and funding is a non-validated charade whose processes generate results little better than does chance. Given the fact that most reviewers are likely to be mainstream and broadly supportive of the existing organization of the scientific enterprise, it would not be surprising if the likelihood of support for truly innovative research was considerably less than that provided by chance.

    Objective evaluation of grant proposals is a high priority.  Scientists frequently become very angry about the public's rejection of the conclusions of the scientific process. However, the Rothwell and Martyn findings, coming on top of so much other evidence, suggest that the public might be right in groping its way to a conclusion that there is something rotten in the state of science. Public support can only erode further if science does not put its house in order and begin a real attempt to develop validated processes for the distribution of publication rights, credit for completed work, and funds for new work. Funding is the most important issue that most urgently requires opening up to rigorous research and objective evaluation.

    What relevance does this have for pharmacology and pharmaceuticals? Despite enormous amounts of hype and optimistic puffery, pharmaceutical research is actually failing [7]. The annual number of new chemical entities submitted for approval is steadily falling in spite of the enthusiasm for techniques such as combinatorial chemistry, high-throughput screening, and pharmacogenomics. The drive to merge pharmaceutical companies is driven by failure, and not by success.

    The peer review process may be stifling innovation. Could the peer-review processes in both academia and industry have destroyed rather than promoted innovation?

    In my own field of psychopharmacology, could it be that peer review has ensured that in depression and schizophrenia, we are still largely pursuing themes that were initiated in the 1950s? Could peer review explain the fact that in both diseases the efficacy of modern drugs is no better than those compounds developed in 1950? Even in terms of side-effects, where the differences between old and new drugs are much hyped, modern research has failed substantially. Is it really a success that 27 of every 100 patients taking the selective 5-HT reuptake inhibitors stop treatment within six weeks compared with the 30 of every 100 who take a 1950s tricyclic antidepressant compound? 

    The Rothwell-Martyn bombshell is a wake-up call to the cozy establishments who run science. If science is to have any credibility - and also if it is to be successful - the peer-review process must be put on a much sounder and properly validated basis or scrapped altogether.

David F. Horrobin (1939-2003) a long-time critic of anonymous peer review, headed Laxdale Ltd., which developed novel treatments for psychiatric disorders. In 1972 he founded Medical Hypotheses, the only journal fully devoted to discussion of ideas in medicine. In the present article Horrobin criticises the form of peer-review that has operated over past decades. The article does not criticize peer-review itself, but the way it has been implemented. Alternative forms of peer-review - such as the "bicameral review" approach suggested in these web-pages - have long been on the table. In Horrobin's words, bicameral review has the potential to put the peer-review process "on a much sounder and properly validated basis".  DRF

1 Daubert vs Merrel Dow Pharmaceuticals Judgement (1993) US Supreme Court (92102) 509, 579

2 Goodstein, D. (2000) How science works. In US Federal Judiciary Reference Manual on Evidence, pp. 6672

3 Horrobin, D.F. (1990) The philosophical basis of peer review and the suppression of innovation. J. Am. Med. Assoc. 263, 14381441

4 Horrobin, D.F. (1996) Peer review of grant applications: a harbinger for mediocrity in clinical research? Lancet 348, 12931295

5 Horrobin, D.F. (19811982) Peer review: is the good the enemy of the best? J. Res. Commun. Stud. 3, 327334

6 Rothwell, P.M. et al. (2000) Reproducibility of peer review in clinical neuroscience is agreement between reviewers any greater than would be expected by chance alone? Brain 123, 19641969

7 Horrobin, D.F. (2000) Innovation in the pharmaceutical industry. J. R. Soc. Med. 93, 341345


Return to: Peer Review Index Click Here

Return to: HomePage Click Here

This page was established circa 2001 and was last edited 19 May 2010 by Donald R. Forsdyke