The future of the peer-review systems
When academic journals used paper media, limited space obliged editors to have referees review contributors’ manuscripts and selectively publish them, but the spread of the Internet has lowered the publication costs and the age of speed has outdated the peer-reviewed journals. To respond to the demands of the day, we should separate evaluation from publication and create a new system for evaluation. Here we propose a new system that replaces the current peer-review system named the recursive voting system. This system is founded on two principles, the principle of elitism “The more highly a researcher is estimated, the abler to estimate others’ research he or she is" and the principle of democracy “The more estimators participate in voting, the less unfair and less biased the result becomes."
- 1. The necessity of the peer-review systems
- 2. The problems of the peer-review systems
- 3. How to solve the problems of peer-reviewed journals
- 4. The recursive voting system
- 5. References
1. The necessity of the peer-review systems
Peer review is the process of scrutinizing an author’s paper by two or three experts in the same field, before it is published in a journal, so as to publish only high quality papers. The experts are called reviewers or referees. The peer-review system is also adopted in examining an application before it is sent to grant funding panels. The page space of a journal is limited. A research grant is also limited. As scientists compete for the limited resources, an assessment system is necessary for the selection. That is why peer review is introduced.
The first recorded peer review dates back to 1665, with the publication of the journal Philosophical Transactions of the Royal Society. The initial authorization for its publication included the order that it be first reviewed by some members of the Council of the Royal Society. In 1752, the Society established a “Committee on Papers" to review all articles considered for publication. Peer review, however, had long been exceptional until the middle of the twentieth century.
Reviewers are kept unknown to the authors and applicants. They are literally “peers" to each other. So, if reviewers were not anonymous and drew negative conclusions that led to the rejection, the author or the applicant might bear a grudge against them and take revenge on them in the reverse position. If on the other hand the reviewers always drew positive conclusions, peer review would not discharge its proper function. Therefore anonymity is necessary.
Authors or applicants are sometimes kept unknown to reviewers. This reviewing in reciprocal anonymity is called double-blind, while the usual reviewing is called single-blind. The aim of the double-blind peer review is to prevent reviewers from being influenced by the attributes of authors. Since peers know their achievements each other, it is pointed out that a referee can often identify the author using a combination of the author’s reference list and the referee’s personal background knowledge. As the preservation of double-blind anonymity is difficult, it is not as usual as single-blind anonymity.
It is significant for researchers to accept unvarnished criticism by their peers and improve their papers. The peer-review system enables researchers to read quality papers and administrators to award a grant for research or decide researchers’ promotion in a non-arbitrary way. This is why the peer-review system and the allocation of research resources have been widely accepted since the middle of the twentieth century.
2. The problems of the peer-review systems
Although the peer-review system is generally assumed to improve the quality of journals, the following problems are pointed out.
2.1. The process takes a long time
It is rare for editors of a journal and referees to accept the submitted manuscript unconditionally. Even in the case of acceptance, they usually advise its authors to improve it in certain ways. The manuscript must be rewritten, often with new data collected and analyses done. Because of this procedure, it usually takes more than several months from initial submission to publication. This time consuming procedure for publication is a drawback in this age of rapid progress.
Nowadays researchers come to post pre-publication versions of their papers in institutional repositories for the sake of communication at an early stage. Though journals prohibit a duplicate submission of an article, Nature, for example, allows pre-print publication, saying, “You are welcome to post pre-submission versions or the original submitted version of the manuscript on a personal blog, a collaborative wiki or a preprint server at any time.“
The most famous preprint server is arXiv, which was originally hosted at the Los Alamos National Laboratory in 1991 and is now hosted and operated by Cornell University. As of 2011, 97% of papers on particle physics are available at arXiv. Most of researchers read papers there and less than 10% read them in the journals. The peer-reviewed journals of this field has no longer a role of publishing. There are other preprint servers such as viXra.org, Philica, Science Paper Online and so on.
2.2. Referees might steal information
Taking too much time increases this risk. The peers are the researchers in the same field, and it means that they are rivals. As is well known, researchers compete with the priority of their research. Peer review is as risky as letting the competing companies examine the patent applied for. Journals order referees to observe confidentiality, but there is no guarantee of it.
Some researchers used a clever trick to prevent the leak of the not-yet-published information. Paul Chu (朱經武) developed a Y-Ba-Cu-O compound system adopting yttrium instead of lanthanum and succeeded in achieving superconductivity above 77 K in 1987. Chu, fearing the leak of this important information before publication, substituted Yb (ytterbium) for Y (yttrium) in the manuscript and changed Yb to Y before final printing. Chu’s trick actually protected his priority, because the erroneous information of Yb-Ba-Cu-O compound system."had clearly reached rival research groups before publication.“
Preprint servers like arXiv can also reduce this sort of risk. They use an electronic signature and timestamp technique that establish author’s priority. Publishing papers on personal websites could arouse suspicion that the author is the administrator of the site and might falsify the contents and the timestamp of the papers. Those who think much of their priority would be best advised to publish their work in the form of a preprint or technical report on a public system managed by a reliable administrator.
2.3. Modification is difficult after publication
It is difficult to modify the articles once published in peer-reviewed journals, especially when they are printed in paper media. They can add corrigenda or addenda at best. As the role of a journal is to authorize articles, it is not desirable to modify them after the authorization. That is why editors are cautious in publishing. But spending a long time for publication does not reduce the possibility of mistake to zero. The copyright of articles is usually transferred to publishers and the authors are forbidden to reproduce their articles without permission. Modifying an article and resubmitting it to another journal is called self-plagiarism and regarded as scientific misconduct.
Modifying articles is quite easy at preprint servers. For example, arXiv allows you to replace a publicly announced article with a new version. To make priority establishment compatible with flexibility of revision it is necessary to record who first wrote the original, when and who made comments on it and when and what part the author revised. This is what Wikipedia does and it is not difficult for online media to be like Wikipedia.
2.4. Publishing costs a lot
The costs of preprint servers can be kept low because they are restricted to those of the server maintenance. Although arXiv has some moderators for each area who review the submissions and may delete or re-categorize them, they are volunteers and not paid for their work. The budget for arXiv currently funded by Cornell University Library is $400,000 per year as of 2010. It will remain free to submit or download articles, though Cornell is now asking the 200 institutions that download most from the repository to make annual contributions to help fund it.
The publishers of peer-reviewed journals are beset by labor costs of editors and publication costs for paper, printing and binding in addition to server costs. Publishers, if they are commercial, must make a profit. Unlike other journals, academic journals can have researchers write and review articles for free and sell them to libraries at exorbitant prices. So, their profit margin can be huge. The London-based Institute of Physics, for example, earns more than 60% of its total income from publishing. George Monbiot criticized profit-making academic publishers as “economic parasitism" , that lives on tax payers’ money.
In recent years more and more open access journals whose online articles everyone can read for free, such as PLOS, have begun to challenge traditional academic publishing. Even Nature set the entire field of particle physics, the main field of arXiv, to switch to open-access publishing, as “a milestone in the push to make research results freely available to readers" on September 24, 2012. General readers would welcome open-access publishing, but then researchers must bear the cost of publishing. Nature Communications, for example, charge American researchers $5,000 per article for publication. Open-access publishing just switches from the burden on libraries to that on researchers and the total costs remain almost the same.
2.5. Reviewing is often inappropriate
The result of peer-review and the conclusion of editors are not necessarily right. Peer-reviewed journals can make two sorts of mistakes: (1) publishing what should not be published and (2) not publishing what should be published.
(1) The typical case of this failure is the referee’s inability to detect intentional fabrication or falsification. The recent examples of Jan Hendrik Schön, Hwang Woo-suk (황우석), Diederik Alexander Stapel shows that even the top journals such as Science and Nature could not prevent large-scale fabrication. It was not because referees and editors happened to be careless. All referees can do is review whether the article refers to the past achievements and whether it has originality except pointing out simple mistakes they can detect on reading it. Editors are not specialists and they are rather concerned whether the readers take an interest in it or not. As Nature Neuroscience confesses, " detecting well-done fraud during peer review is nearly impossible in practice." Referee reviews articles for free and they cannot carry out expensive reproducibility testing. So, they cannot detect most of unintentional errors as well.
(2) This failure is apparent from more than 20 Nobel laureates’ rejections by many journals. Conservative journals would prefer to select senior authorities in the field as referees and adhere to the ruling paradigm. Commercial journals, on the other hand, might prefer sensational articles to steady ones trying to boost sales. Of course we can say the various editorial policies maintain variety as a whole, but so long as only a few are committed to edition, no journals are immune from bias.
3. How to solve the problems of peer-reviewed journals
We enumerated five problems that preprint servers can solve except the last one. Progress in science requires three functions: publication, communication and evaluation. Preprint servers fulfill the former two functions, but not the last one, which peer-review systems fulfill. That is why peer-reviewed journals of particle physics, the main field of arXiv, are still issued.
Evaluating an article by the value of the journal that publishes it puts the cart before the horse, because Thomson Reuters’ impact factor, the representative proxy for the importance of a journal within its field, attributes the value of a journal to that of articles published in it. For example, the impact factor 2012 for a journal is calculated as follows:
The impact factor 2012 = A/B
A = the number of times articles published in 2010-2011 were cited in indexed journals during 2012
B = the number of articles published in 2010-2011
Now that the value of a journal depends on that of articles published in it, the latter can be calculated independently of the former. Though authoritarians might evaluate articles merely by the reputation of the journal that publishes it and the bandwagon effect could accelerate the citation of a famous article, these are just secondary effects that should be neglected.
Google’s PageRank™ is also based on a similar idea: the more a webpage is linked by others, the more valuable it is. It, however, actually adopts a more complex mathematical algorithm. The PageRank of a web page is defined recursively as is illustrated in the figure below: a web page linked to by many pages with higher PageRank receives a higher rank itself. Another rule of PageRnak is that the more links a webpage has, the less valuable each link is.
A similar methodology was developed for determining citation based influence measures for scientific journals, subfields and fields as early as in 1976.
The recursive method like PageRank isn’t always measuring the value of articles correctly. George Michael Sheldrick’s paper in 2008 “A short history of SHELX" is a paper that should be cited whenever one or more of the open-source SHELX programs are employed in the course of a crystal-structure determination. This article received 5624 citations so that the impact factor in 2009 of Acta Crystallographica Section A: Foundations of Crystallography that published “A short history of SHELX" jumped from 2.051 in 2008 to 49.926 in 2009, reaching the top of all over those of Nature (31.434) and Science (28.103). This happening aroused criticism for journal’s impact factor.
We can find a similar case in PageRank. When you use WordPress, popular blog software, you can choose your favorite theme created by a third party, but your choice automatically inserts a footer link to the creator’s website into every page of your blog. If the theme is popular, the links to its site proliferate so much as to overestimate the linked site. Another problem of using linking or citation as votes occurs when these are used as a criticism. These problems spring from the fact that linking or citation not always indicates the linked or the quoted is good.
Although citation in academic journals is akin to linking in web pages in many respects, there are some differences between them. Unlike PageRank, the journal’s impact factor allows only the elite to participate in the evaluation. Generally speaking, professional scientists do not want to let amateur scientists enter their community. For example, arXiv, to filter out articles that they consider inappropriate and thus make their site distinct from the layman’s sites, began requiring new users to be endorsed by another user before submitting their first paper on January 17, 2004. They insist this endorsement system as well as the moderator system ensures the quality of their content. Though arXiv is not a peer-reviewed journal, they still stick to traditional elitism.
Some scientists criticized arXiv’s policy of endorsements and moderation and founded a new site named viXra.org. Their spelling order reverse of arXiv expresses their policy is the reverse of that of arXiv. In fact viXra is an open repository which does not endorse e-prints accepted on its website, neither does it review them against criteria such as correctness or author’s credentials. Their site is so minor that very few professional scientists read the articles in viXra.
The closed elitism, however, has various problems. The small number of participants results in data shortage, fixed evaluation criteria, collusion between peers, and vulnerability to manipulation. To prevent such harmful effects, some propose altmetrics, alternative metrics of academic works by means of the reputation of social media, storage at online reference managers like Mendeley, social bookmark, online link and so on.
With altmetrics, we can crowdsource peer-review. Instead of waiting months for two opinions, an article’s impact might be assessed by thousands of conversations and bookmarks in a week. In the short term, this is likely to supplement traditional peer-review, perhaps augmenting rapid review in journals like PLoS ONE, BMC Research Notes, or BMJ Open. In the future, greater participation and better systems for identifying expert contributors may allow peer review to be performed entirely from altmetrics. Unlike the JIF, altmetrics reflect the impact of the article itself, not its venue. Unlike citation metrics, altmetrics will track impact outside the academy, impact of influential but uncited work, and impact from sources that aren’t peer-reviewed.
Thus altmetrics expands the objects of evaluation into those including datasets, code, experimental designs, blog, microblog, comments, annotations on existing work and so on.
Certainly to increase contributors and estimators, expanding the community of scientists and objects of evaluation, will increase data to be estimated and diversify scientific works and the criteria for estimation. But if resources for research were distributed based only on altmetrics, researchers would go populism and avoid esoteric or steady research. Expanding objects of evaluation might allocate the resources to contributors to non-academic work. Therefore it is necessary to confine the object of estimation to the media for researchers (including sites like viXra) and make a new hybrid metrics based on mainly citation data and subordinately altmetrics that calculates the impact factor of researchers’ works.
Taking its importance into consideration, we should not allocate resources, namely subsidy and academic posts, using only the metrics mechanically generated by proxy indices such as citation, links, bookmarks and so on. We should directly vote on allocation. The proxy indices are useful for screening, but we must recognize whether the screened candidates are really excellent or not. Otherwise we would encounter the discrepancy like the case of Sheldrick’s paper “A short history of SHELX".
4. The recursive voting system
The mere hybrid of the traditional citation index and altmetrics is not enough. So, I would like to propose a new evaluation system that replaces the previous peer-review system, the recursive voting system.
- First the impact factor of individual researcher is measured by means of the impact factor of his or her works. When a work is collaborative, as is usual with natural science, the impact factor of works should be distributed according as the contribution which the order of authors’ names reflects. If authors want to determine the distribution ratio by themselves, they can show it in percentages.
- The next stage is the online vote held annually to rank individual researchers. Those who apply for a research grant or an academic position/promotion must take part in this vote. Ordinary people (science journalists, amateur scientists etc.) can also vote on it voluntarily. Anyway the procedure for identification is necessary to prevent fraudulent voting.
- Each voter has votes according as his or her personal impact factor and votes for others’ works that are applied for and considered to be valuable. Suppose a voter has 100 votes. The voter, for example, can divide his or her votes into 40 votes for an article A, 30 votes for an article B, 20 votes for an article C, and 10 votes for an article D.
- A voter must write the reason for voting. This record is published and becomes the object of evaluation that is used to calculate the next time impact factor of the voter. Unfair cronyism and irrational reasons will lower the personal impact factor of voters. This evaluation of evaluation prevents manipulation and unfair voting.
- Each country invites applicants every year and allocates a research grant and academic posts to applicants according as poll results. The result of the voting is recursively reflected in the next year impact factors. That is to say personal impact factors of a year are calculated partly from poll results in the previous year and partly from impact factors of the current year.
The recursive voting system is founded on two principles, the principle of elitism “The more highly a researcher is estimated, the abler to estimate others’ research he or she is" and the principle of democracy “The more estimators participate in voting, the less unfair and less biased the result becomes." The lack of the latter could bring about the stagnation of research with ivory-towered scholars adhering to the traditional paradigm. The lack of the former would result in populism and a public grant might be spent on studies in astrology, medical quackery, occult beliefs and other pseudoscience. Two principles must be introduced so as to keep research sound and innovative.
The recursive voting system maintains the principle of elitism because of its recursiveness. The impact factors of works or researchers, votes and poll results are all recursively defined like PageRank. Thanks to this recursiveness, academic establishment will keep their power in this system. Even if a cult leader tries to authorize their pseudoscientific doctrine, mobilizing the cult followers massively, the wall of recursiveness would prevent them from consequential manipulation.
The recursive voting system, on the other hand, establishes the principle of democracy because of its open voting. This openness more or less increases the risk of pseudoscience entering orthodox science, but we should not shut the door just because the outsiders are called pseudoscience. The boundary between science and pseudoscience is vague and we should recognize that some of the current established theories were once called pseudoscience by the mainstream scientists at that time. It is not scientific to call a theory pseudoscience without verification, however different it is from the established. Nature published an article by Stanford researchers who tested Uri Geller’s performances of drawing duplication, but this test and their conclusion should not be condemned as unscientific.
The peer-reviewed journals keep referees anonymous for fear the rejected applicants might not take revenge on the referees, but the recursive voting system does not have to adopt anonymity. No researcher has the right to stop another researcher from publishing articles or has the decisive power to change the allocation of research resources. Comments on papers, which are not the object of evaluation of the peer review system, are achievements to be assessed in the recursive voting system. Critical comments, if they are to the point, are more valuable than conforming comments. Criticism is productive whether it is accepted or refuted and therefore researchers do not need to worry about grudge or revenge.
As we have already recognized, the peer-reviewed journals cannot detect intentional fabrication or falsification. How is this problem solved in the recursive voting system? Now that articles are published without any check, online repository cannot detect intentional fabrication or falsification at all. So, researchers would doubt an unexpected result in a non-peer-reviewed repository and conduct a check experiment while they would accept a result of the experiment published in a prestigious peer-reviewed journal without question. Researchers have enough motivation to carry out a check experiment, because the result can be an object of evaluation. When the check experiment fails to show the reproducibility of the result, the first experimenters must clarify the doubt. Otherwise their sincerity might be called into question. Intentional fabrication or falsification to get money is a fraud that the police should search.
The spread of the Internet has transformed the hierarchy of social systems into the flat network. Though the Internet started as a tool for researchers, the transformation of academicism lags far behind that of journalism, because the latter does not require so much professionalism and specialization as academicism does. There used to be a clear boundary between professional journalists and mere consumers of information. The distinction has disappeared since personal blogs started to compete equally with mass media. The recursive voting system I propose here would bring about a similar transformation in academicism.
- Wendy Wagner, Rena Steinzor. Rescuing Science from Politics: Regulation and the Distortion of Scientific Research. Cambridge University Press; 1 edition (July 31, 2006). p. 220-221.
- Nature. “Confidentiality and pre-publicity."
- 瀧川 仁. “素粒子物理学系ジャーナルのオープンアクセス化の試み." 国立情報学研究所.
- Wu, Maw-Kuen, et al. “Superconductivity at 93 K in a new mixed-phase Y-Ba-Cu-O compound system at ambient pressure." Physical review letters 58.9 (1987): 908.
- Rustum Roy & James R. Ashburn. “The perils of peer review," Nature 414, p. 393–394 (2001).
- arXiv. “To replace an article." revision 0.3.4. Last modified 2019-01-04.
- Since I use MediaWiki to write the draft of this article, it can record the part and the time of correction. But it does not mean that the software can protect my priority, because the server which hosts the private wiki site is under my management, and some can suspect that I altered date. So, if I am to protect my priority, I have to leave cash to the server which a third party like Internet Archive manages. As the frequency of the crawling is once in several months and not all contents are cached, this service is not perfect, but it can be used to demonstrate my priority to some extent.
- Greg Landgraf. “Cornell Seeks Sustainable arXiv Support." American Libraries Magazine. February 18, 2010.
- Richard Van Noorden. “Britain aims for broad open access." Nature 19 June 2012.
- George Monbiot. “Academic publishers make Murdoch look like a socialist." The Guardian Monday 29 August 2011.
- Richard Van Noorden. “Open-access deal for particle physics." Nature 24 September 2012.
- Nature Communications.
“Open access options." Accessed on 20 Aug 2016.
- “detecting well-done fraud during peer review is nearly impossible in practice" Nature Neuroscience Editorial. “Can peer review police fraud?." Nature Neuroscience. 9, p. 149 (2006).
- Nature Editorial. “Coping with peer rejection." Nature 425, p. 645 (2003).
- FML. “PageRank." Licensed under CC-BY-SA.
- Pinski, Gabriel, and Francis Narin. “Citation influence for journal aggregates of scientific publications: Theory, with application to the literature of physics." Information processing & management 12.5 (1976): 297-312.
- a British chemist born 17. November 1942 in Huddersfield
- Jordan D. Dimitrov, Srini V. Kaveri, Jagadeesh Bayry. “Metrics: journal’s impact factor skewed by a single paper." Nature 466, p. 179 (2010).
- “The arXiv endorsement system." arXiv.org.
- Roger A. Brumback, MD. “Impact Factor Wars: Episode V—The Empire Strikes Back." Journal of Child Neurology. March 1, 2009.
- altmetrics.org. “How can altmetrics improve existing filters?" version 1 – released October 26, 2010.
- Targ, Russell, and Harold Puthoff. “Information transmission under conditions of sensory shielding." Nature 251.5476 (1974): 602.