Science in the golden shackles of its imaginary impacts

June, 2014.

by Muru Venkatapathi.

The impact of science on our daily lives is ubiquitous, but tracing bits of this impact to individual works of science with reasonable certainty is becoming impossible in most cases. Nevertheless, considering that we spend a notable part of our financial commons (or GDP) on scientific endeavors, such a microscopic estimate of the impact by each scientific work seems unavoidable. Even in the case of a work in fundamental sciences that is far away from any immediate use, an estimate of its impact on our knowledge is quite pertinent. As a first approximation, measuring impacts of science have been relegated to the quantification of citation-impact by a scientific work. Such measures assume that the citations received by a scientific work are unbiased pointers to its real impact. Further the citation-impact of works are cumulatively considered for estimating impacts of larger aggregates such as journals, scientific institutions and the careers of individual scientists. Moreover, monetizing both scientific discoveries (by patents) and the access to scientific publications exacerbates this necessity of micro-estimating scientific impacts. The argument that technology is a great end-use of science but not the primary motivation is unappealing even to most scientists today. Hence, use of citation-impacts to justify the quantum of funding to scientific organizations and in the evaluation of scientific competence of countries to individuals is the order of the day. This has begun a debate on the pitfalls of such conclusions using an emphasis on citation-impacts [1-3].

Let us start with the broad agreements among the scientific community on this issue:

1) Measuring impacts is necessary

2) Citations earned by a scientific work have a positive correlation with its actual technological and scientific impacts

3) Citation based indicators are far from perfect primarily due to uncertainty in the relationship between real and citation impacts (notwithstanding any advanced processing of citation data).

The strong disagreements arise from the effects of (3), especially its effect on the way we do science in the long term [1-5]. While many believe that despite its limitations the current strong emphasis on such indicators has been fruitful, many others argue that its premature use in decision-making severely stifles science due to fundamental deficiencies of the citation system and its indicators. In this article, I point to the large lacunae in our rudimentary citation system that makes quantification of real impacts unreliable except in restricted cases. These points are valid irrespective of the statistical metrics used in processing the citation data. Next I point to specific practices in the current system of scientific publication that can multiply these negative consequences into a vicious runaway cycle in the long term. This discussion offers suggestions that can make the measurement of real impacts more accurate and also help increase the signal-to-noise ratio of scientific publications. I argue that these improvements in the systems of citation and publishing are vital, and should receive strong support of the scientific community irrespective of which side of the above argument one submits to.

Does every citation indicate an identical impact? Does this fallacy result in a folly?

When one attempts to derive metrics for scientific impacts from citations, the following issues should be pondered.

A] Grades of citation: A citation earned by a scientific work indicates any of the 3 kinds of contributions to the citing publication. The first more notable kind is a contribution to the methods used in the citing work; the second is a relevant work with comparable/contradictory results; and the third is a related work used to highlight either the historical antecedent or the contemporary significance of citing work. The first kind is enumerated in the methods and introductory sections of a manuscript, whereas the second type is typically found in the introduction and results/discussion. The third kind is limited to the introductory section of a manuscript. It is thus natural to require that citations are distinguished based on their graded relevance to work as the difference in the real impacts to the citing work may be separable by orders of magnitude. On an average less than 20% of the references of a typical manuscript are unique indispensable citations, more so in the applied areas of science.

B] Methods matter: Also to be noted is that the current practices of highly visible journals (as described in the next section) explicitly discourage a detailed description/verification of methods, to be replaced by longer introductory sections and more plots of the results. The questionable justification is that today many of the methods are eventually repeated in the prolific publications of increments, and also, they do not appeal to a wider readership. The above factors introduce a large bias against manuscripts describing new essential analytical/experimental methods that are fundamental and general.

C] Citations can be inherited: Even before the era of search engines, it was showed that indicators like citations have had the characteristics of a greedy propagator (i.e.) the effect of rich getting richer [6], making advisors/co-authors at graduate school a significant causal factor in the citation-impacts of later works of a scientist. This effect has increased subsequently with the internet age and also introduces bias against a scientist working in multiple scientific areas; while largely favoring incremental publishing on a problem to saturation as it can garner higher hits in a search engine.

D] More the authors more the merrier: One of the most glaring faults in the current indicators is that total citation-impacts earned by a publication are not shared by authors, but instead is duplicated to each of them (i.e.) the sum citation-impact attributed to authors is not conserved by the citation-impact of the publication!

E] Quality of citations: Recently, there has been an effort to include the apparent quality of a citing publication in the determining the impact of a work. In principle, this can be done using the citation data provided the pitfalls A, C and D are sufficiently addressed. If these are allowed to linger, impact indicators based on advanced data processing techniques can only enlarge those lacunae.

F] Blind spot of industrial impacts: One other glaring deficiency of citation-impacts is that an industry using the work in a publication has a large disincentive to reveal its trade secrets by citing that work.

Conflicts of interest: Science Vs the Journal

Monetizing the scientific publications has resulted in a necessity to make journals highly visible. It is in the interest of scientific community that parochial interests do not trump the larger interests of science. Unfortunately, a high standard of science does not necessarily have a notable correlation with a wide readership (that is needed for high visibility and citation impacts). Large increases in doctoral students and the number of publications along with this need for journals to be distinguished have severely stressed the peer-review process. Introduction of full-time editorial staff to screen manuscripts before peer-review is a result of this need. A non-practicing scientist is employed in a journal (for decades together resulting in entrenched interests); primarily to screen submitted manuscripts for maximizing the future citation-impact of the journal. Naturally, they are well trained to distinguish the apparently good manuscripts from the average ones, but more importantly they mimic the non-expert wide readership they seek for the journal. Typically each one of them is expected to peruse and make decisions on a few thousand manuscripts in a year. Such decisions are not scientifically justified but more importantly, it ensures that scientific merit plays a minor role in comparison to the significance perceived by a non-expert [7].

It is a system that is designed to publish manuscripts that are appealing to even the people who may not understand the contents of the manuscript sufficiently. Based on a superficial understanding, a vicious cycle of inflation in publications on any subject along with its citation-impacts can result, and this seems to satisfy the false premise of an increasing quality and quantity of science. There is also an explosion of literary/algebraic embellishments in publications appealing to such editorial staff and the larger readership, naturally at the cost of our understanding in the science. In many cases, peer-reviews in these journals have been relegated to the opinions of the peers on the appropriateness of a manuscript to the journal; many a times shifting the focus unscientifically from ‘what is being said’ to ‘who is saying it’.

Above all, the above practice and negative consequences have been justified based on the imaginary impacts enumerated by the citations accrued to journals. But the actual signal-to-noise ratios in science may have drastically fallen. Leaving aside this opinion on the difference between real and citation impacts, one should at least take note of the ability of highly visible journals to accommodate the most cited publications [3]. The correlation of the most highly cited papers to the highly cited journals (in all areas) was moderate before 1960 and did climb until the dawn of the internet age (~ 1990). Subsequently, this has taken a sharp downward trend recently (~2002) clearly showing that the publication practices to ensure high visibility run counter to accommodating the most excellent scientific works of our times. Systems optimized for high throughput and higher averages naturally have trouble in accommodating the most original works. Also too much specialization of journals is counterproductive as well; where duplication of scientific knowledge and vocabulary slows down the actual scientific progress despite an increase of citations.

Finally, a specific example of the uncoupling of citation-impacts and real impacts is tempting here. The seminal paper of Pines and Bohm [8] on collective excitations of free electrons in a metal (called plasmons today) has earned ~ 700 citations in sixty years. Not surprisingly, even invited opinions on the use of plasmonics that were published in highly visible journals have attracted more than 5000 citations in just the last decade; which in naiveté would signal an impact almost hundred times stronger. The remedies for the large lacunae in journal publishing practices are mostly well-known. An effort to limit unbridled monetizing of the access to scientific publications has already begun. This should be followed by a double blind peer-review process that puts emphasis on the scientific rigor and simplicity of a solution to the problem as this has become a dire need.

References:

1. www.ascb.org/SFdeclaration.html.

2. Luís A. Nunes Amaral, “Measuring Impact: Scientists must find a way to estimate the seemingly immeasurable impact of their research efforts,” The Scientist (Opinion), February 24, 2014.

3. George A. Lozano, Vincent Larivière and Yves Gingras, “The weakening relationship between the Impact Factor and papers’ citations in the digital age,” Journal of the American Society for Information Science and Technology 63, 2140–2145 (2012).

4. Richard Naftalin, “Rethinking Scientific Evaluation: Asymmetry in the Research Excellence Framework in the U.K. is a threat to basic medical sciences within British medical schools”, The Scientist (Opinion), July 16, 2013.

5. Orion Penner, Raj K. Pan, Alexander M. Petersen, Kimmo Kaski, and Santo Fortunato, “On the Predictability of Future Impact in Science”, Scientific Reports 3, 3052 (2013).

6. Matthew J. Salganik, Peter Sheridan Dodds, Duncan J. Watts, “Experimental Study of Inequality and unpredictability in an artificial cultural market,” Science 311, 854-856 (2006).

7. Steen RG, Casadevall A, Fang FC, “Why Has the Number of Scientific Retractions Increased?,” PLoS ONE 8(7): e68397. doi:10.1371/journal.pone.0068397 (2013).

8. D. Pines and D. Bohm, “A collective description of electron interactions:I and II” Physical Review 82, 625-634 (1951); Physical Review 85, 338-353 (1952).

9. Douglas N. Arnold and Kristine K. Fowler, “Nefarious Numbers,” arXiv:1010.0278 (2010).

I don’t mind if you think slowly, but I object if you write papers faster than you can think.

- Wolfgang Pauli

It was very easy in those days for any second-rate physicist to do first-rate work. There has not been such a glorious time since then. It is very difficult now for a first rate physicist to do even second-rate work.

- Paul Dirac