**By Rubén Ruiz, professor, Department of Applied Statistics, Operational Research and Quality, UPV; & Ángel A. Juan, associate professor, Computer Science Department, IN3 – UOC.**

Data is everywhere. In the age of the internet we use data, computers, and statistics to analyze, measure, and compare everything: different strategies, policies, systems, processes, and even people. This allows us to be more informed and as a consequence, make better decisions. In a word, it makes us wiser than ever.

However, the ease of access to data and computers can also have some drawbacks if we are not cautious. Sometimes it is too easy to succumb to the temptation to compare apples with oranges, and consider the results of these comparisons as valid just because we are using some fancy math formulas, lots of Internet-based data sources, or powerful computers. Of course, extracting conclusions after comparing apples with oranges is a practice we should try to avoid.

In academics and research, this is what happens when we compare the impact factors (IF) of journals that belong to different disciplines. For instance, according to the ISI JCR 2013 Science Edition, the *Reliability Engineering & System Safety* journal has an IF of 2.048, and it belongs in the first quartile of its subject category ranking (Operations Research & Management Science).

« Extracting conclusions after comparing apples with oranges is a practice we should try to avoid »

However, whist having a similar IF (2.055), the *Neuroscience Letters* journal is located in the third quartile of its subject category (Neurosciences). To give an even more extreme example: *Advanced Mathematics* has an IF of 1.353 and belongs in the first quartile of its category (Mathematics), while *Methods in Cell Biology*, with an “equivalent” IF of 1.440, belongs in the fourth quartile of its category (Cell Biology).

Therefore, it is not reasonable to compare IFs belonging to different subject categories, at least not without applying some statistical adjustment to the index (e.g., normalizing by the IF range, etc.). Of course, it is even worse when somebody makes a direct comparison between two indices that use completely different criteria, e.g., the IF and the SCImago Journal Rank (SJR). Both are positive numbers used to establish journal rankings, but they are based on different mathematical formulas.

And what happens when comparing the h-indices of authors belonging to different disciplines? Recall that the h-index of an author is a given number “h” so that h of his or her Np papers have at least h citations each and the other (Np – h) papers have less than or equal to h citations each. Np is the number of papers published over ten years. So an author with an h-index of 10 has published at least 10 papers with at least 10 citations (or more).

Similarly to IF, as journals in some disciplines (e.g., Neurosciences) tend to receive many more citations than journals in other disciplines (e.g., Mathematics), it is obvious that a direct comparison of h-indices across disciplines will cause us the same problem as before. It is easy to argue that the h-index is a preferable measure when simply comparing the total papers published or the total received citations. However, it is not an indicator that one can use to compare between disciplines easily.

« Pure bibliometric indicators must be supplemented with an article-based assessment and peer-review of the past achievements of an individual »

As suggested in the paper by Ciriminna and Pagliaro^{1}, it is good to use an h-index as a bibliometric indicator but its use should be made wisely, for example by doing an age-normalization or by considering the year of the first publication. Complementing the h-index with the SNIP (Source Normalized Index per Paper) or the aforementioned SJR would give a much richer picture.

In any case, there is a growing move towards the view that for greater accuracy, pure bibliometric indicators must be supplemented with an article-based assessment and peer-review of the past achievements of an individual. After all, it is obvious that no single easy to understand, easy to calculate metric can fully capture the rich and complex nature of the scholarly contribution of researchers to their disciplines, especially considering the rich and varied nature of these disciplines. Many forms and facets of scholarly achievement should be considered.

In summary, both the IF and the h-index are useful indicators, but they should be statistically adjusted and used with caution, especially when used to contrast journals or authors across disciplines. Despite this quite obvious conclusion, it is surprising to see how many times brilliant academics, researchers, and managers are still comparing apples with oranges. Shouldn’t we use statistics in a smarter way?

Further reading:

- Anne-Wil Harzing (2010). Citation analysis across disciplines: The impact of different data sources and citation metrics.

_____

1. Ciriminna, R. and Plagiaro, M., (2013). On the use of the h-index in evaluating chemical research. *Chemistry Central*, 7:132.