jh.gifWhile I am lecturing MIT’s 8.012 class (“Mechanics for Masochists”), I don’t have a lot of time to concentrate on my brown dwarf research, so I find myself drawn off by other, shall we say, intellectual dalliances. My most recent distraction has been the so-called Hirsch index, or h-index, a single number that is meant to indicate the “impact” of a researcher’s published work in science. Think of it as baseball stats for science geeks – Ted Williams may have the highest on base percentage, but I’m sure I’ve got him beat on the h-index!

For better or worse, I’ve been contemplating this index quite a bit recently – perhaps too much. In particular, I’ve been concerned about its growing impact (pun intended) on hiring decisions, salary raises, and – perhaps most importantly – the psyches of young researchers who over- or under-estimate how their h-index is perceived. It seemed to me a perfect topic for an extensive blog discussion.

What is the h-index?

The h-index was formally proposed in a 2005 publication by Jorge Hirsch as a simple metric to gauge a researcher’s impact on her/his field (incidentally, Hirch was also my Physics 105: Computational Physics professor when I was an undergraduate at UC San Diego). As described in Hirsch (2005):

“A scientist has index h if h of his/her Np papers have at least h citations each, and the other (Np−h) papers have no more than h citations each.”

In other words, if you were to list all of a researcher’s publications in order of decreasing number of citations, then the h-index is the number of papers down the list one can go until the number of citations for a paper is less than the number down the list.

While it sounds somewhat complicated to describe this in words, the h-index can be determined very easily with publication search tools such as those on the ISI Web of Science (for which you or your university needs to have a paid subscription to use) or, more appropriate for astrophysics, NASA ADS (which is free!). Here’s a procedure to calculate the h-index using the latter:

  • Go to the NASA ADS site
  • Enter the researcher’s name (e.g. “Burgasser, A”) in the “Authors” text box at the top of the page
  • Choose the databases to query (“Astronomy” and “Physics” should generally both be checked)
  • Choose “Sort by citation count” in the Sorting section at the bottom of the page
  • Click “Send query”
  • The next page will give a list of papers ordered (at left) by the number of citations (the number immediately to the right of the list of authors). Scroll down until the order number equals (or is just less than) the number of citations. That’s the h-index (mine happens to be 29).

What is the h-index good for?

In his defining paper, Hirsch argues that the h-index is useful number “to characterize the scientific output of a researcher.” This is justified through a variety of arguments, including the high h-index values for several prominent physicists, most notably Edward Witten (h=110 in 2005) and several Nobel Prize winners (however, see caveats below). Hirsch estimates that h^2 corresponds roughly the number of citations of an author (an arguably clear measure of impact), without being subject to biases such as a few “big hit” papers or highly cited (but with little original work) review articles. He also argues that the h-index has advantages over several other common metrics, such as the total number of papers (no measure of importance), citations per paper (“rewards low productivity”), number of significant papers (“significant” is arbitrary) and number of citations in most cited papers (again, an arbitrarily defined set). He therefore argues:

“that two individuals with similar h are comparable in terms of their overall scientific impact, even if their total number of papers or the total number of citations is very different.”

Hirsch goes on to list some h-index “metrics” for gauging the status of an individual:

  • h ~ 10-12: advancement to tenure (associate professor)
  • h ~ 18: advancement to full professor
  • h ~ 45+: National Academy of Sciences (NAS) membership (the average for newly elected members in 2005 was just over 45 as reported in Hirsch’s paper)

These benchmarks should not be interpreted literally, however. There are several brown dwarf astrophysicists with h-indices well above 18 (including yours truly) who are nowhere close to full professorship. Furthermore, while Hirsch argues that Nobel prize winners and NAS members have high average h-index values, but there are a handful of the former who have h-indices less than I, but clearly far more impact! More on this later.

The h-index, being a seemingly simple number that can gauge the entire life’s work of a scientist, has not unexpectedly garnered considerable attention by these same scientists. Google scholar lists >170 citations for the Hirsch 2005 paper as of December 9, 2007, most notably in the fields of bibliometrics and scientometrics (indeed, analysis of the h-index seems to form the core research of one Ronald Rossaeu). Nature has published 13 articles and new stories related to Hirsch’s paper (which is itself published in the Proceedings of the National Academy of Science) The scientific press have picked up on this “scientist stat”, and it is featured in a PhD Comics (a sign one has truly made it in the eyes of graduate students everywhere). Several groups have even come up with web-based calculators to calculate one’s h-index using Google scholar. There are also several extensive blogs discussing the h-index, including a very nice one by Prof. Anne-Will Harzing. And I cannot count the number of researchers who are expending time (like me) looking up the h-indices of themselves and their colleagues.

Major caveats

Despite the hype, many researchers (including Hirsch himself) have pointed out caveats in the use of the h-index in assessing the impact of a published researcher. Excellent reviews of these criticisms are found in recent articles by Bornmann & Daniels, Wendl, Kelly & Jennions, and include the following issues:

  • Bounded by the number of publications: the h-index can only be as large as the number of publications. This makes it a poor metric for a young researcher (who, of course, may not yet have substantial impact), but also for well-established scientists whose work is confined to a few classic publications (a prime example is Alan Guth, founder of the inflationary theory of the Universe, who has a relatively paltry h-index of 22 according to NASA ADS).
  • Context issues: highly cited papers may in fact be review articles (with little new research) or poor articles that are negatively criticized by several citations. My feeling, however, is if 100 articles are published to attack an article, it clearly had an impact (or at least struck a cord!).
  • The Matthew effect: analogous to “money makes money”, an already highly cited author may continue to garner citations because that person is highly visible. Again, I believe this in fact argues in favor of h-index measuring a researcher’s impact.
  • Home runs don’t matter: the h-index may de-emphasize very-high impact publications from a particular researcher, failing to distinguish that person from someone who is simply steadily productive. This touches on the meaning of “impact” implicit in this metric.
  • Lost in the crowd: the h-index makes no correction for researchers who are the Nth coauthor of a paper with many authors, and for which she/he may not have greatly contributed. This is common for large collaborations with agreements that all or most collaboration members are coauthors (frequently the case for particle physics experiments or for publications from consortiums such as the Sloan Digital Sky Survey).
  • Incompleteness in the publication/citation record: NASA ADS clearly advertises that its citation library is not complete, ISI Web of Science appears to be limited to 1955 and onward (based on a comment made in the Hirsch 2005 paper), and you can bet Google scholar is considerably incomplete (my h-index drops by 8 points when calculated with it). Such incompleteness worsens for the more senior scientists.
  • Generic names: The h-indices of scientists like Mike Brown or T. Nakajima are fraught with the contributions of scientists with overlapping names.
  • Self-citation: Hirsch points out this will effect the bottom few citations in the h-index set, possibly dropping one’s number by only 1 or 2 (in fact, it drops mine by 5). There is some controversy, however, as to whether self-citation is truly a problem.
  • Normalization between fields: Hirsch points out in his original paper that h-index can be very different between the top physicists (which in 2005 top out at 110 for Edward Witten and top biologists (which in 2005 top out at 191 for Solomon H. Snyder). This issue has been raised by a number of critics (see for example Iglesias & Pacharroman)
  • What defines impact? The is probably the most important criticism, and one that plagues any bibliometric index. Hirsch (and others) nicely tip-toe over this issue, which forces the user to understand what is really being measured in the h-index , regulating the interpretation of impact to the user (again, see Kelly & Jennions).

In response to some of these perceived deficiencies in the h-index, a number of “derivative indices” have been proposed by the number-happy researchers that follow statistics (and appear to be as equally enthralled as me in wasting time on them). Beyond basic numbers (such as the total number of citations or number of citations per paper) these include:

  • g-index (Egghe 2006): Designed to boost the impact of very highly cited papers, the g-index corresponds to the gth paper in a citation-ranked list of publications for which the cumulative number of citations is g^2. This index effectively compares researchers’ total citations assuming that that number is proportional to h^2 (as per Hirsch 2005). Michael Schreiber finds that the g-index is more susceptible to self-citation, but nevertheless shows less variation amongst scientists at a common career phase.
  • H^2 index (Komulski 2006): The H^2 index is similar to the h-index, except that instead of the number h papers with at least h citations, it is the number H papers with at least H^2 citations. For example, I have 9 papers with over 81 citations (my 9th has 90), so my H^2 number is 9. Like the g-index, this variant is meant to reward researchers with high citation counts per paper over the slow and steady researchers.
  • Individual h index (Batista et al. 2006): This index is meant to compensate for papers with many authors for which the researcher in question may have had minimal contribution. The individual h index can be computed by dividing the h-index by the average number of authors in the 1st through hth papers in a citation-ranked list. An alternate formulation is to divide each paper’s number of citations by the number of authors and compute the equivalent h-index.
  • Contemporary h-index (Sidiropoulos et al. 2006): This modification of the h-index is meant to enhance the weight of a researcher’s most recent papers in the calculation of the h-index, in order to gauge her/his current productivity. This is in some way meant to “flush out the dead wood”; i.e., those researchers who may have published considerably several years ago but haven’t had much to show more recently. Individual papers are weighted according to a prescription the balances the “decay time” of a paper and when that decay sets in (this compensates for the lag period in which a paper in absorbed into literature of that field) One particularly brutal implementation that Sidiropoulos et al. apply name “publish or perish” (now “be cited or be consumed) gives a weight of 4/(# years since article was published).
  • A, R and AR index (Jin et al. 2007): Jin and colleagues suggest these few modifications to the h-index, all computed from the “Hirsch-core” – the set of h papers that contribute to the h-index. A is the average number of citation per paper within the h-core, R is the square root of the number of citations in the h core, and AR is the square root of the sum of citations/(# years since publication) over all of the papers in the h-core. The latter is meant to allow a decrease in the “impact” of a scientist who has stopped publishing. However, I feel this supposition is hard to justify, given for example Albert Einstein’s continued impact despite a recent lack of publications!
  • m-index (Hirsch 2005): In the same paper that Hirsch defines the h-index, he also defines an m-index to gauge the rate at which a researcher’s h-index increases; i.e., the growth of impact. Hirsch assumed (and later verified) that the h index grows roughly linearly with time, h ~ mn, where n is the “academic age” of a researcher (hence m = h/n). This latter, rather subjective variable is assumed by Hirsch to be the number of years since that researcher’s first publication. Hirsch considered m ~ 1 as an indicator of a successful scientist, m ~ 2 an outstanding scientist, and m ~ 3 a truly unique individual (these classifications are at best misleading: I wonder how Hirsch would classify Edo Berger, who has an m-index of 5.3!
  • V index (Vaidya 2005) Jayant Vaidya suggested a modification of the m index, scaled by the proportion of time a researcher is able to spend actually doing research. This would obviously normalize out the time constraints on academics whose time is spent teaching and in committees (or writing blogs), but is a somewhat arbitrary number to come up with
  • h-b index (Banks et al. 2006): This index is defined identically to the h-index, but is instead applied to a field of research as opposed to an individual researcher. In other words, it is meant to gauge the impact of a particular field. However, I find this index to be somewhat suspect. If I do a citation-ranked search of papers on NASA ADS using keywords of “brown dwarf” and “dark matter”, I find h-b indices of 213 and 205, respectively. I highly doubt brown dwarf research has a greater impact that dark matter, particularly with regard to national funding!

So the h-index can be problematic; but it is also simple to use and simple to calculate, and it is clearly popular amongst bibliometrists, employers, job seekers, and time-wasting researchers. This blog entry was meant purely as a starting point. In a future blog, I will throw out my own suggestions for improvements to the h-index (yea, more indices!), examine how this index rates brown dwarf astrophysics, and assess when the h-index and its derivatives would be best served or ignored.