Sunday, January 1, 2012

My Life with h

I came to know h only when h was already well-known to many others. And then, one day, an epiphanic moment: the sudden realization that h had intruded on the academic careers of all of us without even asking for consent. Since then, h and I maintain a rather ambivalent relationship. On the one hand, I should have probably liked h, because my private h is nothing to be ashamed of in my scientific discipline. On the other, I think that h undermines proper scientific culture.

The Hirsch index, Hirsch number, H index/number, h-index, or simply h to connoisseurs, was formally born in November 15, 2005, in a paper entitled "An index to quantify an individual's scientific research output", single-authored by J.E. Hirsch from the Department of Physics in UCSD. In one of the tersest and most effective abstracts I have ever encountered in the scientific literature, Hirsch states: "I propose the index h, defined as the number of papers with citation number ≥h, as a useful index to characterize the scientific output of a researcher." The rationale is hence straightforward: find a single measure to encapsulate the scientific output of a scientist, to be used for evaluations made in processes of recruitment, promotion, competitive grant allocation, or simply for experiencing naches. The methodology is also amazingly simple: scroll down the list of publications, arranged per times cited from the highest to the lowest, until the paper rank is equal or greater than the citations for that paper. Of course, you do not really have to do it that way, nowadays you can simply select the "create citation report" on ISI Web of Knowledge (Thomson Reuters) or a similar data base, or, for those reluctant to spend even this amount of energy, use one of the "h index calculators" on the web (but beware, some yield bizarre results).   

The ingenious simplicity of h - an entire career squeezed into a single, usually double-digit, number - immediately gained much popularity. On a recent search Google returned no less than 641,000,000 results for "h index", and although I admit I didn't scan them all, a quick glance revealed that at least many of them refer indeed to the Hirsch index. The methodological shortcomings of h became apparent immediately, but didn't slow down the infection. Clearly, there is no doubt that Hirsch meant only good. But when an auteur releases a piece of art into the universe, that creation acquires a life of its own (though in this case I suspect that Hirsch foresaw something, dubbing "quantification" of science "potentially distasteful" already in the first paragraph of his paper).

Among the issues brought up from the outset: the dependence of h on the culture of the specific discipline (h values are higher, for example, in molecular biology than in math or psychology); the effect of the size of the sub-discipline and of the research teams; the bias against books, that are cited sparsely in research papers; the difference between archival- and groundbreaking papers that ultimately make it into textbooks; and the context of citation, including, is it refuting the work cited? Theoretically, one could even make an h-index-promoted career by publishing irreproducible results in a catchy field. At the time of writing, h can only grow over time, though on a second thought the idea of procedures to reduce h over time for poetic justice is not entirely irrational. Pitfalls and potential remedies in using h are outlined nicely by various authors, including in Wikipedia.

But in my view, the major drawback is not methodological, but rather conceptual. h boosts instant scientific culture. Using it relieves many of the need to look more closely into (or God forbid, even read) papers of those they attempt to evaluate. It is an epitome of industrialized science. Clearly, high h (hh, following the spirit of instant science and the universe of texting), is not sufficient evidence for high quality science, neither is the lack of hh evidence for lack of strong impact. Some of the most influential scientists I know have a rather meager h within their own discipline, including Nobel laureates. I think that h provides an incentive to avoid devoting the attention needed to really evaluate lasting contributions to science, and we always risk the tendency to find refuge in the easy solutions. Although the reductionist approach was highly successful in promoting modern science, the reduction of careers to a single numerical index is going much too far.

Using h is also in my view an additional incentive to publish too many fragments of papers instead of coherent narratives. The late Max Delbruck, a founding father of modern molecular genetics and biology and Nobel laureate in physiology and medicine in 1969 (with Alfred Hershey), advocated the idea that PhDs should get coupons upon graduation, each to be used for publishing a paper. No unused coupons, no more papers. There was an argument what the number of the coupons should be, but it was never deemed beneficial to set it larger than 30. This could have by definition limited h to a small useless index, forcing authors to publish a number of papers smaller than that they could seriously read, and others to read these papers. But it is too late, probably, to revive the Delbruck principle.

By the way, Hirsch's PNAS, 102: 16569 (2005) was cited so far 903 times, by far the most cited publication of Hirsch, and contributed to his own h index, which is 48, pretty high for physicists.

© Yadin Dudai 2011

Sunday, December 18, 2011

Flirting with Oblivion

K., my host, had a timid smile that didn’t yield even the slightest hint to the news she was going to break. “Looking for your publication record in PubMed”, she said while we were crossing the spacious cafeteria of the Institute of Molecular Pathology (IMP) in Vienna, “I noticed a gap of about thirty years. You must have done other things then?”

Well, I did do many other things in those years, but the major one was pursuing my research interests. This unavoidably culminated in many papers, some of which I admittedly still feel much attached to. What K. implied was, however, that no trace could be found of these papers in the major data base in my field. A gloomy, existential uneasiness descended on me. I was supposed to start delivering the Max Birnstiel Lecture at IMP within half an hour, yet decided that there was still enough time to conduct an independent experiment. I woke my MacBook Air from its electronic sleep, connected to the internet, and lo and behold, the bare truth shone on the glossy screen: PubMed indeed displayed only a single reference of one of my earliest papers, not necessarily the one to write home about, then only a few from the last few years. The rest were gone, although they were there a few weeks earlier when I needed to consult the site. I did the expected control experiment and typed a colleague’s name in the search line. His entire list of publications came up proudly in no time. I tried mine again. No remedy. Nada. Gornisht mit gornisht. The public track record of my academic career seemed to have suddenly descended into oblivion.

Scientific careers are not unlike the names of T.S. Elliot’s cats: they come in three versions. First, there is the private version, known only to oneself, reconstructed and sometimes naturally self-inflated as years go by, an associative autobiographical web of episodic and semantic information unavailable to those not personally involved. Second, there is the version known to colleagues, containing elements of shared experiences and interests and discussions and colored by context, anecdotes, and personal touch. And finally, there is the public version, which whoever wanders into the literature encounters. Nowadays, the information for this public profile is retrieved from the web, and data bases are the major source. In science as in science, scribo ergo sum, but whether you did publish or not, is disclosed by the search function of the data bases. Hence when our records disappear from these data bases, we take a step toward oblivion in the scientific universe.

Later at night, scenarios were alternating in my head. Some were straightforward, like whom should I actually contact to find out what happened and make sure my record is restored. Some were admittedly paranoiac. Was it personal, or did I stumble across a plot by Dr. No to slowly annihilate world science? Or, may be, this could have been expected, since I didn’t listen to smart advice years earlier. When I first met my postdoctoral mentor, Seymour Benzer, in Caltech, he had two statements to make to the newcomer. One was that being trained in Biophysics, I should pay attention to the fact that animals behave, and therefore should not rush to grind them up. This one I followed. The second, that I have already published too much at that early stage in my career, and I should not continue cranking papers at such pace. This one I am not sure I did follow. Was this the belated punishment?

I woke up still engulfed by the sense of the frailty of the human condition. Luckily, PubMed’s fair treatment of my scientific output returned to normal within less than a day. Possibly, some merry electrons in a remote server regained their senses and decided to restore world order. Hence my mini-encounter with oblivion was so far only a brief flirt, possibly even gone unnoticed except by K. and me. Yet I came to cherish hardcopies again, and I may even pay a visit to a real library just to pay tribute to the printed word. Coming to think about it, evolution has probably embedded in us the reliance on tangibility, the need to palpate things before we really trust in them. I plan to print out all my papers and bind them, just in case. It may be difficult to place so much faith in clouds.

© Yadin Dudai 2011