Thursday, February 18, 2010

Waste of Time?

A comment on yesterday's post got me thinking about something: If a non-recent journal article gets a low number of citations (say, 0-2), was the research that went into that paper a waste of time (and money)?

My gut reaction is to say no, of course not. Surely something was learned during the research that led to even the most forgettable or forgotten of papers? And surely the researcher didn't know in advance that the paper would never be cited, and did the research for a good reason?

And perhaps citations are not the most perfect judge of what is or is not worthwhile. It's not difficult to think of examples of highly cited papers that aren't that great, and barely cited papers that are overlooked (especially our own).

At the same time, a paper with zero citations, even after more than 10 years, might mean something..

.. such as:

- no one else in the world is or will be interested in this topic;

- others are interested, but they never publish;

- others are interested, but they only cite other papers, not yours (for various possible reasons), creating a snowball effect of subsequent citation of papers other than yours on this topic. With time, it becomes ever less likely that your paper will be cited.

Publishing something widely believed to be wrong or stupid isn't necessarily a barrier to citations, nor is publishing something obvious, so I am not including these in my list of possibilities.

Since I am in a quantitative mood this week, I tried to decide whether there is a minimum number of citations, above which we can say that the research was worthwhile, and below which we might have good reasons to doubt this.

My musings on this topic made me dive into my citation index to look at some of my low-citation papers to see if I could reasonably defend them as worthwhile in some way. My favorite example of a deservedly ignored paper in my oeuvre has surprised me by being cited in the low double-digits. Does that mean that the paper is more worthwhile than it was a few years ago because it has now received (slightly) more than 10 citations (none by me!) instead of 2 (or zero)? No, I don't think so. The difference between 14 citations and 2 citations really isn't that significant in terms of gauging the worth of a paper. And yet, although I am well aware that it was a fairly insignificant paper, I am reluctant to say it was a waste of time.

Further rummaging in my citation history shows me that some of my most highly cited papers do not represent what I consider to be my most significant work but that happened to be on topics that are of more widespread interest than the core of my research. Does that make these more-cited papers more "important" than my others? I am not objective about this, but I don't believe that citations correlate with significance, though I admit that it depends on how you define "important" and "significance".

One more personal example: A paper that has received a very modest number of citations is frequently mentioned to me as a paper that is read and discussed in graduate seminars. I am very pleased about that. The paper is being read and used (perhaps as an example of how not to write a paper..), although it is not cited very often. I consider that paper to have been worthwhile.

So, although I agree that zero citations is not a good thing for non-recent papers, and my papers have thus far avoided this fate (though in some cases not for any good reason), I have trouble casting aspersions on papers that have received a modest but non-zero number of citations.


Drugmonkey said...

citation is next to useless for making a categorical assumption of worth. the question is whether people are reading it and learning something. period.

PLoS enthusiasm for article level metrics will help to show the difference between cites and downloads which will underline the disconnect.

John Vidale said...

As the author of the precipitous two-citation = junk jingoism, here's a brief defense, or at least the statement's genesis.

First, something's funny about the statistic of half of papers having 2 or less citations. Only 6 of my 100 papers have 5 or less citations, and are not editorials or published in 2008 or 2009. Maybe they're counting abstracts or some such grey literature.

Of those 6, 4 in hindsight were a waste of effort, 1 I did with an undergrad to make fun of an earthquake prediction crank, and I like a 2007 one with 4 citations. Several papers with 6 citations I'm proud of.

So I made a rule that held true for my own papers, halved the number of citations, and brashly put it in an FSP comment.

Anonymous said...

This reminds me of the NBA general manager (perhaps not the most appropriate source for a comment here, but whatever) who said his best strategy for improving his team's efficiency was to acquire players who only take the shots that go into the basket, and don't take any of the others. Write 'em and maybe someone will cite 'em; if not, oh well.

qaz said...

I have two comments on this topic:

First, remember that although we (as scientists - or is it the business?) try to quantify impact through citations, they are only a correlation, not a definition. In one case, there was a paper that tested an explicit prediction I had made in a theoretical paper some years earlier but never cited that theory paper. I emailed the senior author to ask if she knew about my paper and our prediction. She wrote back saying that of course she knew about the paper - that was why she'd done the experiment in the first place. And then she sheepishly admitted she probably should have cited the paper and mentioned that.

The other thing is that as long as something is in the literature it may reappear. One often finds examples of work that is ignored for years and then rediscovered or reexamined. Some enterprising scientist may go back, find it, and wonder if the reason it didn't have the impact it could have was because they didn't have the technology we do today, and follows up on the experiment. I know of several cases like this.

Anonymous said...

I wasn't sure what to think when I first read this, and then I was reminded of all the orphaned data in labs. The data can be orphaned because it's negative, doesn't fit into any other stories in the lab, or isn't deemed worthy of publication. The third of these is what concerns me. I myself rescued quite a bit of orphaned data from a collaborator a couple of years ago, and that data made our story stronger and more interesting. So, just because the data on its own doesn't seem worthwhile, either by a PI or lack of citations, it still can be of use to someone out there. Maybe others don't pursue a lost cause, maybe they structure their studies differently. Failure is the cornerstone of success, and there is no "wasted" experiment, so long as the proper conclusions and "next steps" are taken.

Anonymous said...

Just found this post, thought it would be an interesting read for everyone involve in the process of science and judging scientists.

Anonymous said...

Yeah... I think citation indexes (CI) are hugely problematic.

1) They are poor for judging recent papers since all the citations will not be out yet, and since our most important evaluations occur early in our careers, they are poor tools for evaluating hiring/tenure/promotion decisions.

2) Understandably because of their broadness, review papers receive by far the highest citation indexes. If that alone is not enough to convince that citation index and research quality are not correlated (since there is no research in review papers) I'm not sure what is.

3) I know when I look for citations to use in papers (particularly methods and introduction), I will often just cite the same paper someone else did when making a related statement in their paper. Do I go and make absolutely sure that there isn't another paper out there that would be more appropriate? In 90% of cases, NO. There's too much of a snowballing effect when it comes to CI.

a physicist said...

I've been meaning to chime in on this week's discussions for a while. I see two issues that haven't been mentioned.

First: In physics, I often hear that three papers are required for a PhD student to graduate, at least as a rule of thumb. For really strong students, this is not a problem. What about students who are below average but still worthwhile people? What about experiments that don't work the way we want them to, no matter how good the student is? For these cases, I think it's better to write a low-impact paper. The alternatives might be delaying graduation by a few years, or a student who has a tough time getting any sort of job later on.

Second, it seems like a lot of physics workshops and conferences in Europe have requirements that invited speakers submit a paper to their conference proceedings. I personally hate this requirement, my guess is that to get money to sponsor the conference, they have to show that there is a physical outcome from the conference in the form of a published proceedings. Often these proceedings are in low-impact journals or worse. So, I tend to only put low-impact results in these proceedings, if I can't get out of the requirement and still want to go to the conference.

The first problem (PhD students who need to publish SOMETHING) is hard to fix. The second problem (proceedings) is potentially fixable and would reduce some of the clutter of low-impact papers, at least in my corner of physics.

EliRabett said...


Eli has a pub that has zero citations. The reason is that it essentially killed off a hot area of inquiry showing that much of what was measured was an artifact.

Most influential paper the Bunny ever published.

Anonymous said...

Again, this will have to be discipline specific. I looked quickly at two of John V's papers to see about 4 citations per page of text. In pure mathematics this number for many many authors is less than 1. So, in a field where people publish less frequently and cite less frequently than most, any discussion of minimum citations is not very helpful. Perhaps download counts will help but with so many ways to get papers this is not always a good metric either.

Hope said...

The first problem (PhD students who need to publish SOMETHING) is hard to fix.

No, it isn’t. Get rid of the publishing requirement for PhD students – problem solved.

PLoS enthusiasm for article level metrics will help to show the difference between cites and downloads which will underline the disconnect.

When it doesn’t cost me anything to download something, you can’t assume that I will only download what I find valuable or useful.

Ms.PhD said...

some of the biggest discoveries went almost completely un-cited for 20 or even 100 years, because the person who worked on the stuff originally was dead (mendel) or the authors went right to the edge of what was possible with the technology at the time, and then there was a gap until the technology caught up to the point where someone else could continue the work.

not a waste of time to do the work, not a waste of time to publish it.

just sad that it went unappreciated for so long before anybody noticed.

辛保山 said...

How about this for a quantitative mood:

A Principal Component Analysis of 39 Scientific Impact Measures, J. Bollen etc.

It says "usage" metrics is more appropriate than citation ones. However, I feel that what matters is the general social process of doing "science". With enough bona fide scientists, "gold nuggets always shine given time".

Kevin said...

I have 3 papers with 0 citations.
One is my first paper ever, answering a published conjecture in math. It was read by me and the person who published the conjecture. I never expected another reader.

Another was a conference paper by my grad student that just came out last summer. I expect it will pick up a dozen or so citations over the next 5 years.

Another was a publication by a volunteer postdoc about 5 years ago. It was not a great piece of work, but it was respectable, and its appearance on his resume helped him get a job.

My total citations are only 2455 (with 0 to 582 per paper), but there is a huge difference between different fields. Even low-quality bio papers get citations in the dozens, while high-quality math papers may only get cited 5 times. Having switched fields a few times, I've seen that citation counts and derived measures (like h-index) are very, very field-specific.

John Vidale said...

Some posts have arm-waving and absolutist logic, but no systematic facts to back them up.

An a practicing scientist, all I can say is that in hundreds of cases, I've pulled up the citation counts for someone's papers, and find it a very useful guide to suggest which papers were most influential, and which people have accomplished what. Not in every case, but usually. As I said above, it would certainly work in my case.

It is not clear that there is any other way, short of asking the person's closest colleagues, with all the potential biases and smoke-filled logic that elicits.

I imagine usage and citation each have drawbacks, and citation is much easier to count objectively.

Kevin said...

Another problem with citation---some people fail to cite. I have one paper that has been cited only 230 times, but the algorithm in it is referred to on the web over 99,000 times.

Anonymous said...

one of my most highly cited papers required a critical detail in the experiment (without which the rest of the project would not have worked) that was previously published several decades ago and cited only in the single digits.

Even if a paper has been around for a long time and has hardly been cited, there is no telling that further into the future it may not become useful or play a role in some other (presumably more important) research to come.

Anonymous said...

Papers that get highly cited tend to be those related to what a lot of people are also doing. A paper that is not well cited could be because the author was not jumping on the bandwagon working on the same problems that everyone else is also working on and instead was more intellectually independent.

a physicist said...

An earlier version of this comment has gone missing, so here it is again.

@Hope: I agree it's easy to get rid of the formal requirement for publications to get a PhD. I think, though, it's hard to get rid of the informal requirement by people who want to hire someone with a PhD. Well, in some cases, they're not going to ask about publication record. But in general, if you're planning to hire a PhD, wouldn't it look bad if that person had no publications? Even a low-impact publication looks better than nothing: it shows that the person could finish a project.

Also, I think learning to write a paper is a great experience for students, even if it's a low-impact paper. Sure, you learn more if you write a high-impact paper. But again, if we're talking about a 4th or 5th year PhD student who hasn't written a paper yet, and who's not on the verge of a high-impact result, it seems like writing up a low-impact result is better than not having any publications.

But to be very clear about this: my main thought is to do what is best for the student. If a student has an urgent reason to graduate, and doesn't care about their publication record, then I don't want to stand in their way. (And this has happened to me more than once as a PhD committee member, where I signed a PhD approval form for someone with 0-2 publications.)

Anonymous said...

I manually ranked my top 30 papers from 1 to 30 according to what I believe their impact to be. Then I ranked the same set of papers according to their citation number using data from Google scholar.

The correlation between the two is 0.53.

Thinkerbell said...

I've worked on a topic that stole my heart for the better part of this centuries' first decade. No one else in the world seems to care. While I don't want to compare myself to Mendel or other Dead Important Dudes, I am CONVINCED that somewhere, somehow, sometime this work will make a difference to someone. I just hope I'll still be there to witness it.

Kevin said...

"The correlation between the two is 0.53."

Since you were correlating ranks, I hope that is Spearman's rho or Kendall's tau, and not Pearson's r that you are using for correlation.

Anonymous said...

Oops I posted this in the wrong thread. Here it goes again.

Since you were correlating ranks, I hope that is Spearman's rho or Kendall's tau, and not Pearson's r that you are using for correlation.

I used Pearson's over the ranks (not the citation counts), which should be the same as Spearman's right?

I'm no expert in statistics. How strong is a correlation of 0.5 deemed to be?

Kevin said...

Pearson's r on ranks is indeed Spearman's rho. Personally, I like Kendall's tau better (more directly interpretable). In any case, how strong a relationship with a correlation of 0.5 is depends on who you are and what sort of data you look at. For a sociologist, that's a huge correlation. For an electrical engineer, that's so noisy as to be nearly useless.

vanzare apartamente cluj said...

I think citations are important ... or at least were. Now the volume of information is very high but the important thing is if people are reading it and learning something from that paper.