You are currently browsing the tag archive for the ‘authorship’ tag.

# Tag Archive

## The combinatorics of authorship in the biological sciences

November 3, 2018 in academia | Tags: authorship, enumeration, equal contribution, ordered authorship | by Lior Pachter | 7 comments

A few years ago I wrote a post arguing that it is time to end ordered authorship. However that time has not yet arrived, and it appears that it is unlikely to arrive anytime soon. In the meantime, if one is writing a paper with 10 authors, a choice for authorship ordering and equal contribution designation must be made from among the almost 2 billion possibilities (1857945600 to be exact). No wonder authorship arguments are commonplace! The purpose of this short post is to explain the number 1857945600.

At first glance the enumeration of authorship orderings seems to be straightforward, namely that in a paper with *n* authors there are *n!* ways to order the authors. However this solution fails to account for designation of authors as “equal contributors”. For example, in the four author paper Structural origin of slow diffusion in protein folding, the first two authors contributed equally, and separately from that, so did the last two (as articulated via a designation of “co-corresponding” authorship). Another such example is the paper PRDM/Blimp1 downregulates expression of germinal center genes LMO2 and HGAL. Equal contribution designations can be more complex. In the recent preprint Connect-seq to superimpose molecular on anatomical neural circuit maps the first and second authors contributed equally, as did the third and fourth (though the equal contributions of the first and second authors was distinct from that of the third and fourth). Sometimes there are also more than two authors who contributed equally. In SeqVis: Visualization of compositional heterogeneity in large alignments of nucleotides the first eight authors contributed equally. A study on “equal contribution” designation in biomedical papers found that this type of designation is becoming increasingly common and can be associated with nearly every position in the byline.

To account for “equal contribution” groupings, I make the assumption that a set of authors who contributed equally must be consecutive in the authorship ordering. This assumption is certainly reasonable in the biological sciences given that there are two gradients of “contribution” (one from the front and one from the end of the authorship list), and that contributions for those in the end gradient are fundamentally distinct from those in the front. An authorship designation for a paper with *n* authors therefore consists of two separate parts: the *n!* ways to order the authors, and then the ways of designating groups of equal contribution for consecutive authors. The latter enumeration is simple: designation of equal authorship is in one-to-one correspondence with placement of dividers in the *n-1* gaps between the authors in the authorship list. In the extreme case of placement of no dividers the corresponding designation is that all authors contributed equally. Similarly, the placement of dividers between all consecutive pairs of authors corresponds to all contributions being distinct. Thus, the total number of authorship orderings/designations is given by . These numbers also enumerate **the number of ways to lace a shoe**. Other examples of objects whose enumeration results in these numbers are given in the Online Encyclopedia of Integer Sequences entry for this sequence (A002866). The first twenty numbers are:

1, 4, 24, 192, 1920, 23040, 322560, 5160960, 92897280, 1857945600, 40874803200, 980995276800, 25505877196800, 714164561510400, 21424936845312000, 685597979049984000, 23310331287699456000, 839171926357180416000, 31888533201572855808000, 1275541328062914232320000.

In the case of a paper with 60 authors, the number of ways to order authors and designate equal contribution is much larger than the number of atoms in the universe. Good luck with your next consortium project!

## Time to end ordered authorship

May 16, 2014 in talks | Tags: Arrow's impossibility theorem, authorship, card catalog, Dewey decimal classification, Hardy-Littlewood rule | by Lior Pachter | 13 comments

One of my distinct memories from elementary school is going to “library class” to learn about the Dewey decimal classification and how to use a card catalog to find books. Searching for books efficiently was possible because cards in the catalog were sorted lexicographically.

It didn’t occur to me at the time, but the system *required* authors of books to be totally ordered. Without an ordering of authors in a book with multiple authors, there would be no way to decide where to place the card for the book in a catalog searchable by author. The practice of ordering authors on publications is evident in the oldest printed texts and has persisted to this day. I have never thought that it could be any other way.

However this past Wednesday I was visiting the University of Washington to deliver a seminar, and among the highlights of the visit was my meeting with the graduate students. I met 12 for lunch and two more came for dinner. Meeting with students is always my favorite part of a visit to a university. They have original and creative ideas, and most importantly, are not bound in their thought by archaic tradition. They frequently don’t know what one is supposed to think and how one is supposed to say it. They just think and speak!

One of the students I met on Wednesday was Vanessa Gray, a student of Doug Fowler, who in a conversation on authorship practices suggested to me the radical and brilliant idea that **papers should be published without an ordering of authors**.

Many journals now have a section called “Author contributions” where roles of individuals in collaborative projects can be described (many journals now require such descriptions). So why bother ordering the authors for a list underneath the title? As far as indexing and searching goes, Google and other search engines require only a *set* of authors, and not a specific ordering.

I agree with Vanessa that ending author ordering on publications would greatly improve fairness in the biological sciences, where many current projects involve complex assemblies of teams with complementary skills. “First authorship” is not well-defined when one author performed a large number of difficult experiments, and another developed novel algorithms and wrote complex software for analyzing the experiments. Similarly, “last authorship” fails as a concept when students are co-advised, or one principal investigator provides substantial funding on a project, while another is participating in doing the work. And recently, large consortium projects have completely destroyed any meaning of “author” by having hundreds, or even thousands of authors on projects. Even when there are relatively few authors people rarely credit anyone except the first and last authors, even if others did substantial work. In the recent ENCODE paper published in PNAS with 30 authors, it appears to me from the responses to my previous blog post about the paper that the 5th and 6th authors did a lot (majority?) of the work in putting together figures and results, yet I suspect the “credit” for the paper will go to the first author (the flip side in that case is that the first author is where blame is assigned as well).

There is also a theoretical justification for *not *ordering authors. Ordering of authors on a publication can be thought of as a ranking produced by “votes” of the participants in the project. Of course in practice not all votes are equal. In what is called dictatorship in social choice theory, PIs frequently make the decisions independently of how specific authors feel they may have contributed. This may work on a paper where there is a single PI (although it may be considered unfair by the graduate students and postdocs), however dictatorship as a system for determining authorship certainly breaks down when multiple PIs collaborate on a project. Arrow’s impossibility theorem explains that in the absence of dictatorship, there is a problem in producing a single ordering satisfying two other seemingly basic and essential fairness criteria. Informally, the theorem states that there is no authorship ordering system based on voting of contributing authors that can satisfy the following three criteria:

- If every author thinks that X should be ordered before Y, then the author list should have X placed before Y.
- For a fixed list of voting preferences regarding the ordering of X vs. Y, the ordering between X and Y in the author list will not depend on the ordering of other pairs such as X and Z, Y and Z, or Z and W.
- There is no “dictator”, i.e. no single author possesses the power to determine the author ordering.

Authors frequently have differing opinions about the impact of their own contribution to a publication, and therefore their preferences (votes) for author ordering are discordant. This means that any system for ordering authors will not satisfy everyone’s preferences, and in the sense of Arrow’s impossibility theorem will be unfair. One way around Arrow’s impossibility theorem is to specify authorship order without regard to authors’ preferences, for example by always ordering authors alphabetically (the Hardy-Littlewood rule). This method, usually the one used in the mathematical sciences, is also fraught with problems. Of course, listing author contributions for what they are is not entirely trivial. For example, different authors may have conflicting views about what it means to have “written the text of the paper”. But using words to describe contributions allow for much more detail about what each author did, and allows for nuanced contributions to be described (e.g., John and Jane were in the room when the initial idea for the project was discussed, but did not contribute anything afterwards).

To summarize, in the modern era of electronic publishing ordering of authors is unnecessary, and if it is unnecessary, then why confront Arrow’s theorem and inevitably produce orderings unfairly? Publications should just explain the author contributions. **Time to end ordered authorship**.

The card catalog at Yale University’s Sterling Memorial Library (from Wikipedia).

## Recent Comments