[September 2, 2017: A response to this post has been posted by the authors of Patro et al. 2017, and I have replied to them with a rebuttal]
Spot the difference
One of the maxims of computational biology is that “no two programs ever give the same result.” This is perhaps not so surprising; after all, most journals seek papers that report a significant improvement to an existing method. As a result, when developing new methods, computational biologists ensure that the results of their tools are different, specifically better (by some metric), than those of previous methods. The maxim certainly holds for RNA-Seq tools. For example, the large symmetric differences displayed in the Venn diagram below (from Zhang et al. 2014) are typical for differential expression tool benchmarks:
In a comparison of RNA-Seq quantification methods, Hayer et al. 2015 showed that methods differ even at the level of summary statistics (in Figure 7 from the paper, shown below, Pearson correlation was calculated using ground truth from a simulation):
These sort of of results are the norm in computational genomics. Finding a pair of software programs that produce identical results is about as likely as finding someone who has won the lottery… twice…. in one week. Well, it turns out there has been such a person, and here I describe the computational genomics analog of that unlikely event. Below are a pair of plots made using two different RNA-Seq quantification programs:
The two volcano plots show the log-fold change in abundance estimated for samples sequenced by Boj et al. 2015, plotted against p-values obtained with the program limma-voom. I repeat: the plots were made with quantifications from two different RNA-Seq programs. Details are described in the next section, but before reading it first try playing spot the difference.
The reveal
The top plot is reproduced from Supplementary Figure 6 in Beaulieu-Jones and Greene, 2017. The quantification program used in that paper was kallisto, an RNA-Seq quantification program based on pseudoalignment that was published in
- Near-optimal probabilistic RNA-Seq quantification by Nicolas Bray, Harold Pimentel, Páll Melsted and Lior Pachter, Nature Biotechnology 34 (2016), 525–527.
The bottom plot was made using the quantification program Salmon, and is reproduced from a GitHub repository belonging to the lead author of
- Salmon provides fast and bias-aware quantification of transcript expression by Rob Patro, Geet Duggal, Michael I. Love, Rafael A. Irizarry and Carl Kingsford, Nature Methods 14 (2017), 417–419.
Patro et al. 2017 claim that “[Salmon] achieves the same order-of-magnitude benefits in speed as kallisto and Sailfish but with greater accuracy”, however after being unable to spot any differences myself in the volcano plots shown above, I decided, with mixed feelings of amusement and annoyance, to check for myself whether the similarity between the programs was some sort of fluke. Or maybe I’d overlooked something obvious, e.g. the fact that programs may tend to give more similar results at the gene level than at the transcript level. Thus began this blog post.
In the figure below, made by quantifying RNA-Seq sample ERR188140 with the latest versions of the two programs, each point is a transcript and its coordinates are the estimated counts produced by kallisto and salmon respectively.
Strikingly, the Pearson correlation coefficient is 0.9996026. However astute readers will recognize a possible sleight of hand on my part. The correlation may be inflated by similar results for the very abundant transcripts, and the plot hides thousands of points in the lower left-hand corner. RNA-Seq analyses are notorious for such plots that appear sounds but can be misleading. However in this case I’m not hiding anything. The Pearson correlation computed with is still extremely high (0.9955965) and the Spearman correlation, which gives equal balance to transcripts irrespective of the magnitude of their counts is 0.991206. My observation is confirmed in Table 3 of Sarkar et al. 2017 (note that in this table “quasi-mapping” corresponds to Salmon):
For context, the Spearman correlation between kallisto and a truly different RNA-Seq quantification program, RSEM, is 0.8944941. At this point I have to say… I’ve been doing computational biology for more than 20 years and I have never seen a situation where two ostensibly different programs output such similar results.
Patro and I are not alone in finding that Salmon kallisto (if kallisto and Salmon gave identical results I would write that Salmon = kallisto but in lieu of the missing 0.004 in correlation I use the symbol
to denote the very very strong similarity). Examples in the literature abound, e.g. Supplementary Figure 5 from Majoros et al. 2017 (shown later in the post), Figure 1 from Everaert et al. 2017
or Figure 3A from Jin et al. 2017:
Just a few weeks ago, Sahraeian et al. 2017 published a comprehensive analysis of 39 RNA-Seq analysis tools and performed hierarchical clusterings of methods according to the similarity of their output. Here is one example (their Supplementary Figure 24a):
Amazingly, kallisto and Salmon-Quasi (the latest version of Salmon) are the two closest programs to each other in the entire comparison, producing output even more similar than the same program, e.g. Cufflinks or StringTie run with different alignments!
This raises the question of how, with kallisto published in May 2016 and Salmon kallisto, Patro et al. 2017 was published in one of the most respected scientific publications that advertises first and foremost that it “is a forum for the publication of novel methods and significant improvements to tried-and-tested basic research techniques in the life sciences.” ?
How not to perform a differential expression analysis
The Patro et al. 2017 paper presents a number of comparisons between kallisto and Salmon in which Salmon appears to dramatically improve on the performance of kallisto. For example Figure 1c from Patro et al. 2017 is a table showing an enormous performance difference between kallisto and Salmon:

Figure 1c from Patro et al. 2017.
At a false discovery rate of 0.01, the authors claim that in a simulation study where ground truth is known Salmon identifies 4.5 times more truly differential transcripts than kallisto!
This can explain how Salmon was published, namely the reviewers and editor believed Patro et al.’s claims that Salmon significantly improves on previous work. In one analysis Patro et al. provide a p-value to help the “significance” stick. They write that “we found that Salmon’s distribution of mean absolute relative differences was significantly smaller (Mann-Whitney U test, P=0.00017) than those of kallisto. But how can the result Salmon >> kallisto, be reconciled with the fact that everybody repeatedly finds that Salmon kallisto?
A closer look reveals three things:
- In a differential expression analysis billed as “a typical downstream analysis” Patro et al. did not examine differential expression results for a typical biological experiment with a handful of replicates. Instead they examined a simulation of two conditions with eight replicates in each.
- The large number of replicates allowed them to apply the log-ratio t-test directly to abundance estimates based on transcript per million (TPM) units, rather than estimated counts which are required for methods such as their own DESeq2.
- The simulation involved generation of GC bias in an approach compatible with the inference model, with one batch of eight samples exhibiting “weak GC content dependence” while the other batch of eight exhibiting “more severe fragment-level GC bias.” Salmon was run in a GC bias correction mode.
These were unusual choices by Patro et al. What they did was allow Patro et al. to showcase the performance of their method in a way that leveraged the match between one of their inference models and the procedure for simulating the reads. The showcasing was enabled by having a confounding variable (bias) that exactly matches their condition variable, the use of TPM units to magnify the impact of that effect on their inference, simulation with a large number of replicates to enable the use of TPM, which was possible because with many replicates one could directly apply the log t-test. This complex chain of dependencies is unraveled below:
There is a reason why log-fold changes are not directly tested in standard RNA-Seq differential expression analyses. Variance estimation is challenging with few replicates and RNA-Seq methods developers understood this early on. That is why all competitive methods for differential expression analysis such as DESeq/DESeq2, edgeR, limma-voom, Cuffdiff, BitSeq, sleuth, etc. regularize variance estimates (i.e., perform shrinkage) by sharing information across transcripts/genes of similar abundance. In a careful benchmarking of differential expression tools, Shurch et al. 2016 show that log-ratio t-test is the worst method. See, e.g., their Figure 2:

Figure 2 from Schurch et al. 2016. The four vertical panels show FPR and TPR for programs using 3,6,12 and 20 biological replicates (in yeast). Details are in the Schurch et al. 2016 paper.
The log-ratio t-test performs poorly not only when the number of replicates is small and regularization of variance estimates is essential. Schurch et al. specifically recommend DESeq2 (or edgeR) when up to 12 replicates are performed. In fact, the log-ratio t-test was so bad that it didn’t even make it into their Table 2 “summary of recommendations”.
The authors of Patro et al. 2017 are certainly well-aware of the poor performance of the log-ratio t-test. After all, one of them was specifically thanked in the Schurch et al. 2016 paper “for his assistance in identifying and correcting a bug”. Moreover, the recommended program by Schurch et. al. (DESeq2) was authored by one of the coauthors on the Patro et al. paper, who regularly and publicly advocates for the use of his programs (and not the log-ratio t-test):
This recommendation has been codified in a detailed RNA-Seq tutorial where M. Love et al. write that “This [Salmon + tximport] is our current recommended pipeline for users”.
In Soneson and Delorenzi, 2013, the authors wrote that “there is no general consensus regarding which [RNA-Seq differential expression] method performs best in a given situation” and despite the development of many methods and benchmarks since this influential review, the question of how to perform differential expression analysis continues to be debated. While it’s true that “best practices” are difficult to agree on, one thing I hope everyone can agree on is that in a “typical downstream analysis” with a handful of replicates
do not perform differential expression with a log-ratio t-test.
Turning to Patro et al.‘s choice of units, it is important to note that the requirement of shrinkage for RNA-Seq differential analysis is the reason most differential expression tools require abundances measured in counts as input, and do not use length normalized units such as Transcripts Per Million (TPM). In TPM units the abundance for a transcript t is
where
are the estimated counts for transcript t,
is the (effective) length of t and N the number of total reads. Whereas counts are approximately Poisson distributed (albeit with some over-dispersion), variance estimates of abundances in TPM units depend on the lengths used in normalization and therefore cannot be used directly for regularization of variance estimation. Furthermore, the dependency of TPM on effective lengths means that abundances reported in TPM are very sensitive to the estimates of effective length.
This is why, when comparing the quantification accuracy of different programs, it is important to compare abundances using estimated counts. This was highlighted in Bray et al. 2016: “Estimated counts were used rather than transcripts per million (TPM) because the latter is based on both the assignment of ambiguous reads and the estimation of effective lengths of transcripts, so a program might be penalized for having a differing notion of effective length despite accurately assigning reads.” Yet Patro et al. perform no comparisons of programs in terms of estimated counts.
A typical analysis
The choices of Patro et al. in designing their benchmarks are demystified when one examines what would have happened had they compared Salmon to kallisto on typical data with standard downstream differential analysis tools such as their own tximport and DESeq2. I took the definition of “typical” from one of the Patro et al. coauthors’ own papers (Soneson et al. 2016): “Currently, one of the most common approaches is to define a set of non-overlapping targets (typically, genes) and use the number of reads overlapping a target as a measure of its abundance, or expression level.”
The Venn diagram below shows the differences in transcripts detected as differentially expressed when kallisto and Salmon are compared using the workflow the authors recommend publicly (quantifications -> tximport -> DESeq2) on a typical biological dataset with three replicates in each of two conditions. The number of overlapping genes is shown for a false discovery rate of 0.05 on RNA-Seq data from Trapnell et al. 2014:

A Venn diagram showing the overlap in genes predicted to be differential expressed by kallisto (blue) and Salmon (pink). Differential expression was performed with DESeq2 using transcript-level counts estimated by kallisto and Salmon and imported to DESeq2 with tximport. Salmon was run with GC bias correction.
This example provides Salmon the benefit of the doubt- the dataset was chosen to be older (when bias was more prevalent) and Salmon was not run in default mode but rather with GC bias correction turned on (option –gcBias).
When I saw these numbers for the first time I gasped. Of course I shouldn’t have been surprised; they are consistent with repeated published experiments in which comparisons of kallisto and Salmon have revealed near identical results. And while I think it’s valuable to publish confirmation of previous work, I did wonder whether Nature Methods would have accepted the Patro et al. paper had the authors conducted an actual “typical downstream analysis”.
What about the TPM?
Patro et al. utilized TPM based comparisons for all the results in their paper, presumably to highlight the improvements in accuracy resulting from better effective length estimates. Numerous results in the paper suggest that Salmon is much more accurate than kallisto. However I had seen a figure in Majoros et al. 2017 that examined the (cumulative) distribution of both kallisto and Salmon abundances in TPM units (their Supplementary Figure 5) in which the curves literally overlapped at almost all thresholds:

The plot above was made with Salmon v0.7.2 so in fairness to Patro et al. I remade it using the ERR188140 dataset mentioned above with Salmon v0.8.2:

The distribution of abundances (in TPM units) as estimated by kallisto (blue circles) and Salmon (red stars).
The blue circles correspond to kallisto and the red stars inside to Salmon. With the latest version of Salmon the similarity is even higher than what Majoros et al. observed! The Spearman correlation between kallisto and Salmon with TPM units is 0.9899896.
It’s interesting to examine what this means for a (truly) typical TPM analysis. One way that TPMs are used is to filter transcripts (or genes) by some threshold, typically TPM > 1 (in another deviation from “typical”, a key table in Patro et al. 2017 – Figure 1d – is made by thresholding with TPM > 0.1). The Venn diagram below shows the differences between the programs at the typical TPM > 1 threshold:

A Venn diagram showing the overlap in transcripts predicted by kallisto and Salmon to have estimated abundance > 1 TPM.
The figures above were made with Salmon 0.8.2 run in default mode. The correlation between kallisto and Salmon (in TPM) units decreases a tiny amount, from 0.9989224 to 0.9974325 with the –gcBias option and even the Spearman correlation decreases by only 0.011 from 0.9899896 to 0.9786092.
I think it’s perfectly fine for authors to present their work in the best light possible. What is not ok is to deliberately hide important and relevant truth, which in this case is that Salmon kallisto.
A note on speed
One of the claims in Patro et al. 2017 is that “[the speed of Salmon] roughly matches the speed of the recently introduced kallisto.” The Salmon claim is based on a benchmark of an experiment (details unknown) with 600 million 75bp paired-end reads using 30 threads. Below are the results of a similar benchmark of Salmon showing time to process 19 samples from Boj et al. 2015 with variable numbers of threads:

First, Salmon with –gcBias is considerably slower than default Salmon. Furthermore, there is a rapid decrease in performance gain with increasing number of threads, something that should come as no surprise. It is well known that quantification can be I/O bound which means that at some point, extra threads don’t provide any gain as the disk starts grinding limiting access from the CPUs. So why did Patro et al. choose to benchmark runtime with 30 threads?
The figure below provides a possible answer:
In other words, not only is Salmon kallisto in accuracy, but contrary to the claims in Patro et al. 2017, kallisto is faster. This result is confirmed in Table 1 of Sarkar et al. 2017 who find that Salmon is slower by roughly the same factor as seen above (in the table “quasi-mapping” is Salmon).
Having said that, the speed differences between kallisto and Salmon should not matter much in practice and large scale projects made possible with kallisto (e.g. Vivian et al. 2017) are possible with Salmon as well. Why then did the authors not report their running time benchmarks honestly?
The first common notion
The Patro et al. 2017 paper uses the term “quasi-mapping” to describe an algorithm, published in Srivastava et al. 2016, for obtaining their (what turned out to be near identical to kallisto) results. I have written previously how “quasi-mapping” is the same as pseudoalignment as an alignment concept, even though Srivastava et al. 2016 initially implemented pseudoalignment differently than the way we described it originally in our preprint in Bray et al. 2015. However the reviewers of Patro et al. 2017 may be forgiven for assuming that “quasi-mapping” is a technical advance over pseudoalignment. The Srivastava et al. paper is dense and filled with complex technical detail. Even for an expert in alignment/RNA-Seq it is not easy to see from a superficial reading of the paper that “quasi-mapping” is an equivalent concept to kallisto’s pseudoalignment (albeit implemented with suffix arrays instead of a de Bruijn graph). Nevertheless, the key to the paper is a simple sentence: “Specifically, the algorithm [RapMap, which is now used in Salmon] reports the intersection of transcripts appearing in all hits” in the section 2.1 of the paper. That’s the essence of pseudoalignment right there. The paper acknowledges as much, “This lightweight consensus mechanism is inspired by Kallisto ( Bray et al. , 2016 ), though certain differences exist”. Well, as shown above, those differences appear to have made no difference in standard practice, except insofar as the Salmon implementation of pseudoalignment being slower than the one in Bray et al. 2016.
Srivastava et al. 2016 and Patro et al. 2017 make a fuss about the fact that their “quasi-mappings” take into account the starting positions of reads in transcripts, thereby including more information than a “pure” pseudoalignment. This is a pedantic distinction Patro et al. are trying to create. Already in the kallisto preprint (May 11, 2015), it was made clear that this information was trivially accessible via a reasonable approach to pseudoalignment: “Once the graph and contigs have been constructed, kallisto stores a hash table mapping each k-mer to the contig it is contained in, along with the position within the contig.”
In other words, Salmon is not producing near identical results to kallisto due to an unprecedented cosmic coincidence. The underlying method is the same. I leave it to the reader to apply Euclid’s first common notion:
Things which equal the same thing are also equal to each other.
Convergence
While Salmon is now producing almost identical output to kallisto and is based on the same principles and methods, this was not the case when the program was first released. The history of the Salmon program is accessible via the GitHub repository, which recorded changes to the code, and also via the bioRxiv preprint server where the authors published three versions of the Salmon preprint prior to its publication in Nature Methods.
The first preprint was published on the BioRxiv on June 27, 2015. It followed shortly on the heels of the kallisto preprint which was published on May 11, 2015. However the first Salmon preprint described a program very different from kallisto. Instead of pseudoalignment, Salmon relied on chaining SMEMs (super-maximal exact matches) between reads and transcripts to identifying what the authors called “approximately consistent co-linear chains” as proxies for alignments of reads to transcripts. The authors then compared Salmon to kallisto writing that “We also compare with the recently released method of Kallisto which employs an idea similar in some respects to (but significantly different than) our lightweight-alignment algorithm and again find that Salmon tends to produce more accurate estimates in general, and in particular is better able [to] estimate abundances for multi-isoform genes.” In other words, in 2015 Patro et al. claimed that Salmon was “better” than kallisto. If so, why did the authors of Salmon later change the underlying method of their program to pseudoalignment from SMEM alignment?
Inspired by temporal ordering analysis of expression data and single-cell pseudotime analysis, I ran all the versions of kallisto and Salmon on ERR188140, and performed PCA on the resulting transcript abundance table to be able to visualize the progression of the programs over time. The figure below shows all the points with the exception of three: Sailfish 0.6.3, kallisto 0.42.0 and Salmon 0.32.0. I removed Sailfish 0.6.3 because it is such an outlier that it caused all the remaining points to cluster together on one side of the plot (the figure is below in the next section). In fairness I also removed one Salmon point (version 0.32.0) because it differed substantially from version 0.4.0 that was released a few weeks after 0.32.0 and fixed some bugs. Similarly, I removed kallisto 0.42.0, the first release of kallisto which had some bugs that were fixed 6 days later in version 0.42.1.
Evidently kallisto output has changed little since May 12, 2015. Although some small bugs were fixed and features added, the quantifications have been very similar. The quantifications have been stable because the algorithm has been the same.
On the other hand the Salmon trajectory shows a steady convergence towards kallisto. The result everyone is finding, namely that currently Salmon kallisto is revealed by the clustering of recent versions of Salmon near kallisto. However the first releases of Salmon are very different from kallisto. This is also clear from the heatmap/hierarchical clustering of Sahraeian et al. in which Salmon-SMEM was included (Salmon used SMEMs until version 0.5.1, sometimes labeled fmd, until “quasi-mapping” became the default). A question: if Salmon ca. 2015 was truly better than kallisto then is Salmon ca. 2017 worse than Salmon ca. 2015?

Convergence of Salmon and Sailfish to kallisto over the course of a year. The x-axis labels the time different versions of each program were released. The y-axis is PC1 from a PCA of transcript abundances of the programs.
Prestamping
The bioRxiv preprint server provides a feature by which a preprint can be linked to its final form in a journal. This feature is useful to readers of the bioRxiv, as final published papers are generally improved after preprint reader, reviewer, and editor comments have been addressed. Journal linking is also a mechanism for authors to time stamp their published work using the bioRxiv. However I’m sure the bioRxiv founders did not intend the linking feature to be abused as a “prestamping” mechanism, i.e. a way for authors to ex post facto receive a priority date for a published paper that had very little, or nothing, in common with the original preprint.
A comparison of the June 2015 preprint mentioning the Salmon program and the current Patro et al. paper reveals almost nothing in common. The initial method changed drastically in tandem with an update to the preprint on October 3, 2015 at which point the Salmon program was using “quasi mapping”, later published in Srivastava et al. 2016. Last year I met with Carl Kingsford (co-corresponding author of Patro et al. 2017) to discuss my concern that Salmon was changing from a method distinct from that of kallisto (SMEMs of May 2015) to one that was replicating all the innovations in kallisto, without properly disclosing that it was essentially a clone. Yet despite a promise that he would raise my concerns with the Salmon team, I never received a response.
At this point, the Salmon core algorithms have changed completely, the software program has changed completely, and the benchmarking has changed completely. The Salmon project of 2015 and the Salmon project of 2017 are two very different projects although the name of the program is the same. While some features have remained, for example the Salmon mode that processes transcriptome alignments (similar to eXpress) was present in 2015, and the approach to likelihood maximization has persisted, considering the programs the same is to descend into Theseus’ paradox.
Interestingly, Patro specifically asked to have the Salmon preprint linked to the journal:
The linking of preprints to journal articles is a feature that arXiv does not automate, and perhaps wisely so. If bioRxiv is to continue to automatically link preprints to journals it needs to focus not only on eliminating false negatives but also false positives, so that journal linking cannot be abused by authors seeking to use the preprint server to prestamp their work after the fact.
The fish always win?
The Sailfish program was the precursor of Salmon, and was published in Patro et al. 2014. At the time, a few students and postdocs in my group read the paper and then discussed it in our weekly journal club. It advocated a philosophy of “lightweight algorithms, which make frugal use of data, respect constant factors and effectively use concurrent hardware by working with small units of data where possible”. Indeed, two themes emerged in the journal club discussion:
1. Sailfish was much faster than other methods by virtue of being simpler.
2. The simplicity was to replace approximate alignment of reads with exact alignment of k-mers. When reads are shredded into their constituent k-mer “mini-reads”, the difficult read -> reference alignment problem in the presence of errors becomes an exact matching problem efficiently solvable with a hash table.
Despite the claim in the Sailfish abstract that “Sailfish provides quantification time…without loss of accuracy” and Figure 1 from the paper showing Sailfish to be more accurate than RSEM, we felt that the shredding of reads must lead to reduced accuracy, and we quickly checked and found that to be the case; this was later noted by others, e.g. Hensman et al. 2015, Lee et al. 2015).
After reflecting on the Sailfish paper and results, Nicolas Bray had the key idea of abandoning alignments as a requirement for RNA-Seq quantification, developed pseudoalignment, and later created kallisto (with Harold Pimentel and Páll Melsted).
I mention this because after the publication of kallisto, Sailfish started changing along with Salmon, and is now frequently discussed in the context of kallisto and Salmon as an equal. Indeed, the PCA plot above shows that (in its current form, v0.10.0) Sailfish is also nearly identical to kallisto. This is because with the release of Sailfish 0.7.0 in September 2015, Patro et al. started changing the Sailfish approach to use pseudoalignment in parallel with the conversion of Salmon to use pseudoalignment. To clarify the changes in Sailfish, I made the PCA plot below which shows where the original version of Sailfish that coincided with the publication of Patro et al. 2014 (version 0.6.3 March 2014) lies relative to the more recent versions and to Salmon:
In other words, despite a series of confusing statements on the Sailfish GitHub page and an out-of-date description of the program on its homepage, Sailfish in its published form was substantially less accurate and slower than kallisto, and in its current form Sailfish is kallisto.
In retrospect, the results in Figure 1 of Patro et al. 2014 seem to be as problematic as the results in Figure 1 of Patro et al. 2017. Apparently crafting computational experiments via biased simulations and benchmarks to paint a distorted picture of performance is a habit of Patro et al.
Addendum [August 5, 2017]
In the post I wrote that “The history of the Salmon program is accessible via the GitHub repository, which recorded changes to the code, and also via the bioRxiv preprint server where the authors published three versions of the Salmon preprint prior to its publication in Nature Methods” Here are the details of how these support the claims I make (tl;dr https://twitter.com/yarbsalocin/status/893886707564662784):
Sailfish (current version) and Salmon implemented kallisto’s pseudoalignment algorithm using suffix arrays
First, both Sailfish and Salmon use RapMap (via `SACollector`) and call `mergeLeftRightHits()`:
Sailfish:
https://github.com/kingsfordgroup/sailfish/blob/352f9001a442549370eb39924b06fa3140666a9e/src/SailfishQuantify.cpp#L192
Salmon:
https://github.com/COMBINE-lab/salmon/commit/234cb13d67a9a1b995c86c8669d4cefc919fbc87#diff-594b6c23e3bdd02a14cc1b861c812b10R2205
The RapMap code for “quasi mapping” executes an algorithm identical to psuedoalignment, down to the detail of what happens to the k-mers in a single read:
First, `hitCollector()` calls `getSAHits_()`:
https://github.com/COMBINE-lab/RapMap/blob/bd76ec5c37bc178fd93c4d28b3dd029885dbe598/include/SACollector.hpp#L249
Here kmers are used hashed to SAintervals (Suffix Array intervals), that are then extended to see how far ahead to jump. This is the one of two key ideas in the kallisto paper, namely that not all the k-mers in a read need to be examined to pseudoalign the read. It’s much more than that though, it’s the actual exact same algorithm to the level of exactly the k-mers that are examined. kallisto performs this “skipping” using contig jumping with a different data structure (the transcriptome de Bruijn graph) but aside from data structure used what happens is identical:
https://github.com/COMBINE-lab/RapMap/blob/c1e3132a2e136615edbb91348781cb71ba4c22bd/include/SACollector.hpp#L652
makes a call to jumping and the code to compute MMP (skipping) is
https://github.com/COMBINE-lab/RapMap/blob/c1e3132a2e136615edbb91348781cb71ba4c22bd/include/SASearcher.hpp#L77
There is a different detail in the Sailfish/Salmon code which is that when skipping forward the suffix array is checked for exact matching on the skipped sequence. kallisto does not have this requirement (although it could). On error-free data these will obviously be identical; on error prone data this may make Salmon/Sailfish a bit more conservative and kallisto a bit more robust to error. Also due to the structure of suffix arrays there is a possible difference in behavior when a transcript contains a repeated k-mer. These differences affect a tiny proportion of reads, as is evident from the result that kallisto and Salmon produce near identical results.
The second key idea in kallisto of intersecting equivalence classes for a read. This exact procedure is in:
https://github.com/COMBINE-lab/RapMap/blob/bd76ec5c37bc178fd93c4d28b3dd029885dbe598/include/SACollector.hpp#L363
which calls:
https://github.com/COMBINE-lab/RapMap/blob/bd76ec5c37bc178fd93c4d28b3dd029885dbe598/src/HitManager.cpp#L599
There was a choice we had to make in kallisto of how to handle information from paired end reads (does one require consistent pseudoalignment in both? Just one suffices to pseudoalign a read?)
The code for intersection between left and right reads making the identical choices as kallisto is:
https://github.com/COMBINE-lab/RapMap/blob/bd76ec5c37bc178fd93c4d28b3dd029885dbe598/include/RapMapUtils.hpp#L810
In other words, stepping through what happens to the k-mers in a read shows that Sailfish/Salmon copied the algorithms of kallisto and implemented it with the only difference being a different data structure used to hash the kmers. This is why, when I did my run of Salmon vs. kallisto that led to this blog post I found that
kallisto pseudoaligned 69,780,930 reads
vs
salmon 69,701,169.
That’s a difference of 79,000 out of ~70 million = 0.1%.
Two additional points:
- Until the kallisto program and preprint was published Salmon used SMEMs. Only after kallisto does Salmon change to using kmer cached suffix array intervals.
- The kallisto preprint did not discuss outputting position as part of pseudoalignment because it was not central to the idea. It’s trivial to report pseudoalignment positions with either data structure and in fact both kallisto and Salmon do.
I want to make very clear here that I think there can be great value in implementing an algorithm with a different data structure. It’s a form of reproducibility that one can learn from: how to optimize, where performance gains can be made, etc. Unfortunately most funding agencies don’t give grants for projects whose goal is solely to reproduce someone else’s work. Neither do most journal publish papers that set out to do that. That’s too bad. If Patro et al. had presented their work honestly, and explained that they were implementing pseudoalignment with a different data structure to see if it’s better, I’d be a champion of their work. That’s not how they presented their work.
Salmon copied details in the quantification
The idea of using the EM algorithm for quantification with RNA-Seq goes back to Jiang and Wong, 2009, arguably even to Xing et al. 2006. I wrote up the details of the history in a review in 2011 that is on the arXiv. kallisto runs the EM algorithm on equivalence classes, an idea that originates with Nicolae et al. 2011 (or perhaps even Jiang and Wong 2009) but whose significance we understood from the Sailfish paper (Patro et al. 2014). Therefore the fact that Salmon (now) and kallisto both use the EM algorithm, in the same way, makes sense.
However Salmon did not use the EM algorithm before the kallisto preprint and program were published. It used an online variational Bayes algorithm instead. In the May 18, 2015 release of Salmon there is no mention of EM. Then, with the version 0.4 release date Salmon suddenly switches to the EM. In implementing the EM algorithm there are details that must be addressed, for example setting thresholds for when to terminate rounds of inference based on changes in the (log) likelihood (i.e. determine convergence).
For example, kallisto sets parameters
const double alpha_limit = 1e-7;
const double alpha_change_limit = 1e-2;
const double alpha_change = 1e-2;
in EMalgorithm.h
https://github.com/pachterlab/kallisto/blob/90db56ee8e37a703c368e22d08b692275126900e/src/EMAlgorithm.h
The link above shows that these kallisto parameters were set and have not changed since the release of kallisto
Also they were not always this way, see e.g. the version of April 6, 2015:
https://github.com/pachterlab/kallisto/blob/2651317188330f7199db7989b6a4dc472f5d1669/src/EMAlgorithm.h
This is because one of the things we did is explore the effects of these thresholds, and understand how setting them affects performance. This can be seen also in a legacy redundancy, we have both alpha_change and alpha_change_limit which ended up being unnecessary because they are equal in the program and used on one line.
The first versions of Salmon post-kallisto switched to the EM, but didn’t even terminate it the same way as kallisto, adopting instead a maximum iteration of 1,000. See
https://github.com/COMBINE-lab/salmon/blob/59bb9b2e45c76137abce15222509e74424629662/include/CollapsedEMOptimizer.hpp
from May 30, 2015.
This changed later first with the introduction of minAlpha (= kallisto’s alpha_limit)
https://github.com/COMBINE-lab/salmon/blob/56120af782a126c673e68c8880926f1e59cf1427/src/CollapsedEMOptimizer.cpp
and then alphaCheckCutoff (kallisto’s alpha_change_limit)
https://github.com/COMBINE-lab/salmon/blob/a3bfcf72e85ebf8b10053767b8b506280a814d9e/src/CollapsedEMOptimizer.cpp
Here are the salmon thresholds:
double minAlpha = 1e-8;
double alphaCheckCutoff = 1e-2;
double cutoff = minAlpha;
Notice that they are identical except that minAlpha = 1e-8 and not kallisto’s alpha_limit = 1e-7. However in kallisto, from the outset, the way that alpha_limit has been used is:
if (alpha_[ec] < alpha_limit/10.0) {
alpha_[ec] = 0.0;
}
In other words, alpha_limit in kallisto is really 1e-8, and has been all along.
The copying of all the details of our program have consequences for performance. In the sample I ran kallisto performed 1216 EM rounds of EM vs. 1214 EM rounds in Salmon.
Sailfish (current version) copied our sequence specific bias method
One of the things we did in kallisto is implement a sequence specific bias correction along the lines of what was done previously in Roberts et al. 2011, and later in Roberts et al. 2013. Implementing sequence specific bias correction in kallisto required working things out from scratch because of the way equivalence classes were being used with the EM algorithm, and not reads. I worked this out together with Páll Melsted during conversations that lasted about a month in the Spring of 2015. We implemented it in the code although did not release details of how it worked with the initial preprint because it was an option and not default, and we thought we might want to still change it before submitting the journal paper.
Here Rob is stating that Salmon can account for biases that kallisto cannot:
https://www.biostars.org/p/143458/#143639
This was a random forest bias correction method different from kallisto’s.
Shortly thereafter, here is the source code in Sailfish deprecating the Salmon bias correction and switching to kallisto’s method:
https://github.com/kingsfordgroup/sailfish/commit/377f6d65fe5201f7816213097e82df69e4786714#diff-fe8a1774cd7c858907112e6c9fda1e9dR76
This is the update to effective length in kallisto:
https://github.com/pachterlab/kallisto/blob/e5957cf96f029be4e899e5746edcf2f63e390609/src/weights.cpp#L184
Here is the Sailfish code:
https://github.com/kingsfordgroup/sailfish/commit/be0760edce11f95377088baabf72112f920874f9#diff-8341ac749ad4ac5cfcc8bfef0d6f1efaR796
Notice that there has been a literal copying down to the variable names:
https://github.com/kingsfordgroup/sailfish/commit/be0760edce11f95377088baabf72112f920874f9#diff-8341ac749ad4ac5cfcc8bfef0d6f1efaR796
The code written by the student of Rob was:
effLength *=alphaNormFactor/readNormFactor;
The code written by us is
efflen *= 0.5*biasAlphaNorm/biasDataNorm;
The code rewritten by Rob (editing that of the student):
effLength *= 0.5 * (txomeNormFactor / readNormFactor);
Note that since our bias correction method was not reported in our preprint, this had to have been copied directly from our codebase and was done so without any attribution.
I raised this specific issue with Carl Kingsford by email prior to our meeting in April 13 2016. We then discussed it in person. The conversation and email were prompted by a change to the Sailfish README on April 7, 2016 specifically accusing us of comparing kallisto to a “ **very old** version of Sailfish”:
https://github.com/kingsfordgroup/sailfish/commit/550cd19f7de0ea526f512a5266f77bfe07148266
What was stated is “The benchmarks in the kallisto paper *are* made against a very old version of Sailfish” not “were made against”. By the time that was written, it might well have been true. But kallisto was published in May 2015, it benchmarked with the Sailfish program described in Patro et al. 2014, and by 2016 Sailfish had changed completely implementing the pseudoalignment of kallisto.
Token attribution
Another aspect of an RNA-Seq quantification program is effective length estimation. There is an attribution to kallisto in the Sailfish code now explaining that this is from kallisto:
“Computes (and returns) new effective lengths for the transcripts based on the current abundance estimates (alphas) and the current effective lengths (effLensIn). This approach is based on the one taken in Kallisto
https://github.com/kingsfordgroup/sailfish/blob/b1657b3e8929584b13ad82aa06060ce1d5b52aed/src/SailfishUtils.cpp
This is from January 23rd, 2016, almost 9 months after kallisto was released, and 3 months before the Sailfish README accused us of not testing the latest version of Sailfish in May 2015.
The attribution for effective lengths is also in the Salmon code, from 6 months later June 2016:
https://github.com/COMBINE-lab/salmon/blob/335c34b196205c6aebe4ddcc12c380eb47f5043a/include/DistributionUtils.hpp
There is also an acknowledgement in the Salmon code that a machine floating point tolerance we use
https://github.com/pachterlab/kallisto/blob/master/src/EMAlgorithm.h#L19
was copied.
The acknowledgment in Salmon is here
https://github.com/COMBINE-lab/salmon/blob/a3bfcf72e85ebf8b10053767b8b506280a814d9e/src/CollapsedEMOptimizer.cpp
This is the same file where the kallisto thresholds for the EM were copied to.
So after copying our entire method, our core algorithm, many of our ideas, specific parameters, and numerous features… really just about everything that goes into an RNA-Seq quantification project, there is an acknowledgment that our machine tolerance threshold was “intelligently chosen”.
62 comments
Comments feed for this article
August 2, 2017 at 12:12 pm
mi4
I have been following Kallisto and Salmon from the very beginning. Indeed there are similarities and also differences. Personally, I do not see any suspicious similarities (from ethical point of view) between Kallisto and Salmon.
Regarding the comparison done by the authors of Salmon where it is shown that Salmon performs better Kallisto (as shown in Figure 1c) this is a very common problem in many articles from life sciences because any new method is tested on very small number of test datasets (i.e. 1-5) which are very well chosen beforehand. For example Pizzly is shown to be superior whilst being tested only four datasets (where one is picked without any explanation out of an article where several tests of the same kind are provided together as an extensive benchmark dataset, see: SRR1659964). Do not get me wrong! I think that Pizzly is a good tool and the pre-print is good.
August 2, 2017 at 4:13 pm
pmelsted
It’s not the number of simulations that’s the problem here, but the analysis and misrepresentation.
For pizzly we used (positive) simulations from other papers rather than making a new one. But your point about SRR1659964 is a good one, we picked it because it was at high concentration, but I need to run the rest (same content, different concentrations) for the paper.
August 2, 2017 at 12:21 pm
Thyago LC
Ouch!
August 2, 2017 at 1:25 pm
Michele Busby
I have some thoughts:
First: We need to stop benchmarking our bioinformatics methods on underpowered experiments. An experiment with 3 replicates (the paradigm) is going to find most of your differentially expressed genes at 4X fold change (whatever method you use) and maybe some of your 3X fold change genes. This is only the lowest hanging fruit and leaves a lot of biology on the table. We see huge biological results from things like a single extra chromosome. We would expect a 1.5X fold change to be relevant. To get at this level of expression you need closer to 6-8 replicates. I would consider benchmarking on this experimental design a strength and not a weakness of the paper.
We published on this in 2013 https://academic.oup.com/bioinformatics/article-lookup/doi/10.1093/bioinformatics/btt015. The subsequent experiments in yeast , e.g. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4878611/ have said the same thing.
Wet lab methods have moved on to make multiplexing of many samples easier and cheaper, e.g. http://www.nature.com/nmeth/journal/v12/n4/abs/nmeth.3313.html
No offense, but I really think we, as a field, need to stop patting ourselves on the back for designing methods that perform better for poorly designed experiments. (except for very precious samples by which I do not mean underfunded studies)
Second: You don’t really point out that the reason multiple good and correct methods do not get convergent results is that the experiments are wildly underpowered. There are a lot of methods in RNA Seq that are really great (and correct in that the p-value basically coincides with the false positive) and if they are run on experiments with 20 replicates they are all going to spit out the same result. But it is important for people to understand what is happening in their data. They aren’t converging because they are finding only a small fraction of the “real” genes. I wrote a longer blog about that there: http://michelebusby.tumblr.com/post/26983625206/how-underpowered-experiments-make-good-methods
Third: I am not interested in getting into a contest over this, but I think it is interesting: I did some simulations in our Scotty paper looking at whether the t-test works for differential expression. I compared it to the old version of DESeq and found convergent results at about 4 or 5 reps. It is in the supplement.
Obviously, the details of how you do the simulation matter and I wasn’t really interested in saying one was better, just establishing that existing power techniques are reasonable for power estimations. However, one thing I looked at empirically was whether logging the data improved or degraded the results of the t-test. Even when I simulated the data as lognormally distributed the t-test was more accurate in simulation if I did NOT log the data first.
A t-distribution models both the shape of the data (in the numerator) and the uncertainty in the variance (in the denominator). I think that what happens when you log the data is that you do better at modeling the shape of the data but you screw up the expectations of the uncertainty. With a low number of replicates the uncertainty is super important. With RNA data the shape of the distribution is constrained by biology so in my simulations the t-test was robust enough to work on the unlogged data better with the low number of replicates.
Fourth: I am cautious about information sharing in RNA Seq analyses because you have to squash the variance of highly variable genes which is going to introduce systematic bias into your results, which is sometimes going to be correlated with biology. I expect exactly no one is handling this right in downstream analyses to interpret their data. Most people aren’t even handling the count bias right.
August 2, 2017 at 3:59 pm
Daniel Falush
This is interesting but on a similar topic of appropriate citation/credit, when you put a new version of the preprint about your Kmer method up for association mapping up, it would have been nice to have at least mentioned if not compared your method to the previously published bacterial methods which I tweeted you about a little while ago: e.g. Sheppard.2013 PNAS, Lees 2016 Nature Communications, Earle 2016 Nature Microbiology.
August 2, 2017 at 4:11 pm
Lior Pachter
I appreciate the references- I’d looked at the bacterial methods and they were not applicable to our human GWAS for various reasons (scale etc.) But it was an oversight not to cite them and we’ll fix that in the next version.
August 2, 2017 at 4:21 pm
Daniel Falush
Thanks!
August 3, 2017 at 2:38 am
dtabb1973
Reblogged this on Picking Up The Tabb and commented:
As the pressure to publish in “top-ranked” journals increases, we see that people resort to inappropriate methods to “cook the numbers.” This blog post examines ways that the authors of the Salmon RNA-Seq differentiation software caused their code to seem far better than existing tools, when in typical analyses its results are essentially identical to earlier software. Word to the wise: if you use trickery to make your algorithms look better when they’re not, you can expect that the field will find out. Perhaps it will happen in peer review, and perhaps it will happen later. What will happen to your reputation then?
August 3, 2017 at 8:03 am
A. Student
It’s not only “cooking the numbers”, results were essentially identical since it was the same algorithm. Looks to me like plagiarism.
August 4, 2017 at 1:59 am
dtabb1973
“Imitation is the sincerest form of flattery.” When I see that another author has incorporated my scoring algorithms in their new software, I generally feel happy about it, but I feel even happier if they have collaborated with me in that incorporation. That said, I positively fume when researchers implement a hamstrung version of my software as a “straw man” to show how wonderful their code is. If someone were to use my code without any attribution or to claim that my invention was actually theirs, that would make me more likely to use terms like plagiarism. My goal is to move the field, not so much to convince everyone to use my software, so there’s room for interpretation.
August 3, 2017 at 3:08 am
felixbalazard
You have a weird link at :” Patro et al. 2017 was published in one of the most respected scientific publications”. It sends to a complaint against the Salk institute. Except if there is a connection that I missed.
August 3, 2017 at 7:59 am
Lior Pachter
The link is intentional. I wanted to a specific reference for the fact that Nature is viewed as a “respected scientific publication”. The Salk wrote, in response to the lawsuit that “In the past ten years she failed to publish a single paper in any of the most respected scientific publications (Cell, Nature and Science). ”
August 3, 2017 at 10:36 am
klmr
In my browser (Chrome 58, macOS), the heatmap (Sahraeian et al. 2017, Supplementary Figure 24a) isn’t displayed at all (even though the image link is in the source code and I can manually navigate there). This made the text quite confusing.
August 3, 2017 at 10:38 am
Lior Pachter
Very strange- I haven’t encountered this bug but I’ll look into it.
August 3, 2017 at 1:09 pm
martenson
It is set to `hidden` and has css of: `style=”display: none !important;”` which causes this.
August 3, 2017 at 10:38 am
Lior Pachter
Thanks for letting me know.
August 3, 2017 at 12:06 pm
jonathan moore
I’ve been saying for a few years now that the best analysis method for this problem is likely to be all of them. All methods are wrong to various degrees, and (mostly) in different ways, due to the sensitivities and assumptions of their underlying models. Divergence between methods is especially likely in low signal/noise situations (e.g. 3 reps).
Recent years’ progress in Data Science has taught us that using all the methods available to you, and generating an ensemble result, is likely to outperform any individual analysis method in most circumstances.
August 4, 2017 at 8:57 am
David Rinker
This sounds intuitive but it also assumes that errors in high-dimensional data analysis are going to somehow be complementary. This I’m not convinced of.
Rather I would suspect that the overlaps of algorithm_A and algorithm_B could just as likely be skewed toward true positives as false ones. So while an ecumenical approach *might* enrich for TPs, it could also do the opposite. Rigorous benchmarking is still the most essential consideration when choosing a analytic pipeline..
August 5, 2017 at 12:22 am
Ravity Grainbow
Interesting you did not mention that one of the important differences between kallisto and Salmon was the license. I say ‘was’ because you changed kallisto’s license a few days ago. Whatever ostensibly good-natured justification is provided in the ‘I was wrong (part 2)’ blog post, it is hard to believe the timing of the license change and this take down of Salmon is coincidental. Perhaps you should pseudoalign your ego instead of pretending it is the reference.
August 5, 2017 at 9:58 pm
Lior Pachter
I’m not sure I understand your point or what you are asking of me.
Are you saying it’s ok that someone stole our work because they didn’t like our license? Do you not believe me that we had an August 1st date for changing the license set months ago? I’ll happily send you emails that show this if you share your email address. Do you think I should have acted faster in writing this blog post after the Salmon paper was published? I assure you I worked as hard as I could, unfortunately at an inconvenient time due to a recent move.
August 7, 2017 at 12:48 am
Ravity Grainbow
Stole? You think you have been robbed? Astounding.
“Academic politics are so vicious precisely because the stakes are so small.”
August 6, 2017 at 10:15 am
Not me
Who cares.
August 7, 2017 at 12:45 am
Anon Grad
Well solved Prof. It seems all that is missing is an audio recording of the discussion around the name: ” … We obviously can’t call it pseudoalignment … How about Quasi-Mapping?… I love it ”
I’m confident it will be leaked soon.
August 7, 2017 at 8:22 am
Pablo García Fernández
If people here can’t understand the difference between stole one idea or incorporate that in a previously work… well… humanity is condemned.
Here Salmon authors not only take a lot of “inspiration” from kallisto code, they even, trying to get and advantage, wrote this review making his tool look better than its direct competitor (kallisto). So they not only stole, they also try to reduce the visibility of the original idea in a desperate movement to cope citation in every single rnaseq paper.
Just wow, since CRISPER patent I didn’t see anything so dirty.
PD: My english skills are so weak sorry.
August 7, 2017 at 12:52 pm
mi4
I think that this is the classic example of MULTIPLE DISCOVERY! See long list of multiple discoveries: https://en.wikipedia.org/wiki/List_of_multiple_discoveries#21st_century
MULTIPLE DISCOVERY (from wikipedia) = The concept of multiple discovery (also known as simultaneous invention) is the hypothesis that most scientific discoveries and inventions are made independently and more or less simultaneously by multiple scientists and inventors.
August 8, 2017 at 9:20 am
postdoc4life
That’s what I thought when I read Lior’s first post, but the addendum makes it look like literal code plagiarism, which is a whole different beast.
August 7, 2017 at 2:42 pm
yuanh
At least it’s good for us to be aware of such a thing or possibility, so that it won’t grow under neglect into a tumor in our scientific community
August 8, 2017 at 7:32 am
anonymous
Kallisto might have something to do with “lightweight alignment” in sailfish, but several key ideas behind kallisto are original enough to separate it from previous tools. Kallisto is the prior art which salmon learned from. I don’t think there is dispute over priority. This is not multiple discovery, either.
The major contribution made by salmon is GC correction, which indeed addresses an important practical issue. This, combined with a solid implementation, deserves a nature methods publication in my view. There are certainly worse tools published in nature methods/biotech that no one uses.
At the same time, I agree the salmon devs could present their methods and results in a better way. To start with, they should stick with “pseudoalignment” instead of coming up with a new term “quasi-mapping”. The two approaches are theoretically close and practically indistinguishable as is shown in the blog post and a few other papers. In addition, the salmon devs could point out that salmon, with GC correction off, produces nearly identical results to kallisto. This would give kallisto due credit and send readers a clear message about the real innovations in salmon. In the current writing, the rapmap and salmon papers do read like they have radically improved upon kallisto in many fronts, which is a little exaggerating.
August 8, 2017 at 10:36 am
sciencer
I have run both tools in a well-powered setting. I did not see a noticeable difference to the results with the GC correction setting in salmon. Can you tell me the circumstances where you found the GC correction to improve results?
August 8, 2017 at 11:01 am
sciencer
Thank you, Lior for your candor, outspokenness and open stance for discussion. I admire the rigor in your post and how every claim is supported with evidence. I have run kallisto and salmon and did not see any difference in the results in a well-powered setting for quantification. It made sense because the underlying ideas were similar. I am also very impressed with Bray for speaking up.
I am saddened that the scientific community in general has given more weight to your personal attacks over your scientific claims. I am quite disappointed with all the senior faculty who’ve taken to digging up the past, attacking your trainees and yourself and supporting the salmon crew on the basis of friendships and reputation. Please tell me – which one of us human is not fallible? Even the greatest of us can make a mistake and what is wrong with that? maybe you are wrong or maybe they are but what is wrong with being wrong? Reform and changing for the better is allowed in humanity. Why is the scientific community so opposed to open and public discussion and so sensitive to criticism?
I understand if people don’t like the tone of your post or your style of accusing before publicly eliciting a response. But it’s saddening to see top scientists supporting the salmon crew just because they disagree with your tone. It’s okay to react, tell you about tone or discuss at length as a separate issue. But immediately supporting the other side on the basis of established record is disappointing because again, no one is infallible. Even the greatest scientists can get sloppy at times. I have respect for coauthors of both papers and the records of brilliant work. But past record does not mean we are incapable of sloppiness or mistakes. More than anything else, your blog has exposed to me a side of the scientific community that is intolerant of strong views and disagreements.
I was also disappointed to see that you had to waste your time engaging with intolerant scientists on Twitter and try to convince them of your point. If they don’t care to read your posts and are intolerant, there is nothing you can do.
At the end of the day, we are a community and I’m sure you’ve put yourself at risk with this writing. But it is important to stand up for what you are convinced of and with evidence, and hopefully the community will engage in a civil way. After all, since you are convinced there is misconduct, there should be some expectation of passion and anger from you. Thanks for speaking up.
August 8, 2017 at 4:26 pm
Forseti
@sciencer and @anonymous wrote the smartest comments about this blog post than the entire community did on Twitter for the past few days. We heard about absolutely everything until now: personal relationships ( “x and y are good scientists, they would not do that”), software licensing, the ethics of publicly writing your opinion on what you think is plagiarism (because eh, we should all do open science but not open enough for this), etc.
But no one really addressed the content of this blog post, raising fundamental questions such as:
Reimplementing is okay but what is the line with copying? Is Salmon different enough from Kallisto to be its own software? Without the introduction of key-concepts from Kallisto into Salmon, does Salmon innovate enough from Kallisto for a Nature publication?
Also, no one really seem to bother about the side note on Kallisto/Salmon running times: according to these plots, there is a gigantic difference between the two software speeds, a lot different from what is written in the Salmon paper. Can someone confirm this?
August 9, 2017 at 3:33 am
Tim
As far as I understand from Lior’s post, there is just suggested that Salmon copied the IDEA of “intersection of transcripts appearing in all hits” from Kallisto. Salmon has not used ALL the ideas from Kallisto and also it brought new ideas like using using suffix arrays instead of de bruijn graphs and so on (e.g. GC bias?).
Is there any copy and paste in manuscript/pre-prints/articles/source-code-files of Salmon? I guess that this is NOT the case because then this would have been raised by now.
Also issues like that the results of Salmon and Kallisto are pretty similar is raised but looking to science field in general this happens more often than one expects, like for example, a given problem has only and only one solution so over the years (even centuries apart, like for example concrete and aluminium discovered by Romans and re-discovered again in the last centuries) the same solution/method will be re-discovered over and over again. Indeed this is rare in bioinformatics but it might be that for this type of bioinformatics problem of mapping very fast the reads using k-mers on transcriptome there is only and only one solution that is “use intersection of transcripts appearing in all hits”.
Now, the issue that Salmon is made to look perform better in a very under-powered test dataset by playing with the (command line and/or mathematical) parameters. All I can say that it is under-powered and if I pick randomly 100 bioinformatics articles which introduce a new tool/method/algorithm then most likely 80 of them will be affected by this issue.
Therefore in the end the major issue is who decides when an idea is copied? The reviewers? The original authors who came up with the idea? How different the idea should be in order not to be considered as copied? How long time should be ok to pass in order to have the second identical re-discovery accepted as genuine and not as a copy (in age of internet and github)? 10 minutes? 10 days? 1 month?
August 9, 2017 at 7:59 am
anonymous
To me, it is not about copying ideas or code; it is about giving credit. Take the Smith-Waterman algorithm as an example. The original algorithm was not that great. It only becomes one of the most famous algorithms after repeated improvements by other researchers. These researchers still credit Smith and Waterman because the idea was originated from them. This is the right way.
Now suppose in an alternative universe, there were researchers Foo and Bar around the same time as Smith&Waterman. They read S&W’s preprint, made little effective improvement to the algorithm and then published a paper, naming their algorithm as Foo-Bar and demoting Smith-Waterman at length, although the two algorithms are very close in fact. This is the wrong way.
The RapMap and Salmon papers are not as bad as my example, but they could certainly do better.
August 9, 2017 at 8:54 am
sciencer
Also, this is not the age of concrete discovery. Github code is public and usually people in the same research sub-community are aware of competing methods. It is totally fair that we make an incremental improvement to what is already out there, and sometimes as luck would have it, the incremental improvement has greater following for various reasons. In this case, both tools are well-maintained and there are active user bases. The incremental improvement is also at dispute here, so perhaps that needs to be carefully re-examined.
August 10, 2017 at 1:34 pm
anonliorfan@gmail.com
Hi Lior. You should check out my new tool for RNA-seq analysis. Its called Callisto and implements a revolutionary new method I developed called “kinda-matching”. Benchmarked using 738 threads, it outperforms both Kallisto and Salmon, on simulated gene expression data from sea urchin testicles. It even extensively corrects for AT bias, a completely conceptually novel conception.
The manuscript is in preparation, but I feel publication in Nature Genetics is more-or-less assured.
August 14, 2017 at 6:08 pm
Sahin Naqvi
Hi Lior, thank you for doing this very thorough analysis. I did want to point one thing out with respect to the use of estimated counts vs. TPM in differential expression. Your point that transcript-level TPMs are very dependent on effective length estimation as compared to counts is valid. However, for gene-level DE, one needs to in some way sum counts/expression of transcripts. It’s been argued that ‘simple summing’ of transcript-level counts to the gene level is not optimal for DE, and indeed sleuth (and addon tools for DESeq2 such as tximport) first normalize transcript counts to effective lengths, sum to the gene level, then convert back to gene-level counts based on the median effective length for that gene’s transcripts. So for most tools, isn’t this dependency on effective length still there, even when using estimated counts instead of TPM?
August 16, 2017 at 7:43 am
ckingsford
Here is a detailed response from the authors: http://bit.ly/SalmonResponse
August 16, 2017 at 10:36 am
salzberg1
Carl’s comment (above, by ckingsford) is much too modest. In the lengthy response that Rob Patro, Carl Kingsford, Geet Duggal, Mike Love, and Rafa Irizarry just posted to GitHub, they demonstrate that virtually all of Lior’s claims in this lengthy attack are wrong. Lior’s claims of plagiarism are wrong – and, in this writer’s opinion, libelous and irresponsible. His claims that Salmon is essentially equivalent to Kallisto is also wrong, not only because he based most of his argument on a single sample, but also because of his misleading use of certain numbers and figures. (I suggest readers look at Figure 1 of https://github.com/salmonteam/SalmonBlogResponse/blob/master/SalmonBlogResponse.md, if they can’t read the entire document.)
I hope that Lior will allow this reply to appear, but because these comments are moderated I’m not certain he will. After reading through the rebuttal, I would like to see Lior write a length retraction and an apology (along the lines of his “I was wrong” blog posts), but I’m not expecting that.
The bottom line: (1) Salmon has multiple novel ideas; (2) it performs quite differently from kallisto on many data sets, (3) the Salmon paper cites kallisto more than adequately (and repeatedly), (4) the code also cites kallisto appropriately, and (5) most of all, Lior’s accusations of plagiarism, which would be very serious if true, are completely false and never should have been made.
-Steven Salzberg, Johns Hopkins University
August 16, 2017 at 11:16 am
Lior Pachter
Hi Steven,
I have had a chance to read through Patro et al.’s rebuttal to my post. I will not respond in detail in this comment but thought it important to reply to your accusations.
I’m appalled that you would bring Figure 1 in as evidence that Salmon is not near identical to kallisto. Quite the contrary, it’s a deliberate misleading use of a plot, the very thing that some of the authors of the Salmon paper have admirably argued against. If there is one thing we have to be able to agree on it’s at least that 1=1.
There are 198,457 points in that plot, and I explain in this pair of tweets exactly why they show that salmon really IS near identical to kallisto.
I never said in my post that Salmon output was identical to kallisto. But not matter how you slice and dice it, it really does produce near identical output for almost all transcripts in every dataset I’ve looked at. Of course for very ambiguous cases, with very low coverage, there will be a bit of difference (which is the point of kallisto’s bootstrap), but even those are very few. Do you really believe the correlation, of log counts vs. log counts = 0.9955965, is just something I cherry picked?
You can go and test as many samples as you’d like. The SRA is filled with samples and running kallisto and salmon in default mode, then making a scatterplot, is straightforward. It’s something even PIs can do. I provided evidence in my post from multiple other papers, and I enccourage you to test for yourself.
In the absence of agreeing on the most basic fact here, which is that Salmon produces near identical results to kallisto, I don’t think there is a point to continue debate as I doubt it will be very productive.
Sincerely,
Lior
P.S. Regarding approval of comments I have never withheld a comment except in extreme cases of inappropriate conduct (or in the case of one post I announced that I will not approve more comments). In particular, anyone I approve has future automatic approval, so your current comment was approved without my intervention upon submittal.
August 17, 2017 at 5:28 pm
School Bully
Quite right! When I was in school I used to beat up younger kids for their lunch money. I was 5 years older than them, but I would beat up two or three at a time. That makes it fair, because their collective ages and heights were more than mine, so actually I was fighting against superior odds. Besides, I bestowed the lunch money on myself and my friends, which was an act of great kindness, goodness and honesty on my part.
I have been accused of all sorts of things because of these acts of kindness, generosity, honesty, courage and love. I have even been accused of being mean! Just for trying to take back lunch money that was mine in the first place, STOLEN from myself under the foulest circumstances.
Please allow me to explain My genome is 99% identical to those of the kids whose money I took. Clearly, that means that I am exactly the same person as them, so it was my money to begin with. Furthermore, since I am older, I clearly thought of the idea of having lunch money first. I am appalled that anyone would try to argue otherwise. If we can’t agree that 1=1, there is no point in any more discussion. Goodbye
PS Apologies – but I do prefer to remain publicly anonymous – hence the email rather than a comment on your blog. I am very junior, and as you know Professor Pachter is very senior and influential, and having re-read my analysis I think there is a good chance he will conclude that he invented the entrepreneurial scheme I just described and that my lunch money now belongs to him.
August 20, 2017 at 11:59 pm
College Educated
I am tempted to assume you’re some grad student of Salzberg’s academic lineage. Your attempt at this “sarcasm” is appalling because nowhere did Pachter attempt to steal anything. Pseudo-alignment was his group’s idea. They published it first.
Kingsford’s group on the other hand was developing an alternative strategy almost in parallel. When Pachter group’s paper and code was released Kingsford group switched their strategy to pseudo-alignment under the alias of quasi-mapping and claimed the resulting implementation to be superior to Pachter group’s software. Sure, they used different data structures to do so, but it was a clear cut case of plagiarism.
It’s equivalent of you copying your friend’s algorithm, using different variable names, function names, data structures, programming paradigm and programming language and turning in your assignment. It’s plagiarism. Maybe you disguise your submission by throwing in a few more use cases beyond what was required of the homework. It’s still stealing. Hell, you even mention in your submission that you got “inspired” by your friend’s work, but unless your friend willingly collaborated with you, it’s academic dishonesty. It’s a simple thing that bullies like you may not get.
August 19, 2017 at 11:21 am
sciencer
Dear Prof. Salzberg,
Thank you for taking the time to engage in discussion.
Would it be fair to request you to elicit an apology from the multiple scientists who have made accusations against Lior and his student on Twitter since this post? All those accusatory comments without evidence would constitute libel as well.
The majority vocal community has questioned and drawn conclusions about Lior et al’s motives and character primarily based on the accusatory tone of his blogpost. It does seem unfair to target one scientist (who’s been very honest about where he stands with his views) when infact, most of the community has done worse when expressing outrage and has retaliated through direct and indirect remarks.
Sincerely,
September 4, 2017 at 10:11 pm
Lior Pachter
You suggested that readers look at Figure 1. I suggest that after they do that they look at points #1 and #2 here: https://liorpachter.wordpress.com/2017/09/02/a-rebuttal/
August 16, 2017 at 6:22 pm
Mihai Pop
I would like to add another perspective, at some level unrelated to the technical arguments between Lior, Carl, and Rob.
The comments in the original blog posts quite strongly alleged scientific misconduct – a very significant charge especially when levied by an established full professor against a newly appointed assistant professor.
The power differential here is quite substantial (both in rank and social media presence) and the evidence from the comments indicates that the community of people reading this blog has accepted the argument provided by Lior without further independent examination of the facts.
I find this situation troubling and unacceptable for a number of reasons. First, I have been “raised” in a bioinformatics community that has been supportive and inclusive, and I owe my own success to repeated encouragement and support from the elders in the field. I have personally made some mistakes that could have led others berate me publicly, yet they chose to privately provide me the opportunity to explain myself and correct my mistakes.
This blog entry, as well as others that have appeared on this blog, are creating an environment where junior scientists in our field get to fear being the next one in the “cross-fire”. Some tough ones will persist nonetheless, while (perhaps many) others will decide that their talents are more valued elsewhere.
Second, should allegations of scientific misconduct be true, there are formal procedures for addressing such allegations – every University, journal, and professional society has clear guidelines for the burden of proof necessary to independently verify such claims. Absent such proof, unsupported allegations of misconduct constitute libel.
In closing, I would like to urge Lior to discontinue the use of his blog as a platform for conducting personal attacks on other scientists. The many positive contributions of this blog are eclipsed by the harm these attacks do, not just to individual scientists, but to the field as a whole. Scientific progress critically depends on an inclusive, supportive, and civil community.
August 17, 2017 at 2:15 am
Lior Pachter
I appreciate your concerns, and have thought about many of these issues at great length. Regarding this specific post, I’d just like to point out a few things:
1. The paper I blogged about has 4 PIs on it. As far as power differential, these PIs collectively have substantially more power than me in almost every respect. One of them is more senior than I am. I’m not an editor of a journal (in fact I’m currently not even on an editorial board), I don’t serve on an NIH or NSF study section, and I don’t have any senior position in any genome consortia. My social media presence consists of a twitter account and this blog. I have fewer twitter followers than the sum of the 4 PIs on this paper, and yes, my blog has a following but among the 4 PIs a number of them blog, and one of them is an author on one of the most widely read blogs in statistics (Simply Statistics). Whatever power this blog yields, if any, is via it’s truth. And without truth, it will be relegated to the dustbin of the internet, where it will reside with many other failed blogs.
2. I really wish I could think of a better venue to air the kinds of concerns raised with this blog post. This particular story related to several preprints and papers across multiple journals. So which journal exactly should I have written to with a complaint? Which preprint should I have commented on? How could I have tied it all together? And how would I have explained what happened with a limit of a single page that is common with many journals? As I’ve discussed before in other blog posts, there is also a fundamental conflict of interest for journals in hosting the kind of discussion that my blog posts elicit. I’ve thought of and looked into other channels as well, but I honestly don’t know who would be able to investigate the kinds of claims I’m making, coordinate it among multiple institutions, and sort out the technical issues? I will say that in some cases I have sought out and worked with formal channels when I saw that they were willing and able to help.
3. When I blog about papers that I find to be problematic I do so with enormous preparation. I respectfully disagree with your characterization on the impact to the community. I’ve received an enormous amount of email and personal thank you remarks from people, mostly junior (especially students) who are afraid to speak up and feel empowered by my blog. The following is an excerpt from an email I once received which exemplifies a lot of what I hear behind the scenes:
“Apologies – but I do prefer to remain publicly anonymous – hence the email rather than a comment on your blog. I am very junior, only just on the job market for my first faculty position, and ** is influential as you know. Call this cowardly, but there it is. ”
That’s real power being exerted right there.
While you are truly fortunate to have benefited from supportive and inclusive mentors, I think it’s fair to say that many students don’t feel that way about their academic experience. Civil discourse is great but not if its strict enforcement takes the place of, and prevents, kindness, goodness and honesty. To be blunt, I’m not really sure how to say “I think you stole my work” in a really nice way. But I do think that saying it is not only liberating, it bestows kindness, goodness and honesty on those whose work was stolen (in this case not just myself but a number of junior colleagues who are former students and collaborators) and others in the field who have had similar experiences. I believe the field benefits when it’s acceptable to criticize abhorrent behavior.
4. In this blog post, as with others, I’ve been careful to support every claim with hard data and facts. I didn’t call anyone names but I did question motives because I was truly baffled as to how the cumulative behavior of the authors could be explained. In some cases there is disagreement over the validity of my claims. The authors of the paper blogged about here have posted a rebuttal above, and I or you or anyone else is free to refute their rebuttal or explain why it merits revision of my remarks. I trust the readers of the blog to follow the discussions and debates and draw their own conclusions.
September 20, 2017 at 9:58 pm
Tata
I felt very bitter reading your post – is it what research has come to? Are we not worried anymore how good our work is, and worried too much of who gets the credit???? But reading your comment shed some light on the emotional component of your post, which I can empathize with. I understand that unfairness around us can really get to our nerves, but I wonder – is this the best application of your talent? If you believe that your work is better, and truly novel, the community will recognize it through applying it to their studies and appreciating the accuracy, novelty, precision…. For me personally, I dropped using kallisto (and will not even try Salmon, since they are so similar) for a simple reason that kallisto was giving very high expression for transcript that was clearly not expressed (and I know it was not expressed because this transcript is specific for a particular tissue, which was not the one I was looking at, and because raw alignments from multiple softwares did not have a single read supporting the junctions of the transcript, while kallisto reported it to be one of the highest expressed). I also saw few posters in the various conferences that repeatedly found rsem to be one of the top performers, and it has been producing convincing results in my datasets, so that is my choice number one (though I encourage everyone to try several for themselves if asked).
The point I am trying to make here is this: I get it that you feel you’ve been robbed, and that its unfair. But life is unfair, and I see it all around me, happening to me, to my colleagues, friends and family. Unworthy people getting faculty positions, publishing in high impact journals, getting promotions and awards. ITS EVERYWHERE! And I keep telling myself – I can’t change the world. But I can make sure that the work that I do is pure and true. And if someone wants to have a credit for my discovery, I go and complain to my friends and family, they tell me their words of support and I move on. I am a junior researcher, I can’t do anything about it. But what I can do is excellent work, which is meaningful. And when I move to more senior position, I will teach my trainees the values I believe in, and that would be the way I leave my mark on the field.
That being said: Maybe the most useful thing that came out of this post is the encouragement it gave to more junior people to stand up to the abuse of power. But whoever used kallisto before, will still use kallisto; and those who use Salmon, will still use Salmon (because, why switch if they are so similar anyway?). Please, don’t waste your time and energy on this absolutely pointless battle. You have a talent – use it wisely.
(I have a lot of respect to both parties of the conflict and do not pick any side)
September 20, 2017 at 10:10 pm
Lior Pachter
In Bray et al. 2015 we compared kallisto to RSEM and on simulated data we found that RSEM was slightly more accurate. That makes sense because RSEM is making full use of alignments that can help distinguish read assignments in cases error modeling helps. This is one reason I’ve been surprised with the arguments of some people saying that the fact that Salmon is near identical to kallisto is a good thing, because it means they must both be giving the correct answer. I disagree with that, and agree with you that RSEM is more accurate. The point we made in Bray et al. 2015 was not that kallisto is the most accurate program, but that the small loss of accuracy due to abandoning full alignments is worth it given the overall noise in experiments, and the advantages very rapid quantification provides (e.g. the bootstrapping- RSEM can also estimate inferential variance but it is very slow in doing so, and not as good at it due to the complexities involved).
Having said this I am surprised that you would find a highly expressed transcript called by kallisto that in reality is not expressed at all. The differences we have seen with RSEM are usually more subtle. I would be interested to see the case if you can share it (and of course will preserve your anonymity).
August 17, 2017 at 4:12 pm
anonymous
The original blog post is largely objective. It points out a major problem with many method papers: exaggeration of incremental improvements with biased benchmarks, which in my view still stands even after ckingsford’s rebuttal. As a matter of fact, the original post mentions no words like steal/stole, copy/copied or misconduct. It is several senior PIs who started baseless accusations and finally dragged a scientific debate into personal attacks, unfortunately in both directions.
Who do I fear more, Lior or those senior PIs? The latter. I would feel comfortable to openly confront Lior and point out his mistakes (e.g. I do think salmon deserves a good publication and Lior et al have over-reacted in response to accusations). I keep anonymous only because I fear a single supportive word might let my career doomed in the hands of PIs who care more about personal attacks than science.
August 17, 2017 at 7:19 pm
sciencer
“and the evidence from the comments indicates that the community of people reading this blog has accepted the argument provided by Lior without further independent examination of the facts.”
Above remark is suggestive of a number of things :-
1. that we (the commentators on the blog) have not cared to examine the technical aspects of the post. To reiterate, I have tried both software and both give me very similar results with and without gc-bias correction. I hope everyone who endorsed the response post has likewise done an independent examination.
2. that if we air our objective views, we would be considered taking a side. For the record, I respect both sets of authors and this blogpost is not going to prevent me from using either of their work when the science demands it. I engage in discussion because I love science and as long as humans do science, it is important to discuss. We don’t live in silos.
3. that we should not engage in discussion with someone whose tone is unacceptable to the vocal majority? I think Lior has invited criticism with his tone and because of that unproductive nature, I don’t endorse it. But that will not stop me from engaging with him in discussion of his claims. I won’t banish him for feeling outraged and being open about it.
I am very confused and honestly, a bit disappointed at the outrage of our supportive community of elders. Have you seen how the majority vocal community has attacked him and his student with several accusations on Twitter, trolling (see obscene comments on this post) and undue animosity? Is this how our community espouses inclusion and diversity of opinions?
I feel our greater community has done worse. There are a few accusations in Lior’s post against possibly five scientists. But multiple members have accused and attacked Lior et al. The elders have tried to police him. Seems a bit hypocritical to me.
Is it an eye for an eye?
Is extreme concern or disappointment or distress when someone strongly feels their work has been plagiarized not allowed in the community? Or is the preference to have a public discussion without any backstabbing or selective/private engagement a wrong idea?
Mental health was mentioned on Twitter. I now worry more about the mental health of Lior and his students given the majority outrage of scientists with power.
August 17, 2017 at 4:50 pm
Anonymous Graduate
Salmon despite it’s claim to be superior to kallisto has produced similar results for my and my colleagues’ work. I routinely use both and other tools to compare my analysis. Never had running time issues with either; both use similar amounts of time (kallisto is faster in my opinion though). So if anything Salmon is slightly better than kallisto due to its GC bias correction (which I hardly use anyway) as opposed to being radically different for all use cases. Lior stood up for what at this point indicates plagiarism on Salmon’s part and that is commendable.
August 17, 2017 at 10:10 pm
Jack Donaghy
Lior,
This post from you is a total disgrace to the spirit of science. Making accusations of fraud by selectively picking points and claiming your work has not been credited even though it has been (Of course who does not want more credit. While we’re at it, I want a $100 million as well).
The worst part here is that you do not hold yourself to the standards you seem to be demanding from others either. This is just you attempting to bully Salmon as it is a popular well-written software competing with your own.
November 1, 2019 at 6:28 pm
Thomas
This. Thanks for so clearly articulating it. I was looking into which algorithm to use and encoutered this blog post. I was in awe thinking “how could they not cite kallisto if they are so similar ad published later”? I went to read to paper just to see these were all false accusations. It’s so weird all this can come from a supposed “leader” in the field, who clearly doesn’t understand how science progresses. Laughable. Unfortunately pathetic.
November 1, 2019 at 7:36 pm
Lior Pachter
The only laughable thing about this is the fact that even the authors of Salmon don’t even use the GC bias option when they run Salmon.
August 19, 2017 at 6:10 am
ConcernedScientist
I would like to give a round of applause to @salzberg1 for his last comment, that was a very nice comedy that he made there. He’s been posting comments on this blog for years, comments in which he disagrees with Lior, and Lior’s blog is known from the entire community to be a heated discussion place because of this very fact: all comments are approved and everyone can says what he thinks. So by implying in his last comment that he doesn’t know if “Lior will allow this reply to appear, but because these comments are moderated I’m not certain he will”, he tried to picture Lior as some sort of dictator and himself as a victim, knowing pertinently that his reply would indeed appear. That was a nice try @salzberg1 but you didn’t fool anyone.
Furthermore, @salzberg1 has been claiming on social networks from day 1 that Lior is wrong and Kinsford et al. are excellent scientists. At no moment did he analysed the content of the blog post and tell Lior or anyone else “I disagree/He’s wrong because of point X and Y in the blog. Here is why.”. No, he just vouched for a good friend who was his postdoc without any justifications. That’s the most unscientific behavior I have seen in a while, the perfect example for students of what to not do and how to not behave in science.
August 19, 2017 at 8:38 am
You are right to be concerned
This only makes Salzberg and his students all look guilty. Note how there is a clique who are supporting Patro et. al. and they all mostly are academic relatives of each other. Their defense mainly consists of discrediting Lior as a character than as a scientist. I never would have believed bioinformatics had such unscientifically oriented people for a discipline spanning so many basic sciences. Maybe it’s just a Salzberg/JHU thing.
August 20, 2017 at 11:41 pm
Junhyong Kim
I am late to the scene here but I would like to leave a support for Lior and his usage of this blog forum to critique papers, issues, and such.
First, to clarify: with respect to the comparison of Salmon vs kallisto, I am not taking any position. I learned a lot from reading Lior’s detailed analysis but I have not studied either papers in detail to be qualified to say anything one way or another.
However, I think there has been quite a bit of criticisms about Lior’s use of the blog format as well as his critiques (the current discussion as well as those in the past). I agree with Lior that there is no easy appropriate forum for these kinds of discussions, especially with the depth that is presented here. I am not sure what the solution is, but I find it strange that people criticize the use of this format versus any other for what is essentially (and appropriately) a public format.
Without taking any sides on any particular issue, I very much appreciate Lior’s willingness to dissect critical issues and take a stance on those issues that he thinks is problematic.
I think we know very well that in science we see things that (we personally think) are naive < foolish < dumb < sly < conniving < malicious < fraudulent. There is a line in this sequence somewhere, at which we should stand up and say something. But, we all too often put the line way too much to the right. We don't want to be "that person". I find a very close analog to when we see people engaged in sexual misconduct behavior that is boorish < inappropriate < creepy < harassing < criminal. And, again we tend to cry out only when we see things way to the right; and, we tend to encourage a culture of letting things go.
One might be right or one might be wrong, but taking a stance on things that one thinks is wrong is truly difficult. Older I get more I feel that I was cowardly (not collegial) when I let things slide. We need more transparent discussions not less.
Junhyong Kim
November 24, 2017 at 7:17 pm
Humphrey Gardner
I’m switching to kallisto. Thanks for pointing out the better performance w fewer cores..and thanks Lior for the great work enabling the community to make the best use of RNASeq data…
December 18, 2017 at 2:53 pm
naarkhoo
Seems, there are a lot of people here who bully Lior for his object view and thorough analysis. I personally, believe, knowing to not have a significant edge to other methods and showing outperformance in a selected simulated data and publishing in “high-impact” factor journal, is a level of dis-honesty. I really get disappointed here by people in science who accuse Lior for his language – don’t you have an objective criticism ?
November 1, 2019 at 6:24 pm
Thomas Patterson
Salmon is not the same as kallisto, it is an extension of kallisto’s method: considering the prior for the data. In most cases where there won’t be a strong prior, the results won’t change much but there are cases where this will matter (GC content abnormalities etc).
They very properly cited kallisto and showed how they differ in their published paper. This is how science progresses, Lior, by building on other scientists’ work. This is how it has always been. Perhaps you should review your perspective as a scientist and your attitude towards others’ work.
November 1, 2019 at 7:34 pm
Lior Pachter
I’m all for building on prior work, and I’m particularly happy when people build on my work. But this was not building, it was stealing.
November 4, 2019 at 11:01 am
Thomas
That’s interesting. If the paper cited kallisto properly in their publication, talked about the novelty kallisto brings (pseudo-alignment idea, a lightweight way to carry out alignment etc) and the novelty Salmon brings (incorporating priors, difference in licensing), what precisely makes you consider it “stealing”?
November 4, 2019 at 2:18 pm
Lior Pachter
There’s a good summary of what’s wrong with what happened here on this thread: https://twitter.com/yarbsalocin/status/893886707564662784
The exact details are in this very blog post, which I encourage you to read. Yes, technically the Salmon paper (drive-by-cites) cites kallisto, but that’s a far cry from properly acknowledging prior work. BTW even the term “lightweight alignment” that you used was a cynical phrase created by the Salmon authors.