The encapsulation of beads together with cells in droplets is the basis of microfluidic based single-cell RNA-seq technologies. Ideally droplets contain exactly one bead and one cell, however in practice the number of beads and cells in droplets is stochastic and encapsulation of cells in droplets produces an approximately Poisson distribution of number of cells per droplet:

Specifically, the probability of observing k cells in a droplet is approximated by

$\mathbb{P}(\mbox{k cells in a droplet}) = \frac{e^{-\lambda}\lambda^k}{k!}$.

The rate parameter $\lambda$ can be controlled and the average number of cells per droplet is equal to it. Therefore, setting $\lambda$ to be much less than 1 ensures that two or more cells are rarely encapsulated in a single droplet. A consequence of this is that the number of empty droplets, given by $e^{-\lambda}$, is large. Importantly, one of the properties of the Poisson distribution is that variance is equal to the mean so the number of cells per droplet is also equal to $\lambda$.

Along with cells, beads must also be captured in droplets, and when plastic beads are used the occupancy statistics follow a Poisson distribution as well. This means that with technologies such as Drop-seq (Macosko et al. 2015), which uses polystyrene beads, many droplets are either empty, contain a bead and no cell, or a cell and no bead. The latter situation (cell and no bead) leads to a low “capture rate”, i.e. not many of the cells are assayed in an experiment.

One of the advantages of the inDrops method (Klein et al. 2015) over other single-cell RNA-seq methods is that it uses hydrogel beads which allow for a reduction in the variance of the number of beads per cell. In an important paper Abate et al. 2009 showed that close packing of hydrogel beads allows for an almost degenerate distribution where the number of beads per droplet is exactly one 98% of the time. The video below shows how close to degeneracy the distribution can be squeezed (in the example two beads are being encapsulated per droplet):

A discrete distribution defined over the non-negative integers with variance less than the mean is called sub-Poisson. Similarly, a discrete distribution defined over the non-negative integers with variance greater than the mean is called super-Poisson. This terminology dates back to at least the 1940s (e.g., Berkson et al. 1942) and is standard in many fields from physics (e.g. Rodionov and Cherkin 2004) to biology (e.g. Pitchiaya et al. 2014 ).

Figure 5.26 from Adrian Jeantet, Cavity quantum electrodynamics with carbon nanotubes, 2017.

Using this terminology, the close packing of hydrogel beads can be said to enable sub-Poisson loading of beads into droplets because the variance of beads per droplet is reduced in comparison to the Poisson statistics of plastic beads.

Unfortunately, in a 2015 paper, Bose et al. used the term “super-Poisson” instead of “sub-Poisson” in discussing an approach to reducing bead occupancy variance in the single-cell RNA-seq context. This sign error in terminology has subsequently been propagated and recently appeared in a single- cell RNA-seq review (Zhang et al. 2018) and in 10x Genomics advertising materials.

When it comes to single-cell RNA-seq we already have people referring to the number of reads sequenced as “the library size” and calling trees “one-dimensional manifolds“. Now sub-Poisson is mistaken for super-Poisson. Before you know it we’ll have professors teaching students that cell clusters obtained by k-means clustering are “cell types“…

Supper poisson (not to be confused with super-Poisson (not to be confused with sub-Poisson)).