Continuous-time Markov chain models for DNA mutations on a phylogenetic tree (e.g. the Jukes-Cantor model, the Kimura models, and more generally models of the Felsenstein hierarchy) have the simple and convenient property of multiplicativity. Specifically, if Q is a rate matrix then the associated substitution matrices are multiplicative in the following sense:

e^{Q(t_1+t_2)} = e^{Qt_1}e^{Qt_2}.

This follows directly from the fact that the matrices Qt_1 and Qt_2 commute, because for any two commuting matrices A and B

e^{A+B} = e^{A}e^{B}.

This means that substitutions over a time period 2t are equivalently described as substitutions occurring over a time period t, followed by substitutions occurring afterwards over another time period t.

But what if over the course of time the rate matrix changes? For example, suppose that for a period of time t mutations proceed according to a rate matrix Q, and following that, for another period of time t,  mutations proceed according to a rate matrix R? Is it true that the substitutions after time 2t will behave as if mutations occurred for a time 2t according to the (average) rate matrix \frac{Q+R}{2}?

If Q and R commute the answer will be yes, as Qt and Rt will also be commutative and the multiplicativity property will hold. But what if Q and don’t commute? Is there any relationship at all between e^{\frac{Q+R}{2}2t} and the matrices e^{Qt} and e^{Rt}?

This week I visited Yale University to give a talk in the Center for Biomedical Data Science seminar series.  I was invited by Smita Krishnaswamy, who organized a wonderful visit that included many interesting conversations not only in computational biology, but also applied math, computer science and statistics (Yale has strong programs in applied mathematics, statistics and data science, computer science and biostatistics). At dinner I learned from Dan Spielman of the Golden-Thompson inequality which provides a beautiful answer to the question above in the case where Q and R are symmetric. The theorem is a trace inequality for Hermitian matrices A and B:

tr(e^{A+B}) \leq tr(e^Ae^B).

This inequality is well known in statistical mechanics and random matrix theory but I don’t believe it is known in the phylogenetics community, hence this post. The phylogenetic interpretation of the pieces of the Golden-Thompson inequality (replacing A with Qt and B with Rt) is straightforward:

  • The matrices e^{Qt} and e^{Rt} are substitution matrices for the rate matrices Q and R respectively.
  • The product e^{Qt}e^{Rt} is the substitution matrix corresponding to mutations occurring with rate matrix Q for time t followed by rate matrix R for time t.
  • The matrix e^{Qt+Rt} = e^{\frac{Q+R}{2} \cdot 2t} is the substitution matrix for mutations occurring with rate \frac{Q+R}{2} for time 2t.
  • Since the trace of a substitution matrix is the probability that there is no transition, or equivalently the probability that a change in nucleotide does not occur, the Golden-Thompson inequality states that for two symmetric rate matrices and R, the probability of a substitution after time 2t is higher when mutations occur first at rate Q for time t and then at rate R for time t, than if they occur at rate \frac{Q+R}{2} for time 2t. 

In other words, rate changes decrease the expected number of substitutions in comparison to what one would see if rates are constant

The Golden-Thompson inequality was discovered independently by Sidney Golden and Colin Thompson in 1965. A proof is explained in an expository blog post by Terence Tao who heard of the Golden-Thompson inequality only eight years ago, which makes me feel a little bit better about not having heard of it until this week! It would be nice if there was a really simple proof but that appears not to be the case (there is a purported one page proof in a paper titled Golden-Thompson from Davis, however what is proved there is the different inequality tr(e^{A+B}) \leq tr(e^A)tr(e^B), which can be shown, by virtue of another matrix trace inequality, to be a weaker inequality).

There is considerable interest in evolutionary biology in models that allow for time-varying rates of mutation, as there is substantial evidence of such variation. The Golden-Thompson inequality provides an additional insight for how mutation rate changes over time can affect naïve estimates based on homogeneity assumptions.


The Felsenstein hierarchy (from Algebraic Statistics for Computational Biology).