The two cultures of mathematics and biology

December 30, 2014 in expository | Tags: Alexander Grothendieck, biology, David Haussler, David Mumford, Gian-Carlo Rota, Institute for Quantitative and Computational Biosciences, John Tate, mathematics, Montgomery Slatkin, phylogenetic invariants, Richard Durbin, The Two Cultures, UCLA | by Lior Pachter

I’m a (50%) professor of mathematics and (50%) professor of molecular & cell biology at UC Berkeley. There have been plenty of days when I have spent the working hours with biologists and then gone off at night with some mathematicians. I mean that literally. I have had, of course, intimate friends among both biologists and mathematicians. I think it is through living among these groups and much more, I think, through moving regularly from one to the other and back again that I have become occupied with the problem that I’ve christened to myself as the ‘two cultures’. For constantly I feel that I am moving among two groups- comparable in intelligence, identical in race, not grossly different in social origin, earning about the same incomes, who have almost ceased to communicate at all, who in intellectual, moral and psychological climate have so little in common that instead of crossing the campus from Evans Hall to the Li Ka Shing building, I may as well have crossed an ocean.¹

I try not to become preoccupied with the two cultures problem, but this holiday season I have not been able to escape it. First there was a blog post by David Mumford, a professor emeritus of applied mathematics at Brown University, published on December 14th. For those readers of the blog who do not follow mathematics, it is relevant to what I am about to write that David Mumford won the Fields Medal in 1974 for his work in algebraic geometry, and afterwards launched another successful career as an applied mathematician, building on Ulf Grenader’s Pattern Theory and making significant contributions to vision research. A lot of his work is connected to neuroscience and therefore biology. Among his many awards are the MacArthur Fellowship, the Shaw Prize, the Wolf Prize and the National Medal of Science. David Mumford is not Joe Schmo.

It therefore came as a surprise to me to read his post titled “Can one explain schemes to biologists?” in which he describes the rejection by the journal Nature of an obituary he was asked to write. Now I have to say that I have heard of obituaries being retracted, but never of an obituary being rejected. The Mumford rejection is all the more disturbing because it happened after he was invited by Nature to write the obituary in the first place!

The obituary Mumford was asked to write was for Alexander Grothendieck, a leading and towering figure in 20th century mathematics who built many of the foundations for modern algebraic geometry. My colleague Edward Frenkel published a brief non-technical obituary about Grothendieck in the New York Times, and perhaps that is what Nature had in mind for its journal as well. But since Nature is bills itself as “An international journal, published weekly, with original, groundbreaking research spanning all of the scientific disciplines [emphasis mine]” Mumford assumed the readers of Nature would be interested not only in where Grothendieck was born and died, but in what he actually accomplished in his life, and why he is admired for his mathematics. Here is the beginning excerpt of Mumford’s blog post² explaining why he and John Tate (his coauthor for the post) needed to talk about the concept of a scheme in their post:

John Tate and I were asked by Nature magazine to write an obituary for Alexander Grothendieck. Now he is a hero of mine, the person that I met most deserving of the adjective “genius”. I got to know him when he visited Harvard and John, Shurik (as he was known) and I ran a seminar on “Existence theorems”. His devotion to math, his disdain for formality and convention, his openness and what John and others call his naiveté struck a chord with me.

So John and I agreed and wrote the obituary below. Since the readership of Nature were more or less entirely made up of non-mathematicians, it seemed as though our challenge was to try to make some key parts of Grothendieck’s work accessible to such an audience. Obviously the very definition of a scheme is central to nearly all his work, and we also wanted to say something genuine about categories and cohomology.

What they came up with is a short but well-written obituary that is the best I have read about Grothendieck. It is non-technical yet accurate and meaningfully describes, at a high level, what he is revered for and why. Here it is (copied verbatim from David Mumford’s blog):

Alexander Grothendieck
David Mumford and John Tate

Although mathematics became more and more abstract and general throughout the 20th century, it was Alexander Grothendieck who was the greatest master of this trend. His unique skill was to eliminate all unnecessary hypotheses and burrow into an area so deeply that its inner patterns on the most abstract level revealed themselves — and then, like a magician, show how the solution of old problems fell out in straightforward ways now that their real nature had been revealed. His strength and intensity were legendary. He worked long hours, transforming totally the field of algebraic geometry and its connections with algebraic mber

mber theory. He was considered by many the greatest mathematician of the 20th century.

Grothendieck was born in Berlin on March 28, 1928 to an anarchist, politically activist couple — a Russian Jewish father, Alexander Shapiro, and a German Protestant mother Johanna (Hanka) Grothendieck, and had a turbulent childhood in Germany and France, evading the holocaust in the French village of Le Chambon, known for protecting refugees. It was here in the midst of the war, at the (secondary school) Collège Cévenol, that he seems to have first developed his fascination for mathematics. He lived as an adult in France but remained stateless (on a “Nansen passport”) his whole life, doing most of his revolutionary work in the period 1956 – 1970, at the Institut des Hautes Études Scientifique (IHES) in a suburb of Paris after it was founded in 1958. He received the Fields Medal in 1966.

His first work, stimulated by Laurent Schwartz and Jean Dieudonné, added major ideas to the theory of function spaces, but he came into his own when he took up algebraic geometry. This is the field where one studies the locus of solutions of sets of polynomial equations by combining the algebraic properties of the rings of polynomials with the geometric properties of this locus, known as a variety. Traditionally, this had meant complex solutions of polynomials with complex coefficients but just prior to Grothendieck’s work, Andre Weil and Oscar Zariski had realized that much more scope and insight was gained by considering solutions and polynomials over arbitrary fields, e.g. finite fields or algebraic number fields.

The proper foundations of the enlarged view of algebraic geometry were, however, unclear and this is how Grothendieck made his first, hugely significant, innovation: he invented a class of geometric structures generalizing varieties that he called schemes. In simplest terms, he proposed attaching to any commutative ring (any set of things for which addition, subtraction and a commutative multiplication are defined, like the set of integers, or the set of polynomials in variables x,y,z with complex number coefficients) a geometric object, called the Spec of the ring (short for spectrum) or an affine scheme, and patching or gluing together these objects to form the scheme. The ring is to be thought of as the set of functions on its affine scheme.

To illustrate how revolutionary this was, a ring can be formed by starting with a field, say the field of real numbers, and adjoining a quantity $\epsilon$ satisfying $\epsilon^2=0$ . Think of $\epsilon$ this way: your instruments might allow you to measure a small number such as $\epsilon=0.001$ but then $\epsilon^2=0.000001$ might be too small to measure, so there’s no harm if we set it equal to zero. The numbers in this ring are $a+b \cdot \epsilon$ real a,b. The geometric object to which this ring corresponds is an infinitesimal vector, a point which can move infinitesimally but to second order only. In effect, he is going back to Leibniz and making infinitesimals into actual objects that can be manipulated. A related idea has recently been used in physics, for superstrings. To connect schemes to number theory, one takes the ring of integers. The corresponding Spec has one point for each prime, at which functions have values in the finite field of integers mod p and one classical point where functions have rational number values and that is ‘fatter’, having all the others in its closure. Once the machinery became familiar, very few doubted that he had found the right framework for algebraic geometry and it is now universally accepted.

Going further in abstraction, Grothendieck used the web of associated maps — called morphisms — from a variable scheme to a fixed one to describe schemes as functors and noted that many functors that were not obviously schemes at all arose in algebraic geometry. This is similar in science to having many experiments measuring some object from which the unknown real thing is pieced together or even finding something unexpected from its influence on known things. He applied this to construct new schemes, leading to new types of objects called stacks whose functors were precisely characterized later by Michael Artin.

His best known work is his attack on the geometry of schemes and varieties by finding ways to compute their most important topological invariant, their cohomology. A simple example is the topology of a plane minus its origin. Using complex coordinates (z,w), a plane has four real dimensions and taking out a point, what’s left is topologically a three dimensional sphere. Following the inspired suggestions of Grothendieck, Artin was able to show how with algebra alone that a suitably defined third cohomology group of this space has one generator, that is the sphere lives algebraically too. Together they developed what is called étale cohomology at a famous IHES seminar. Grothendieck went on to solve various deep conjectures of Weil, develop crystalline cohomology and a meta-theory of cohomologies called motives with a brilliant group of collaborators whom he drew in at this time.

In 1969, for reasons not entirely clear to anyone, he left the IHES where he had done all this work and plunged into an ecological/political campaign that he called Survivre. With a breathtakingly naive spririt (that had served him well doing math) he believed he could start a movement that would change the world. But when he saw this was not succeeding, he returned to math, teaching at the University of Montpellier. There he formulated remarkable visions of yet deeper structures connecting algebra and geometry, e.g. the symmetry group of the set of all algebraic numbers (known as its Galois group $Gal(\overline{\mathbb{Q}}/\mathbb{Q})$ ) and graphs drawn on compact surfaces that he called ‘dessin d’enfants’. Despite his writing thousand page treatises on this, still unpublished, his research program was only meagerly funded by the CNRS (Centre Nationale de Recherche Scientifique) and he accused the math world of being totally corrupt. For the last two decades of his life he broke with the whole world and sought total solitude in the small village of Lasserre in the foothills of the Pyrenees. Here he lived alone in his own mental and spiritual world, writing remarkable self-analytic works. He died nearby on Nov. 13, 2014.

As a friend, Grothendieck could be very warm, yet the nightmares of his childhood had left him a very complex person. He was unique in almost every way. His intensity and naivety enabled him to recast the foundations of large parts of 21st century math using unique insights that still amaze today. The power and beauty of Grothendieck’s work on schemes, functors, cohomology, etc. is such that these concepts have come to be the basis of much of math today. The dreams of his later work still stand as challenges to his successors.

Mumford goes on in his blog post to describe the reasons Nature gave for rejecting the obituary. He writes:

The sad thing is that this was rejected as much too technical for their readership. Their editor wrote me that ‘higher degree polynomials’, ‘infinitesimal vectors’ and ‘complex space’ (even complex numbers) were things at least half their readership had never come across. The gap between the world I have lived in and that even of scientists has never seemed larger. I am prepared for lawyers and business people to say they hated math and not to remember any math beyond arithmetic, but this!? Nature is read only by people belonging to the acronym ‘STEM’ (= Science, Technology, Engineering and Mathematics) and in the Common Core Standards, all such people are expected to learn a hell of a lot of math. Very depressing.

I don’t know if the Nature editor had biologists in mind when rejecting the Grothendieck obituary, but Mumford certainly thought so, as he sarcastically titled his post “Can one explain schemes to biologists?” Sadly, I think that Nature and Mumford both missed the point.

Exactly ten years ago Bernd Sturmfels and I published a book titled “Algebraic Statistics for Computational Biology“. From my perspective, the book developed three related ideas: 1. that the language, techniques and theorems of algebraic geometry both unify and provide tools for certain models in statistics, 2. that problems in computational biology are particularly prone to depend on inference with precisely the statistical models amenable to algebraic analysis and (most importantly) 3. mathematical thinking, by way of considering useful generalizations of seemingly unrelated ideas, is a powerful approach for organizing many concepts in (computational) biology, especially in genetics and genomics.

To give a concrete example of what 1,2 and 3 mean, I turn to Mumford’s definition of algebraic geometry in his obituary for Grothendieck. He writes that “This is the field where one studies the locus of solutions of sets of polynomial equations by combining the algebraic properties of the rings of polynomials with the geometric properties of this locus, known as a variety.” What is he talking about? The notion of “phylogenetic invariants”, provides a simple example for biologists by biologists. Phylogenetic invariants were first introduced to biology ca. 1987 by Joe Felsenstein (Professor of Genome Sciences and Biology at the University of Washington) and James Lake (Distinguished Professor of Molecular, Cell, and Developmental Biology and of Human Genetics at UCLA)³.

Given a phylogenetic tree describing the evolutionary relationship among n extant species, one can examine the evolution of a single nucleotide along the tree. At the leaves, a single nucleotide is then associated to each species, collectively forming a single selection from among the $4^n$ possible patterns for nucleotides at the leaves. Evolutionary models provide a way to formalize the intuitive notion that random mutations should be associated with branches of the tree and formally are described via (unknown) parameters that can be used to calculate a probability for any pattern at the leaves. It happens to be the case that for most phylogenetic evolutionary model have the property that the probabilities for leaf patterns are polynomials in the parameters. The simplest example to consider is the tree with an ancestral node and two leaves corresponding to two extant species, say “B” and “M”:

The molecular approach to evolution posits that multiple sites together should be used both to estimate parameters associated with evolution along the tree, and maybe even the tree itself. If one assumes that nucleotides mutate according to the 4-state general Markov model with independent processes on each branch, and one writes $p_{ij}$ for $\mathbb{P}(B=i,M=j)$ where i,j are one of A,C,G,T, then it must be the case that $p_{ij}p_{kl} = p_{il}p_{jk}$ . In other words, the polynomial

$p_{ij}p_{kl} - p_{il}p_{jk}=0$ .

In other words, for any parameters in the 4-state general Markov model, it has to be the case that when the pattern probabilities are plugged into the polynomial equation above, the result is zero. This equation is none other than the condition for two random variables to be independent; in this case the random variable corresponding to the nucleotide at B is independent of the random variable corresponding to the nucleotide at M.

The example is elementary, but it hints at a powerful tool for phylogenetics. It provides an equation that must be satisfied by the pattern probabilities that does not depend specifically on the parameters of the model (which can be intuitively understood as relating to branch length). If many sites are available so that pattern probabilities can be estimated empirically from data, then there is in principle a possibility for testing whether the data fits the topology of a specific tree regardless of what the branch lengths of the tree might be. Returning to Mumford’s description of algebraic geometry, the variety of interest is the geometric object in “pattern probability space” where points are precisely probabilities that can arise for a specific tree, and the “ring of polynomials with the geometric properties of the locus” are the phylogenetic invariants. The relevance of the ring lies in the fact that if f and g are two phylogenetic invariants then that means that $f(P)=0$ and $g(P)=0$ for any pattern probabilities from the model, so therefore $f+g$ is also a phylogenetic invariant because $f(P)+g(P)=0$ for any pattern probabilities from the model (the same is true for $c \cdot f$ for any constant c). In other words, there is an algebra of phylogenetic invariants that is closely related to the geometry of pattern probabilities. As Mumford and Tate explain, Grothendieck figured out the right generalizations to construct a theory for any ring, not just the ring of polynomials, and therewith connected the fields of commutative algebra, algebraic geometry and number theory.

The use of phylogenetic invariants for testing tree topologies is conceptually elegantly illustrated in a wonderful book chapter on phylogenetic invariants by mathematicians Elizabeth Allman and John Rhodes that starts with the simple example of the two taxa tree and delves deeply into the subject. Two surfaces (conceptually) represent the varieties for two trees, and the equations $f_1(P)=f_2(P)=\ldots=f_l(P)=0$ and $h_1(P)=h_2(P)=\ldots=h_k(P)=0$ are the phylogenetic invariants. The empirical pattern probability distribution is the point $\hat{P}$ and the goal is to find the surface it is close to:

Figure 4.2 from Allman and Rhodes chapter on phylogenetic invariants.

Of course for large trees there will be many different phylogenetic invariants, and the polynomials may be of high degree. Figuring out what the invariants are, how many of them there are, bounds for the degrees, understanding the geometry, and developing tests based on the invariants, is essentially a (difficult unsolved) challenge for algebraic geometers. I think it’s fair to say that our book spurred a lot of research on the subject, and helped to create interest among mathematicians who were unaware of the variety and complexity of problems arising from phylogenetics. Nick Eriksson, Kristian Ranestad, Bernd Sturmfels and Seth Sullivant wrote a short piece titled phylogenetic algebraic geometry which is an introduction for algebraic geometers to the subject. Here is where we come full circle to Mumford’s obituary… the notion of a scheme is obviously central to phylogenetic algebraic geometry. And the expository article just cited is just the beginning. There are too many exciting developments in phylogenetic geometry to summarize in this post, but Elizabeth Allman, Marta Casanellas, Joseph Landsberg, John Rhodes, Bernd Sturmfels and Seth Sullivant are just a few of many who have discovered beautiful new mathematics motivated by the biology, and also have had an impact on biology with algebro-geometric tools. There is both theory (see this recent example) and application (see this recent example) coming out of phylogenetic algebraic geometry. More generally, algebraic statistics for computational biology is now a legitimate “field”, complete with a journal, regular conferences, and a critical mass of mathematicians, statisticians, and even some biologists working in the area. Some of the results are truly beautiful and impressive. My favorite recent one is this paper by Caroline Uhler, Donald Richards and Piotr Zwiernik providing important guarantees for maximum likelihood estimation of parameters in Felstenstein’s continuous character model.

But that is not the point here. First, Mumford’s sarcasm was unwarranted. Biologists certainly didn’t discover schemes but as Felsenstein and Lake’s work shows, they did (re)discover algebraic geometry. Moreover, all of the people mentioned above can explain schemes to biologists, thereby answering Mumford’s question in the affirmative. Many of them have not only collaborated with biologists but written biology papers. And among them are some extraordinary expositors, notably Bernd Sturmfels. Still, even if there are mathematicians able and willing to explain schemes to biologists, and even if there are areas within biology where schemes arise (e.g. phylogenetic algebraic geometry), it is fair to ask whether biologists should care to understand them?

The answer to the question is: probably not. In any case I wouldn’t presume to opine on what biologists should and shouldn’t care about. Biology is enormous, and encompasses everything from the study of fecal transplants to the wood frogs of Alaska. However I do have an opinion about the area I work in, namely genomics. When it comes to genomics journalists write about revolutions, ~~personalized~~ precision medicine, curing cancer and data deluge. But the biology of genomics is for real, and it is indeed tremendously exciting as a result of dramatic improvements in underlying technologies (e.g. DNA sequencing and genome editing to name two). I also believe it is true that despite what is written about data deluge, experiments remain the primary and the best way, to elucidate the function of the genome. Data analysis is secondary. But it is true that statistics has become much more important to genomics than it was even to population genetics at the time of R.A. Fisher, computer science is playing an increasingly important role, and I believe that somewhere in the mix of “quantitative sciences for biology”, there is an important role for mathematics.

What biologists should appreciate, what was on offer in Mumford’s obituary, and what mathematicians can deliver to genomics that is special and unique, is the ability to not only generalize, but to do so “correctly”. The mathematician Raoul Bott once reminisced that “Grothendieck was extraordinary as he could play with concepts, and also was prepared to work very hard to make arguments almost tautological.” In other words, what made Grothendieck special was not that he generalized concepts in algebraic geometry to make them more abstract, but that he was able to do so in the right way. What made his insights seemingly tautological at the end of the day, was that he had the “right” way of viewing things and the “right” abstractions in mind. That is what mathematicians can contribute most of all to genomics. Of course sometimes theorems are important, or specific mathematical techniques solve problems and mathematicians are to thank for that. Phylogenetic invariants are important for phylogenetics which in turn is important for comparative genomics which in turn is important for functional genomics which in turn is important for medicine. But it is the the abstract thinking that I think matters most. In other words, I agree with Charles Darwin that mathematicians are endowed with an extra sense… I am not sure exactly what he meant, but it is clear to me that it is the sense that allows for understanding the difference between the “right” way and the “wrong” way to think about something.

There are so many examples of how the “right” thinking has mattered in genomics that they are too numerous to list here, but here are a few samples: At the heart of molecular biology, there is the “right” and the “wrong” way to think about genes: evidently the message to be gleaned from Gerstein et al.‘s in “What is a gene post ENCODE? History and Definition” is that “genes” are not really the “right” level of granularity but transcripts are. In a previous blog post I’ve discussed the “right” way to think about the Needleman-Wunsch algorithm (tropically). In metagenomics there is the “right” abstraction with which to understand UniFrac. One paper I’ve written (with Niko Beerenwinkel and Bernd Sturmfels) is ostensibly about fitness landscapes but really about what we think the “right” way is to look at epistasis. In systems biology there is the “right” way to think about stochasticity in expression (although I plan a blog post that digs a bit deeper). There are many many more examples… way too many to list here… because ultimately every problem in biology is just like in math… there is the “right’ and the “wrong” way to think about it, and figuring out the difference is truly an art that mathematicians, the type of mathematicians that work in math departments, are particularly good at.

Here is a current example from (computational) biology where it is not yet clear what “right” thinking should be despite the experts working hard at it, and that is useful to highlight because of the people involved: With the vast amount of human genomes being sequenced (some estimates are as high as 400,000 in the coming year), there is an increasingly pressing fundamental question about how the (human) genome should be represented and stored. This is ostensibly a computer science question: genomes should perhaps be compressed in ways that allow for efficient search and retrieval, but I’d argue that fundamentally it is a math question. This is because what the question is really asking, is how should one think about genome sequences related mostly via recombination and only slightly by mutation, and what are the “right” mathematical structures for this challenge? The answer matters not only for the technology (how to store genomes), but much more importantly for the foundations of population and statistical genetics. Without the right abstractions for genomes, the task of coherently organizing and interpreting genomic information is hopeless. David Haussler (with coauthors) and Richard Durbin have both written about this problem in papers that are hard to describe in any way other than as math papers; see Mapping to a Reference Genome Structure and Efficient haplotype matching and storage using the positional Burrows-Wheeler transform (BPWT). Perhaps it is no coincidence that both David Haussler and Richard Durbin studied mathematics.

But neither David Haussler nor Richard Durbin are faculty in mathematics departments. In fact, there is a surprisingly long list of very successful (computational) biologists specifically working in genomics, many of whom even continue to do math, but not in math departments, i.e. they are former mathematicians (this is so common there is even a phrase for it “recovering mathematician” as if being one is akin to alcoholism– physicists use the same language). People include Richard Durbin, Phil Green, David Haussler, Eric Lander, Montgomery Slatkin and many others I am omitting; for example almost the entire assembly group at the Broad Institute consists of former mathematicians. Why are there so many “formers” and very few “currents”? And does it matter? After all, it is legitimate to ask whether successful work in genomics is better suited to departments, institutes and companies outside the realm of academic mathematics. It is certainly the case that to do mathematics, or to publish mathematical results, one does not need to be a faculty member in a mathematics department. I’ve thought a lot about these issues and questions, partly because they affect my daily life working between the worlds of mathematics and molecular biology in my own institution. I’ve also seen the consequences of the separation of the two cultures. To illustrate how far apart they are I’ve made a list of specific differences below:

Biologists publish in “glamour journals” such as Science, Nature and Cell where impact factors are high. Nature publishes its impact factor to three decimal digits accuracy (42.317). Mathematicians publish in journals whose names start with the word Annals, and they haven’t heard of impact factors. The impact factor of the Annals of Mathematics, perhaps the most prestigious journal in mathematics, is 3 (the journal with the highest impact factor is the Journal of the American Mathematical Society at 3.5). Mathematicians post all papers on the ArXiv preprint server prior to publications. Not only do biologists not do that, they are frequently subject to embargos prior to publication. Mathematicians write in LaTeX, biologists in Word (a recent paper argues that Word is better, but I’m not sure). Biologists draw figures and write papers about them. Mathematicians write papers and draw figures to explain them. Mathematicians order authors alphabetically, and authorship is awarded if a mathematical contribution was made. Biologists author lists have two gradients from each end, and authorship can be awarded for payment for the work. Biologists may review papers on two week deadlines. Mathematicians review papers on two year deadlines. Biologists have their papers cited by thousands, and their results have a real impact on society; in many cases diseases are cured as a result of basic research. Mathematicians are lucky if 10 other individuals on the planet have any idea what they are writing about. Impact time can be measured in centuries, and sometimes theorems turn out to simply not have been interesting at all. Biologists don’t teach much. Mathematicians do (at UC Berkeley my math teaching load is 5 times that of my biology teaching load). Biologists value grants during promotion cases and hiring. Mathematicians don’t. Biologists have chalk talks during job interviews. Mathematicians don’t. Mathematicians have a jobs wiki. Biologists don’t. Mathematicians write ten page recommendation letters. Biologists don’t. Biologists go to retreats to converse. Mathematicians retreat from conversations (my math department used to have a yearly retreat that was one day long and consisted of a faculty meeting around a table in the department; it has not been held the past few years). Mathematics graduate students teach. Biology graduate students rotate. Biology students take very little coursework after their first year. Mathematics graduate students take two years of classes (on this particular matter I’m certain mathematicians are right). Biologists pay their graduate students from grants. Mathematicians don’t (graduate students are paid for teaching sections of classes, usually calculus). Mathematics full professors that are female is a number (%) in the single digits. Biology full professors that are female is a number (%) in the double digits (although even added together the numbers are still much less than 50%). Mathematicians believe in God. Biologists don’t.

How then can biology, specifically genomics (or genetics), exist and thrive within the mathematics community? And how can mathematics find a place within the culture of biology?

I don’t know. The relationship between biology and mathematics is on the rocks and prospects are grim. Yes, there are biologists who do mathematical work, and yes, there are mathematical biologists, especially in areas such as evolution or ecology who are in math departments. There are certainly applied mathematics departments with faculty working on biology problems involving modeling at the macroscopic level, where the math fits in well with classic applied math (e.g. PDEs, numerical analysis). But there is very little genomics or genetics related math going on in math departments. And conversely, mathematicians who leave math departments to work in biology departments or institutes face enormous pressure to not focus on the math, or when they do any math at all, to not publish it (work is usually relegated to the supplement and completely ignored). The result is that biology loses out due to the minimal real contact with math– the special opportunity of benefiting from the extra sense is lost, and conversely math loses the opportunity to engage biology– one of the most exciting scientific enterprises of the 21st century. The mathematician Gian-Carlo Rota said that “The lack of real contact between mathematics and biology is either a tragedy, a scandal, or a challenge, it is hard to decide which”. He was right.

The extent to which the two cultures have drifted apart is astonishing. For example, visiting other universities I see the word “mathematics” almost every time precision medicine is discussed in the context of a new initiative, but I never see mathematicians or the local math department involved. In the mathematics community, there has been almost no effort to engage and embrace genomics. For example the annual joint AMS-MAA meetings always boast a series of invited talks, many on applications of math, but genomics is never a represented area. Yet in my Junior level course last semester on mathematical biology (taught in the math department) there were 46 students, more than any other upper division elective class in the math department. Even though I am a 50% member of the mathematics department I have been advising three math graduate students this year, equivalent to six for a full time member, a statistic that probably ranks me among the most busy advisors in the department (these numbers do not even reflect the fact that I had to turn down a number of students). Anecdotally, the numbers illustrate how popular genomics is among math undergraduate and graduate students, and although hard data is difficult to come by my interactions with mathematicians everywhere convince me the trend I see at Berkeley is universal. So why is this popularity not reflected in support of genomics by the math community? And why don’t biology journals, conferences and departments embrace more mathematics? There is a hypocrisy of math for biology. People talk about it but when push comes to shove nobody wants to do anything real to foster it.

Examples abound. On December 16th UCLA announced the formation of a new Institute for Quantitative and Computational Biosciences. The announcement leads with a photograph of the director that is captioned “Alexander Hoffmann and his colleagues will collaborate with mathematicians to make sense of a tsunami of biological data.” Strangely though, the math department is not one of the 15 partner departments that will contribute to the Institute. That is not to say that mathematicians won’t interact with the Institute, or that mathematics won’t happen there. E.g., the Institute for Pure and Applied Mathematics is a partner as is the Biomathematics department (an interesting UCLA concoction), not to mention the fact that many of the affiliated faculty do work that is in part mathematical. But formal partnership with the mathematics department, and through it direct affiliation with the mathematics community, is missing. UCLA’s math department is among the top in the world, and boasts a particularly robust applied mathematics program many of whose members work on mathematical biology. More importantly, the “pure” mathematicians at UCLA are first rate and one of them, Terence Tao, is possibly the most talented mathematician alive. Wouldn’t it be great if he could be coaxed to think about some of the profound questions of biology? Wouldn’t it be awesome if mathematicians in the math department at UCLA worked hard with the biologists to tackle the extraordinary challenges of “precision medicine”? Wouldn’t it be wonderful if UCLA’s Quantitative and Computational biosciences Institute could benefit from the vast mathematics talent pool not only at UCLA but beyond: that of the entire mathematics community?

I don’t know if the omission of the math department was an accidental oversight of the Institute, a deliberate snub, or if it was the Institute that was rebuffed by the mathematics department. I don’t think it really matters. The point is that the UCLA situation is ubiquitous. Mathematics departments are almost never part of new initiatives in genomics; biologists are all too quick to glance the other way. Conversely, the mathematics community has shunned biologists. Despite two NSF Institutes dedicated to mathematical biology (the MBI and NIMBioS) almost no top math departments hire mathematicians working in genetics or genomics (see the mathematics jobs wiki). In the rooted tree in the figure above B can represent Biology and M can represent Mathematics and they truly, and sadly, are independent.

I get it. The laundry list of differences between biology and math that I aired above can be overwhelming. Real contact between the subjects will be difficult to foster, and it should be acknowledged that it is neither necessary nor sufficient for the science to progress. But wouldn’t it be better if mathematicians proved they are serious about biology and biologists truly experimented with mathematics?

Notes:

1. The opening paragraph is an edited copy of an excerpt (page 2, paragraph 2) from C.P. Snow’s “The Two Cultures and The Scientific Revolution” (The Rede Lecture 1959).
2. David Mumford’s content on his site is available under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License, and I have incorporated it in my post (boxed text) unaltered according to the terms of the license.
3. The meaning of the word “invariant” in “phylogenetic invariants” differs from the standard meaning in mathematics, where invariant refers to a property of a class of objects that is unchanged under transformations. In the context of algebraic geometry classic invariant theory addresses the problem of determining polynomial functions that are invariant under transformations from a linear group. Mumford is known for his work on geometric invariant theory. An astute reader could therefore deduce from the term “phylogenetic invariants” that the term was coined by biologists.

61 comments

Comments feed for this article

December 30, 2014 at 10:44 am

Damian Kao

From my experience, the frosty attitude of biologists towards mathematics is partly rooted in being intimidated by the subject and also perception of its utility (lack of).

The intimidation and, as a result, unwarranted back-lash is more indicative of the academic culture where people just don’t want to seem stupid. Academics are prideful creatures. I am not sure if this can be fixed.

The perception that mathematics is not very useful is short-sighted. While I agree that practical applications of mathematics might not apply to a lot of the experimental work being conducted (is that a fault of the experimental design?), I do think that mathematics offer its students a way to think about problems in a consistent manner. There are a lot of hand-wavy abstractions in biology that could benefit from a rigid mental framework.

What I’ve always enjoyed about the little mathematics I’ve encountered is that it is explicit in the best way. It may take me a couple of days to digest a technical mathematics-heavy paper, but at least I know that every part of the thought process is recorded down explicitly in a consistent mathematical language. I cannot say the same for a lot of biology papers where definitions of terms can possibly be different depending on what lab you are in or the historical context of the field.

December 30, 2014 at 11:45 am

Liana Lareau

A lot to like (and think about) here, but I particularly appreciate your Gian-Carlo Rota quote. When I took his probability class I remember being afraid he’d look down on my choice to add a double major in biology. Later I learned he’d been very interested in computational biology; he died that year before I had a chance to talk to him about it but it encouraged me to keep going in that direction.

December 30, 2014 at 12:00 pm

bckirkup

I’m a biologist, my brother is a mathematician. I cannot say that I see mathematics as somehow seeing everything in the right way. Mathematicians, including those other than my immediate relations, have ways of deciding which problems are interesting that pose difficulties for a biologist. They even pose difficulties for engineers. It seems to take a number of steps of translation to get from mathematics to many of the sciences; and from there, several steps to real applications. By ‘real applications,’ I am depicting the gap between research chemistry and managing an oil refinery, or research microbiology and a hospital laboratory.

There is real danger when people get their hands on tools with which they are inexperienced. It takes time to even appreciate the skill of a master.

Despite the vast gap that I have experienced, I am not gloomy about the relationship between the fields, because there are growing classes of intermediaries. A sort of multivariate stepped impedance matching. The requirement for mediation certainly appears to slow the progress of solving world problems, but on the other hand, how well did it work out for a mathematician to directly change society in 1969?

December 30, 2014 at 9:08 pm

jcbeer

It seems to me that there exists plenty of contact between mathematics and biology; in fact there are entire disciplines that apply mathematics to biology. These are not purely mathematics disciplines, but interdisciplinary fields such as bioinformatics, computational biology, and biostatistics. Why should we prefer that genomics and genetics work be done in math departments rather than in these departments? Or do you think that mathematicians can make unique contributions where these other disciplines cannot?

December 31, 2014 at 11:00 am

Jonathan Badger

I’m a bioinformatican/computational biologist/computational genomicist, and while I know more mathematics than most bench types, my knowledge of it is pretty shallow compared to a real mathematician. Just like bench biologists know that sometimes they have to work with real chemists despite knowing a bit of chemistry themselves, computational biologists need to know when real mathematicians are needed.

December 31, 2014 at 12:42 pm

Lior Pachter

As I acknowledged in my post, there are contributions to be made by statisticians, computer scientists and many others. I just think that mathematicians can make their own unique contributions, and I’m not claiming those contributions would be more or less important, just that they could be different and valuable.

December 31, 2014 at 6:59 am

Titus Brown

I’ve benefitted greatly from an undergrad math degree, as well as from having been trained in physics labs, while getting my PhD in a developmental molecular biology lab and then moving into a CS faculty position. (My first independent paper applied percolation theory to study how Bloom filters could be used to accurately store De Bruijn sequence graphs for metagenome assembly ;).

From being more of a mongrel than many, I would say that my experience is that the different fields have different and often complementary ways of thinking, so it’s valuable to get serious cultural immersion in many fields. I’m not sure I’d go any further than that.

December 31, 2014 at 10:23 am

virgilio leonardo ruilova castillo

Reblogged this on Virgilio Leonardo Ruilova Castillo.

December 31, 2014 at 11:07 am

Christos Ouzounis

We probably need a new mathematics for a new biology, most likely to come from mathematicians. Problem is that learning curves for both disciplines are very steep. Few individuals can (and do) master both.

December 31, 2014 at 12:46 pm

Lior Pachter

One positive recent development is a large increase in double majors in mathematics and biology or statistics and biology. I was reminded of this last night when I met an alumna of my Math 10 class who then decided to double major in statistics and biology (and is now applying to medical school). I see that happening a lot more these days than it used to (at least here at UC Berkeley).

December 31, 2014 at 12:00 pm

a.c.r

Be aware you’re linking to creationist literature (your transformingteachers.org link) that makes some patently false claims (in the page itself, not just the site as a whole). If you want to do that, fine; I just write this in case you did it by accident. I understand linking is not endorsement.

(I gather from your interest in “conserved regions using phylogenetic methods” that you’re no creationist yourself.)

December 31, 2014 at 12:40 pm

Lior Pachter

Yes, I did realize that and I’m certainly not a creationist… but based on my experience in mathematics and biology I think the numbers quoted in the post are approximately true.

December 31, 2014 at 12:09 pm

j2kun

As a mathematician, what I don’t appreciate is the double standard implied by Nature’s response. A mathematician (or engineer, or physicist, or sociologist) is expected to know what genes are and what a phylogenetic tree is. Or if not to look up the concepts, or to skip article in question. Yet not only do they claim half of all Nautre readers will have never heard of a “higher degree polynomial” or “complex number,” but also that these concepts are too technical to look up!

You claim the problem is that mathematicians are not giving examples biologists can understand (and that there are many great examples to give). That may be part of it. But an equally large part is that scientists aren’t willing to learn the most basic terms of mathematics.

December 31, 2014 at 1:00 pm

KCd

You write “Biologists have chalk talks during job interviews. Mathematicians don’t.” Surely this is backwards (as far as stereotypes go); I have seen some job talks in math using slides, but most use a blackboard.

December 31, 2014 at 2:45 pm

Lior Pachter

The “chalk talk” is a biology tradition of having the job candidate give a follow up talk to the main seminar, usually only to faculty, and typically without slides.

December 31, 2014 at 1:38 pm

grepgrok

Really enjoyed this post! As a physics major who went to medical school, I must say, I became very depressed as I slowly learned I would never have the world-turned-inside-out experiences in grad school that I had every day as a Physics undergrad, though I should have gotten the hint from my biochemistry professor, who also came to biology from physics, when he told me “Physicians don’t do math. You’ll see.”

It has been a long road, 10 years now, to retool my education to accomodate a space for math. Working my way through Durbin and Feldenstein’s books right now, solving Rosalind problems along the way. I have to say, the open source community deserves a lot of props for creating tools (e.g. iPython) that allow an independent learner to grapple with this stuff.

December 31, 2014 at 2:35 pm

Mike White

Your point that mathematicians have the right idea about requiring more graduate coursework deserves an entire blog post of its own. I’d be interested in hearing your thoughts.

The low course load in biology programs certainly allows people to dive right into research, but it seems like the consequence is that the common knowledge base among trained biologists is very small, limited mostly to what we learned as undergraduates.

December 31, 2014 at 5:53 pm

idoerg

Lior, to understand something about the nature of this gap and how it formed, I recommend reading “The Growth of Biological Thought” by Ernst Meyer. Published in the early ’80s, it describes why large aspects of biology grew with little or know quantitative support: in many cases, because it was simply irrelevant to the science. Of course biology always had quantitative components, including population biology, ecology, biophysics and neurobiology (all preceding bioinformatics, and other information-rich subfields).

Riveting and thoughtful post, as usual.

December 31, 2014 at 7:37 pm

davidormeno

Reblogged this on Arcanus's Random Stuffs.

January 1, 2015 at 3:26 pm

Tom Chou

Nice article. I see you saw our “initiative” at UCLA and mentioned our dept. Indeed the numerous administrators at UCLA are very adept at forming new concoctions and “movements”! There is some backstory as to why Math (and in my opinion Physics) is not a big part of this.

BTW, I think the math-bio combination has been an exercise in extremes. What I mean is, there was an identified need to more quantitative analysis and modeling in the biosciences. But most of the attention has focused on mathematics and statistics. Forgotten are the legions of physicists, chemists, and engineers who in my opinion could have had a much more direct and immediate impact on biology. Somehow a biologist, when told basic math is required (maybe calculus!?) would then seek out a mathematician. I can see how most problems would not intrinsically interest a mathematician, but that a physicist would find “nice.” Basically, biologists in general do not have a high enough resolution on the other physical sciences.

In the end though, I would say what is most dangerous is a biologist who thinks they understand mathematical modeling, rather than the other way round. They go off and do the “math” or modeling or analysis themselves, spreading wrongness. This 9 steps back, and 10 steps forward approach should have been better mitigated early on.

I do think that the language barrier is MUCH higher between math and biology than it is between physics and biology, or say, chemical engineering and biology. I think, sadly, this whole effort of quantifying the biosciences, was carried about rather inefficiently by not planning and understanding what all fields had offer. This is why there is some division between physics and math approaches to biology. At UCLA, I would say there is even a cultural/philosophical difference between CS and math/stats approaches to biology, which may contribute to why Math/Physics do not seem to be so gung-ho about this old initiative.
(I say old because UCLA tried to initiate a similar comp. bio. initiative in 2000, but attempts at nepotistic hiring by members of the large search committee at the time led to gridlock and the money being given to CNSI instead).

January 1, 2015 at 4:23 pm

Norman Yarvin

When writing something like this obituary for non-mathematicians to read, whenever one is tempted to write “beauty” one might consider substituting “terror” — because that’s more or less how outsiders perceive it, at least if they think they might have to learn it. “The power and terror of his work…”

Likewise, for “unique” one might substitute “crazy”. “His crazy skill was to eliminate all unnecessary hypotheses and burrow into an area…”

May 26, 2015 at 8:15 pm

Jerry R

haha!! 🙂

January 1, 2015 at 6:24 pm

The biggest regret of my undergraduate and grad school education is how little time and opportunity there was to take math classes. Sure, nobody stopped me from doing that, but it’s not as if there was the kind of time one needs to dedicate to the subject to become sufficiently immersed in it and understand it in depth.

This is an area in which IMHO biology departments should do a lot of long and hard thinking regarding the priorities of their graduate programs and how they should be revamped. There is no reason why mathematicians and physicists should be taking 2 years of real, serious courses in grad school while biologists should be sent to the bench basically the very moment they arrive on campus in their first year (which is what my and many others’ experience has been). That hurts biology education too, but it is really detrimental when it comes to the absence of math in the curriculum. Of course, one can’t possibly hope to even get to things like schemes within a biology program (nor is it really needed for everyone), but still, there is a lot of essential foundation that needs to be provided but is currently not, and it’s up to the individual students to figure it out on their own.

The problem is that without guidance from more experienced people one only realizes what areas of math it would have been good to have learned after reading a lot of primary research papers, at which point it is often too late…

One can, of course, get really cynical and blame the way students are often seen as cheap and disposable labor — what incentive is there from that perspective to invest a lot of time and effort to educate them into more than they will need to function at the bench? I really hope that kind of thinking has no role in the genesis of the current situation, but there are reasons to think otherwise…

January 2, 2015 at 2:16 am

joskid

Reblogged this on josephdung.

January 2, 2015 at 2:50 pm

barry goldman (@barrygoldman1)

I’ve found this divide puzzling all through my undergraduate education. And I think it operates at MUCH more basic levels than the ones you are describing.

I started life off informally drenched in biology and mathematics and then computer science. As I got my start in both fields from the cradle so to speak, I guess I was not infected with this academic divide between the two fields. for odd reasons i did not do biology in college (except for a mindfucking molecular bio of cell course). eventually i ended up with a degree in mathematics and while i can still invent a curious problem and find a proof for a solution (i.e. a few years back i was proud that when someone asked me about why a deck of cards gets randomly shuffled after only 7 shuffles i was in about 2 weeks able to come up with an adequate definiton for myself of what that question would mean and came very close to getting started on the solution that persi diaconis found), i never quite felt i got the nack of professional mathematics.

Nevertheless i always appreciated how my mathematics background (and computer science) allowed me to engage in the ability to generalize and imagine really fucked up ways of looking at things. (numbers are points on a line or they are sequences of digits!) In fact, a most basic thing about mathematics which i constantly point out to my tutoring students is: in mathematics you usually have 2 or 3 totally different toolsets to apply to any problem: geometrical/visual, algebraic/verbal, arithmetical/gut calculation.

Other mathematical habits I also found valuble were going around and just inventing arbitrary structures and seeing what happens. Allied with this is the experience of creating a set of definitions (a group) or a set of constraints (having only one and self as factors) and then figuring out all the examples that follow these and exploring the variety you get. (i.e. def of platonic solid or prime number of group gives you some wild variety but not infinitely chaotic variety) I belive it is this background that gives me the intuition that of course given the properies of bacteria they can have evolved into all life on earth, or given the properties of chemical interactions, two dozen different elements can spontaneosly form into life.

I remember in high school and then later finding it mind blowing when I understood the rudimentary accounts of what happens when you comine two different formalisms, like put a group structure AND a topological structure on a space and see what happens.

I also appreciated how my biology background gave me a WEALTH of crazy tangled messy examples full of exceptions to experience.

While the particular topics you discuss are way over my head (not even sure i would take the time now to attemt to nail them (i felt the obit was rather poorly written and would have required me weeks maybe months of study to appreciate) i did and do feel that even at the undergraduate level math and biology was missing out SO MUCH not hanging with each other.

All areas of biology are so full of crazy complexity that thinking about them would benefit from having had the experience of making the abstractions one does in basic mathematics (the nested sequence of number spaces, the crazy habits of things like studying functions between number spaces and then .. what the hell .. make a number space out of functions… Topology..)

I remember thinking what a gold mine for inventing new math when i was studying the geometries and processes in the molecular biology of the cell. or what about biological classification? it is so arcane, but the new cladistics is a disaster as far as being useful to organize organisms you see in the wild. and how does phylogenetic history depend on population dynamics, feedbacks at the level of developmental networks, ecological interactions. Why are some lineages so hyperdiverse and others not? seems to me there is a goldmine there for mathematicians to see the right kinds of abstractions.

Eespecially i am struck by the fact that most people in evolution (which is a highly mathematically abstract structure what with having to think on multiple hierarchical levels at the same time) have so little appreciation for mathematics. there seems especially throughout history little appreciateion for the properties of dynamical systems to be able to go though discrete bifurcations when various parameters are varied continuously and little appreciations for the concepts of attractors in dynamical systems. surely these concepts can help with the disaster in black and white thinking between ‘gradualists’ and ‘punctuationalists’ of various sorts. I always found it dissapointing.

One thing i recall from my attempt at grad biology was a discussion about reductionism vs holism. I couldn’t understand the distinction and I realized it was because of my background in computer science where we are constantly thinking top down and bottom up back and forth. we look at problems and try to abstract about them and we also start with building blocks and try to build abstract complex behaviors from them. I invented a term half way between the two concepts: constructivism, you constantly take apart and put back together.

I do recall one thing about the divide very vividly though: almost ANY biology professor who’s research I asked about could give me an understanding of what they were doing and get me to see why it was puzzling. I recall spending afternoons talking to my bio professor about the possibly messy tangled mechanisms in the evolution of the ribosome or the lab tachniques used to learn about them. But when it came to math professors, it seemed most of the time we both admitted it would take me YEARS for me to understand even the LANGUAGE he was couching his questions in, let alone then being able to think about the puzzle and appreciate it.

Though I generaly enjoyed doing puzzles and messing with numbers and dynamical systems and looking for patterns and doing some crazy proofs, I generally had a hard time appreciating the motivations behind the various directions my mathematical coursework would take. I could not appreciate Complex analysis for instance! And second semester abstract algebra with maclane and birkhoff’s survey of modern algebra or second semester analysis with the spivak’s book on calculus on manifolds were hopeless for me.

Seems like the main thing going on here is that at this level, I think that there are just plain FEWER human beings that can do this kind of mathematical thinking than can appreciate organismic biology, or molecular biology or population biology!

January 3, 2015 at 6:32 am

Sayan Mukherjee

The question of why mathematics in contrast to computer science and statistics has not had a strong connection with genomics is interesting. Mathematics has had strong connections to other areas of biology, population genetics is often mentioned another example is morphology (D’Arcy Thompson comes to mind).

I was discussing this with some people the other night (I am tempted to make a joke about a statistician, a probabilist, and a biologist walking into a bar) and I think there are a few reasons for the lack of interaction.

1) Time scale and funding structure. The divide in disciplines is a key issue. Math moves slowly needs probably say 20-30 years to bring in new applications if left to its own natural cycle (these numbers are of course approximate). Genomics as a hiring discipline just has not been around long enough to fall into the “natural hiring cycle” unless special initiatives are taken. The applied ends of stats and computer science move faster and adopt application areas faster, here both funding structure and what is valued in terms of impact factors and visibility have a strong role.

2) History. Roughly, the two fields that genomics heavily drew from are molecular biology and statistical genetics.

Statistical genetics and statistics have had a long association (both disciplines share common founders such as R.A. Fisher) so the relation between stats and the stochastic data analysis end of genomics has a history. In much of genomics, stochastic models and data analysis methods that incorporate randomness are required. The stochastically oriented people in math departments have traditionally been probabilists, to be flip if you take a probabilist and make them do data analysis and inference you get a statistician (often a Bayesian statistician). The divide between stats and math I think is for another post and blog.

Molecular biology has historically been deterministic and did not use mathematics. This changed with sequencing technology and the need for alignment tools. Early on mathematicians were involved in developing alignment tools as well as understanding alignment scores, for example Waterman of Smith-Waterman and Karlin of BLAST. The alignment questions and problems are very algorithmic and quickly fell under the purview of computer science. Now where discrete math, algorithms, theoretical computer science divides between computer science and math is another issue. I will say that even in computer science the more applied end has been more involved in genomics.

The question to me is does the genomics community or at least the quantitative methods development end of the genomics community think math should be more involved. If so how to make this happen.

January 3, 2015 at 6:34 am

Peter M

The relatively low impact factor of journals in mathematics compared to those in biology is most likely an artifact of the common definitions used for this statistic. Most journal impact factors count the number of citations to papers published in the particular journal in the 2 (or sometimes) 3 years following publication. This cut-off point may be fine for fast-moving fields like medicine or biology, but is ridiculous for pure mathematics or any mathematical discipline. For instance, I am a computer scientist working in computational argumentation, and my colleagues and I regularly cite (and draw upon) the work of Aristotle. Likewise, the theory of rational expectations which so dominated mainstream economics until the Crash of 2008 arose in work that took over a decade to be understood (and thus cited) by the discipline. A two-year cutoff for journal impact factors makes no sense in mathematical disciplines.

January 3, 2015 at 7:10 am

It’s not just the length of the cycle of absorption of new knowledge into the community, it’s also two other factors:

1) How many papers are published (almost an order of magnitude more in biology)
2) How many citations there are in each paper (again, usually significantly more in biology, and much more these days with OA journals with no length limits)

January 5, 2015 at 5:50 am

Peter M

In addition, citation practices differ between disciplines, as many have observed. In mathematics, if you prove a long-standing conjecture, the group of people who studied the conjecture will move onto other problems, and sometimes even other sub-domains of mathematics. Thus, a very important paper may have the effect of closing down a field of enquiry, rather than opening it up. Who, now, is working on the Fermat-Wiles Theorem, for instance? This is very different to what happens in other academic disciplines.

January 3, 2015 at 8:44 am

There is another force at play. The majority of the world’s pure math funding is security-related and the publications are not necessarily in the open literature. I sincerely believe that the ‘dual use research of concern’ biology is a much smaller fraction of total biology.

January 3, 2015 at 11:05 pm

Avi Levy

In the second to last paragraph, “biologists are all to quick to” -> “biologists are all too quick to”.

January 3, 2015 at 11:23 pm

Lior Pachter

Thanks! Fixed.

January 4, 2015 at 9:07 am

Gp Singh

Impact factor are a much larger concern for biologists than even physicists because most of them are in it for recognition and promotion etc. and not for the science. I think the author is just trying to be politically correct when he says that both are “comparable in intelligence”. Based on my experience it is simply not true. It is very easy for biologists to publish based on lot of (student’s) work and (grant) money, which can not be the case for mathematicians.

January 5, 2015 at 7:11 am

Reinhard Laubenbacher

In my experience, this entry describes the state of affairs fairly accurately, sad as it is. Cultural differences are certainly one factor, and the prevalent lack of more than superficial training in mathematics and statistics for biologists is another, combined with the relative lack of interest within the mathematics community in endeavors that don’t obviously have cool theorems as low-hanging fruit. But the real reason, I suspect, is that neither mathematics nor biology is ready for a relationship that resembles that of physics and mathematics. Currently, the most pressing problem in the life sciences and biomedicine, especially in genomics, is that there is now lots of data (of varying quality) from which one needs to somehow extract information. This is certainly where more mathematically trained scientists can help do a better job with data analytics. In fact, this represents an important employment opportunity for people with math and stats degrees. (See my recent contribution to the blog “On Teaching and Learning Mathematics” of the American Mathematical Society (http://blogs.ams.org/matheducation/).) But what Lior is rightly concerned about is that there are very few examples of where mathematical theory can help organize biological information into knowledge, along the lines of the role mathematical physics has played. This is where the real potential of the collaboration lies. The NSF used to have a program that promoted “computational thinking” in the sciences. While bioinformatics has had a profound impact on biology in redefining the research agenda toward things that generate data to be analyzed, it has not promoted the view that mathematical theory could be the paradigm in which to frame biological reality, based on those data. The NSF also used to have a program supporting the development of theory in biology, which was great and much needed. Unfortunately, this has been discontinued.

Few mathematicians and few biologists know enough about each others’ fields to do what the “Berkeley School,” with collaborators, did for the use of algebraic statistics in evolutionary biology. Until the need for theory in biology becomes obvious to biologists, and until the need for collaboration with the life sciences becomes obvious to the mathematics community, we have no choice but to rely on a small “coalition of the willing” to do the sort of work Lior’s blog points to as the real game changer .

January 5, 2015 at 8:07 am

The problem that I don’t see often stated is that there is a minimal level of mutual understanding of each other’s fields that is needed for these areas of fruitful collaboration to be identified. And because mathematics and biology are so divergent, this is very difficult to achieve. There is a significant asymmetry here too – a mathematician can learn a lot of biology much more easily than a biologist can learn a lot of math (that’s why there are so many mathematicians who successfully moved into biology), because biology is mainly broad, while math is very deep.

But for the full potential to be realized, it is necessary for biologists to learn more math, which can only be meaningfully achieved at the training level (starting as early as high school, not at the graduate level). I have seen mathematicians saying the the role of the mathematician is to provide tools for the scientists to use in their research. Well, that can only work out if the scientists know what tools are available, and this is not at all the case at the moment because of the huge upfront cost associated with learning advanced math. I know that in general, “knowing more math” might help me with my research. the problem is that I don’t really know what math and neither I, nor anyone can learn all the math in the world (because the subject is really huge), much less at depth . Now if you need to have gone though several deep courses to even have an idea what people are talking about (as is the case with algebraic geometry), you’re most likely not going to invest that effort if there isn’t a certain payoff at the end. Especially if as a young scientist you are under absurd pressure to publish as much as possible so that you don’t find yourself out of the system and out of a job at the age of 35-40, with the non-transferable skills developed by doing bench science and the “overqualified” stamp on your forehead that a PhD gives you.

Physics and math have had a fruitful relationship because the subjects have never been that far apart, and because there is a commonly and fairly well understood, quite large and deep body of mathematics that a theoretical physics PhD needs to learn, that was arrived at in much better times for science in general. In biology we can’t even start bridging the gap, and because of how messed up institutionally and culturally biomedical research is right now, that’s not going to change any time soon.

September 25, 2020 at 9:24 am

Nam Nguyen

With my limited understanding of mathematics and biology I’d agree on the notion “In biology we can’t even start bridging the gap”.

For what it’s worth, the even sadder reality is that all 3 disciplines Mathematics, Physics, Biology share fundamental underlying, underpinning _logic_ principles:

– Absolute incompleteness
– Relativistic choice

which are imbedded in the trio per the following:

– Mathematics: “satisfaction-is-not-absolute”.
– Physics: Uncertainty principle.
– Biology: assumed(phenotype) ⇒ unknown(genotype).

———————————————————————–

http://jdh.hamkins.org/satisfaction-is-not-absolute/

“Many mathematicians and philosophers seem to share this perspective. The truth of an arithmetic statement, to be sure, does seem to depend entirely on the structure (N,+,.,0,1,<), with all quantifiers restricted to N and using only those arithmetic operations and relations, and so if that structure has a definite nature, then it would seem that the truth of the statement should be similarly definite."

"Nevertheless, in this article we should like to tease apart these two ontological commitments, arguing that the definiteness of truth for a given mathematical structure, such as the natural numbers […] does not follow from the definite nature of the underlying structure in which that truth resides."

———————————————————————–

https://en.wikipedia.org/wiki/Uncertainty_principle

"asserting a fundamental limit to the precision with which the values for certain pairs of physical quantities of a particle, such as position, x, and momentum, p, can be predicted from initial conditions."

———————————————————————–

assumed(phenotype) ⇒ unknown(genotype).

September 25, 2020 at 10:57 am

davidwlocke

The commonalities of those sciences are more about asserting a Euclidean space. That is where that shared logic comes from. Machine learning happens only in Euclidean space. If the data is taken from hyperbolic space it has to be converted into Euclidean pace before it can be learned from, and then those lessons have to be converted back to hyperbolic space. This is changing, but we are barely there right now.

The asymmetries you mentioned are typical. When you need to learn something, your sample is small, so the asserted normal distribution is skewed and kurtotic. After you learn something, your sample is larger and it fits the asserted normal distribution better. A standard normal exists in Euclidean space. Those skewed and kurtotic, not yet normal, aka pre-normals are in hyperbolic space. After any distribution has actually been normal, its sigma goes up and the data is in spherical space. Those three spaces have very different logics and maths.

Collaborate by knowing what you need and want. A mathematics might be constructable to that end.

When I read “Wetware” years ago, I was shocked that the Kreb’s cycle does not exist. It appears to happen, but it is the result of filters and spaces.

September 26, 2020 at 3:26 am

Ronald Brown

I have tried to cross this gap with articles available from my web site publication list
http://www.groupoids.org.uk/publicfull.html

try numbers 75, 101, 111, 130, 136, 140.

Ronald Brown
Emeritus Professor Bangor University
FLSW

January 8, 2015 at 4:56 pm

genophoria

Reblogged this on genophoria and commented:
An excellent take on interdisciplinary research at the intersection of math and biology!

January 16, 2015 at 2:40 pm

Becky

Leaving aside the larger issue of the math/biology divide, I find Nature’s decision extraordinary. I’m a biologist (and former Nature editor) with no special mathematical training and I found the obituary a great read. Of course I can’t say that I understood it in any real sense, but the article gives quite enough information to get a sense that enormous and far-reaching contributions are being described. What a shame that Nature’s readers won’t get a chance to appreciate Grothendieck’s remarkable life.

January 16, 2015 at 3:07 pm

Lior Pachter

Thanks for the comment. I completely agree. At the end, Mumford and Nature “compromised” so there will be some version of the obituary published, but not the original; see Mumford’s discussion of the story on his blog.

January 30, 2015 at 8:01 am

Ronnie Brown

I had a great experience at Delhi in 2003 where I was asked to give a talk to an International Conference on Theoretical Neuroscience. My talk was entitled: “category theory and higher dimensional algebra: potential descriptive tools in neuroscience”. arXiv:math/0306223 I was fortunate in being able to use experience with: a previous general talk in Longo’s seminar in Paris; lots of talks to children and youngsters on “How mathematics gets into knots”; and discussions with Tim Porter on category theory and analogy. Central to the talk were an “email analogy” for a colimit, i.e. about distributed communication; and the idea that the brain works many dimensionally. Afterwards a senior Indian neuroscientist came up to me and said:”That was the first seminar I have heard by a mathematician that made any sense!” The talk was repeated in Montana to a computational biology department, where the comment was made that it explained new concepts.

So I got the idea that what other scientists may want to hear about from mathematicians is not especially the solution of some “million dollar problem” but about new concepts! And mathematicians have to find ways of explaining these to as wide an audience as possible. We leaned a lot from the experience described my web page “Making a mathematical exhibition”.

February 19, 2015 at 2:41 pm

isomorphismes

Is the email analogy written into that arXiv article?

May 20, 2015 at 3:28 am

Ronnie Brown

The email analogy is a major point in the article.

Another point about Grothendieck is that he was a natural with respect to the rhythm of language: a great writer! See the correspondence available on

http://webusers.imj-prg.fr/~georges.maltsiniotis/ps.html

Other articles of mine are available on my “Teaching and Popularisation” page:

http://pages.bangor.ac.uk/~mas010/publar.html

In an article there on “Popularising Mathematics” I try to evaluate current vogues on University Teaching of Mathematics, and their purpose. In particular I suggest the radical idea of popularising mathemaitcs to maths students! In particular, this should include training/practice in communicating and writing about mathematics. See also the article on “Mathematics in Context”.

There is an old debating society tag: “Text without context is merely pretext.” What about “maths without context”?

May 30, 2015 at 11:49 am

isomorphismes

Seems like there aren’t many mathematics majors, and there are a lot of non-mathematics majors. So I would think there’s more opportunity in appealing to those who don’t already “get it”.

May 30, 2015 at 12:04 pm

isomorphismes

I would probably count John Baez and even Doug Hofstadter as popularisers who talk to the already-mathematical audience. Or things like the article you’re linking, or the bulletin of the AMS. So it seems to me like there is already some popularisation going on amongst the mathematically-literate.

May 24, 2015 at 6:59 pm

Jerry R

Both the biologist and mathematician seek to feel relevant by applying their training to real problems. In spite of this, it’s the problem that demands whether integration is appropriate or whether more is needed. Moreover, if participants don’t understand each others’ goals and desires, what makes them think they can understand the constructed solution well enough to communicate to others?

The two cultures should take more time in negotiating a clear structure of the problem at the outset, including what counts as satisfactory criteria.

“It is not our purpose to become each other; it is to recognize each other, to learn to see the other and honor him for what he is: each the other’s opposite and complement.” ~Hesse

May 26, 2015 at 9:49 am

Harry Hab

Sadly, so much to agree with here. For another example, the new Systems Immunity centre at Cardiff University, led entirely by innumerate experimentalists.

April 28, 2016 at 7:56 am

Amateur Scion

I’ve always wanted to understand the expanding universe of abstraction in mathematics. However my gut tells me that we are at or have passed the event horizon of an intellectual black hole where greater abstraction, while seeming to encircle more and more territory, is actually producing relatively fewer concrete results. There seem to be many social and even psychological pathologies associated with higher modern mathematics. Try to get someone pushing through graduate work in algebraic number theory to give examples in an explanation that don’t themselves rely on severe abstractions. Mathematics may be about to maim itself by refusing to take time to do maintenance on the tether of honest and heartfelt human communication. Compare the contributions of modern mathematics to that of calculus. But today, huge numbers of people can do calculus. Is the course and discourse of modern math going to make it possible for huge numbers of future people to know cohomology? I really can’t list how many young mathematicians I have seen defer to the ‘we are smarter than you are attitude’ when confronted with their own inability to communicate mathematical knowledge to the uninitiated. The answer to this teaching problem can’t be only that fifth graders don’t know how to add up the areas of rectangles.

April 28, 2016 at 2:58 pm

Ronald Brown

What is often not understood about mathematics is that abstraction is about analogy. When one talks about the “commutative law” x*y = y*x one notes that it applies to addition and to multiplication, so part of an anlogy. I have explained this to children in a talk on the mathematics of knots (see my web site). After one such talk, a teacher told me that was the first time in his career anyone had used the word analogy in relation to mathematics!

There is more to be said on this (see my teaching and popularisation web page). I have found biologists very interested in new concepts in mathematics, though not necessarily interested in the “famous problems in mathematics”, a topic on which I have a view expressed on that page.

That is all for now!

April 29, 2016 at 2:41 am

Ronald Brown

There was a misprint in my web site url!

May 1, 2016 at 10:28 am

Victor Camillo

“To a mathematician reality is just a special case.” Biology is a very special case of reality. I tell my students that mathematics is that which is true on every universe. Intelligent beings in a universe without DNA will rediscover schemes. You don’t have to like abstraction or be good at it to recognize its importance—this is true even for mathematicians.

Thought experiment: replace some of the math jargon in the obituary with words like “enzyme” or “chromosome”. This would be acceptable to nature and a lot of pretty fair scientists would have only a vague working idea of exactly what these things are and what their precise structure is.

The Grothendieck obituary was remarkably elegant and compassionate and should resonate with anyone who has solved a quadratic equation in high school? It is above all a celebration of intellectual integrity.

April 14, 2019 at 7:50 pm

davidwlocke

I work in the corporate world. I see accountants in their culture. I see every functional unit in a corporation looking at the world in their way. My term for these observations is functional culture.

When I hear that IT is busting silos, I take it amateurs trying to encode knowledge across silos by ignoring much of the knowledge in those silos. It’s sad that we can only get software written at the 101 level. But, that’s the best programmers can do given that they don’t capture requirements these days. There were other reasons why requirements capture or elicitation led us down a path. Requirements elicitation is the big unsolved problem in programming methodologies and in artificial intelligence.

These functional cultures show up in the technology adoption lifecycle when we have to sell an engagement built on top of discontinuous innovation. We do this to advance the adoption of that discontinuous innovation. We sell it to the early adopter that happens to run a vertical function in a corporation. That vertical function is one of many in a value chain that all share work in a single conceptual model or a functional culture.

It turns out that in philosophy, functional cultures are called epistemic cultures. I discovered this many years after coining the term functional cultures.

That value chain can be found in the industrial classification tree. Mathematics would be one branch, and Biology another. Their conceptual models diverged at some point merged at other points and found discontinuous innovation that built a layers structure that shares only the intractable problem that continuous innovation could not solve. The discontinuities there are dielectric. The separate layer becomes traversable only after a time. At first, the old-new rhetorical contract doesn’t exist. The old is reframed into the new, aka the new-old rhetorical contract is built.

Your discontinuous innovation is the carrier for the value proposition sought by the early adopter. The vertical business gets captured. The epistemic culture gets embedded into the application, and the value proposition succeeds. The developers have to learn the epistemic culture of the early adopter’s firm if they are to succeed in advancing the adoption of their underlying discontinuous innovation.

June 20, 2020 at 11:33 pm

benjamin chu

Reading this in 2020, I’m happy to inform you that UCLA’s math department is indeed partnered with the new quantitative biosciences institute. Maybe Terence Tao is thinking about how to cure cancer right now!

April 22, 2021 at 2:53 am

John D. Brown Junior

Recently I have read master thesis in which the author uses a philosophical framework to explain those differences in both “cultures”. The concept of psychologism (psychology as being prior to logic) is essential in his analysis.

Click to access mnoguiera_msc_thesis_final_for_uploading_20190609.pdf

June 28, 2023 at 6:47 am

Manish

As a PhD student in biology, I understand now that mathematics is crucial and essential to understanding biology in its most basic form. But more often than not biologists are usually looked down upon and are considering not as intellectually superior as mathematicians. So how do you start a dialogue, it’s pretty difficult. I am trying now to learn math ( as much as I can on my own) but based on where I come from, taking biology is considered to be a sign of being poor at math. So I think the true merge of biology and math is truly only possible when beauty of their combination is shown to students at a very young age. I regret not having learnt enough math on a daily basis, because I was not good enough to understand it.

October 19, 2023 at 8:41 am

Mike Roth

Hello, I am about 9 years late to this post (which I found from David Mumford’s book “Numbers and the World”, in the chapter on the Grothendieck obituary for Nature). I work in a Math Department which contains some excellent Math Biologists, so I was very interested in your discussion, especially to try and see the world from their point of view.

In your description of some of the aspects of the Mathematician/Biologist divide, you wrote that “Biologists have chalk talks during job interviews. Mathematicians don’t.” Did you mean it that way, or was that a typo? My experience (in a math department) is the other way around : almost all job talks are chalk talks.

October 19, 2023 at 8:52 am

Lior Pachter

The “chalk talk” in biology is a ritual that is separate from the “job talk”. The latter is delivered to an open audience and is an opportunity for the candidate to present their research, similar to what a job talk might look like in mathematics (except that biolgoists deliver these talks with slides vs. mathematicians who frequently use chalk). The biologists “chalk talk” (usually at a whiteboard using markers) is delivered only to faculty (sometime a handful of postdocs may be present), and it focuses on the research plans, rather than the research accomplishments. In fact it is frequently centered around three “aims” that might form part of an initial NIH R01 grant proposal should the candidate be hired. The chalk talk frequently results in faculty sparring with the candidate, where attempts are made to poke holes in the research plans. In some cases these sparring rounds used to be (and maybe still sometimes are) very heated and occasionally abusive, although in recent years many institutions have tried to moderate this aspect of chalk talks. The purpose of the chalk talk includes an assessment of whether the candidate is ready to run a lab, with all that entails including managing people, managing finances, obtaining samples, ordering etc.

October 19, 2023 at 9:10 am

Mike Roth

Hello — thank you for your speedy reply, and your detailed explanation. I guess that is another example of a cultural divide : I had no idea such types of talks existed!

November 13, 2023 at 1:53 am

John B

I think that one of the root causes of math & biology not mixing well is due to working culture of biology field. More exactly, a biologist is trained from the beginning to be an expert on many areas. Therefore one will see that biologists, want to do their own statistical analysis, do their own programming, do their own image processing, do their own video processing, draw their own images, do their own PCR experiment, do their own Western blot experiements, etc. Biologists see that they have to be experts in all these totally different fields. These is very different than engineering field from where I am where, a mechanical engineer will ask an electronic engineer to design the electronics for his robots. This is because mechanical engineer understand that electronics is not his area of expertise. This is not happening in biology field. As long as the biologists are taught from university benches that they are have to be experts in areas outside of the their are of expertise that is biology, one will have the math and biology will never mix well. Therefore is no surprise to have very complicated math methods in biological journal which are just made up by smashing together randomly math formulas.
Another challenge with biology is that the results in biology are expected to be presented visually. Most of the biological journal require graphical abstract! There is not wonder to see very conclusions based on eye-balling some misleading plots.

January 23, 2024 at 12:17 pm

Wolfgang

Since the discussion is ongoing even after years (which might matches the ongoing importance of the topic), I would like to comment too.

The cultural divide between mathematics and any other exact science using mathematical tools (biology, chemistry, even physics) is, in my opinion, exactly due to people like Grothendieck favoring abstract generalizations in contrast to more mundane applications solving more specific problems. This is not to devalue anything abstract. It is just that scientists usually have a specific problem which they want to solve in mind, and rarely seek for the overarching theory solving all kind of similar problems.

Due to the towering nature of mathematics, were one advanced topic sits on a multitude of more basic ones, it is not astonishing that most people do not have the inclination to understand, for instance, schemes. How could they? How much basic and not so basic concepts would they need to understand first, before getting anywhere close to this level of abstraction? By the way, it is the same the other way around, but rarely spoken about. How many basic concepts a mathematician would have to understand to master anything deep in, say, biochemistry?

It does not help that the mathematical rigor needed in pure mathematics is almost always not necessary in understanding the key ideas conceptually, since in reality the pathological behavior cared for in the precise definitions of the mathematician often do not exist in the actual phenomena (except for the case, when they are the most important thing, like in turbulence). As is the lack of giving examples, or just a few illustrations. It took me years to understand that almost all modern mathematics is still geometry, but in heavy disguise.

It also does not help, that in my personal experience for every nice mathematician there is an arrogant one, which can make it difficult to ask someone for help, or work interdisciplinary.

Thus, in the end it is a question of attitude, language, and the things one values most in a field, which creates the divide. Unfortunately, I do not think, that this will change soon. Except maybe for AI being able to fetch each person at exactly their individual level of understanding (in fact, the same thing, that a good mentor does, but there are just not enough good mentors around).

	flyingmonkey on A note on “How the Gaza…
	Wes J on A note on “How the Gaza…
	David McQuillan on A note on “How the Gaza…
	lewi on A note on “How the Gaza…
	David McQuillan on A note on “How the Gaza…
	Izzy on A note on “How the Gaza…
	Lior Pachter on A note on “How the Gaza…
	Izzy on A note on “How the Gaza…
	Lior Pachter on A note on “How the Gaza…
	M A/K/M on A note on “How the Gaza…

The two cultures of mathematics and biology

Recent Comments

Top Posts & Pages

Recent posts

Archives

Biology

Computational Biology

Computer Science

Ideas

Math

Medicine

Statistics

Blog Stats

61 comments

Leave a comment Cancel reply

The two cultures of mathematics and biology

Share this:

Related

Recent Comments

Top Posts & Pages

Recent posts

Archives

Biology

Computational Biology

Computer Science

Ideas

Math

Medicine

Statistics

Blog Stats

61 comments

Leave a comment Cancel reply