You are currently browsing the tag archive for the ‘network’ tag.

In the August 2013 issue of Nature Biotechnology there were two back-to-back methods papers published in the area of network theory:

  1. Baruch Barzel & Albert-László Barabási, Network link prediction by global silencing of indirect correlationsNature Biotechnology 31(8), 2013, p 720–725. doi:10.1038/nbt.2601.
  2. Soheil Feizi, Daniel Marbach, Muriel Médard & Manolis Kellis, Network deconvolution as a general method to distinguish direct dependencies in networksNature Biotechnology 31(8), 2013, p 726–733. doi:10.1038/nbt.2635.

This post is the first of a trilogy (part2, part3) in which my student Nicolas Bray and I tell the story of these papers and why we took the time to read them and critique them. 

We start with the Barzel-Barabási paper that is about the applications of a model proposed by Barzel and his Ph.D. advisor, Ofer Biham (although all last names start with a B, Biham is not to be confused with Barabási):

In order to quantify connectivity in biological networks, Barzel and Biham proposed an experimental perturbation model in the paper Baruch Barzel & Ofer Biham, Quantifying the connectivity of a network: The network correlation function methodPhys. Rev. E 80, 046104 (2009) that forms the basis for network link prediction in Barzel-Barabási. In the context of biology, link prediction refers to the problem of identifying functional links between genes from data that may be confounded by indirect effects. For example, if gene A inhibits the expression of gene B, and also gene B inhibits the expression of gene C, then if the expression of A increases, it will decrease the expression of B, which in turn increase C. Therefore one might observe correlation in the expression levels of gene A and C, even though there is no direct interaction between them. The Barzel-Biham model is based on perturbation experiments. Assuming that a system of genes is in equilibrium, it is a model for the change in expression of one gene in response to a small perturbation in another.

The parameters in the Barzel-Biham model are entries in what they call a “local response matrix” S (any matrix with S_{ii}=0 for all i). Physical arguments pertaining to perturbations at equilibrium lead to the equations

G=SG \, (off the diagonal) and G_{ii}=1 for all i.                                  (1)

for a “global response matrix” G that can, in principle, be observed and used to infer the matrix S. The innovation of Barzel and Barabási is to provide an approximate formula for recovering S from G, specifically the formula

S \approx (G-I+D((G-I)G))^{-1}                                                       (2)

where D(M) denotes the operation setting off-diagonal elements of M to zero. A significant part of the paper is devoted to showing that the approximation (2) is good. Then they suggest that (2) can be used to infer direct causal links in regulatory networks from collections of expression experiments. Barzel and Barabási claim that the approximation formula (2) is necessary because exact inference of S from G requires solving the intractable system of n^2 equations

G=SG \, (off the diagonal) and S_{ii}=0 for all i.                                  (3)

The assertion of intractability is based on the claim that the equations are coupled. They reason that since the naïve matrix inversion algorithm requires O(m^3) operations for m equations, the solution of (3) would require time O(n^6). When we looked at this system, our first thought was that while it is large, it is also structured. We sat down and started examining it by writing down the equations for a simple case: a matrix S for a graph on 3 nodes. We immediately noticed the equations decoupled into n systems of n equations where system is given by S_{ii}=0 and G_{ij} = \sum_k S_{ik} G_{kj}, \, (j \neq i), with the n unknowns S_{i1},\ldots,S_{in}. This immediately reduces the complexity to O(n^4), or even O(n^3) by simple parallelization. In other words, the system is trivially tractable.

But there is more: while looking at the paper I had to take a quick bathroom break, and by the time I returned Nick had realized he could apply the Sherman-Morrison formula to obtain the following formula for the exact solution:

S=I-D(1/G^{-1})G^{-1}.                                                                    (4)

Here the operator “/” denotes element-wise division, a simple operation to execute, so that inferring S from G requires no more than inverting G and scaling it, a formula that is also much simpler and more efficient to compute than (2). [Added 2/23: Jordan Ellenberg  pointed out the obvious fact that G=SG off the diagonal means that SG=G-D for some diagonal matrix D, and therefore S=I-DG^{-1} and since the diagonal entries of S must be zero it follows that D=D(1/G^{-1}). In other words, the Sherman-Morrison formula is not even needed]. While it would be nice for us to claim that our managing to quickly supersede the main result of a paper published in Nature Biotechnology was due to some sort of genius, in fact the entire exercise would be suitable for an undergraduate linear algebra homework problem. Barabási likes to compare himself to the great physicist and nobel laureate Subrahmanyan Chandrasekhar, but it is difficult to imagine the great Chandrasekhar having similar difficulties.

The approach to solving (4) has an implication that is even more important than the solution itself. It provides a dual formula for computing G from S according to (1), i.e. to simulate from the model. Using the same ideas as above, one finds that

G=TD(1/T) where T=(S-I)^{-1}.                                              (5) 

Unlike Barzel & Barabási that resorted to simulating with Michaelis-Menten dynamics in their study of performance of their approximation, using (4) we can efficiently simulate data directly from the model. One issue with Michaelis-Menten dynamics is that they make more sense for enzymatic networks as opposed to regulatory networks (for more on this see Karlebach, G. & Shamir, R. Modelling and analysis of gene regulatory networksNat Rev Mol Cell Biol 9, 770–780 (2008)), but in any case performance on such dynamics is hardly a validation of (2) since its mixing apples and oranges. So what happens when one simulates data from the Barzel-Biham model and then tries to recover the parameters?

bb_res

A comparison of the standard method of regularized partial correlations with exact inference for the Barzel-Biham model. Random sparse graphs were generated according to the Erdös-Renyi graph model G(5000,p) where was varied to assess performance at different graph densities (shown on x-axis) The y-axis shows the average AUROC obtained from 75 random trials at each density.

When examining simulations from the Barzel-Biham model with graphs on 5,000 nodes (see Figure above), we were surprised to discover that when adding even small amounts of noise, the exact algorithm (4) failed to recover the local response matrix from G (we also analyzed the approximation (3) and observed that it always resulted in performance inferior to (4), and that 5% of the time the correlation with the exact solution was negative). This sensitivity to noise is due to the term D(1/G^{-1}) in the exact formula which becomes problematic if the diagonal entries of G^{-1} are close to zero.

Some intuition for the behavior of G^{-1} may be gained from noting that if S is such that its geometric sum converges, the diagonal of G^{-1} is equal to that of I+S+S^2+S^3+\cdots. If S has mixed signs and there is significant feedback within the network, the diagonal of G^{-1} may be close to zero and any noise in the measurement of G could create very large fluctuations in the inferred S. This means that the results in Figure 1 are not dependent on the graph model chosen (Erdös-Renyi) and will occur for any reasonable model of gene regulatory networks including the modeling of both enhancers and repressors. From Figure 2a in their paper, it appears that Barzel and Barabási used in their simulation an S with only positive entries that would preclude such effects. Such an assumption is biologically unrealistic.

However, the difficulties with noise for the Barzel-Biham model go much deeper. While a constant signal-to-noise ratio, as assumed by Barzel and Barabasi, is a commonly used model for errors in experiments, it is important to remember that there is no experiment for directly measuring the elements of G. Obtaining G_{ij} from an experiment is done by making a small perturbation of size e to gene i, observing the change in gene j, and then dividing that change by e. This last step increases the noise on the estimate of G_{ij} by a factor of 1/e (a large number, for a perturbative experiment) above the noise already present in the measurements. Increasing e acts to remove the system from the perturbative regime and thereby increases the intrinsic error in estimating G.  It is therefore the case that attaining reasonable error on G will require very low noise in the original measurements. In this case of biological networks this would mean performing many replicates of the experiments. However, as Barzel & Barabási acknowledge in their paper, even a single replicate of a perturbation experiment is not currently feasible.

While the exact algorithm (4) for inverting the Barzel-Biham model performs poorly, we found that a widely used shrinkage method based on partial correlation (Schäfer, J. & Strimmer, K. A Shrinkage Approach to Large-Scale Covariance Matrix Estimation and Implications for Functional GenomicsStatistical Applications in Genetics and Molecular Biology 4, (2005)) outperforms the exact algorithm (blue curve in Figure above). This suggests that there is no input for which (4) might be useful. The method is not even ideal for inference from data generated by the model it is based on.

This brings us to the “results” section of the paper. To demonstrate their method called Silencer, Barzel & Barabási ran it on only one of three datasets from the DREAM5 data. They then compared the performance of Silencer to three out of thirty five methods benchmarked in DREAM5. The Barzel-Biham model is for perturbation experiments, but Barzel & Barabási just threw in data from another universe (e.g. mutual information matrices). But lets just go with that for a moment. Their results are shown in the figure below:

Barzel-Barabasi_Figure

Figure 3 from Barzel-Barabási.

The three methods tested potential improvements on are Pearson, Spearman and Mutual Information. Pearson and Spearman rank 16/35 and 18/35 respectively in the DREAM5 benchmarks. There may be some reason why Silencer applied on top of these methods improves performance: in the case where G is a correlation matrix, the path interpretation given by Barzel and Barabási connects the inference procedure to Seawall Wright’s path coefficients (ca. 1920), which in turn suggests an interpretation in terms of partial correlation. However in the case of mutual information, a method that is ranked 19/35 in the DREAM5 benchmarks, there is no statistically significant improvement at all. The improvement is from an AUROC of 0.67 to 0.68. Amazingly, Barzel and Barabási characterize these results by remarking that they “improve upon the top-performing inference methods” (emphasis on top is ours).  Considering that the best of these rank 16/35 the use of the  word “top” seems, shall we say, unconventional.

We have to ask: how did Barzel & Barabási get to publish a paper in the journal Nature Biotechnology on regulatory network inference without improvement or testing on anything but a handful of mediocre DREAM5 methods from a single dataset? To put the Barzel-Barabási results in context, it is worth considering the standards the Feizi et al. paper were held to. In that paper the authors compared their method to DREAM5 data as well, except they tested on all 3 datasets and 9 methods (and even on a community based method). We think its fair to conclude that significantly more testing would have to be done to argue that Silencer improves on existing methods for biological network inference. 

We therefore don’t see any current practical utility for the Barzel-Biham model, except possibly for perturbation experiments in small sub-networks. Even then, we don’t believe it is practical to perform the number of experiments that would be necessary to overcome signal to noise problems.

Unfortunately the problems in Barzel-Barabási spill over into a follow up article published by the duo: Barzel, Baruch, and Albert-László Barabási, “Universality in Network Dynamics.” Nature Physics 9 (2013). In the paper they assume that the local response matrix has entries that are all positive, i.e. they do not allow for inhibitory interactions. Such a restriction immediately renders the results of the paper, if they are to be believed, moot in terms of biological significance. Moreover, the restrictions on S appear to be imposed in order to provide approximations to G that are unnecessary in light of (5). Given these immediate issues, we suspect that were we to read the Universality paper carefully, it is quite likely this post would have to be lengthened considerably.

These are not the first of Barabási’s papers to package meaningless and incoherent results in Nature/Science publications. In fact, there is a long history of Barabási publishing with fanfare in top journals only to have others respond by publishing technical comments on his papers, in many cases refuting completely the claims he makes. Unfortunately many of the critiques are not well known because they are rejected from the journals where Barabási is successful, and instead find their way to preprint servers or more specialized publications. Here is a partial list of Barabási finest and the response(s):

  1. Barabasi is famous for the “BA model”, proposed in Barabási and Albert ‘Emergence of Scaling in Random Networks“, Science, Vol. 286 15 October 1999, pp. 509-512. Lada Adamic and Bernardo Huberman immediately refuted the practical applications of the model. Moreover, as pointed out by Willinger, Alderson and Doyle, while it is true that scale-free networks exhibit some interesting mathematical properties (specifically they are resilient to random attack yet vulnerable to worst-case), even the math was not done by Barabási but by the combinatorialists Bollobás and Riordan.
  2. Barabási has has repeatedly claimed metabolic networks as prime examples of scale-free networks, starting with the paper Jeong et al., “The large-scale organization of metabolic networks“, Nature 407 (2000). This fact has been disputed and refuted in the paper Scale Rich Metabolic Networks by Reiko Tanaka.
  3. The issue of attack tolerance was the focus of Error and attack tolerance of complex networks by Réka Albert, Hawoong Jeong & Albert-László Barabási in Nature 406 (2000). John Doyle refuted the paper completely in this paper.
  4. In the paper “The origin of bursts and heavy tails in human dynamics” published in Nature 435 (2005) Barabási pretends to offer insights into the “bursty nature of human behavior” (by analyzing e-mail). In a follow up comment Daniel Stouffer, Dean Malmgren and Luis Amaral demonstrate that the reported power-law distributions are solely an artifact of the analysis of the empirical data and the proposed model is not representative of e-mail communication patterns.
  5. Venturing into the field of control theory, the paper “Controllability of complex networks” by Liu, Slotine and Barabási, Nature 473 (2011) argues that “sparse inhomogeneous networks, which emerge in many real complex systems, are the most difficult to control, but that dense and homogeneous networks can be controlled using a few driver nodes.” Not so. In a beautiful and strong rebuttal, Carl Bergstrom and colleagues show that a single control input applied to the power dominating set is all that is required for structural controllability of most, if not all networks. I have also blogged about this particular paper previously, explaining the Bergstrom result and why it reveals the Barabási paper to be a control theory embarrassment.

In other words, Barabási’s “work” is a regular feature in the journals Nature and Science despite the fact that many eminent scientists keep demonstrating that the network emperor has no clothes.

Post scriptum. After their paper was published, Barzel and Barabási issued a press release claiming that “their research moves the team a step closer in its quest to under­stand, pre­dict, and con­trol human disease.” The advertisment seems like an excellent candidate for Michael Eisen’s pressies.

 

The International Society for Clinical Densitometry has an official position on the number of decimal digits that should be reported when describing the results of bone density scans:

  • BMD (bone mineral
    density): three digits (e.g., 0.927 g/cm2). •
  • T-score: one digit (e.g., –2.3).
  •  Z-score: one digit (e.g., 1.7).
  •  BMC (bone mineral content): two digits (e.g., 31.76
    g).
  •  Area: two digits (e.g., 43.25
    cm2).
  •  % reference database: Integer
    (e.g., 82%).

Are these recommendations reasonable? Maybe not. For example they fly in the face of the recommendation in the “seminal” work of Ehrenberg (Journal of the Royal Statistical Society A, 1977)  which is to use two decimal digits.

Two? Three? What should it be? This what my Math10 students always ask of me.

I answered this question for my freshmen in Math10 two weeks ago by using an example based on a dataset from the paper Schwartz, M. “A biomathematical approach to clinical tumor growth“. Cancer 14 (1961): 1272-1294. The paper has a dataset consisting of the size of a pulmonary neoplasm over time:

TumorA simple model for the tumor growth is f(t) = a \cdot b^t and in class I showed how a surprisingly good fit can be obtained by interpolating through only two points (t=0 and t=208):

f(0)= 1.8 \Rightarrow a \cdot b^0 = 1.8 \Rightarrow a = 1.8.

Then we have that f(208) = 3.5 \Rightarrow 1.8 \cdot b^{208} = 3.5 \Rightarrow b= \sqrt{208}{3.5/1.8} \approx 1.00302.

The power function f(t)=1.8 \cdot 1.00302^t is shown in the figure. The fit is surprisingly good considering it is based on an interpolation using only two points. The point of the example is that if one rounds the answer 1.00302 to two decimal digits then one obtains  f(t) = 1.8 \cdot 1^t which is very far from super-linear. In other words, a small (quantitative) change in the assumptions (restricting the rate to intervals differing by 0.01) results in a major qualitative change in results:  with two decimal digits the patient lives, with three… death!

This simple example of decimal digit arithmetic illustrates a pitfall affecting many computational biology studies. It is tempting to believe that \mbox{Qualitative} \subset \mbox{Quantitative}, i.e. that focusing on qualitative analysis allows for the flexibility of ignoring quantitative assumptions.  However frequently the qualitative devil is in the quantitative details.

One field where qualitative results are prevalent, and therefore the devil strikes frequently, is network theory. The emphasis on searching for “universal phenomena”, i.e. qualitative results applicable to networks arising in different contexts, arguably originates with Milgram’s small world experiment that led to the concept of “six-degree of separation” and Watts and Strogatz’s theory of collective dynamics in small-world networks (my friend Peter Dodds replicated Milgram’s original experiment using email in “An experimental study of search in global social networks“, Science 301 (2003), p 827–829) . In mathematics these ideas have been popularized via the Erdös number which is the distance between an author and Paul Erdös in a graph where two individuals are connected by an edge if they have published together. My Erdös number is 2, a fact that is of interest only in that it divulges my combinatorics roots. I’m prouder of other connections to researchers that write excellent papers on topics of current interest. For example, I’m pleased to be distance 2 away from Carl Bergstrom via my former undergraduate student Frazer Meacham (currently one of Carl’s Ph.D. students) and the papers:

  1. Meacham, Frazer, Dario Boffelli, Joseph Dhahbi, David IK Martin, Meromit Singer, and Lior Pachter. “Identification and Correction of Systematic Error in High-throughput Sequence Data.” BMC Bioinformatics 12, no. 1 (November 21, 2011): 451. doi:10.1186/1471-2105-12-451.
  2. Meacham, Frazer, Aaron Perlmutter, and Carl T. Bergstrom. “Honest Signaling with Costly Gambles.” Journal of the Royal Society Interface 10, no. 87 (October 6, 2013):20130469. dpi:10.1098/rsif.2013.0469.
One of Bergstrom’s insightful papers where he exposes the devil (in the quantitative details) is “Nodal Dynamics, Not Degree Distributions, Determine the Structural Controllability of Complex Networks”  by Cowan et al., PLoS One 7 (2012), e88398. It describes a not-so-subtle example of an unreasonable quantitative assumption that leads to intuition about network structural controllability that is, to be blunt, false. The example Carl critiques is from the paper

Controllability of complex networks” by Yang-Yu Liu, Jean-Jacques Slotine and Albert-László Barabási, Nature 473 (2011), p 167–173. The mathematics is straightforward: It concerns the dynamics of linear systems of the form

\frac{d{\bf x}(t)}{dt} =-p{\bf x}(t) + A{\bf x}(t) + B{\bf u}(t) .

The dynamics can be viewed as taking place on a graph whose adjacency matrix is given by the non-zero entries of A (an nxn matrix). The vector -p (of size nis called the pole of the linear system and describes intrinsic dynamics at the nodes. The vector (of size mcorresponds to external inputs that are coupled to the system via the nxm matrix B.

An immediate observation is that the vector p is unnecessary and can be incorporated into the diagonal of the matrix A. An element on the diagonal of A that is then non-zero can be considered to be a self-loop. The system then becomes

\frac{d{\bf x}(t)}{dt} =A{\bf x}(t) + B{\bf u}(t)

which is the form considered in the Liu et al. paper (their equation(1)). The system is controllable if there are time-dependent u that can drive the system from any initial state to a  target end state. Mathematically, this is equivalent to asking whether the matrix C=(B,AB,A^2B,\ldots, A^{n-1}B) has full rank (a classic result known as Kalman’s criteria of controllability). Structural controllability is a weaker requirement, in which the question is  whether given only adjacency matrices and B, there exists weights for edges so that the weighted adjacency matrices satisfy Kalman’s criteria. The point of structural controllability is a theorem showing that structurally controllable systems are generically controllable.

The Liu et al. paper makes two points: the first is that if M is the size of a maximum matching in a given nxn adjacency matrix A, then the minimum m for which there exists a matrix B of size nxm for which the system is structurally controllable is m=n-M+1 (turns out this first point had already been made, namely in a paper by Commault et al. from 2002). The second point is that m is related to the degree distribution of the graph A. 

The point of the Cowan et al. paper is to explain that the results of Liu et al. are completely uninteresting if a_{ii}  is non-zero for every i. This is because M is then equal to n (the matching matching of A consists of every self-loop). And therefore the result of Liu et al. reduces to the statement that m=1, or equivalently, that structural controllability for real-world networks can also be achieved with a single control input.

Unfortunately for Liu and coauthors, barring pathological canceling out of intrinsic dynamics with self-regulation, the diagonal elements of A, a_{ii} will be zero only if there are no intrinsic dynamics to the system (equivalently p_i=0 or the time constants \frac{1}{p_i} are infinite).  I repeat the obvious by quoting from Cowan et al.:

“However, infinite time constants at each node do not generally reflect the dynamics of the physical and biological systems in Table 1 [of Liu et al.]. Reproduction and mortality schedules imply species- specific time constants in trophic networks. Molecular products spontaneously degrade at different rates in protein interaction networks and gene regulatory networks. Absent synaptic input, neuronal activity returns to baseline at cell-specific rates. Indeed, most if not all systems in physics, biology, chemistry, ecology, and engineering will have a linearization with a finite time constant. ”

Cowan et al. go a bit further than simply refuting the premise and results of Liu et al. They avoid the naïve reduction of a system with intrinsic dynamics to one with self-loops, and provide a specific criterion for the number of nodes in the graph that must be controlled.

In summary, just as with the rounding of decimal digits, a (simple looking) assumption of Liu et al., namely that p=0 completely changes the qualitative nature of the result. Moreover, it renders false the thesis of the Liu et al. paper, namely that degree distributions in (real) networks affect the requirements for controllability.

Oops.

Blog Stats

  • 2,241,695 views
%d bloggers like this: