Supplementary information for:

'Linking the Genes: Inferring Quantitative Gene Networks from Microarray Data'


Supplementary information for: de la Fuente, Brazhnik, Mendes (2002) Linking the genes: inferring quantitative gene networks from microarray data. Trends in Genetics 18, 395-398.

Alberto de la Fuente, Paul Brazhnik and Pedro Mendes

alf@vt.edu    brazhnik@vt.edu    mendes@vt.edu

Biochemical Networks Modeling Group

     Virginia Bioinformatics Institute


[Abstract] [Model, equations and parameter values] [Perturbation data] [Analysis of gene network]


Abstract

Modern microarray technology is capable of providing data about gene expression of thousands of genes and even whole genomes. An important question is how this technology can be utilized in the most useful way to unravel the workings of the cellular machinery. Here, we propose a method to infer genetic networks based on data of appropriately designed microarray experiments. Both the experimental setup and the theoretical background will be discussed. In addition to identifying the genes that directly affect a specific other gene, it also quantifies the strength of such effects.


Model, equations and parameter values

The main motivation for employing this method on simulated data is that in this case the structure of the system that produced the data is exactly known, in contrast to data produced by biological systems, now we can evaluate the effectiveness of the method. We use the gene network proposed by Mendoza et al. to control flower morphogenesis in Arabidopsis thaliana (Fig. 1). It is irrelevant for our purposes if this network is indeed correct or what the molecular details behind it are. For this illustration, one should assume the model network is the real system. In order to carry out simulated experiments, we encapsulate the gene network in a model that follows the principles of biochemical kinetics, and is formed by ordinary differential equations representing the concentration of mRNAs. Actual parameter values are not important here, provided that they guarantee the relationships shown in Fig. 1. The simulations were carried out with the software Gepasi and result in concentration ratios, like the fluorescence ratios of real measurements. We think that all methods used in gene expression analysis should be tested to simulated data before they can be applied with confidence to much larger biological data sets. The model is described by non-linear ordinary differential equations as given by equation 1 and the parameter values used are given in Table 1.




Figure 1: The network for flower morphogenesis in Arabidopsis thaliana based on the model as proposed by Mendoza et al. , 2001 where LUG, AG, API, EMF1, TFL1, LFY, SUP, AP3, PI and UFO are gene activities (mRNA expression levels). Lines with arrows indicate activation and blunt lines inhibition. This model differs from the one proposed by Mendoza et al. In their model, no negative self-interactions of the genes are included. We take degradation of the gene products into account as a negative regulatory event, so every gene has a negative self-effect. In the model proposed by Medoza et al., AP3 and PI activate their own transcription. We took account for this (Eq. 1), but in our example the negative self-regulatory effect of these genes through degradation is stronger than the positive regulatory effect through activation of transcription. Thus, the net self effect is of these genes is negative.

(1)       
Where [LUG], [AG], [API], [EMF1], [TFL1], [LFY], [SUP], [AP3], [PI] and [UFO] are the concentrations of the mRNA species. Model parameters are explained and quantified in table 1.

Table 1: Model parameters
Parameter Explanation Value
VLUG overall production rate of mRNA LUG 1 mMs-1
kLUG degradation rate constant of mRNA LUG 1 s-1
VAG overall production rate of mRNA AG 100 mMs-1
KI1LUG inhibition constant for LUG on expression of AG 1 mM
KI1AP1 inhibition constant for AP1 on expression of AG 10 mM
KI1TFL1 inhibition constant for TFL1 on expression of AG 100 mM
KA1LFY activation constant for LFY on expression of AG 10 mM
kAG degradation rate constant of mRNA AG 1 s-1
VAP1 overall production rate of mRNA AP1 10 mMs-1
KI1AG inhibition constant for LUG on expression of AP1 10 mM
KI1EMF1 inhibition constant for API on expression of AP1 10 mM
KA2LFY activation constant for LFY on expression of AP1 10 mM
kAP1 degradation rate constant of mRNA AP1 1 s-1
VEMF1 overall production rate of mRNA EMF1 1 mMs-1
kEMF1 degradation rate constant of mRNA EMF1 1 s-1
VTFL1 overall production rate of mRNA TFL1 100 mM-1
KI1LFY inhibition constant for LFY on expression of TFL1 10 mM
KA1EMF1 activation constant for EMF1 on expression of TFL1 0.1 mM
kTFL1 degradation rate constant of mRNA TFL1 1 s-1
VLFY overall production rate of mRNA LFY 100 mMs-1
KI2EMF1 inhibition constant for EMF1 on expression of LFY 10 mM
KI2TFL1 inhibition constant for TFL1 on expression of LFY 100 mM
KA1AP1 activation constant for AP1 on expression of LFY 10 mM
kLFY degradation rate constant of mRNA LFY 1 s-1
VSUP overall production rate of mRNA SUP 0.1 mMs-1
kSUP degradation rate constant of mRNA SUP 1 s-1
VAP3 overall production rate of mRNA AP3 10 mMs-1
KI1SUP inhibition constant for SUP on expression of AP3 1 mM
KA3LFY activation constant for LFY on expression of AP3 10 mM
KA1AP3 auto activation constant for AP3 on expression of AP3 1 mM
KA1PI activation constant for PI on expression of AP3 1 mM
KA1UFO activation constant for UFO on expression of AP3 1 mM
kAP3 degradation rate constant of mRNA AP3 1 s-1
VPI overall production rate of mRNA PI 10 mMs-1
KI2SUP inhibition constant for SUP on expression of PI 1 mM
KA4LFY activation constant for LFY on expression of PI 10 mM
KA2AP3 activation constant for AP3 on expression of PI 1 mM
KA2PI auto activation constant for PI on expression of PI 1 mM
KA2UFO activation constant for UFO on expression of PI 1 mM
kPI degradation rate constant of mRNA PI 1 s-1
VUFO overall production rate of mRNA UFO 10 mMs-1
kUFO degradation rate constant of mRNA UFO 1 s-1

The model can be downloaded here as a Gepasi file.

Perturbation data



The method requires a first measurement of concentrations of mRNA in a certain reference steady state. Perturbations are made by increasing the rate of transcription of single genes and the effect on the steady state concentrations of all mRNA species are measured in response to that perturbation. Perturbations were made by increasing the overall production rates by 10% in individual experiments. The data is given in table 1.


Table 2: Steady state responses of mRNA concentrations in the gene network to a 10% increase in transcription rates
Steady state concentrations LUG AG AP1 EMF1 TFL1 LFY SUP AP3 PI UFO
Steady state 0 1 1 1 1 1 1 1 1 1 1
Perturbation in rate VLUG 1.1 0.946451 1.04343 1 0.997919 1.002352 1 1.000395 1.000395 1
Perturbation in rate VAG 1 1.112846 0.919299 1 1.004388 0.995072 1 0.999167 0.999167 1
Perturbation in rate VAP1 1 0.983643 1.114812 1 0.99485 1.005839 1 1.000978 1.000978 1
Perturbation in rate VEMF1 1 0.998402 0.990945 1.1 1.018428 0.988819 1 0.998099 0.998099 1
Perturbation in rate VTFL1 1 1 1 1 1 1 1 1 1 1
Perturbation in rate VLFY 1 0.987437 1.00868 1 1.109538 0.990304 1 0.998353 0.998353 1
Perturbation in rate VSUP 1 1 1 1 1 1 1.1 0.986623 0.986623 1
Perturbation in rate VAP3 1 1 1 1 1 1 1 1.124096 1.021906 1
Perturbation in rate VPI 1 1 1 1 1 1 1.124096 1.021906 1 1
Perturbation in rate VUFO 1 1 1 1 1 1 1 1.012361 1.012361 1.1

Analysis of the gene network


We propose to use regulatory strengths as a quantitative description for gene regulatory networks. The regulatory strength of mRNAi on mRNAj through action over the rate of change of mRNAm is written as a product of an elasticity coefficient and a control coefficient:

(2)       

The elasticity coefficient quantifies the effect of a certain mRNAi on the rate of expression of another mRNAm. Elasticities are local (kinetic) properties in that they consider the reaction rate in isolation from the rest of the system. The control coefficient quantifies the effect of the change of rate vm on the concentration of mRNAj. Control coefficients are system properties and only make sense in the context of the system. The elasticity coefficients give the local properties and reflect the effect of a certain mRNAi on one rate only, so the regulatory strengths define the regulation that runs through a certain path in the system. However, the control coefficient gives global properties of the system as a whole. Therefore, we suggest using a special type of regulatory strength; those regulatory strengths that are products of elasticity coefficients and control coefficients that quantify the effect of the rate of change of a certain mRNAj on its own concentration. We call these direct regulatory strengths. This subset of regulatory strengths are given by:

(3)       

Co-control coefficients are ratios between control coefficients and can be calculated from normalized steady state concentrations, such as the data in Table 2.

(4)       

The co-control coefficients can be expressed in terms of the fluorescence ratios, which in turn can be directly obtained from microarrays experiments. Microarray experiments quantify gene expression levels essentially as a ratio of the abundance of mRNA in response to a stimulus to its abundance in a reference state (determined by the ratio of fluorescence intensity of two fluorofors):

(5)       

where F' and F are respectively the fluorescence intensities of the stimulated and reference state, [mRNAi] is the reference concentration of the message of gene i (i = 1,,n) n being the total number of genes analyzed) and [mRNAj]' is the concentration of the same message in the new steady state reached after the stimulus has been applied

We will make use of a central finite differences approximation, (C-C0 )/C, to the scaled derivative dC/C:

(6)       

where C0 is the reference concentration and C is the concentration after perturbation. In microarray experiments, however, one does not determine absolute values but rather a ratio of fluorescence intensities that is equivalent to the ratio of concentrations (FR=C/C0), C0 being the reference concentration). Eq. 6 can then be expressed in terms of the fluorescence ratio FR by dividing denominator and numerator by the same factor C0:

(7)       

Using this result the co-control coefficients can be expressed in terms of fluorescence ratios as in equation 4.

We calcated a sub-set op co-control coefficients, i.e. those which are calculated by dividing a change in a certain mRNA by the change in mRNA which rate of change is perturbed (m=j in Eq 4). We calculated all these coefficients and filled in the matrix according to the following structure (for more details on how this matrix is constructed look
HERE):

(8)       


We then used the relation:

(9)       

to obtain the regulatory strengths:

(10)       

The values for these coefficients determined for a model with a structure depicted in figure 1 were:

(11)       

By using a cut-off value of 0.01, assuming these to be zero, one can exactly reproduce the diagram in Fig. 1

To compare these results to the actual values for regulatory strength, we numerically calculated elasticities and control coefficients (with finite differences in Gepasi) and used the definition of regulatory strength (eq. 3) to calculate the gene regulatory matrix. Note that in order to calculate the regulatory strengths in this way the exact structure of the network must be known; this is the forward approach to calculating regulatory strengths, rather than the reverse approach we described above.

(12)       

The estimate obtained by the in silico experiment is indeed rather good when compared with the solution obtained with the forward approach, indicating that 10% perturbations are appropriate.

Co-control coefficients can be calculated from steady state concentrations of mRNA using a spreadsheet . The calculated co-control matrix can be inverted to obtain regulatory strength matrices by using, for example, Mathematica or a ( Java-based web program for matrix operations).