Team:WHU-China/templates/standardpage modeling
1. Overview
For a pdf version of the tandem promoter modeling part,click here
This model aims at predicting the final output of a tandem-repeat promoter system, which constitutes of repeated identical sub-promoter. The key idea of the model is that the strength of a promoter system is proportional to the probability of at least one RNA Polymerase (mentioned as RNAP latter) binding on the promoter.
2. Symbol table, Assumption and reasons.
Definition | |
Relative Strength | The relative strength of certain promoter is defined by let the strength of Anderson promoter BBa_J23100 equals to one (in E.coli), and adjust the strength of other promoters accordingly. (http://parts.igem.org/Promoters/Catalog/Anderson) |
Normalized Strength | The normalized strength of certain promoter is calculated by dividing the strength of the promoter by the highest promoter strength in the host. The highest promoter strength can be reached by creating artificial tandem promoter constitutes of the strongest known promoter. |
Symbol | |
[ ] | The symbol of concentration, i.e. [Protein] means the concentration of the protein |
ptot / y | The probability of at least one RNAP(with all of its subunit) binding on the tandem promoter. It also means the normalized strength of the promoter. |
n / x | The number of sub-promoters in the tandem promoter system. |
u | Number of copies of a tandem promoter in a cell |
ξ | Strength constant, equals to the strongest expression level possible (units in fluorenes normalized by a internal reference). |
V | The volume of a cell |
pi | The probability of a RNAP(with all of its subunit) form a RNAP-with complex with the ith sub-promoter in the tandem promoter system. |
qi | qi=1-pi, the probability of a RNAP not binding to the ith sub-promoter |
j | Cooperative factor |
α | Transcription rate constant |
λ | mRNA degradation constant |
v | Translation rate constant |
k | Protein degradation constant |
RNAP | RNA Polymerase |
ODE | Ordinary Differential Equation |
RP / RPc | RNAP-Promoter complex, inactive complex |
RPi | Intermediate complex |
RPo | Open complex |
- 1.It’s assumed that the promoter strength is measured in the same species, with identical environment and growing stage. This ensures that the concentration of all subunits of RNAP, all subunits of ribosome, all RNA degradation enzymes, all kind of proteases and all transportation protein are almost the same.
- 2.In all measurement, the contexts of the promoters remain the same. i.e. same RBS, terminator, protein sequence, up stream element, down stream element and DNA supercoiling.
- 3.All transcriptional factors are not considered in this version of the model, but can be included in the model with some modification to the equations.
- 4.The promoter region is accessible for RNAP(and all kinds of its subunits), which means it’s not in heterochromatin region or any other condition that hamper a normal RNAP-DNA interaction.
- 5.The probability of RNAP binding on the region between two sub-promoter within the tandem promoter system is neglected. As it contributes too little to final ptot.
- 6.The RNAP-DNA binding is assumed to stay on equilibrium in the model. This is reasonable because the open complex formation is a slow rate limiting step of transcription. So in the time scale of open complex formation, RNAP-DNA binding can always reach its equilibrium in neglectable time[1][2]. It’s also observed that the inactive RNAP-DNA complex can be detected on the DNA[3]. (*The following assumption is adopted by the commonly used thermodynamic based model [1], but it’s challenged in the later part of the model. We will first keep this assumption to derive the model, and modified the model for conditions that this assumption do not work. The weakness of this assumption is discussed in detail in here and here链接)
- 7.The probability (the speed) of RPc transforming to RPo is identical to all promoter, i.e. The strength of a promoter is merely related with the probability of RNAP binding to it. it enable us to calculate the promoter strength from the probability of RNAP binding to the promoter.
3. Modeling result
We found that the strength of a tandem promoter system can be interpreted by a simple equation:
Where qi is the probability of a RNAP(with all of its subunit) not forming a RNAP-with complex with the ith sub-promoter, n the number of sub-promoters, j the coordinative factor, and ξ the strength constant.
If we define the highest possible expression level of a promoter in certain species is 1. Then the equation 1 become normalized.
Figure 1. Model fitting result
Y-axis represent the normalized promoter strength, X-axis the number of sub-promoter The blue dot is data extracted from ref.[4] fig.2, the red line is the prediction made by our model,the red dotted line is the 95% prediction bound
- 1.number of sub-promoter,
- 2.kind of sub-promoter,
- 3.order of sub-promoter .
(With a R-square=0.992 and confidence bond of 95% when fitted with our data)
4.Model derivation
The promoter strength may be influenced by various factors. We need to simplify the system into some reasonable toy model by wiping out all relatively trivial factor.
4.1 Expression level Measurement
We use the fluorescence strength to indicate the strength of the promoter(Normalized by a inner reference fluorescence protein(FP) - mCherry. Please check details at the experiment part). Because when the exciting light is fixed, the fluorescence is proportional to the concentration of FP. And FP can be lighted up in a short time after they are synthesis.
4.2 Translation and transcription
According to the Central Dogma
In equation 4, the protein increasing speed is determined by [mRNA] and v. With same RBS, v relates to the efficiency and concentration of ribosome and concentration of amino acids in the cell, which can be considered identical under the experiment condition of comparing different promoter. The protein degradation speed is determined by [protein] and k. k relates to protease system in the cell, which can also be considered as identical in measurements between different promoter.
In equation 3, the mRNA increasing speed is determined by [RP] and α, and its degradation depends on [mRNA] and λ. Both α and λ can be treated as constant in the experimental condition of comparing different promoter. As α depends on the transcription initiation efficiency, which is assumed to be identical for any RNAP-DNA complex for simplicity. This is reasonable because if α varies, the difference of α can be incorporated in [RP] (and finally in pi, see latter derivation). Though this part of the equation varies from the equations in [5], it is justified by the phenomenon that when [RNAP] and [DNA] is hold in a constant, the UTP incorporation is a zero order reaction [2]. And λ depends on the concentration of RNase which doesn’t varies in different promoter measurement.
Therefore, because we are interested in the steady state of the protein expression. We can set,
4.3 RNAP binding and transcription initiation
The open complex formation reaction is as follow.
To evaluate the probability of polymerase binding (pi) we must sum the Boltzmann weights over all possible states of P polymerase molecules on DNA.
So the probability of a RNAP binding to promoter i is,
So the probability of RNAP binding to both promoter i and j is,
Figure 2.Model fitting result of the simpler model
Figure 3. Curve fitting residual plot of the simpler model
Figure 4. Curve fitting residual plot of the final model
5.User Guideline
To employ the model, the user need to assign the pi for each kind of promoter that will be used to construct the tandem promoter.
The simplest way to achieve it is as follow.
1)Using fluorescence protein to indicate the expression level of each promoter or promoter association, optional (normalize it by a internal reference just as we used a RFP in our experiment).
2)To measure the strongest expression level possible in the species. Using a known strongest promoter to construct a tandem promoter that made of 5 repeats of the promoter, to see the strongest expression level.
3)Normalizing other promoter’s expression level by the strongest expression level, which result in the pi of each promoter. As follow.
In this way, the error of the prediction should be less than 4% of the maximum expression rate, as our data showed before.
If the data allow, the user can carry out fit with a variable j, which may varies in different species and cell condition.
Reference:
1.Bintu, Lacramioara, et al. "Transcriptional regulation by the numbers: models." Current opinion in genetics & development 15.2 (2005): 116-124.
2.Buc, Henri, and William R. McClure. "Kinetics of open complex formation between Escherichia coli RNA polymerase and the lac UV5 promoter. Evidence for a sequential mechanism involving three steps." Biochemistry24.11 (1985): 2712-2723.
3.DeHaseth, Pieter L., and John D. Helmann. "Open complex formation by Escherichia coli RNA polymerase: the mechanism of polymerase‐induced strand separation of double helical DNA." Molecular microbiology 16.5 (1995): 817-824.
4.Li, Mingji, et al. "A strategy of gene overexpression based on tandem repetitive promoters in Escherichia coli." Microb Cell Fact 11 (2012): 19.
5.Buchler, Nicolas E., Ulrich Gerland, and Terence Hwa. "Nonlinear protein degradation and the function of genetic circuits." Proceedings of the National Academy of Sciences of the United States of America 102.27 (2005): 9559-9564.