Team:WHU-China/templates/standardpage modelingCas9
From 2013.igem.org
(Difference between revisions)
IgnatzZeng (Talk | contribs) (Created page with "<html> <link rel="stylesheet" href="https://2013.igem.org/Team:WHU-China/basecss?action=raw&ctype=text/css" type="text/css" /> <link rel="stylesheet" href="https://2013.igem.org/Te...") |
IgnatzZeng (Talk | contribs) |
||
Line 9: | Line 9: | ||
<h1 style="font-size:20px;"><b> | <h1 style="font-size:20px;"><b> | ||
Cas9 Off-target Prediction Model</b>.(Abbreviation: Cas9Off Model) | Cas9 Off-target Prediction Model</b>.(Abbreviation: Cas9Off Model) | ||
- | </h1 | + | </h1></br></br> |
<a name="overview"></a> | <a name="overview"></a> | ||
Line 36: | Line 36: | ||
2. 2. Symbol table, Assumption and reasons.</b></h1></br> | 2. 2. Symbol table, Assumption and reasons.</b></h1></br> | ||
<div style="text-align:center;width:100%;"> | <div style="text-align:center;width:100%;"> | ||
+ | |||
+ | <div style="text-align:center"> | ||
<img src="https://static.igem.org/mediawiki/2013/3/39/WHUTable1aCas9.png" /></br> | <img src="https://static.igem.org/mediawiki/2013/3/39/WHUTable1aCas9.png" /></br> | ||
<img src="https://static.igem.org/mediawiki/2013/8/8e/WHUTable1bCas9.png"/></div> | <img src="https://static.igem.org/mediawiki/2013/8/8e/WHUTable1bCas9.png"/></div> | ||
+ | </div> | ||
1. As Cas9 need the guiding of gRNA to cut DNA, the unbounded gRNA and Cas9 are ignored in the analysis, and other gRNA and Cas9 are considered to constantly bind to each other. </br> | 1. As Cas9 need the guiding of gRNA to cut DNA, the unbounded gRNA and Cas9 are ignored in the analysis, and other gRNA and Cas9 are considered to constantly bind to each other. </br> | ||
Line 50: | Line 53: | ||
</br></br> | </br></br> | ||
We employ a NN nearest neighbor model to calculate the △G(i) between gRNA and DNA on each NN position. From the first nucleotide of the target area of gRNA to the 20th, △G(i) of totally 21 position are calculated. We first proved the feasibility of our idea by calculating the correlation between △G(i) and cutting efficiency (employing data from [1]). </br> | We employ a NN nearest neighbor model to calculate the △G(i) between gRNA and DNA on each NN position. From the first nucleotide of the target area of gRNA to the 20th, △G(i) of totally 21 position are calculated. We first proved the feasibility of our idea by calculating the correlation between △G(i) and cutting efficiency (employing data from [1]). </br> | ||
+ | <div style="text-align:center"> | ||
<img src="https://static.igem.org/mediawiki/2013/6/67/WHUResults.png" /></br> | <img src="https://static.igem.org/mediawiki/2013/6/67/WHUResults.png" /></br> | ||
+ | </div> | ||
<center><em>Figure 1. Correlation map between △G(i) and Cas9 cutting efficiency</em></center></br> | <center><em>Figure 1. Correlation map between △G(i) and Cas9 cutting efficiency</em></center></br> | ||
There may be two reason for the negative correlation between △G(1),△G(2).△G(3) and cutting efficiency. </br></br> | There may be two reason for the negative correlation between △G(1),△G(2).△G(3) and cutting efficiency. </br></br> | ||
Line 64: | Line 69: | ||
<b>4.1. Calculation of △G’ of DNA-gRNA binding</br></b> | <b>4.1. Calculation of △G’ of DNA-gRNA binding</br></b> | ||
The calculation method of △G(i) and △G’ is modified from the NN nearest neighbor model introduced in [2]. </br> | The calculation method of △G(i) and △G’ is modified from the NN nearest neighbor model introduced in [2]. </br> | ||
+ | <div style="text-align:center"> | ||
<img src="https://static.igem.org/mediawiki/2013/0/04/WHUCas9gRNA.png" align=right /> | <img src="https://static.igem.org/mediawiki/2013/0/04/WHUCas9gRNA.png" align=right /> | ||
+ | </div> | ||
<center><em>Figure 2. schematic picture of Cas9 digestion, modified from [1]</em></center></br> | <center><em>Figure 2. schematic picture of Cas9 digestion, modified from [1]</em></center></br> | ||
<b>Step1.</b> Set up the binding sequence</br> | <b>Step1.</b> Set up the binding sequence</br> | ||
Line 92: | Line 99: | ||
If a dangling end is determined, determine the first match position following the mismatch position. In the example, this will be position 2 (△G(2)). Set all dangling end position energy as 0, i.e. △G(1)=0, and calculate the first match according to Table 2, i.e. △G(2)=5’TC/G+3’GG/C=-0.58-0.44=-1.02 kcal/mol, </br></br> | If a dangling end is determined, determine the first match position following the mismatch position. In the example, this will be position 2 (△G(2)). Set all dangling end position energy as 0, i.e. △G(1)=0, and calculate the first match according to Table 2, i.e. △G(2)=5’TC/G+3’GG/C=-0.58-0.44=-1.02 kcal/mol, </br></br> | ||
- | <img src="https://static.igem.org/mediawiki/2013/e/e4/WHUDanglingend.png" /></br> | + | <div style="text-align:center"> |
+ | <img src="https://static.igem.org/mediawiki/2013/e/e4/WHUDanglingend.png" /></br></div> | ||
<center><em>Table 2. Nearest-neighbor model for terminal dangling ends next to Watson-Crick pairs in 1 M NaCl, modified from Table 3 of [5]</em></center></br> | <center><em>Table 2. Nearest-neighbor model for terminal dangling ends next to Watson-Crick pairs in 1 M NaCl, modified from Table 3 of [5]</em></center></br> | ||
If no dangling end appears. Determine whether the terminal pair is A-T. If yes, add a terminal AT penalty(+0.05) to the △G(i), and calculate all △G(i) according to Table 3. </br></br> | If no dangling end appears. Determine whether the terminal pair is A-T. If yes, add a terminal AT penalty(+0.05) to the △G(i), and calculate all △G(i) according to Table 3. </br></br> | ||
Line 104: | Line 112: | ||
△G(19)=-1.84 kcal/mol</br> | △G(19)=-1.84 kcal/mol</br> | ||
- | <img src="https://static.igem.org/mediawiki/2013/3/33/WHUPropagation.png" /></br> | + | <div style="text-align:center"> |
+ | <img src="https://static.igem.org/mediawiki/2013/3/33/WHUPropagation.png" /></br></div> | ||
<center><em>Table 3.Nearest-neighbor model, modified from Table 2 of [5]</em></center></br> | <center><em>Table 3.Nearest-neighbor model, modified from Table 2 of [5]</em></center></br> | ||
<b>Step4.</b> Further analysis of internal loops and bulges. </br> | <b>Step4.</b> Further analysis of internal loops and bulges. </br> | ||
Line 117: | Line 126: | ||
<b>Step6.</b> Calculate △G’ We assume △G’ takes up a form of <img src="https://static.igem.org/mediawiki/2013/d/dc/WHUVectorW.png"/> . Where “a” is an 1×19 vector that contain △G(1) to △G(19) as its value, ω is the weight vector. Only the impact of DNA-gRNA interaction (“a”) is counting as a variable, and the △G contributed by other interaction(eg. protein-DNA interaction) are considered as a constant b. This is also why this model cannot predict Cas9 off-target rate of a target without PAM(NGG), which interact with Cas9 rather than gRNA. (Assumption 4) </br></br> | <b>Step6.</b> Calculate △G’ We assume △G’ takes up a form of <img src="https://static.igem.org/mediawiki/2013/d/dc/WHUVectorW.png"/> . Where “a” is an 1×19 vector that contain △G(1) to △G(19) as its value, ω is the weight vector. Only the impact of DNA-gRNA interaction (“a”) is counting as a variable, and the △G contributed by other interaction(eg. protein-DNA interaction) are considered as a constant b. This is also why this model cannot predict Cas9 off-target rate of a target without PAM(NGG), which interact with Cas9 rather than gRNA. (Assumption 4) </br></br> | ||
According to the previous steps, the △G’of our example should be</br> | According to the previous steps, the △G’of our example should be</br> | ||
- | <img src="https://static.igem.org/mediawiki/2013/c/c0/WHUDeltaG.png" /></br></br></br> | + | <div style="text-align:center"> |
+ | <img src="https://static.igem.org/mediawiki/2013/c/c0/WHUDeltaG.png" /></br></div></br></br> | ||
<b>4.2. Correlations between △G’ and Cas9 targeting efficiency </br></b> | <b>4.2. Correlations between △G’ and Cas9 targeting efficiency </br></b> | ||
</br> | </br> | ||
Vikram Pattanayak et al. used in vitro selection and high-throughput sequencing to determine the propensity of eight guide-RNA:Cas9 complexes to cleave each of 1012 | Vikram Pattanayak et al. used in vitro selection and high-throughput sequencing to determine the propensity of eight guide-RNA:Cas9 complexes to cleave each of 1012 | ||
potential off-target DNA sequences. This size is sufficiently large to include tenfold coverage of all sequences with eight or fewer mutations relative to each 22-base-pair target sequence. </br></br> | potential off-target DNA sequences. This size is sufficiently large to include tenfold coverage of all sequences with eight or fewer mutations relative to each 22-base-pair target sequence. </br></br> | ||
- | <img src="https://static.igem.org/mediawiki/2013/1/1d/WHUPrepostselect.png" /></br></br> | + | <div style="text-align:center"> |
+ | <img src="https://static.igem.org/mediawiki/2013/1/1d/WHUPrepostselect.png" /></br></div></br> | ||
The DNA of target sequences and their corresponding potential off-target sites were produced as substrates by PCR and rolling circle amplification. The abundance of each kind of sequence in the pre-selection library will differ from their abundance in post-selection library. This abundance changes reveals the relative targeting efficiency of the Cas9 on certain target. </br></br> | The DNA of target sequences and their corresponding potential off-target sites were produced as substrates by PCR and rolling circle amplification. The abundance of each kind of sequence in the pre-selection library will differ from their abundance in post-selection library. This abundance changes reveals the relative targeting efficiency of the Cas9 on certain target. </br></br> | ||
Line 147: | Line 158: | ||
<b>4.3. Derivation of Cas9 binding model, for off targe prediction of d/aCas9</b></br></br> | <b>4.3. Derivation of Cas9 binding model, for off targe prediction of d/aCas9</b></br></br> | ||
Cas9 must first binds to DNA to cut them. For d/aCas9, </br> | Cas9 must first binds to DNA to cut them. For d/aCas9, </br> | ||
- | <img src="https://static.igem.org/mediawiki/2013/7/76/WHUStep1.png" /></br> | + | <div style="text-align:center"> |
+ | <img src="https://static.igem.org/mediawiki/2013/7/76/WHUStep1.png" /></br></div> | ||
Where E stands for the enzyme - gRNA-Cas9, S the substrate - certain DNA of specific sequence, ES the gRNA-Cas9-DNA complex. We keep calling Cas9 a enzyme for uniformity in this article, though all Cas9 considered in this part(4.3) is deactivated and is actually not an enzyme. | Where E stands for the enzyme - gRNA-Cas9, S the substrate - certain DNA of specific sequence, ES the gRNA-Cas9-DNA complex. We keep calling Cas9 a enzyme for uniformity in this article, though all Cas9 considered in this part(4.3) is deactivated and is actually not an enzyme. | ||
</br></br> | </br></br> | ||
First, we link △G’ with [S], [E] and [ES] through </br> | First, we link △G’ with [S], [E] and [ES] through </br> | ||
- | <img src="https://static.igem.org/mediawiki/2013/e/e8/WHUCas9Kd.png" /></br> | + | <div style="text-align:center"> |
+ | <img src="https://static.igem.org/mediawiki/2013/e/e8/WHUCas9Kd.png" /></br></div> | ||
In a living cell at steady state, the protein concentration is usually kept in a constant. In E.coli this constant is approximately 1nM[9]. The substrate concentration is also fixed, as certain sequence usually has relative fixed copy number in a cell, especially in prokaryote like E.coli. The concentration of certain DNA sequence in a cell is typically</br> | In a living cell at steady state, the protein concentration is usually kept in a constant. In E.coli this constant is approximately 1nM[9]. The substrate concentration is also fixed, as certain sequence usually has relative fixed copy number in a cell, especially in prokaryote like E.coli. The concentration of certain DNA sequence in a cell is typically</br> | ||
+ | <div style="text-align:center"> | ||
<img src="https://static.igem.org/mediawiki/2013/f/f9/WHUDnavivo.png" /></br> | <img src="https://static.igem.org/mediawiki/2013/f/f9/WHUDnavivo.png" /></br> | ||
<img src="https://static.igem.org/mediawiki/2013/5/5d/WHUES.png" /></br> | <img src="https://static.igem.org/mediawiki/2013/5/5d/WHUES.png" /></br> | ||
+ | </div> | ||
For Cas9 guided by two different gRNA targeting at the same sequence,</br> | For Cas9 guided by two different gRNA targeting at the same sequence,</br> | ||
+ | <div style="text-align:center"> | ||
<img src="https://static.igem.org/mediawiki/2013/f/f7/WHUESdivide.png" /></b> | <img src="https://static.igem.org/mediawiki/2013/f/f7/WHUESdivide.png" /></b> | ||
+ | </div> | ||
Lei et.al. and Bikard et.al. use the inhibitory effect of dCas9 to measure the targeting efficiency of different gRNA [3,4]. The regulated florescence represents the inhibitory effect of dCas9. Prashant et.al. employ aCas9 for the same purpose[2]. But they sequence mRNA to measure the activation of the aCas9 guided by various gRNA. Anyway, the concentration of fluorescence protein and mRNA both obey following ODEs (detailed explanation in our TP model, the equation is the same as the equations in [10]). </br></br> | Lei et.al. and Bikard et.al. use the inhibitory effect of dCas9 to measure the targeting efficiency of different gRNA [3,4]. The regulated florescence represents the inhibitory effect of dCas9. Prashant et.al. employ aCas9 for the same purpose[2]. But they sequence mRNA to measure the activation of the aCas9 guided by various gRNA. Anyway, the concentration of fluorescence protein and mRNA both obey following ODEs (detailed explanation in our TP model, the equation is the same as the equations in [10]). </br></br> | ||
- | <img src="https://static.igem.org/mediawiki/2013/4/46/WHUDnadiff.png" /></br></br> | + | <div style="text-align:center"> |
+ | <img src="https://static.igem.org/mediawiki/2013/4/46/WHUDnadiff.png" /></br></div></br> | ||
These equation can reach steady state quickly when compared with the time scale of any in vivo or in vitro experiment. Because according to the data from [3], Cas9-DNA binding can achieve equilibrium within 0~3 min. The reasoning is as follow. </br> | These equation can reach steady state quickly when compared with the time scale of any in vivo or in vitro experiment. Because according to the data from [3], Cas9-DNA binding can achieve equilibrium within 0~3 min. The reasoning is as follow. </br> | ||
- | <img src="https://static.igem.org/mediawiki/2013/0/0e/WHURFPFluores.png" style="width:100%;height:auto;"></br> | + | <div style="text-align:center"> |
+ | <img src="https://static.igem.org/mediawiki/2013/0/0e/WHURFPFluores.png" style="width:100%;height:auto;"></br></div> | ||
<center><em>Figure 3. dCas9 regulation on promoter J23119 (extracted from [3])</em></center></br> | <center><em>Figure 3. dCas9 regulation on promoter J23119 (extracted from [3])</em></center></br> | ||
Notice that, on Figure 3, the RFP started to decrease exponentially 10min after the adding of inducer. This is only possible, when v[mRNA] is hold as a constant. So d[mRNA]/dt=0, which means [TF] is a constant. In this equation, [TF] means the concentration of transcription factor that binding to the promoter, while dCas9 is the only transcription factor in this experiment. According to table 2.1 and 2.2 in [9], the typical mRNA lifetime in E.coli is 2-5 min, the time for protein (Cas9) transcription and translation is 5 min. So the Cas9-DNA binding can achieve equilibrium within (10-5-5~10-5-2) 0~3 min in vivo. So the time needed to achieve equilibrium is much shorter than the experiment time-scale both in vivo and in vitro. </br></br> | Notice that, on Figure 3, the RFP started to decrease exponentially 10min after the adding of inducer. This is only possible, when v[mRNA] is hold as a constant. So d[mRNA]/dt=0, which means [TF] is a constant. In this equation, [TF] means the concentration of transcription factor that binding to the promoter, while dCas9 is the only transcription factor in this experiment. According to table 2.1 and 2.2 in [9], the typical mRNA lifetime in E.coli is 2-5 min, the time for protein (Cas9) transcription and translation is 5 min. So the Cas9-DNA binding can achieve equilibrium within (10-5-5~10-5-2) 0~3 min in vivo. So the time needed to achieve equilibrium is much shorter than the experiment time-scale both in vivo and in vitro. </br></br> | ||
So we can consider the equations are in steady state. </br> | So we can consider the equations are in steady state. </br> | ||
- | <img src="https://static.igem.org/mediawiki/2013/6/61/WHUProtein.png" /></br> | + | <div style="text-align:center"> |
+ | <img src="https://static.igem.org/mediawiki/2013/6/61/WHUProtein.png" /></br></div> | ||
Because dCas9 or aCas9 is the only transcriptional factor for the promoter of measurement, the concentrations of mRNA and fluorescence protein are proportional to the concentration of d/aCas9 binding to the target promoter. So the relative repression or activation activity of d/aCas9 guided by two different gRNA is, </br> | Because dCas9 or aCas9 is the only transcriptional factor for the promoter of measurement, the concentrations of mRNA and fluorescence protein are proportional to the concentration of d/aCas9 binding to the target promoter. So the relative repression or activation activity of d/aCas9 guided by two different gRNA is, </br> | ||
- | <img src="https://static.igem.org/mediawiki/2013/e/e2/WHUMeasurement.png" /> </br> | + | <div style="text-align:center"> |
+ | <img src="https://static.igem.org/mediawiki/2013/e/e2/WHUMeasurement.png" /> </br></div> | ||
Therefore we can link △G’ with the data of [2,3,4], and calculate the relation between △G(i) and △G’. </br></br></br> | Therefore we can link △G’ with the data of [2,3,4], and calculate the relation between △G(i) and △G’. </br></br></br> | ||
In order to predict the off-target rate of d/aCas9. Following equation can be derived. At equilibrium, </br> | In order to predict the off-target rate of d/aCas9. Following equation can be derived. At equilibrium, </br> | ||
- | <img src="https://static.igem.org/mediawiki/2013/9/9a/WHUPb.png" /></br> | + | <div style="text-align:center"> |
+ | <img src="https://static.igem.org/mediawiki/2013/9/9a/WHUPb.png" /></br></div> | ||
So at equilibrium, the probability of a substrate binding with a Cas9 is [E0]/([E0]+Kd). If we set pbw as the probability of d/aCas9 binding to the wrong target, pbr as the probability of d/aCas9 binding to the right target. The off-target rate will be, </br> | So at equilibrium, the probability of a substrate binding with a Cas9 is [E0]/([E0]+Kd). If we set pbw as the probability of d/aCas9 binding to the wrong target, pbr as the probability of d/aCas9 binding to the right target. The off-target rate will be, </br> | ||
- | <img src="https://static.igem.org/mediawiki/2013/e/e7/WHUPbv.png" /></br></br></br></br> | + | <div style="text-align:center"> |
+ | <img src="https://static.igem.org/mediawiki/2013/e/e7/WHUPbv.png" /></br></div></br></br></br> | ||
<b>4.4. Derivation of Cas9 cutting model, for off targe prediction of Cas9</b></br></br> | <b>4.4. Derivation of Cas9 cutting model, for off targe prediction of Cas9</b></br></br> | ||
Cas9 contains two nuclease domain - a RuvC-like domain and a HNH motif flanked by two RuvC-like domains. Each of them responsible for cutting one of the two nucleotide chains[11]. The kinetic of endonuclease catalyzed DNA double strand break is very complex. But fortunately, experiments have showed that most double strand break process can be approximated by a consecutive first-order reaction as below[12,13,14]. RuvC itself also show enzymatic activity consistent with first-order reactions based prediction[15]. </br> | Cas9 contains two nuclease domain - a RuvC-like domain and a HNH motif flanked by two RuvC-like domains. Each of them responsible for cutting one of the two nucleotide chains[11]. The kinetic of endonuclease catalyzed DNA double strand break is very complex. But fortunately, experiments have showed that most double strand break process can be approximated by a consecutive first-order reaction as below[12,13,14]. RuvC itself also show enzymatic activity consistent with first-order reactions based prediction[15]. </br> | ||
- | <img src="https://static.igem.org/mediawiki/2013/e/e6/WHUAbc.png" /></br> | + | <div style="text-align:center"> |
+ | <img src="https://static.igem.org/mediawiki/2013/e/e6/WHUAbc.png" /></br></div> | ||
in the equation A represents the intact DNA duplex, B the DNA molecule in which one of the two strands has been cleaved at the recognition site for the restriction enzyme and C the DNA molecule (or molecules) in which both strands have been cleaved at this site. | in the equation A represents the intact DNA duplex, B the DNA molecule in which one of the two strands has been cleaved at the recognition site for the restriction enzyme and C the DNA molecule (or molecules) in which both strands have been cleaved at this site. | ||
</br></br> | </br></br> | ||
In order to link the apparent first-order rate constant to △G’. We assume both steps of cleaving is classic enzymatic reaction as follow. </br> | In order to link the apparent first-order rate constant to △G’. We assume both steps of cleaving is classic enzymatic reaction as follow. </br> | ||
- | <img src="https://static.igem.org/mediawiki/2013/6/60/WHUSees.png" /></br> | + | <div style="text-align:center"> |
+ | <img src="https://static.igem.org/mediawiki/2013/6/60/WHUSees.png" /></br></div. | ||
With S as the substrate, E the enzyme and P the product. </br> | With S as the substrate, E the enzyme and P the product. </br> | ||
</br> | </br> | ||
One can derive the concentration-time function of C following enzyme kinetic equations. The equation will be like following (derivation details in addendum变成链接): </br> | One can derive the concentration-time function of C following enzyme kinetic equations. The equation will be like following (derivation details in addendum变成链接): </br> | ||
- | <img src="https://static.igem.org/mediawiki/2013/7/7f/WHUKa.png" /></br> | + | <div style="text-align:center"> |
+ | <img src="https://static.igem.org/mediawiki/2013/7/7f/WHUKa.png" /></br></div> | ||
This equation is hard to link with △G’, as</br> | This equation is hard to link with △G’, as</br> | ||
- | <img src="https://static.igem.org/mediawiki/2013/5/5e/WHUKm.png" /></br> | + | <div style="text-align:center"> |
+ | <img src="https://static.igem.org/mediawiki/2013/5/5e/WHUKm.png" /></br></div> | ||
It’s also hard to fit into present data, as there is no kinetic data for Cas9 available now. So we have to change our goal from predict the “exact off target rate” to the “off target probability”. We decide to use the binding probability of Cas9 and certain DNA to indicate the probability of Cas9 cutting the target. </br></br> | It’s also hard to fit into present data, as there is no kinetic data for Cas9 available now. So we have to change our goal from predict the “exact off target rate” to the “off target probability”. We decide to use the binding probability of Cas9 and certain DNA to indicate the probability of Cas9 cutting the target. </br></br> | ||
But this function can tell us that the product concentration will increase in following patterns. </br> | But this function can tell us that the product concentration will increase in following patterns. </br> | ||
- | <img src="https://static.igem.org/mediawiki/2013/c/c0/WHUTheor.png" /></br> | + | <div style="text-align:center"> |
+ | <img src="https://static.igem.org/mediawiki/2013/c/c0/WHUTheor.png" /></br></div> | ||
<center><em>Figure 4. Theoretical curves from the Cas9 cleaving reaction | <center><em>Figure 4. Theoretical curves from the Cas9 cleaving reaction | ||
The curves displaying changes of two different cleaved products. Boundary conditions were set as [A0]=1.0, [B0]=[C0]=0, ka=0.2 min-1,kb=0.1 min-1 for red line; | The curves displaying changes of two different cleaved products. Boundary conditions were set as [A0]=1.0, [B0]=[C0]=0, ka=0.2 min-1,kb=0.1 min-1 for red line; | ||
Line 203: | Line 231: | ||
Pattanayak’s in vitro experiment can reveal the off-target rate in vivo. Because in the experiment the DNA and gRNA-Cas9 concentration is 200nM and 100nM respectively. Every single kind of DNA has a abundance equals to or less than 0.1% (which is approximately the abundance of wild type sequence, the most abundant one), so the concentration of a specific DNA is on the same power(or less than) 0.1nM. Therefore, </br></br> | Pattanayak’s in vitro experiment can reveal the off-target rate in vivo. Because in the experiment the DNA and gRNA-Cas9 concentration is 200nM and 100nM respectively. Every single kind of DNA has a abundance equals to or less than 0.1% (which is approximately the abundance of wild type sequence, the most abundant one), so the concentration of a specific DNA is on the same power(or less than) 0.1nM. Therefore, </br></br> | ||
- | <img src="https://static.igem.org/mediawiki/2013/3/35/WHUDNAcas9.png" /></br> | + | <div style="text-align:center"> |
+ | <img src="https://static.igem.org/mediawiki/2013/3/35/WHUDNAcas9.png" /></br></div> | ||
Nucleolus size according to [K], in vivo protein concentration of mammalian cell from [9] | Nucleolus size according to [K], in vivo protein concentration of mammalian cell from [9] | ||
</br></br> | </br></br> | ||
Line 215: | Line 244: | ||
Addendum</br></br> | Addendum</br></br> | ||
- | <img src="https://static.igem.org/mediawiki/2013/6/6e/WHUDpdt.png" /></br> | + | <div style="text-align:center"> |
+ | <img src="https://static.igem.org/mediawiki/2013/6/6e/WHUDpdt.png" /></br></div> | ||
In a typical endonuclease environment, <img src="https://static.igem.org/mediawiki/2013/3/3b/WHUAkm.png" /> and <img src="https://static.igem.org/mediawiki/2013/9/9b/WHUAe.png" />are always hold. Even in Pattanayak’s paper[1], though the total DNA concentration is 200nM, the concentration every single kind of DNA(with certain sequence) is lower than 0.1nM, which is much lower than KM of any typical restriction enzyme, </br> | In a typical endonuclease environment, <img src="https://static.igem.org/mediawiki/2013/3/3b/WHUAkm.png" /> and <img src="https://static.igem.org/mediawiki/2013/9/9b/WHUAe.png" />are always hold. Even in Pattanayak’s paper[1], though the total DNA concentration is 200nM, the concentration every single kind of DNA(with certain sequence) is lower than 0.1nM, which is much lower than KM of any typical restriction enzyme, </br> | ||
But still, the MM equation remains valid. Because, first, under these conditions, [E] (free E concentration) doesn't change much, because most "enzymes" are in free form and they don't do anything; second, some time after enzyme and substrate are mixed the concentrations of free enzyme sites and of substrate complexed will reach a steady state.[L] </br> | But still, the MM equation remains valid. Because, first, under these conditions, [E] (free E concentration) doesn't change much, because most "enzymes" are in free form and they don't do anything; second, some time after enzyme and substrate are mixed the concentrations of free enzyme sites and of substrate complexed will reach a steady state.[L] </br> | ||
+ | <div style="text-align:center"> | ||
<img src="https://static.igem.org/mediawiki/2013/b/bf/WHUVka.png" /></br> | <img src="https://static.igem.org/mediawiki/2013/b/bf/WHUVka.png" /></br> | ||
<img src="https://static.igem.org/mediawiki/2013/3/38/WHUAbccascade.png" /></br> | <img src="https://static.igem.org/mediawiki/2013/3/38/WHUAbccascade.png" /></br> | ||
+ | </div> | ||
<h1 style="font-size:20px;"><b>Reference</b></h1></br> | <h1 style="font-size:20px;"><b>Reference</b></h1></br> |
Revision as of 17:30, 26 September 2013
Cas9 Off-target Prediction Model.(Abbreviation: Cas9Off Model)
1. Overview
This model aims at predicting the off-target rate of any Cas9-based system in vivo. This model has three key ideas. First, the probability of Cas9 recognizing and binding to a DNA sequence is majorly determined by the affinity of the gRNA and DNA. Different position on the gRNA has a different weight of importance. Second, by analyzing binding equilibrium, dCas9 inhibition data and aCas9 activation data, the model to predict the possibility of gRNA-d/aCas9 binding to certain target in vivo can be constructed. The fitting result of this model also provides the equation to calculate △G’ from △G(i). Finally, By employing the △G’ equation and kinetic analysis, the model that predict the Cas9 in vivo editing off-target rate can be constructed and fitted with high throughput data. The data for Cas9 editing model fitting is generously provided by Vikram Pattanayak and Prof. David Liu, who has published the paper - High-throughput profiling of off-target DNA cleavage reveals RNA- programmed Cas9 nuclease specificity - on Nature Biotechnology, 11 Aug 2013.[1] The data for Cas9 binding model fitting is extracted from the following figures, Fig 2C, S7B, S7C of [2], Fig 5C of [3], Fig 2AB of [4]. The software used to extract high fidelity data is GetData Graph Digitizer V2.22.2. 2. Symbol table, Assumption and reasons.
3. Modeling result
We employ a NN nearest neighbor model to calculate the △G(i) between gRNA and DNA on each NN position. From the first nucleotide of the target area of gRNA to the 20th, △G(i) of totally 21 position are calculated. We first proved the feasibility of our idea by calculating the correlation between △G(i) and cutting efficiency (employing data from [1]).4. Model derivation
4.1. Calculation of △G’ of DNA-gRNA binding The calculation method of △G(i) and △G’ is modified from the NN nearest neighbor model introduced in [2].
One can derive the concentration-time function of C following enzyme kinetic equations. The equation will be like following (derivation details in addendum变成链接):
This equation is hard to link with △G’, as
It’s also hard to fit into present data, as there is no kinetic data for Cas9 available now. So we have to change our goal from predict the “exact off target rate” to the “off target probability”. We decide to use the binding probability of Cas9 and certain DNA to indicate the probability of Cas9 cutting the target.
But this function can tell us that the product concentration will increase in following patterns.
Figure 4. Theoretical curves from the Cas9 cleaving reaction
The curves displaying changes of two different cleaved products. Boundary conditions were set as [A0]=1.0, [B0]=[C0]=0, ka=0.2 min-1,kb=0.1 min-1 for red line;
And [A0]=1.0, [B0]=[C0]=0, ka=0.1 min-1,kb=0.05 min-1 for blue line.
KKKK
Pattanayak’s in vitro experiment can reveal the off-target rate in vivo. Because in the experiment the DNA and gRNA-Cas9 concentration is 200nM and 100nM respectively. Every single kind of DNA has a abundance equals to or less than 0.1% (which is approximately the abundance of wild type sequence, the most abundant one), so the concentration of a specific DNA is on the same power(or less than) 0.1nM. Therefore,
Nucleolus size according to [K], in vivo protein concentration of mammalian cell from [9]
The DNA-Cas9 ratio is of the same order, so it’s reasonable to use the experimental data to predict the Cas9 behavior in vivo.
kkkk
In a typical endonuclease environment, and are always hold. Even in Pattanayak’s paper[1], though the total DNA concentration is 200nM, the concentration every single kind of DNA(with certain sequence) is lower than 0.1nM, which is much lower than KM of any typical restriction enzyme,
But still, the MM equation remains valid. Because, first, under these conditions, [E] (free E concentration) doesn't change much, because most "enzymes" are in free form and they don't do anything; second, some time after enzyme and substrate are mixed the concentrations of free enzyme sites and of substrate complexed will reach a steady state.[L]