Team:Tsinghua-A/md

Overview

The expression of interacting genes depends on the structure of gene regulatory networks (GRN). In order to figure out optimal biological networks that can function reliably when faced with fluctuation of DNA template amount (copy number), it is important to do some simulations in advance to narrow the screening scope and theoretically discuss certain functional regulatory motifs that are significant in showing adaptation. Therefore, it would be easier for us to search for and verify adaptive and robust networks in wet lab.

In our project, we abstractly analyzed gene regulatory network topologies and computed all possible three-node network structures by enumeration. We modeled, calculated, tested and made comparisons, and finally screened 2 core optimal network structures out of 19683 network structures, which show great adaptation to DNA copy number. In order to verify the correctness of our screening result, we introduced an optimal testing case and simulated.

Based on our screening results, we analyzed and concluded 9 key motifs that may be essential to adaptation. We respectively simulated these motifs and tried to explain their characters with mathematical proof. Finally we figured out that these key motifs can be combined to obtain better effects.

Construction of Networks And Description Mathematical Description -- ODE Equations

We based on the Michealis-Menten Equation and Hill Equation to describe the kinetics of gene regulatory networks. The M-M Equation and Hill Equation describe the relationship between the production rate of products and the concentration of substrate.

M-M Equations of activation (monotone-increasing) and inhibition (monotone-decreasing) are defined as:

Hill Equations of activation (monotone-increasing) and inhibition (monotone-decreasing) are defined as:

(C is a constant, K is called half maximal effective concentration and n is called hill coefficient.)

In our project, we aimed at all three-node network structures, and each contains 4 nodes (1 input node, 2 regulatory nodes and 1 output node) and vast possible regulatory edges among them.

So we formed ODE equation sets to describe the mutual relationships and form networks. The ODE equation sets involve two 4*4 matrices to respectively bring in the activating effect and inhibitive effect, and a column vector for self-decomposition.

In our wet lab part, we synthetized network and constructed plasmids transferred into Hela cells, and we made some adjustment in our design for a better realization in experiment:

 We assumed that input (I) is not regulated by the network and remained constant.

 We fixed two certain edges (I inhibits A, A inhibits O), because when constructing the system, the components to function as inducers are quite few. So we tend to use 2 repressors to substitute it.

 We limited the maximum number of regulatory nodes to 2 restrict the problem to an acceptable scale.

 We chose relative parameters close to the actual experiments in our simulation. According to previous research, we guaranteed that most parameters are within the already confirmed range, such as reaction rate.

Basic Function Analysis -- Massive Parallel Data Processing

The ideal network in our project must be sensitive to the input signal and be able to distinguish the low input and high input from each other. So we expected that the target structure should have a low-input-low-output and high-input-high-output character. Besides, it should avoid ambiguous interim region between low input and high input.

We firstly raised two basic indexes, High-low Ratio and Interim Slope to evaluate the performance of all three-node network structures. To each structure, we scanned the value of input ranging from 1 to 1000 and recorded the corresponding output.

Then we enumerated all possible network structures and illustrated the filtering result in a 2D map. The X-axis represents High-low Ratio and the Y-axis represents Interim Slope, and there are 19683 three-node structures illustrated as docs in grids in the map.

According to our target, only networks with high High-low Ratio and high Interim Slope can achieve the required function. In other words, only networks located in the top-right corner of the map are acceptable.

Finally we screened 476 three-node network structures out of 19683 network structures in total, which can achieve the high High-low Ratio and high Interim Slope.

The whole process involved massive calculation and parallel data processing. Especially, it requires serious computing power to solve all the ODE equations of 19683 networks and computer cluster is a powerful tool. Computer cluster connects a group of incompact computers which collaborate together to offer stronger processor power and larger space. We finished the first round of screening with the help of High Performance Computing Cluster (HPCC) in Tsinghua University.

3. Adaptation to Copy Number -- Decision Based on Probability Distribution

In order to obtain an ideal and robust network which is adaptive to DNA template abundance (copy number), we firstly did parameter scan analysis and made comparisons among all the selected 476 structures intuitively. We changed the range of reaction rate to represent the change of copy number (proportional relationship), and expected to get a cluster of input-output curves (when copy number changes). Besides, we analyzed the correlativity between copy number and output, and expected the curve to be saturated.

Since the target network structure is supposed to robustly distinguish different cells with either low or high endogenous input signal, there needs to be a saturated tendency to copy number. In other words, the output shouldn’t infinitely increase when the value of copy number increases, especially when the input is low. Otherwise, the network would wrongly regard the low-input kind of cell as the high-input kind when copy number is at high level.

According to our simulation, we selected 111 network structures that tend to be saturated out of 476 structures.

Considering that the actual number of DNA templates that may be transferred into cells is not certain but rather follows a certain distribution in probability, we assumed that the copy number follows Poisson distribution. Then we studied each kind of distribution output follows when inputs are low and high respectively. For a better quantized comparison among the 111 structures, we raised Overlap (the covering part of two output distribution) as index, and hoped to get the optimal network with lowest Overlap.

Finally we figured out 2 optimal network structures, and the corresponding network structures contain following core topologies.

Compared with unsaturated networks, the 2 optimal network performance well when copy number follows Poisson distribution.

Because the target network is expect to be functional in distinguishing different cells, we designed a simulated screening test (sorter case) to verify our results above.

In this optimal test, we offered 1000 cell A with low amount of endogenous microRNA (low input) and 1000 cell B with high amount of endogenous microRNA (high input). And both of them followed Gaussian distribution (continuous form of Poisson distribution). Then we packed the network as a black box sorter which only receives these different cells and tries to distinguish them correctly. We then counted the number of the sorting results, compared with correct decisions and calculated the Accuracy Rate.



Basic Function Analysis -- Massive Parallel Data Processing

The ideal network in our project must be sensitive to the input signal and be able to distinguish the low input and high input from each other. So we expected that the target structure should have a low-input-low-output and high-input-high-output character. Besides, it should avoid ambiguous interim region between low input and high input.

We firstly raised two basic indexes, High-low Ratio and Interim Slope to evaluate the performance of all three-node network structures. To each structure, we scanned the value of input ranging from 1 to 1000 and recorded the corresponding output.

Then we enumerated all possible network structures and illustrated the filtering result in a 2D map. The X-axis represents High-low Ratio and the Y-axis represents Interim Slope, and there are 19683 three-node structures illustrated as docs in grids in the map.

According to our target, only networks with high High-low Ratio and high Interim Slope can achieve the required function. In other words, only networks located in the top-right corner of the map are acceptable.

Finally we screened 476 three-node network structures out of 19683 network structures in total, which can achieve the high High-low Ratio and high Interim Slope.

The whole process involved massive calculation and parallel data processing. Especially, it requires serious computing power to solve all the ODE equations of 19683 networks and computer cluster is a powerful tool. Computer cluster connects a group of incompact computers which collaborate together to offer stronger processor power and larger space. We finished the first round of screening with the help of High Performance Computing Cluster (HPCC) in Tsinghua University.

Discussion

Due to some restrictions in wetlab, we only finished the above-mentioned experiment. We found that the number of Hela cells who possesses high copy number is comparatively low. We also noticed the circuit C’s output is higher than expected in Figure 1.This may cause wrong judge when use the design to detect miR-21. Some measures will be taken to solve this question.
Besides, we are going to endeavor to construct the other networks mentioned in modeling work.

@@ Line 237: / Line 237: @@
 <!-- section: projects -->
 <section id="parts" class="section wood">
-<div class="container" style="text-align:center;">
+<div class="container">
-	<h4>Parts</h4>
+	<h4>3.	Adaptation to Copy Number -- Decision Based on Probability Distribution</h4>
-<div class="aligncenter">
+	<div class="row">
-<table border="1"  style="text-align:center;font-size:30px;">
-<tr style="font-weight:bold;">
+			<div>
-<td>Name</td><td>description</td><td>Part Type</td><td>Designer</td>
-</tr>
+				<p >
-<tr>
+				 In order to obtain an ideal and robust network which is adaptive to DNA template abundance (copy number), we firstly did parameter scan analysis and made comparisons among all the selected 476 structures intuitively. We changed the range of reaction rate to represent the change of copy number (proportional relationship), and expected to get a cluster of input-output curves (when copy number changes). Besides, we analyzed the correlativity between copy number and output, and expected the curve to be saturated.</br></br>
-<td><a href="http://parts.igem.org/Part:BBa_K1116000">K1116000</td><td>LacI with miR-21 and miR-FF3 target</td><td>Composite</td><td>Lei Wei</td>
-</tr><tr>
-<td><a href="http://parts.igem.org/Part:BBa_K1116001">K1116001</td><td>LacI with miR-21 and miR-FF5 target</td><td>Composite</td><td>Shuguang Peng</td>
+<img src="https://static.igem.org/mediawiki/2013/4/47/Model8.jpg.jpg" alt="" style="margin:auto;display:block;text-align:center"/>
-</tr><tr>
+<img src="https://static.igem.org/mediawiki/2013/0/0f/Model9.jpg" alt="" style="margin:auto;display:block;text-align:center"/>
-<td><a href="http://parts.igem.org/Part:BBa_K1116002">K1116002</td><td>TRE-LacI with miR-21 and miR-FF3 target</td><td>Composite</td><td>Shuguang Peng</td>
+</br></br>
-</tr><tr>
+Since the target network structure is supposed to robustly distinguish different cells with either low or high endogenous input signal, there needs to be a saturated tendency to copy number. In other words, the output shouldn’t infinitely increase when the value of copy number increases, especially when the input is low. Otherwise, the network would wrongly regard the low-input kind of cell as the high-input kind when copy number is at high level.
-<td><a href="http://parts.igem.org/Part:BBa_K1116003">K1116003</td><td>TRE-LacI with miR-21 and miR-FF5 target</td><td>Composite</td><td>Shuguang Peng</td>
-</tr><tr>
+				  </br></br>
-<td><a href="http://parts.igem.org/Part:BBa_K1116004">K1116004</td><td>CAG-EYFP-miRNA-FF3</td><td>Composite</td><td>Shuguang Peng</td>
+According to our simulation, we selected 111 network structures that tend to be saturated out of 476 structures.
-</tr><tr>
+</br></br>
-<td><a href="http://parts.igem.org/Part:BBa_K1116005">K1116005</td><td>CAG-Cerulean-hsa-miR-21</td><td>Composite</td><td>Shuguang Peng</td>
+<img src="https://static.igem.org/mediawiki/2013/6/6a/Model10.jpg" alt="" style="margin:auto;display:block;text-align:center"/> </br></br>
-</tr>
+Considering that the actual number of DNA templates that may be transferred into cells is not certain but rather follows a certain distribution in probability, we assumed that the copy number follows Poisson distribution. Then we studied each kind of distribution output follows when inputs are low and high respectively. For a better quantized comparison among the 111 structures, we raised Overlap (the covering part of two output distribution) as index, and hoped to get the optimal network with lowest Overlap.
-</table>
+ </br></br>
-</br></br></br></br></br></br></br></br>
+<img src="https://static.igem.org/mediawiki/2013/9/93/Model11.jpg" alt="" style="margin:auto;display:block;text-align:center"/>
+</br></br>
+Finally we figured out 2 optimal network structures, and the corresponding network structures contain following core topologies.
+</br></br>
+<img src="https://static.igem.org/mediawiki/2013/b/bb/Model14.jpg" alt="" style="margin:auto;display:block;text-align:center"/>
+</br></br>
+Compared with unsaturated networks, the 2 optimal network performance well when copy number follows Poisson distribution.
+</br></br>
+<img src="https://static.igem.org/mediawiki/2013/0/02/Model12.jpg" alt="" style="margin:auto;display:block;text-align:center"/></br>
+<img src="https://static.igem.org/mediawiki/2013/c/c7/Model13.jpg" alt="" style="margin:auto;display:block;text-align:center"/>
+</br></br>Because the target network is expect to be functional in distinguishing different cells, we designed a simulated screening test (sorter case) to verify our results above.</br></br>
+ In this optimal test, we offered 1000 cell A with low amount of endogenous microRNA (low input) and 1000 cell B with high amount of endogenous microRNA (high input). And both of them followed Gaussian distribution (continuous form of Poisson distribution). Then we packed the network as a black box sorter which only receives these different cells and tries to distinguish them correctly. We then counted the number of the sorting results, compared with correct decisions and calculated the Accuracy Rate.
+</br></br>
+<img src="https://static.igem.org/mediawiki/2013/5/57/Model15.jpg" alt="" style="margin:auto;display:block;text-align:center"/></br>
+				</p>
+			</div>
+	</div>
 </div>
+</section>
+<!-- end section: team -->
+<!-- section: life -->
+<section id="exp" class="section blue">
+<div class="container">
+	<h4>Basic Function Analysis -- Massive Parallel Data Processing</h4>
+	<div class="row">
+			<div>
+				<p >
+				  The ideal network in our project must be sensitive to the input signal and be able to distinguish the low input and high input from each other. So we expected that the target structure should have a low-input-low-output and high-input-high-output character. Besides, it should avoid ambiguous interim region between low input and high input.
+                  </br></br>
+				  <img src="https://static.igem.org/mediawiki/2013/b/ba/Model4.jpg" alt="" style="margin:auto;display:block;text-align:center"/>
+				  </br></br>
+				  We firstly raised two basic indexes, High-low Ratio and Interim Slope to evaluate the performance of all three-node network structures. To each structure, we scanned the value of input ranging from 1 to 1000 and recorded the corresponding output.
+				  </br>
+				  </br>
+<img src="https://static.igem.org/mediawiki/2013/d/d6/Model5.jpg" alt="" style="margin:auto;display:block;text-align:center"/>
+</br>
+				  </br>
+                 Then we enumerated all possible network structures and illustrated the filtering result in a 2D map. The X-axis represents High-low Ratio and the Y-axis represents Interim Slope, and there are 19683 three-node structures illustrated as docs in grids in the map.
+				  </br></br>
+According to our target, only networks with high High-low Ratio and high Interim Slope can achieve the required function. In other words, only networks located in the top-right corner of the map are acceptable.
+				  </br></br>
+			      <img src="https://static.igem.org/mediawiki/2013/2/24/Model6.jpg" alt="" style="margin:auto;display:block;text-align:center"/>
+				  </br>
+<img src="https://static.igem.org/mediawiki/2013/4/44/Model7.jpg" alt="" style="margin:auto;display:block;text-align:center"/>
+  </br></br>
+				  Finally we screened 476 three-node network structures out of 19683 network structures in total, which can achieve the high High-low Ratio and high Interim Slope.  </br></br>
+The whole process involved massive calculation and parallel data processing. Especially, it requires serious computing power to solve all the ODE equations of 19683 networks and computer cluster is a powerful tool. Computer cluster connects a group of incompact computers which collaborate together to offer stronger processor power and larger space. We finished the first round of screening with the help of High Performance Computing Cluster (HPCC) in Tsinghua University.   </br></br>
+				</p>
+			</div>
 </div>
 </section>

From 2013.igem.org

Revision as of 19:56, 27 September 2013

Tsinghua-A

Team

Projects

Model

Wetlab

Human Practice

Safety

Home

Overview

Construction of Networks And Description Mathematical Description -- ODE Equations

Basic Function Analysis -- Massive Parallel Data Processing

3. Adaptation to Copy Number -- Decision Based on Probability Distribution

Basic Function Analysis -- Massive Parallel Data Processing

Discussion