From 2013.igem.org

(Difference between revisions)

Revision as of 10:24, 27 September 2013

Slide

Take a gNAP before wearing your gloves! Genetic Network Analyze and Predict

The sketch and final GUI of gNAP!

We compare the result of our software with gene expression profile in literature.

We are USTC-Software!

Methodologies

In order to simulate the GRN’s working and analyze the changing after exogenous gene imported, some advanced algorithms and classical methods are employed in the software. These algorithms and methods include Binary Tree method, Needle-Wunsch Algorithm, Decision Tree method, Hill Equation and PSO Algorithm.
There are five parts of methodologies: Fetch Database, Alignment Analyze, New Network Construction, Network Model and Predict.

Fetch Database

Fetch Database Abstract

Fetch Regulation

Fetch Gene Info

Fetch Promoter Info

Integration

Our software integrates all information we picked out about genes and generates a file named “all_info” —— all information about genes —— for the output graphical interface’s reading. In the meanwhile, the array of objects containing all information has been stored in computer memory which greatly improve the computing speed of our software.

The format of all_info database:
No. promoter_sequence gene_sequence gene_name ID left_position right_position promoter_name description
The fetching module generates three files: old_GRN, all_info and uncertain_database.

Operon Theory and Regulatory Model

Operon Theory

In genetics, an operon is a functioning unit of genomic DNA containing a cluster of genes under the control of a single regulatory signal or promoter.
The genes contained in the operon are either expressed together or not at all.
Several genes must be both cotranscribed and co-regulated to define an operon.

The first time “operon” was proposed is in a paper of French Academic Science, 1960. The lac operon of the model bacterium E. coli was discovered and provides a typical example of operon function. It consists a promoter, an operator, three structural genes and a terminator. The operon is regulated by several factors including the availability of glucose and lactose.

From this paper, the so-called general theory of the operon was developed. According to the theory, all genes are controlled by means of operons through a single feedback regulatory mechanism-repression. The first operon to be described was the lac operon in E. coli. The 1965 Nobel Prize in Physiology and Medicine was awarded to François Jacob, André Michel Lwoff and Jacques Lucien Monod for their discoveries concerning the operon and virus synthesis.

Figure 1. Structure of Operon

An operon is made up of several structural genes arranged under a common promoter and regulated by a common operator. It is defined as a set of adjacent structural genes, plus the adjacent regulatory signals that affect transcription of the structural genes. The regulators of a given operon, including repressors, corepressors and activators, are not necessarily coded for by that operon.

As a unit of transcription, upstream of the structural genes lies a promoter sequence which provides a site for RNA polymerase to bind and initiate transcription. Close to the promoter lies a section of DNA called an operator.

Operon regulation can be either negative or positive by induction or repression. Negative control involves the binding of a repressor to the operator to prevent transcription. Operons can also be positively controlled. An activator protein binds to DNA, usually at a site other than the operator, to stimulate transcription.

Figure 2. Regulation of Operon 1: RNA Polymerase, 2: Repressor, 3: Promoter, 4: Operator, 5: Lactose, 6: lacZ, 7: lacY, 8: lacA. Top: The gene is essentially turned off. There is no lactose to inhibit the repressor, so the repressor binds to the operator, which obstructs the RNA polymerase from binding to the promoter and making lactase.Bottom: The gene is turned on.Lactose is inhibiting the repressor, allowing the RNA polymerase to bind with the promoter, and express the genes, which synthesize lactase. Eventually, the lactase will digest all of the lactose, until there is none to bind to the repressor. The repressor will then bind to the operator, stopping the manufacture of lactase.

Regulatory Model

Similarity and homology

New Network Construction

Random Noise

Filter

Construct new GRN

If there is a three-unit network and they interact with each other as it is shown in the figure. The regulation is described by the GRN matrix.

Figure 5. Example network and its GRN matrix.

If D is the exogenous unit, we can obtain three similarity data sets of D with the units in the original GRN:

Promoter sequence similarity

Gene sequence similarity

Amino acid sequence similarity.

The construction is equivalent to add a new column and a row into the original matrix.

Figure 6. Mathematical Equivalence

When filling the column, D is compared with the regulators of the unit in each row. The regulations in the row are consider separately and marked as “positive group” and “negative group”. The average similarity of each group represents the distance between the exogenous unit and the group. D is supposed to have the larger one’s regulatory direction(positive or negative). The regulatory intensity is the weight average regulation of the chose group. The weight here is the amino acid sequence similarity.

There are two conditions when fill the new row:
1. There are units having the same promoter as the exogenous unit.
2. There is no units having the same promoter as the exogenous unit.

In condition 1, the units sharing the same promoter with the new member are picked out, and the following steps are the same as the construction of the column. The difference is the similarity used here is the gene sequence similarity. As explained in the regulation model part, the promoter is the main regulatory region, but the following sequence is also considered. Now the promoter is the same, so what we focus on are the gene sequences.

In condition 2, the process is almost the same as constructing the new column. Promoter similarity is used because it is the main region.

Figure 7. Construct New GRN

Network Model

Network Model Abstract

Network analysis includes finding stable condition of network, adding new gene, finding new stable condition and changes from original condition to new condition. We use densities of materials to describe network condition. If all material densities are time-invariant, we can say the network condition is stable.

Hill Equations

Find Stable Network Condition

Find Changes From Original Stable Condition to New Condition

Predict

Predict Abstract

In some cases, importing exogenous gene is for enhancing or suppressing the expression of some specific genes in engineered bacteria itself. But it is hard to choose an appropriate regulatory gene. Our software analyzes the GRN forward as well as simulates by optimization algorithm backward for giving a reference of choosing to the users. Our software not only focused on the direct regulation but also focused on the global GRN. In the same time, controlling the expression of multiple genes in network has been realized by global prediction. What’s more, Particle Swarm Optimization (PSO) Algorithm makes it possible.

Input Target

Particle Swarm Optimization

Filter

Database

TF-TF

This file contains the regulation between Transcription Factors.

TF-Gene

Gene Info

Promoter Info

TU Info

@@ Line 135: / Line 135: @@
-<div id=" ">
+<div id="Alignment_Analyze">
 <h2>Operon Theory and Regulatory Model</h2>
    <div id="jobs_container">
 	         <div class="jobs_trigger"><strong>Operon Theory</strong></div>
-		 		<div class="jobs_item" style="display: none;"><p class="bodytext"></p><p align="justify">In genetics, an operon is a functioning unit of genomic DNA containing a cluster of genes
+		 		<div class="jobs_item" style="display: none;"><p class="bodytext"></p>
-under the control of a single regulatory signal or promoter. The genes contained in the
+                <p align="justify">In genetics, an operon is a functioning unit of genomic DNA containing a cluster of genes
-operon are either expressed together or not at all. Several genes must be both cotranscribed
+under the control of a single regulatory signal or promoter.<br /> The genes contained in the
-and co-regulated to define an operon.<br/>
+operon are either expressed together or not at all.<br /> Several genes must be both cotranscribed
+and co-regulated to define an operon.<br /><br />
 The first time “operon” was proposed is in a paper of French Academic Science, 1960.
 The lac operon of the model bacterium E. coli was discovered and provides a typical
 example of operon function. It consists a promoter, an operator, three structural genes and
 a terminator. The operon is regulated by several factors including the availability of glucose
-and lactose.<br/>
+and lactose.<br /><br />
 From this paper, the so-called general theory of the operon was developed. According to
 the theory, all genes are controlled by means of operons through a single feedback
 regulatory mechanism-repression. The first operon to be described was the lac operon in
 E. coli. The 1965 Nobel Prize in Physiology and Medicine was awarded to François Jacob,
-André Michel Lwoff and Jacques Lucien Monod for their discoveries concerning the operon and virus synthesis.<br/>
+André Michel Lwoff and Jacques Lucien Monod for their discoveries concerning the operon and virus synthesis.<br />
-<img src="https://static.igem.org/mediawiki/2013/8/8d/USTC_Software_Operon_Theory.png"/>
+               </p>
-<p>Figure 1. Structure of Operon</p>
+<div align="center"><img src="../../method/Figure 1.png" />
+<p align="center"><strong>Figure 1.</strong> Structure of Operon</p></div>
+<p align="justify">An operon is made up of several structural genes arranged under a common promoter and
+regulated by a common operator. It is defined as a set of adjacent structural genes, plus
+the adjacent regulatory signals that affect transcription of the structural genes. The
+regulators of a given operon, including repressors, corepressors and activators, are not
+necessarily coded for by that operon.<br /><br />
+As a unit of transcription, upstream of the structural genes lies a promoter sequence which
+provides a site for RNA polymerase to bind and initiate transcription. Close to the promoter
+lies a section of DNA called an operator.<br /><br />
+Operon regulation can be either negative or positive by induction or repression. Negative
+control involves the binding of a repressor to the operator to prevent transcription.
+Operons can also be positively controlled. An activator protein binds to DNA, usually at a
+site other than the operator, to stimulate transcription.
 </p>
-                </div>
+<div align="center"><img style="width:600px;" src="../../method/Figure 2.png"/>
+<p align="justify"><strong>Figure 2.</strong> Regulation of Operon
-				<div class="jobs_trigger"><strong>Models</strong></div>
+: RNA Polymerase, 2: Repressor, 3: Promoter, 4: Operator, 5: Lactose, 6: lacZ, 7:
-				<div class="jobs_item" style="display: none;"><p align="justify">
+lacY, 8: lacA. Top: The gene is essentially turned off. There is no lactose to inhibit the
-Regulatory Model</br>
+repressor, so the repressor binds to the operator, which obstructs the RNA polymerase
-Regulation of gene expression includes 4 levels: </br>
+from binding to the promoter and making lactase.Bottom: The gene is turned on.Lactose
-•Level of DNA rearrangement.</br>
+is inhibiting the repressor, allowing the RNA polymerase to bind with the promoter, and
-•Level of transcriptional regulation.</br>
+express the genes, which synthesize lactase. Eventually, the lactase will digest all of the
-•Level of translation.</br>
+lactose, until there is none to bind to the repressor. The repressor will then bind to the
-•Level of post-translation</br>
+operator, stopping the manufacture of lactase.</p></div>
-This year we focus on the level of transcriptional regulation both for the importance of the level and model simplification. By carefully examining the lac operon system, which is widely considered as the first discovery of the gene regulation system, we constructed our regulation model with functional units called “Regulation Unit” [FXIME: regulation or regulatory?]. A regulation unit consists of two segments. The first one is a promoter sequence and the second one is a protein coding sequence.</br>
-[Pic. 3 Promoter Sequence, Protein Coding Sequence and Regulation Unit]</br>
-In our model, the promoter segment is regarded as the main regulated region. A transcription factor is a protein that binds to specific DNA sequences, thereby controlling the flow of genetic information from DNA to mRNA. The binding sites are promoter regions of DNA adjacent to the genes that they regulate. At first, according to the lac operon system, regulation units as a regulatory target with the same promoter are supposed to have same behavior. But we found it is insufficient because there are units with the same promoter showing different properties. Then we took the genes regulated by transcription factors into consideration. The different properties of two units are first owing to their promoters. If they have the same promoter, their protein coding sequences are supposed to make the difference. By taking this method, it turns out that this model works better.</br>
+      </div>
-[Pic. 4 Regulated Region]</br>
+				<div class="jobs_trigger"><strong>Regulatory Model</strong></div>
+				<div class="jobs_item" style="display: none;"><p align="justify">Regulation of gene expression includes four levels. We choose the transcriptional level to simulate the regulation both for its significance and model simplification.</p>
-A unit in the network regulates another through the transcription factor. That is, the product of the protein coding sequence of the unit is a transcription factor and the transcription factor regulates the promoter of the another unit.
+                <div align="center"><img style="width:600px; height:auto;"src="../../method/Figure 3.png" />
-</p>
+                <p><strong>Figure 3.</strong>Regulation of gene expression.<br />Our regulation model is built based on the operon theory.<br /> The promoter region is regarded as the main regulatory region.</p></div>
-                </div>
+      </div>
-              <div class="jobs_trigger"> <strong>Prediction Model</strong></div>
+              <div class="jobs_trigger"> <strong>Similarity and homology</strong></div>
-		        <div class="jobs_item" style="display: none;"><p align="justify">The basic idea behind the prediction model is deceptively simple: the more similar two sequences are, the more likely they have similar behaviors. In fact, it is extremely difficult to predict an exogenous gene’s behavior because of the complexity of the problem, random noise of the system and the coupling of biosystems. </br>
+		        <div class="jobs_item" style="display: none;"><p align="justify">The sequence similarity is obtained by sequence alignment. It is defined as the proportion of the common subsequence in the aligned sequence. Any two sequences share a certain
+similarity. It should be noted that similarity and homology are two different concepts.<br /><br />
+As with anatomical structures, homology between protein or DNA sequences is defined in
+terms of shared ancestry. Two segments of DNA can have shared ancestry because of
+either a speciation event or a duplication event. The terms “percent homology” and
+“sequence similarity” are often used interchangeably. As with anatomical structures, high
+sequence similarity might occur because of convergent evolution, or, as with shorter
+sequences, because of chance. Such sequences are similar but not homologous.
+Sequence regions that homologous are also called conserved.<br /><br />
+In our project, we use similarity to connect the exogenous gene with the original network.
+Because there is a good chance that the exogenous gene is not homologous with the
+genes in the network.</p>
+      </div>
+		        <div class="jobs_item" style="display: none;"><p align="justify">The GRN matrix is the mathematical description of gene regulatory network in which “1” represents “enhance”, “-1” represents “repress” and “0” represents “no regulatory relationship”. The units(RU) in x-axis regulate the units in y-axis. A row can be seen as a vector containing all the information of the target(corresponding unit in the y-axis). Similarly, a column can be seen as a vector containing all the information of the regulator(corresponding unit in the x-axis).</p>
+                </div>
+		        <div class="jobs_item" style="display: none;"><p align="justify">The sequence similarity is obtained by sequence alignment based on Needleman-Wunsch algorithm[FIXME: wiki link here]. The Needleman-Wunsch algorithm performs a global alignment on two protein sequences or nucleotide sequences. It was the first application of dynamic programming to biological sequence comparison.
-Advanced alignment algorithm is selected to reduce the complexity. Sequences which contain all the information of the species are the entity of the gene regulatory network. Sequence similarity is an essential concept to the prediction model. The selected alignment algorithm can significantly reduce the complexity of the problem and makes it possible to give a reliable prediction from a global point of view.</br>
+When dynamic programming is applicable, the method takes far less time than naive methods. Using a naive method, many of the subproblems are generated and sovled many times. The dynamic programming approach seeks to solve each subproblem only once. Once the solution to a given subproblem has been computed, it is stored to be looked up next time.
-We designed a random method to filter the noise in sequence alignment. There are no totally different sequences. Even the similarity of any two random sequences is not zero. Filtered results are more significant and reliable to the following steps.
+[Pic. 5 Dynamic programming and naive method]
-</br>
-Coupling of biosystem is also simulated at some level. When predicting exogenous gene’s behavior, all the units in the original gene regulatory network are taken into consideration. </br>
-Given that the exogenous gene may have never been inserted into E. coli before, all possible reactions in gene regulatory network are reserved to be filtered. </br>
-Using the innovated methods above, we are trying to challenge the difficulties and obtain a global perspective of the relationship between the exogenous gene and the original gene regulatory network.
+Like the Needleman-Wunsch algorithm, of which it is a variation, Smith-Waterman is also a dynamic programming algorithm. But it is a local sequence alignment algorithm. The famous BLAST(Basic Local Alignment Search Tool) is improved from Smith-Waterman algorithm. Although local algorithm has the desirable property that it is guaranteed to find the optimal local alignment, we decided to choose the global one because we regarded the segment sequence as a unit.
-</p>          </div>
-             <div class="jobs_trigger"> <strong>Mathematical Description of The Network</strong></div>
-		        <div class="jobs_item" style="display: none;"><p align="justify">The GRN matrix is the mathematical description of gene regulatory network in which “1” represents “enhance”, “-1” represents “repress” and “0” represents “no regulatory relationship”. The units(RU) in x-axis regulate the units in y-axis. A row can be seen as a vector containing all the information of the target(corresponding unit in the y-axis). Similarly, a column can be seen as a vector containing all the information of the regulator(corresponding unit in the x-axis).
-</p>          </div>
-             <div class="jobs_trigger"> <strong>Sequence similarity</strong></div>
-		        <div class="jobs_item" style="display: none;"><p align="justify">The sequence similarity is obtained by sequence alignment based on <a id="content" href="http://en.wikipedia.org/wiki/Needleman%E2%80%93Wunsch_algorithm">Needleman-Wunsch algorithm</a>. The Needleman-Wunsch algorithm performs a global alignment on two protein sequences or nucleotide sequences. It was the first application of dynamic programming to biological sequence comparison.
-</br>
-When dynamic programming is applicable, the method takes far less time than naive methods. Using a naive method, many of the subproblems are generated and sovled many times. The dynamic programming approach seeks to solve each subproblem only once. Once the solution to a given subproblem has been computed, it is stored to be looked up next time.</br>
-[Pic. 5 Dynamic programming and naive method]</br>
+Sequences are aligned with different detailed methods in different situations. In the regulated side, what we care about is the DNA sequence. In the regulating side, it is the amino acid sequence. When it comes to predict the regulated behavior, we use a DNA substitution matrix to align promoter and protein coding sequences. In the prediction of regulating behavior, the substitution matrix BLOSUM_50 is used to align the amino acid sequences translated from protein coding sequences.
-Like the Needleman-Wunsch algorithm, of which it is a variation, Smith-Waterman is also a dynamic programming algorithm. But it is a local sequence alignment algorithm. The famous BLAST(Basic Local Alignment Search Tool) is improved from Smith-Waterman algorithm. Although local algorithm has the desirable property that it is guaranteed to find the optimal local alignment, we decided to choose the global one because we regarded the segment sequence as a unit.</br>
-Sequences are aligned with different detailed methods in different situations. In the regulated side, what we care about is the DNA sequence. In the regulating side, it is the amino acid sequence. When it comes to predict the regulated behavior, we use a DNA substitution matrix to align promoter and protein coding sequences. In the prediction of regulating behavior, the substitution matrix BLOSUM_50 is used to align the amino acid sequences translated from protein coding sequences.</br>
 The promoter similarities of the query unit and subject units are stored in a vector. The protein coding similarities are stored in another vector. These vectors are prepared to be used in the new network construction.
 </p>
-           </div>
+         </div>
      </div><!--jobs container-->
@@ Line 223: / Line 234: @@
    <div id="jobs_container">
-	         <div class="jobs_trigger"><strong>Filter</strong></div>
+	         <div class="jobs_trigger"><strong>Random Noise</strong></div>
-		 		<div class="jobs_item" style="display: none;"><p class="bodytext"></p><p align="justify">Once the similarity vectors are calculated, the next step is to filter them. As explained in the previous part, there is random noise in sequence alignment. In order to filter these meaningless values, a certain amount of random sequences are generated for each query-subject alignment. Normally, 100 is sufficient. Because the sequence length will influence alignment result, random sequences are fixed at the same length as the query one. Then align these random sequences with the subject sequence. The statistic result of these random similarities will be used as a threshold. If the original similarity is lower than the threshold, it is abandoned. In this case, the original value is usually short of statistical significance.
+		 		<div class="jobs_item" style="display: none;"><p class="bodytext"></p><p align="justify">Normally, the similarity of two sequences will not be zero. Some computational
-</p>
+experiments were carried out to study the random sequence similarities. We randomly
+chose a gene in the network and generated 1000 random sequences. The alignment result
+indicates that the random sequence similarities are Gauss distributed. The result suggests
+that some similarities are out of statistic significance.</p>
+<div align="center">
+<img src="../../method/Figure 4.png" />
+<p><strong>Figure 4.</strong> Random similarity distribution</p></div>
                  </div>
-				<div class="jobs_trigger"><strong>Construct A New Regulated Vector</strong></div>
+				<div class="jobs_trigger"><strong>Filter</strong></div>
-				<div class="jobs_item" style="display: none;"><p align="justify">
+				<div class="jobs_item" style="display: none;"><p align="justify">We need the genes highly similar to the exogenous one to interact with it. The program will
-If there are units in the original network having the same promoter as the exogenous one, the first step is to pick them out. Positive and negative regulations of these units are counted separately and distinguished into “positive group” and “negative group”. Then compare the exogenous one with these units. The similarities have already been calculated and stored in the corresponding positions in the similarity vector. The similarity mentioned here is the similarity of protein coding sequences as explained in the model part. The next step is to calculate the average similarity of each group. The exogenous unit is supposed to have the larger one’s direction(positive or negative). The weighted average regulation value of the chosen group whose weight is the sequence similarity is the new element’s value. It means regulatory intensity.</br>
+align the exogenous gene(query) with genes in the network(subject) and get the original
+similarities. In order to filter meaningless low values, a certain amount of random
-If there is no unit having the same promoter as the exogenous one, given that the promoter is the main regulatory region, the promoter similarity is used as the weight. And the weighted average of the regulation of the whole column is the new element’s value
+sequences are generated for each query-subject alignment. Normally, 100 is sufficient.
-</p>
+Because the sequence length will influence alignment result, random sequences are fixed
-                </div>
+at the same length as the query one. Then align random sequences with the subject
+sequence. The statistic result of these random similarities is used as a threshold.<br />
+<div align="center">Threshold = μ + xσ</div><br />
+In the formula, μ is the average random similarity. σ is the standard deviation. x is used to
-             <div class="jobs_trigger"> <strong>Construct A New Regulating Vector</strong></div>
+control the filter determined by machine learning. If the original similarity is lower than the
-		        <div class="jobs_item" style="display: none;"><p align="justify">The construction of the new regulating vector is achieved in a way similar to the one described above. By calculated the weighted average regulation of a row, the program gives the regulatory intensity that the exogenous unit regulate the corresponding unit in the network.
+threshold, it is abandoned. It is usually means the original value is usually short of
-</p>          </div>
+statistical significance.<br /><br />
+An example about filtring and consistency is presented in “Example”.
+</p>
+                </div>
-				<div class="jobs_trigger"> <strong>A Supplementary Game: Test of The Model</strong></div>
+				<div class="jobs_trigger"> <strong>Construct new GRN</strong></div>
-				<div class="jobs_item" style="display: block;"><p align="justify">
+				<div class="jobs_item" style="display: block;"><p align="justify">If there is a three-unit network and they interact with each other as it is shown in the figure.
-The behavior similarity of two units can be described by the dot product of two regulated vectors or two regulating vectors. A more intuitive way is using the vectorial angle to measured the similarity of two behaviors. But there are some zero vectors in the gene regulatory network which usually means the units either play the role of target or the regulator.</br>
+The regulation is described by the GRN matrix.</p>
+<div align="center"><img src="../../method/3.png" />
+<p style="font-size:18px;"><strong>Figure 5.</strong> Example network and its GRN matrix.</p></div>
-[Pic. 4 GRN matrix, target vector, regulator vector and their dot product]</br>
-We have tested the hypothesis by analyzing all 1748 regulation units of Escherichia coli, K-12, recorded in <a id="content" href="http://regulondb.ccg.unam.mx/index.jsp">RegulonDB</a>. By pairwise comparison of all these units, about 1.6 million sets of data was obtained. Each set of data consists of promoter sequence similarity, protein coding sequence similarity and behavior similarity. We hope to find some structure in the data that supports our hypothesis. And it is lucky enough to find there is a tendency showing the relationship between sequence similarity and behavior similarity(Pic. 2).
+<p style="font-size:20px;">If D is the exogenous unit, we can obtain three similarity data sets of D with the units in the
-</br>
+original GRN:
-[Pic. 2 Sequence similarity and behavior similarity]</br>
+<li style="margin-left:40px;">Promoter sequence similarity</li>
+<li style="margin-left:40px;">Gene sequence similarity</li>
-Sequence similarity is set as x axis and behavior similarity is set as y axis. Obviously sequence similarity is continuous-valued (from 0 to 1) and behavior similarity is discrete-valued. Values of behavior similarity determined by the dimension(N) of the vector are between -N and N. According to the result, promoter sequence similarity mainly distributes from 0.4 to 0.6, protein coding sequence similarity mainly distributes from 0 to 0.7 and behavior similarity mainly distributes from -3 to 5. As it is shown in Picture 4, high behavior similarity is partial to high sequence similarity. Peak value of behavior similarity, 17, appears where sequence similarity is 0.537. When behavior similarity value is fixed, for example, set behavior similarity as 8, it is obvious that the higher the sequence similarity is, the more intensive the dots are.
+<li style="margin-left:40px;">Amino acid sequence similarity.</li>
-</p>
+<p>
+The construction is equivalent to add a new column and a row into the original matrix.</p>
+<div align="center"><img src="../../method/4.png" />
+<p><strong>Figure 6.</strong> Mathematical Equivalence</p></div>
+<p>When filling the column, D is compared with the regulators of the unit in each row. The
+regulations in the row are consider separately and marked as “positive group” and
+“negative group”. The average similarity of each group represents the distance between
+the exogenous unit and the group. D is supposed to have the larger one’s regulatory
+direction(positive or negative). The regulatory intensity is the weight average regulation of
+the chose group. The weight here is the amino acid sequence similarity.<br /><br />
+There are two conditions when fill the new row:<br />
+. There are units having the same promoter as the exogenous unit.<br />
+. There is no units having the same promoter as the exogenous unit.<br /><br />
+In condition 1, the units sharing the same promoter with the new member are picked out,
+and the following steps are the same as the construction of the column. The difference is
+the similarity used here is the gene sequence similarity. As explained in the regulation
+model part, the promoter is the main regulatory region, but the following sequence is also
+considered. Now the promoter is the same, so what we focus on are the gene sequences.<br /><br />
+In condition 2, the process is almost the same as constructing the new column. Promoter
+similarity is used because it is the main region.</p>
+<div align="center">
+<img src="../../method/5.png" />
+<p><strong>Figure 7.</strong> Construct New GRN</p></div>
-           </div>
+      </div>
    </div><!--jobs container-->
 </div>

Team:USTC-Software/Project/Method