Team:Peking/Project/SensorMining

From 2013.igem.org

(Difference between revisions)
 
(164 intermediate revisions not shown)
Line 139: Line 139:
.Navbar_Item > ul > li {float:left; list-style:none; text-align:center; background-color:transparent; position:relative; top:10px; padding:0 10px;}
.Navbar_Item > ul > li {float:left; list-style:none; text-align:center; background-color:transparent; position:relative; top:10px; padding:0 10px;}
.Navbar_Item:hover{ border-bottom:1px solid #D00000; color:#D00000; background-color:#fafaf8;}
.Navbar_Item:hover{ border-bottom:1px solid #D00000; color:#D00000; background-color:#fafaf8;}
-
.Navbar_Item:hover > ul{ display:block;}
+
.Navbar_Item:hover > ul{zoom:1; display:block;}
.Navbar_Item:hover >a {color:#D00000; background-color:#fafaf8;}
.Navbar_Item:hover >a {color:#D00000; background-color:#fafaf8;}
.Navbar_Item > ul > li:hover {border-bottom:1px solid #D00000; color:#D00000; background-color:#fafaf8;}
.Navbar_Item > ul > li:hover {border-bottom:1px solid #D00000; color:#D00000; background-color:#fafaf8;}
.Navbar_Item > ul > li:hover >a {color:#D00000}
.Navbar_Item > ul > li:hover >a {color:#D00000}
 +
.BackgroundofSublist{position:absolute; left:-1000px; width:2000px; height:80px; background-color:#ffffff; opacity:0;}
#Home_Sublist{position:relative; top:0px; left:-50px;}
#Home_Sublist{position:relative; top:0px; left:-50px;}
-
#Team_Sublist{position:relative; top:0px; left:-90px;}
+
#Team_Sublist{position:relative; top:0px; left:-110px;}
-
#Project_Sublist{position:relative; top:0px; left:-200px;}
+
#Project_Sublist{position:relative; top:0px; left:-180px;}
#Model_Sublist{position:relative; top:0px; left:-140px;}
#Model_Sublist{position:relative; top:0px; left:-140px;}
#DataPage_Sublist{position:relative; top:0px; left:-60px;}
#DataPage_Sublist{position:relative; top:0px; left:-60px;}
#Safety_Sublist{position:relative; top:0px; left:-180px;}
#Safety_Sublist{position:relative; top:0px; left:-180px;}
-
#HumanPractice_Sublist{position:relative; top:0px; left:-320px;}
+
#HumanPractice_Sublist{position:relative; top:0px; left:-470px;}
#iGEM_logo{position:absolute; top:30px; left:1090px; height:80px;}
#iGEM_logo{position:absolute; top:30px; left:1090px; height:80px;}
Line 157: Line 158:
/*Major body*/
/*Major body*/
-
#MajorBody{position:absolute; top:24px; left:0px; width:1200px; height:4300px; background-color:#ffffff; font-size:18px; font-family: calibri, arial, helvetica, sans-serif; }
+
#MajorBody{position:absolute; top:24px; left:0px; width:1200px; height:3700px; background-color:#ffffff; font-size:18px; font-family: calibri, arial, helvetica, sans-serif; }
#LeftNavigation{position:fixed; top:130px; float:left; width:200px; height:100%; background-color:#313131;z-index:1200;}
#LeftNavigation{position:fixed; top:130px; float:left; width:200px; height:100%; background-color:#313131;z-index:1200;}
#SensorsListTitle{position:absolute; top:40px; left:20px; color:#ffffff; font-size:23px; font-family: calibri, arial, helvetica, sans-serif; text-decoration:none; border-bottom:0px;}
#SensorsListTitle{position:absolute; top:40px; left:20px; color:#ffffff; font-size:23px; font-family: calibri, arial, helvetica, sans-serif; text-decoration:none; border-bottom:0px;}
Line 173: Line 174:
#BiosensorOverviewContent{position:absolute; top:250px; left:80px; width:840px; height:400px;color:#1b1b1b;  font-size:18px;font-family:calibri, Arial, Helvetica, sans-serif; text-align:justify;}
#BiosensorOverviewContent{position:absolute; top:250px; left:80px; width:840px; height:400px;color:#1b1b1b;  font-size:18px;font-family:calibri, Arial, Helvetica, sans-serif; text-align:justify;}
 +
 +
#FixedWhiteBackground{position:fixed; top:0px; float:right; width:1200px; height:100%; z-index:-100; background-color:#ffffff;}
#BiosensorMiningEditingArea{position:absolute; left:200px; top:340px; font-size:18px; font-family:calibri,Arial, Helvetica, sans-serif; text-align:justify; line-height:25px;}
#BiosensorMiningEditingArea{position:absolute; left:200px; top:340px; font-size:18px; font-family:calibri,Arial, Helvetica, sans-serif; text-align:justify; line-height:25px;}
 +
#BiosensorMiningEditingArea a{color:#ca4321; font-weight:bold;}
 +
 +
#Mining_Para1{position:relative; top:0px; left:80px; width:840px;}
 +
 +
 +
/*flowchart*/
 +
#MiningFlowchart{position:relative; top:0px; left:200px; width:600px; height:400px; background-color:white;}
 +
#FlowStep0{position:absolute; top:0px; left:0px; width:600px; z-index:100;}
 +
#FlowStep1{position:absolute; top:0px; left:0px; width:600px; display:none; z-index:100;}
 +
#FlowStep2{position:absolute; top:0px; left:0px; width:600px; display:none; z-index:100;}
 +
#FlowStep3{position:absolute; top:0px; left:0px; width:600px; display:none; z-index:100;}
 +
#FlowStep4{position:absolute; top:0px; left:0px; width:600px; display:none; z-index:100;}
 +
#Flowcover0{position:absolute; top:0px; left:0px; width:169px; height:49px; background-color:#ffffff; opacity:0; z-index:300;}
 +
#Flowcover1{position:absolute; top:49px; left:0px; width:169px; height:86px; background-color:#ffffff; opacity:0.8;z-index:300;}
 +
#Flowcover2{position:absolute; top:135px; left:0px; width:169px; height:83px; background-color:#ffffff; opacity:0.8;z-index:300;}
 +
#Flowcover3{position:absolute; top:218px; left:0px; width:169px; height:82px; background-color:#ffffff; opacity:0.8;z-index:300;}
 +
#Flowcover4{position:absolute; top:300px; left:0px; width:169px; height:85px; background-color:#ffffff; opacity:0.8;z-index:300;}
 +
/*endofflowchart*/
 +
 +
 +
#Mining_Legend1{position:relative; top:0px; left:100px; width:800px; font-size:14px;font-family:Arial, Helvetica, sans-serif; text-align:justify; line-height:20px;}
 +
#Mining_Para2{position:relative; top:0px; left:80px; width:840px;}
-
#Mining_Para1{position:absolute; top:40px; left:80px; width:840px;}
+
#SensorMiningTableTitle{position:relative; top:0px; left:100px; width:800px; text-align:center; border-bottom:0px; font-size:18px;}
-
#Mining_Figure1{position:absolute; top:250px; left:80px; width:840px;}
+
#SensorMiningTable{position:relative; top:0px; left:100px; width:800px; text-align:center;}
-
#Mining_Legend1{position:absolute; top:370px; left:100px; width:800px; font-size:14px;font-family:Arial, Helvetica, sans-serif; text-align:justify; line-height:18px;}
+
-
#Mining_Para2{position:absolute; top:490px; left:80px; width:840px;}
+
-
#SensorMiningTableTitle{position:absolute; top:1120px; left:100px; width:800px; text-align:center; border-bottom:0px; font-size:18px;}
+
#Mining_Para3{position:relative; top:0px; left:80px; width:840px;}
-
#SensorMiningTable{position:absolute; top:1160px; left:100px; width:800px; text-align:center;}
+
#Mining_Figure2{position:relative; top:0px; left:160px; width:720px;}
 +
#Mining_Legend2{position:relative; top:0px; left:100px; width:800px; font-size:14px;font-family:Arial, Helvetica, sans-serif; text-align:justify; line-height:20px;}
-
#Mining_Para3{position:absolute; top:2120px; left:80px; width:840px;}
+
#SourceCodeTitle{position:relative; top:0px; left:80px; width:160px; text-align: center;border-bottom: 0px;color: #ffffff;font-weight: bold;font-style: Italic;font-size: 24px;font-family: calibri,arial,helvetica,sans-serif;background-color: #ca4321;height: 30px;width: 200px;}
-
#Mining_Figure2{position:absolute; top:2250px; left:140px; width:720px;}
+
#SourceCodeProA{position:relative; top:0px; left:80px; width:840px;}
-
#Mining_Legend2{position:absolute; top:2770px; left:100px; width:800px; font-size:14px;font-family:Arial, Helvetica, sans-serif; text-align:justify; line-height:18px;}
+
#SourceCodeCrawl{position:relative; top:0px; left:80px; width:840px;}
-
#MileStone1{position:absolute; top:-100px;}
+
#MileStone1{position:absolute; top:80px;}
-
#MileStone2{position:absolute; top:970px;}
+
#MileStone2{position:absolute; top:1740px;}
 +
#MileStone3{position:absolute; top:3330px;}
Line 202: Line 227:
<ul id="navigationbar">
<ul id="navigationbar">
<li id="PKU_navbar_Home" class="Navbar_Item">
<li id="PKU_navbar_Home" class="Navbar_Item">
 +
                       
<a href="https://2013.igem.org/Team:Peking">Home</a>
<a href="https://2013.igem.org/Team:Peking">Home</a>
<ul id="Home_Sublist" >
<ul id="Home_Sublist" >
Line 207: Line 233:
</li>
</li>
<li id="PKU_navbar_Team" class="Navbar_Item">
<li id="PKU_navbar_Team" class="Navbar_Item">
-
<a href="https://2013.igem.org/Team:Peking/Team">Team</a>
+
<a href="">Team</a>
<ul id="Team_Sublist">
<ul id="Team_Sublist">
 +
                                <div class="BackgroundofSublist"></div>
<li><a href="https://2013.igem.org/Team:Peking/Team/Members">Members</a></li>
<li><a href="https://2013.igem.org/Team:Peking/Team/Members">Members</a></li>
<li><a href="https://2013.igem.org/Team:Peking/Team/Notebook">Notebook</a></li>
<li><a href="https://2013.igem.org/Team:Peking/Team/Notebook">Notebook</a></li>
Line 217: Line 244:
<a href="https://2013.igem.org/Team:Peking/Project">Project</a>
<a href="https://2013.igem.org/Team:Peking/Project">Project</a>
<ul id="Project_Sublist">
<ul id="Project_Sublist">
-
                                 <li><a href="https://2013.igem.org/Team:Peking/Project/AutoSensorMining">Auto Sensor Mining</a></li>
+
                                <div class="BackgroundofSublist"></div>
 +
                                 <li><a href="https://2013.igem.org/Team:Peking/Project/SensorMining">Biosensor Mining</a></li>
<li><a href="https://2013.igem.org/Team:Peking/Project/BioSensors">Biosensors</a></li>
<li><a href="https://2013.igem.org/Team:Peking/Project/BioSensors">Biosensors</a></li>
-
<li><a href="https://2013.igem.org/Team:Peking/Project/Plugins">Plug-ins</a></li>
+
<li><a href="https://2013.igem.org/Team:Peking/Project/Plugins">Adaptors</a></li>
<li><a href="https://2013.igem.org/Team:Peking/Project/BandpassFilter">Band-pass Filter</a></li>
<li><a href="https://2013.igem.org/Team:Peking/Project/BandpassFilter">Band-pass Filter</a></li>
-
                                 <li><a href="https://2013.igem.org/Team:Peking/Project/Application">Application</a></li>
+
                                 <li><a href="https://2013.igem.org/Team:Peking/Project/Devices">Devices</a></li>
</ul>
</ul>
</li>
</li>
<li id="PKU_navbar_Model" class="Navbar_Item">
<li id="PKU_navbar_Model" class="Navbar_Item">
-
<a href="https://2013.igem.org/Team:Peking/Model">Model</a>
+
<a href="">Model</a>
<ul id="Model_Sublist">
<ul id="Model_Sublist">
 +
                                <div class="BackgroundofSublist"></div>
 +
                                <li><a href="https://2013.igem.org/Team:Peking/Model">Band-pass Filter</a></li>
 +
                                <li><a href="https://2013.igem.org/Team:Peking/ModelforFinetuning">Biosensor Fine-tuning</a></li>
</ul>
</ul>
</li>
</li>
                         <li id="PKU_navbar_HumanPractice" class="Navbar_Item" style="width:90px">
                         <li id="PKU_navbar_HumanPractice" class="Navbar_Item" style="width:90px">
-
<a href="https://2013.igem.org/Team:Peking/HumanPractice">Data page</a>
+
<a href="">Data page</a>
-
<ul id="DataPage_Sublist">
+
<ul id="DataPage_Sublist">
 +
                                <div class="BackgroundofSublist"></div>
                                 <li><a href="https://2013.igem.org/Team:Peking/DataPage/Parts">Parts</a></li>
                                 <li><a href="https://2013.igem.org/Team:Peking/DataPage/Parts">Parts</a></li>
<li><a href="https://2013.igem.org/Team:Peking/DataPage/JudgingCriteria">Judging Criteria</a></li>
<li><a href="https://2013.igem.org/Team:Peking/DataPage/JudgingCriteria">Judging Criteria</a></li>
Line 244: Line 276:
<li id="PKU_navbar_HumanPractice" class="Navbar_Item" style="width:120px">
<li id="PKU_navbar_HumanPractice" class="Navbar_Item" style="width:120px">
<a href="https://2013.igem.org/Team:Peking/HumanPractice">Human Practice</a>
<a href="https://2013.igem.org/Team:Peking/HumanPractice">Human Practice</a>
-
<ul id="HumanPractice_Sublist">
+
<ul id="HumanPractice_Sublist">
-
                                 <li><a href="https://2013.igem.org/Team:Peking/HumanPractice/Questionnaire">Questionnaire</a></li>
+
                                <div class="BackgroundofSublist"></div>
-
<li><a href="https://2013.igem.org/Team:Peking/HumanPractice/FactoryVisit">Factory Visit</a></li>
+
                                 <li><a href="https://2013.igem.org/Team:Peking/HumanPractice/Questionnaire">Questionnaire Survey</a></li>
-
                                 <li><a href="https://2013.igem.org/Team:Peking/HumanPractice/iGEMWorkshop">iGEM Workshop</a></li>
+
<li><a href="https://2013.igem.org/Team:Peking/HumanPractice/FactoryVisit">Visit and Interview</a></li>
-
<li><a href="https://2013.igem.org/Team:Peking/HumanPractice/ModeliGEM">Model iGEM</a></li>
+
                                 <li><a href="https://2013.igem.org/Team:Peking/HumanPractice/ModeliGEM">Practical Analysis</a></li>
 +
<li><a href="https://2013.igem.org/Team:Peking/HumanPractice/iGEMWorkshop">Team Communication</a></li>
 +
 
</ul>
</ul>
</li>
</li>
</ul>
</ul>
-
         <a href="https://igem.org/Team_Wikis?year=2013"><img id="iGEM_logo" src="https://static.igem.org/mediawiki/igem.org/4/48/Peking_igemlogo.jpg"/></a>
+
         <a href="https://2013.igem.org/Main_Page"><img id="iGEM_logo" src="https://static.igem.org/mediawiki/igem.org/4/48/Peking_igemlogo.jpg"/></a>
</div>
</div>
<!--end navigationbar-->
<!--end navigationbar-->
Line 260: Line 294:
<div id="MajorBody">   
<div id="MajorBody">   
     <div id="LeftNavigation">
     <div id="LeftNavigation">
-
                 <h1 id="SensorsListTitle"><a href="https://2013.igem.org/Team:Peking/Project/BioSensors">Biosensor Mining</a></h1>
+
                 <h1 id="SensorsListTitle">Biosensor Mining</h1>
                 <ul id="SensorsList">
                 <ul id="SensorsList">
      
      
                     <li class="SensorsListItem"><a href="#MileStone1">Method</a><li>
                     <li class="SensorsListItem"><a href="#MileStone1">Method</a><li>
                     <li class="SensorsListItem"><a href="#MileStone2">Result</a><li>
                     <li class="SensorsListItem"><a href="#MileStone2">Result</a><li>
-
                      
+
                     <li class="SensorsListItem"><a href="#MileStone3">Source Code</a><li>
                    
                    
                 </ul>
                 </ul>
Line 277: Line 311:
     </div>
     </div>
 +
    <div id="FixedWhiteBackground"></div>
     <div id="BiosensorMiningEditingArea">     
     <div id="BiosensorMiningEditingArea">     
   
   
-
     <p id="Mining_Para1">In order to comprehensively profile aromatics in environment, our toolkit should be equipped with biosensors responding to various aromatic components. Abundant with protein informations, large protein databases, like uniprot, are ideal gold mines to look for new biobriks. Peking iGEM team has developed a four step sieving method to screen out feasible and well characterized aromatic sensors from the protein database uniprot. This method consists of several computer programs to process massive data and a manual adjustment step to guarantee a reliable result.</p>
+
     <p id="Mining_Para1"><br/><br/>In order to comprehensively profile aromatics in environment, our toolkit should be equipped with a collection of biosensors that senses diverse aromatic components. However, there is no such a comprehensive collection of biosensors available currently. Noting the abundant genomic and proteomic data in databases today, we speculated that large protein databases, like Uniprot, are ideal gold mines finding new Biobricks. This year, Peking iGEM team has developed a four-step bioinformatic mining method to screen out feasible and well-characterized aromatics-sensing transcriptional regulators from the protein database. This method consists of several computer programs to process massive data and a manual adjustment step to further guarantee the reliability of the mining results: </p>
-
    <img id="Mining_Figure1" src="https://static.igem.org/mediawiki/igem.org/f/f9/Peking2013_SensorMining_Figure1.PNG" />
+
 
-
     <p id="Mining_Legend1"><b>Figure 1.</b> The flow chart of sieving aromatic sensors. Step 1, narrowing down the scope of proteins into transcription factors (TFs) in specific bacteria species; step 2, screening out aromatics related transcription factors; step 3, examining whether the selected transcription factors are well studied; step 4, manual adjustment to verify the feasibility of the selected transcription factors.</p>
+
    <!--flowchart-->
-
     <p id="Mining_Para2">First, we narrowed down the scope of proteins into transcription factors in specific bacteria species. We chose Pseudomonas putida, pseudomonas sp and pseudomonas nitroreducens as our source organisms for they live in aromatic rich environments and chose E coli and bacillus subtilis for their clear genetic contexts. We downloaded all 21,096 entries of transcription regulation related proteins of these five bacteria species from the protein data base uniprot.
+
<div id="MiningFlowchart">
 +
    <img id="FlowStep0" src="https://static.igem.org/mediawiki/igem.org/3/38/Database.png" />
 +
    <img id="FlowStep1" src="https://static.igem.org/mediawiki/igem.org/6/63/Peking_miningStep1.png" />
 +
    <img id="FlowStep2" src="https://static.igem.org/mediawiki/igem.org/f/f0/Peking_miningStep2.png" />
 +
    <img id="FlowStep3" src="https://static.igem.org/mediawiki/igem.org/1/1b/Peking_miningStep3.png" />
 +
    <img id="FlowStep4" src="https://static.igem.org/mediawiki/igem.org/2/22/Peking_miningStep4.png" />
 +
    <div id="Flowcover0"></div>
 +
    <div id="Flowcover1"></div>
 +
    <div id="Flowcover2"></div>
 +
    <div id="Flowcover3"></div>
 +
    <div id="Flowcover4"></div>
 +
</div>
 +
    <!--endofflowchart-->
 +
 
 +
     <p id="Mining_Legend1"><b>Figure 1.</b> The flow chart of mining aromatic-sensing transcriptional regulators from the database Uniprot. <b>Step 1</b>, narrowing down the scope of proteins into transcription factors (TFs) in specific bacteria species. <b>Step 2</b>, screening out aromatics-related transcription factors. <b>Step 3</b>, the aromatics-related transcription factors with most detailed studies are selected. <b>Step 4</b>, manual adjustment to further evaluate the reliability of the selected transcription factors. <b>Move the mouse cursor to see the detailed explanations of individual steps. </b><br/><br/></p>
 +
    <img id="Mining_Figure2" src="https://static.igem.org/mediawiki/2013/8/87/Peking2013_SensorMining_Figure2.PNG" />
 +
    <p id="Mining_Legend2"><b>Figure 2.</b> Summary of data mining process and screening criteria. Numbers of remained  candidates after each step are shown on the left surface of the pyramid. The screening criteria are shown on the right.<br/><br/> </p>
 +
     <p id="Mining_Para2">First (Step 1 in <b>Fig. 1</b>), we narrowed down the scope of proteins into transcription factors of specific bacteria species. We chose <i>Pseudomonas putida</i>, <i>pseudomonas sp</i> and <i>pseudomonas nitroreducens</i> as our source organisms because they live in aromatics-rich environments and chose <i>E.coli</i> and <i>bacillus subtilis</i> due to their clear genetic contexts. We downloaded all <b>21,096</b> entries of transcription-regulation-related proteins of these five bacteria species from the protein data base uniprot.
<br/><br/>
<br/><br/>
-
Second, we screened out aromatics related transcription factors by analyzing the downloaded entries with a computer program. The computer program searched all the entries with a list of keywords (aromatic, benzene, phenol, phenyl, naphthalene, benzoic, benzaldehyde, tolyl, toluene, xylene, styrene) and rated the proteins. Once a keyword appeared in a protein’s entry, the program added one point to its rate. 912 proteins rated more than 0 point remained after this step.
+
Second (Step 2 in <b>Fig. 1</b>), we screened out aromatics-related transcription factors by analyzing the downloaded entries with a computer program. The computer program searched all the entries with a list of keywords (aromatic, benzene, phenol, phenyl, naphthalene, benzoic, benzaldehyde, tolyl, toluene, xylene, styrene) and scored the proteins. Once a keyword appeared in a protein’s entry, the program added one point to its score. <b>912</b> proteins with scores higher than 0 remained after this step.
<br/><br/>
<br/><br/>
-
Third, we used another computer program to examine whether the transcription factors remained after step two are well studied. The computer program excluded unnamed proteins that have open reading frame numbers only. Because proteins that have been characterized in E coli are more likely to work well in our host species, the computer program then searched the names of the remaining  proteins together with the keyword “E coli” in google scholar and added k/10 point to its rate ( k is the number of papers in the result). 60 proteins rated more than 10 points remained after this step.
+
Third (Step 3 in <b>Fig. 1</b>), we used another computer program to examine whether the transcription factors remaining after step two were well studied. The computer program excluded unnamed proteins that have open reading frame numbers only. Because proteins that have been characterized in <i>E.coli</i> are more likely to work well in our expected biosensor circuits (that works in <i>E.coli</i>), the computer program then searched the names of the remaining  proteins together with the keyword “E. coli” in google scholar and added <i>k</i>/10 point to its score (<i>k</i> is the number of citations). <b>60</b> proteins scored higher than 10 points remained after this step.
<br/><br/>
<br/><br/>
-
Finally, we carried on a manual adjustment on the selected 60 proteins to verify their feasibility. Proteins that regulate aromatic degradation pathways without actually responding to aromatic compounds and those originated from two component systems are excluded. 19 proteins are manually selected at last (<b>Table 1</b>).
+
Finally (Step 4 in <b>Fig. 1</b>), we carried out a manual adjustment on the 60 proteins to confirm their reliability. Proteins that has no actual ability to sense aromatic compounds and those other possible false positive cases, such as bacterial two-component systems (their performance is highly genetic-context-dependent across different bacterial species), were excluded. Finally, <b>17</b> proteins were manually determined at last (<b>Table 1</b>). The entire mining process has been summarized in <b>Fig. 2</b>.
</p>
</p>
   
   
-
     <h1 id="SensorMiningTableTitle"><b>Table 1. Proteins selected after manual adjustment</b></h1>
+
     <h1 id="SensorMiningTableTitle"><b>Table 1. Aromatics-sensing transcriptional regulators mined from the Uniprot</b></h1>
     <table border="1" id="SensorMiningTable">
     <table border="1" id="SensorMiningTable">
-
       <tr><th>Protein names</th><th>Sources</th><th>Reported Typical Inducers</th><th>Scores</th></tr>
+
       <tr><th>Protein names</th><th>Sources</th><th>Reported Typical Inducers </br>(<b><a href="https://static.igem.org/mediawiki/igem.org/2/24/Peking2013_Chemicals_V3%2B.pdf">Click Here</a> for the chemical formula of aromatic compounds</b>)</th><th>Scores</th></tr>
-
       <tr><td>XylS</td><td><I>Pseudomonas putida</I> (Arthrobacter siderocapsulatus)</td><td>Benzoic acid</td><td>259</td></tr>
+
       <tr><td><a href="https://2013.igem.org/Team:Peking/Project/BioSensors/XylS">XylS</a></td><td><I>Pseudomonas putida</I> (Arthrobacter siderocapsulatus)</td><td>Benzoic acid</td><td>259</td></tr>
-
       <tr><td>XylR</td><td><I>Pseudomonas putida</I> (Arthrobacter siderocapsulatus)</td><td>m-Xylene</td><td>219</td></tr>
+
       <tr><td><a href="https://2013.igem.org/Team:Peking/Project/BioSensors/XylR">XylR</a></td><td><I>Pseudomonas putida</I> (Arthrobacter siderocapsulatus)</td><td>m-Xylene</td><td>219</td></tr>
       <tr><td>tyrR</td><td><I>Escherichia coli</I> (strain K12)</td><td>tyrosine</td><td>160</td></tr>
       <tr><td>tyrR</td><td><I>Escherichia coli</I> (strain K12)</td><td>tyrosine</td><td>160</td></tr>
-
       <tr><td>nahR</td><td><I>Pseudomonas putida </I>(Arthrobacter siderocapsulatus)</td><td>Salicylic acid</td><td>106</td></tr>
+
       <tr><td><a href="https://2013.igem.org/Team:Peking/Project/BioSensors/NahR">nahR</a></td><td><I>Pseudomonas putida </I>(Arthrobacter siderocapsulatus)</td><td>Salicylic acid</td><td>106</td></tr>
-
      <tr><td>catR</td><td><I>Pseudomonas putida</I> (Arthrobacter siderocapsulatus)</td><td>catechol</td><td>104</td></tr>
+
       <tr><td>CapR</td><td><I>Pseudomonas putida</I> (Arthrobacter siderocapsulatus)</td><td>phenol</td><td>80</td></tr>
       <tr><td>CapR</td><td><I>Pseudomonas putida</I> (Arthrobacter siderocapsulatus)</td><td>phenol</td><td>80</td></tr>
-
       <tr><td>hcaR</td><td><I>Escherichia coli</I> (strain K12)</td><td>3-Phenyl-propionic acid</td><td>56</td></tr>
+
       <tr><td><a href="https://2013.igem.org/Team:Peking/Project/BioSensors/HcaR">hcaR</a></td><td><I>Escherichia coli</I> (strain K12)</td><td>3-Phenyl-propionic acid</td><td>56</td></tr>
-
       <tr><td>padR</td><td><I>Bacillus subtilis</I> (strain 168)</td><td>coumalic acid</td><td>45</td></tr>
+
       <tr><td><a href="https://2013.igem.org/Team:Peking/Project/BioSensors/DmpR">dmpR</a></td><td><I>Pseudomonas sp. </I>(strain CF600).</td><td>phenol</td><td>43</td></tr>
-
      <tr><td>dmpR</td><td><I>Pseudomonas sp. </I>(strain CF600).</td><td>phenol</td><td>43</td></tr>
+
       <tr><td>pobR</td><td><I>Pseudomonas putida</I>(Arthrobacter siderocapsulatus)</td><td>p-Hydroxybenzoic acid</td><td>29</td></tr>
       <tr><td>pobR</td><td><I>Pseudomonas putida</I>(Arthrobacter siderocapsulatus)</td><td>p-Hydroxybenzoic acid</td><td>29</td></tr>
       <tr><td>CymR</td><td><I>Pseudomonas putida</I> (Arthrobacter siderocapsulatus)</td><td>4-Isopropyl benzoate</td><td>23</td></tr>
       <tr><td>CymR</td><td><I>Pseudomonas putida</I> (Arthrobacter siderocapsulatus)</td><td>4-Isopropyl benzoate</td><td>23</td></tr>
-
       <tr><td>Paax</td><td><I>Escherichia coli </I>(strain K12)</td><td>phenylacedtyl-CoA</td><td>20</td></tr>
+
       <tr><td><a href="https://2013.igem.org/Team:Peking/Project/BioSensors/HpaR#ContentHpaR4">Paax</a></td><td><I>Escherichia coli </I>(strain K12)</td><td>phenylacedtyl-CoA</td><td>20</td></tr>
-
       <tr><td>hpaR</td><td><I>Pseudomonas putida </I>(Arthrobacter siderocapsulatus)</td><td>(3-Hydroxy-phenyl)-acetic acid</td><td>18</td></tr>
+
       <tr><td><a href="https://2013.igem.org/Team:Peking/Project/BioSensors/HpaR">hpaR</a></td><td><I>Pseudomonas putida </I>(Arthrobacter siderocapsulatus)</td><td>(3-Hydroxy-phenyl)-acetic acid</td><td>18</td></tr>
       <tr><td>mhpR</td><td><I>Escherichia coli</I> (strain K12)</td><td>(3-Hydroxy-phenyl)-propionic acid</td><td>18</td></tr>
       <tr><td>mhpR</td><td><I>Escherichia coli</I> (strain K12)</td><td>(3-Hydroxy-phenyl)-propionic acid</td><td>18</td></tr>
       <tr><td>phhR</td><td><I>Pseudomonas putida</I> (Arthrobacter siderocapsulatus)</td><td>phenylalanine</td><td>16</td></tr>
       <tr><td>phhR</td><td><I>Pseudomonas putida</I> (Arthrobacter siderocapsulatus)</td><td>phenylalanine</td><td>16</td></tr>
       <tr><td>bphS</td><td><I>Pseudomonas sp.</I> (strain CF600).</td><td>2-hydroxy-6-oxo-6-phenylhexa-2,4-dienoic acid</td><td>16</td></tr>
       <tr><td>bphS</td><td><I>Pseudomonas sp.</I> (strain CF600).</td><td>2-hydroxy-6-oxo-6-phenylhexa-2,4-dienoic acid</td><td>16</td></tr>
-
       <tr><td>HbpR</td><td><I>Pseudomonas nitroreducens</I></td><td>2-Hydroxybiphenyl</td><td>12</td></tr>
+
       <tr><td><a href="https://2013.igem.org/Team:Peking/Project/BioSensors/HbpR">HbpR</a></td><td><I>Pseudomonas nitroreducens</I></td><td>2-Hydroxybiphenyl</td><td>12</td></tr>
       <tr><td>phcR</td><td><I>Pseudomonas putida </I>(Arthrobacter siderocapsulatus)</td><td>phenol</td><td>11</td></tr>
       <tr><td>phcR</td><td><I>Pseudomonas putida </I>(Arthrobacter siderocapsulatus)</td><td>phenol</td><td>11</td></tr>
       <tr><td>yodB</td><td><I>Bacillus subtilis </I>(strain 168)</td><td>2-methyl hydroquinone</td><td>11</td></tr>
       <tr><td>yodB</td><td><I>Bacillus subtilis </I>(strain 168)</td><td>2-methyl hydroquinone</td><td>11</td></tr>
Line 317: Line 367:
-
     <p id="Mining_Para3">Peking iGEM team has successfully screened out a set of feasible aromatic sensors using the four step sieving method. Because of its good transferability and massive data processing ability, we also believe that this method will be useful in other kinds of biobriks mining in this information explosion age. </p>
+
     <p id="Mining_Para3"><br/>
-
    <img id="Mining_Figure2" src="https://static.igem.org/mediawiki/igem.org/4/46/Peking2013_SensorMning_Figure2.PNG" />
+
In summary, using the four-step bioinformatic data mining method. we have successfully screened out a set of aromatics-sensing transcriptional regulators (<b>Fig. 2</b>). These 17 aromatics-sensing regulators are supposed to be reliable and well studied.
-
    <p id="Mining_Legend2"><b>Figure 2.</b> Sieving conditions and sieving results of each step. Numbers of selected proteins after each step are showing on the left surface of the pyramid. Sieving conditions are showing on the right. </p>
+
</br></br>
 +
<b>We believe that this method may also be applied to mine other types of Biobricks. Moreover, although our data mining method is conventional in bioinformatics field, we deem such a bioinformatics approach to be highly instructive to routine synthetic biology research, for it will greatly reinforce our ability to mine rich collections of high-quality Biobricks from increasingly massive data in an automated manner</b>.  
 +
</br></br>
 +
In the following study, we will take these regulators as core components to build <a href="https://2013.igem.org/Team:Peking/Project/BioSensors">a comprehensive set of biosensor circuits</a> for aromatics detection.<br/><br/>
 +
</p>
 +
   
 +
 
     <div id="MileStone1"></div>
     <div id="MileStone1"></div>
     <div id="MileStone2"></div>
     <div id="MileStone2"></div>
 +
    <div id="MileStone3"></div>
 +
 +
    <h1 id="SourceCodeTitle">Source Code</h1>
 +
    <p id="SourceCodeProA">Source code for protein sorting: <a href="https://static.igem.org/mediawiki/igem.org/b/b5/Peking2013_Mining_ProteinAnalysis.cpp.txt">Protein analysis.cpp,</a> <a href="https://static.igem.org/mediawiki/igem.org/1/1f/Peking2013_Mining_Protein_heap.cpp.txt">Protein heap.cpp,</a> <a href="https://static.igem.org/mediawiki/igem.org/b/bd/Peking2013_Mining_Proteinheap.h.txt">Proteinheap.h.</a></p>
 +
    <p id="SourceCodeCrawl">Source code for internet crawler: <a href="https://static.igem.org/mediawiki/igem.org/d/df/Peking2013_Mining_NewCrawler.cpp.txt">New crawler.cpp,</a> <a href="https://static.igem.org/mediawiki/igem.org/1/1f/Peking2013_Mining_Protein_heap.cpp.txt">Protein heap.cpp,</a> <a href="https://static.igem.org/mediawiki/igem.org/b/bd/Peking2013_Mining_Proteinheap.h.txt">Proteinheap.h,</a>
 +
<a href="https://static.igem.org/mediawiki/igem.org/7/7e/Peking2013_Mining_Heap_oper.cpp.txt">heap_oper.cpp,</a>
 +
<a href="https://static.igem.org/mediawiki/igem.org/d/d3/Peking2013_Mining_Heap.h.txt">heap.h.</a> </p>
 +
   
     </div>
     </div>
Line 343: Line 407:
-
function MoveInSlide(SlideId)
+
/*flowchart*/
-
{
+
var TimerInterval=5000;
-
$(SlideId).animate({top:"0px"});
+
var myTimerID = setInterval("NextStep()",TimerInterval);
-
         
+
 
-
};
+
function NextStep()
 +
{  
 +
  if($('#FlowStep0').css('display')=='block')
 +
  {
 +
        Activate1();
 +
  }
 +
  else if($('#FlowStep1').css('display')=='block')
 +
  {
 +
        Activate2();
 +
  }
 +
  else if($('#FlowStep2').css('display')=='block')
 +
  {
 +
        Activate3();
 +
  }
 +
  else if($('#FlowStep3').css('display')=='block')
 +
  {
 +
        Activate4();
 +
  }
 +
  else if($('#FlowStep4').css('display')=='block')
 +
  {
 +
        Activate0();
 +
  }
 +
}
 +
 
 +
 
 +
 
 +
document.getElementById("Flowcover0").onmouseover=function()//hover 0
 +
{
 +
        clearInterval(myTimerID);
 +
        Activate0();
 +
}
 +
document.getElementById("Flowcover0").onmouseout=function()
 +
{
 +
        myTimerID = setInterval("NextStep()",TimerInterval);
 +
}
 +
document.getElementById("Flowcover1").onmouseover=function()//hover 1
 +
{
 +
        clearInterval(myTimerID);
 +
        Activate1();
 +
}
 +
document.getElementById("Flowcover1").onmouseout=function()
 +
{
 +
        myTimerID = setInterval("NextStep()",TimerInterval);
 +
}
 +
document.getElementById("Flowcover2").onmouseover=function()//hover 2
 +
{
 +
        clearInterval(myTimerID);
 +
        Activate2();
 +
}
 +
document.getElementById("Flowcover2").onmouseout=function()
 +
{
 +
        myTimerID = setInterval("NextStep()",TimerInterval);
 +
}
 +
document.getElementById("Flowcover3").onmouseover=function()//hover 3
 +
{
 +
        clearInterval(myTimerID);
 +
        Activate3();
 +
}
 +
document.getElementById("Flowcover3").onmouseout=function()
 +
{
 +
        myTimerID = setInterval("NextStep()",TimerInterval);
 +
}
 +
document.getElementById("Flowcover4").onmouseover=function()//hover 4
 +
{
 +
        clearInterval(myTimerID);
 +
        Activate4();
 +
}
 +
document.getElementById("Flowcover4").onmouseout=function()
 +
{
 +
        myTimerID = setInterval("NextStep()",TimerInterval);
 +
}
 +
 
 +
function Activate0()
 +
{
 +
    document.getElementById("Flowcover0").style.opacity="0";
 +
    document.getElementById("Flowcover1").style.opacity="0.8";
 +
    document.getElementById("Flowcover2").style.opacity="0.8";
 +
    document.getElementById("Flowcover3").style.opacity="0.8";
 +
    document.getElementById("Flowcover4").style.opacity="0.8";
 +
    document.getElementById("FlowStep0").style.display="block";
 +
    document.getElementById("FlowStep1").style.display="none";
 +
    document.getElementById("FlowStep2").style.display="none";
 +
    document.getElementById("FlowStep3").style.display="none";
 +
    document.getElementById("FlowStep4").style.display="none";
 +
 
 +
    return 0;
 +
 
 +
}
 +
function Activate1()
 +
{
 +
    document.getElementById("Flowcover0").style.opacity="0.8";
 +
    document.getElementById("Flowcover1").style.opacity="0";
 +
    document.getElementById("Flowcover2").style.opacity="0.8";
 +
    document.getElementById("Flowcover3").style.opacity="0.8";
 +
    document.getElementById("Flowcover4").style.opacity="0.8";
 +
    document.getElementById("FlowStep0").style.display="none";
 +
    document.getElementById("FlowStep1").style.display="block";
 +
    document.getElementById("FlowStep2").style.display="none";
 +
    document.getElementById("FlowStep3").style.display="none";
 +
    document.getElementById("FlowStep4").style.display="none";
 +
 
 +
    return 1;
 +
 
 +
}
 +
function Activate2()
 +
{
 +
    document.getElementById("Flowcover0").style.opacity="0.8";
 +
    document.getElementById("Flowcover1").style.opacity="0.8";
 +
    document.getElementById("Flowcover2").style.opacity="0";
 +
    document.getElementById("Flowcover3").style.opacity="0.8";
 +
    document.getElementById("Flowcover4").style.opacity="0.8";
 +
    document.getElementById("FlowStep0").style.display="none";
 +
    document.getElementById("FlowStep1").style.display="none";
 +
    document.getElementById("FlowStep2").style.display="block";
 +
    document.getElementById("FlowStep3").style.display="none";
 +
    document.getElementById("FlowStep4").style.display="none";
 +
 
 +
    return 2;
 +
 
 +
}
 +
function Activate3()
 +
{
 +
    document.getElementById("Flowcover0").style.opacity="0.8";
 +
    document.getElementById("Flowcover1").style.opacity="0.8";
 +
    document.getElementById("Flowcover2").style.opacity="0.8";
 +
    document.getElementById("Flowcover3").style.opacity="0";
 +
    document.getElementById("Flowcover4").style.opacity="0.8";
 +
    document.getElementById("FlowStep0").style.display="none";
 +
    document.getElementById("FlowStep1").style.display="none";
 +
    document.getElementById("FlowStep2").style.display="none";
 +
    document.getElementById("FlowStep3").style.display="block";
 +
    document.getElementById("FlowStep4").style.display="none";
 +
 
 +
    return 3;
 +
 
 +
}
 +
function Activate4()
 +
{
 +
    document.getElementById("Flowcover0").style.opacity="0.8";
 +
    document.getElementById("Flowcover1").style.opacity="0.8";
 +
    document.getElementById("Flowcover2").style.opacity="0.8";
 +
    document.getElementById("Flowcover3").style.opacity="0.8";
 +
    document.getElementById("Flowcover4").style.opacity="0";
 +
    document.getElementById("FlowStep0").style.display="none";
 +
    document.getElementById("FlowStep1").style.display="none";
 +
    document.getElementById("FlowStep2").style.display="none";
 +
    document.getElementById("FlowStep3").style.display="none";
 +
    document.getElementById("FlowStep4").style.display="block";
 +
 
 +
    return 4;
 +
 
 +
}
 +
 
 +
 
 +
 
-
function MoveOutSlide(SlideId)
+
/*endofflowchart*/
-
{
+
-
$(SlideId).animate({top:"280px"});
+
-
       
+
-
};
+

Latest revision as of 18:12, 28 October 2013

Biosensor Mining

Biosensor Mining



In order to comprehensively profile aromatics in environment, our toolkit should be equipped with a collection of biosensors that senses diverse aromatic components. However, there is no such a comprehensive collection of biosensors available currently. Noting the abundant genomic and proteomic data in databases today, we speculated that large protein databases, like Uniprot, are ideal gold mines finding new Biobricks. This year, Peking iGEM team has developed a four-step bioinformatic mining method to screen out feasible and well-characterized aromatics-sensing transcriptional regulators from the protein database. This method consists of several computer programs to process massive data and a manual adjustment step to further guarantee the reliability of the mining results:

Figure 1. The flow chart of mining aromatic-sensing transcriptional regulators from the database Uniprot. Step 1, narrowing down the scope of proteins into transcription factors (TFs) in specific bacteria species. Step 2, screening out aromatics-related transcription factors. Step 3, the aromatics-related transcription factors with most detailed studies are selected. Step 4, manual adjustment to further evaluate the reliability of the selected transcription factors. Move the mouse cursor to see the detailed explanations of individual steps.

Figure 2. Summary of data mining process and screening criteria. Numbers of remained candidates after each step are shown on the left surface of the pyramid. The screening criteria are shown on the right.

First (Step 1 in Fig. 1), we narrowed down the scope of proteins into transcription factors of specific bacteria species. We chose Pseudomonas putida, pseudomonas sp and pseudomonas nitroreducens as our source organisms because they live in aromatics-rich environments and chose E.coli and bacillus subtilis due to their clear genetic contexts. We downloaded all 21,096 entries of transcription-regulation-related proteins of these five bacteria species from the protein data base uniprot.

Second (Step 2 in Fig. 1), we screened out aromatics-related transcription factors by analyzing the downloaded entries with a computer program. The computer program searched all the entries with a list of keywords (aromatic, benzene, phenol, phenyl, naphthalene, benzoic, benzaldehyde, tolyl, toluene, xylene, styrene) and scored the proteins. Once a keyword appeared in a protein’s entry, the program added one point to its score. 912 proteins with scores higher than 0 remained after this step.

Third (Step 3 in Fig. 1), we used another computer program to examine whether the transcription factors remaining after step two were well studied. The computer program excluded unnamed proteins that have open reading frame numbers only. Because proteins that have been characterized in E.coli are more likely to work well in our expected biosensor circuits (that works in E.coli), the computer program then searched the names of the remaining proteins together with the keyword “E. coli” in google scholar and added k/10 point to its score (k is the number of citations). 60 proteins scored higher than 10 points remained after this step.

Finally (Step 4 in Fig. 1), we carried out a manual adjustment on the 60 proteins to confirm their reliability. Proteins that has no actual ability to sense aromatic compounds and those other possible false positive cases, such as bacterial two-component systems (their performance is highly genetic-context-dependent across different bacterial species), were excluded. Finally, 17 proteins were manually determined at last (Table 1). The entire mining process has been summarized in Fig. 2.

Table 1. Aromatics-sensing transcriptional regulators mined from the Uniprot

Protein namesSourcesReported Typical Inducers
(Click Here for the chemical formula of aromatic compounds)
Scores
XylSPseudomonas putida (Arthrobacter siderocapsulatus)Benzoic acid259
XylRPseudomonas putida (Arthrobacter siderocapsulatus)m-Xylene219
tyrREscherichia coli (strain K12)tyrosine160
nahRPseudomonas putida (Arthrobacter siderocapsulatus)Salicylic acid106
CapRPseudomonas putida (Arthrobacter siderocapsulatus)phenol80
hcaREscherichia coli (strain K12)3-Phenyl-propionic acid56
dmpRPseudomonas sp. (strain CF600).phenol43
pobRPseudomonas putida(Arthrobacter siderocapsulatus)p-Hydroxybenzoic acid29
CymRPseudomonas putida (Arthrobacter siderocapsulatus)4-Isopropyl benzoate23
PaaxEscherichia coli (strain K12)phenylacedtyl-CoA20
hpaRPseudomonas putida (Arthrobacter siderocapsulatus)(3-Hydroxy-phenyl)-acetic acid18
mhpREscherichia coli (strain K12)(3-Hydroxy-phenyl)-propionic acid18
phhRPseudomonas putida (Arthrobacter siderocapsulatus)phenylalanine16
bphSPseudomonas sp. (strain CF600).2-hydroxy-6-oxo-6-phenylhexa-2,4-dienoic acid16
HbpRPseudomonas nitroreducens2-Hydroxybiphenyl12
phcRPseudomonas putida (Arthrobacter siderocapsulatus)phenol11
yodBBacillus subtilis (strain 168)2-methyl hydroquinone11


In summary, using the four-step bioinformatic data mining method. we have successfully screened out a set of aromatics-sensing transcriptional regulators (Fig. 2). These 17 aromatics-sensing regulators are supposed to be reliable and well studied.

We believe that this method may also be applied to mine other types of Biobricks. Moreover, although our data mining method is conventional in bioinformatics field, we deem such a bioinformatics approach to be highly instructive to routine synthetic biology research, for it will greatly reinforce our ability to mine rich collections of high-quality Biobricks from increasingly massive data in an automated manner.

In the following study, we will take these regulators as core components to build a comprehensive set of biosensor circuits for aromatics detection.

Source Code

Source code for protein sorting: Protein analysis.cpp, Protein heap.cpp, Proteinheap.h.

Source code for internet crawler: New crawler.cpp, Protein heap.cpp, Proteinheap.h, heap_oper.cpp, heap.h.