Team:XMU Software
From 2013.igem.org
<!DOCTYPE HTML>
XMU Software 2013 team consists of 11 team members, 2 instructors and 3 advisors. This extraordinarily energetic and creative team is tightly connected by trust, collaboration and affection among its members. Learn more about our team on the Team page.
Our project includes 2 independent software tools - the Brick Worker and the E' NOTE. The former is a software suit for the evaluation and optimization of biobricks, i.e., promoter, RBS, protein coding sequences and terminator. E' NOTE is a web application serveing as an assistant for experiments. Its useful functions such as experiments recording and experimental template customization make experimental process easier and more enjoyable. To learn more about our project, please visit the Project page.
See more about XMU Software 2013 safety form on the Safety form page.
XMU Software 2013 has hosted a series of activities aiming at facilitating communication and collaboration among iGEM teams, propagating iGEM competition as well as promoting the development of synthetic biology. And these efforts are well paid off and have received satisfying feedback. More information is available at the Outreach page.
See more about XMU Software 2013 notebook on the Notebook page.
Dr. Baishan Fang is the professor in College of Chemistry & Chemical Engineering at Xiamen University. In the group of biocatalysis and biotransformation, his research mainly focuses on synthetic biology, mining and the transformation of the enzyme, construction of bio-molecular machines, the application of new biocatalysts. His major role of XMU iGEM team is to mentor for all and to enlighten the members.
Dr. I-Son Ng is the associate professor in College of Chemistry & Chemical Engineering at Xiamen University. Her research interests are biofuel, engineering of enzyme and protein, zymology, genetic engineering, biochemical separation procedures and proteomics. Her role for the project is to provide suggestions and instruments.
It is great for a team to have an omniscient advisor, even better an inspiring one. Ruosang Qiu , our beloved advisor, definitely is offering both. Her hard work as well as undoubted adorable personality is the motivation of our team members' efforts, her clear mind combined with provident planning lays the foundation of our successful project, To quote her words: I' was a happy iGEMer in 2012, I'm going to make you all happy iGEMers in 2013.
The past three years have seen his tremendous dedication to iGEM, Xin Wu, a passionate team member in XMU China 2011, a devoted team leader in XMU China 2012 and now, an invaluable advisor in XMU Software 2013. Had it not been Xin Wu's constant encouragement and guidance, we,the inexperienced iGEMers might have been faint-hearted and failed to face up to the challenges during the process. It is drawing on his expertise in synthetic biology and proficiency in iGEM competition that we have solved the seemingly unsolvable and conquered the seemingly unconquerable.
Youbin Mo is one of the great advisors of XMU software team in 2013. As a computational biophysicist, y, he is an unquestionable master of biological model and computer programming. In the meanwhile, website constructing is also Mo's technical ability which he acquired by participating the iGEM last year. Youbin gives play to his talent by teaching fundamental program skills to new iGEMers as well as directing them to be self-reliant synthetic biologists.
Team Members
Yumin Hong
- Human Practice
- Photographer
- Design User Interface of all software and tools
- Design Visual Identity of team (team logo, mascot wiki, poster and so on)
Yuezhen Chen
- Constructor of the circuit
- Wet Lab Journal
- Lecture and Theme Campus Party
- Wiki designer and implementer
- E' NOTE designer and implementer
Yijuan Zhang
- BioBrick designer
- Circuit construction
- Data analyzer
- The designer of Promoter-decoder, RBS-decoder
- Lecture and Theme Campus Party
- The chief translator
Xin Huang
- Fluorescence test
- Mascot maker
- Travel management
- Wetlab assistance
- Data analyzer
- Programmer
Tao Han
- Talented programmer
- Human practice
Shen Lin
- Designer of the SynoProteiner
- Human practice
- Translater
- Financial management
- Wiki designer
Likai Qiu
- E’ NOTE designer and main implementer
- Wiki implementer
- Network advisor of XMU-China
Jianxing Huang
- Meeting memo
- Draft specification of E’ NOTE
- PPT of presentation
Jiang Huang
- Fluorescence test
- Designer of SynoProteiner part
- Wetlab assistance
Han Cheng
- Constructor of the circuit and fluorescence test
- The chief of Human practice
- Mascot maker
- First lord of the treasury
Advisors
Youbin Mo
- Software training
- Wiki checker
Xin Wu
- Snacks sponsor
- rescue worker
Ruosang Qiu
- Charge of the whole project
- Experiment training
All work described on this wiki or on our parts registry pages was done by iGEM Team XMU Software 2013. We managed to finish the whole project, from planning, financing to the complete dry and wet lab work by ourselves. Nevertheless we could not have done all this work without the help, advice and guidance of several people. Therefore, special thanks to the following people:
1. Team XMU China, the wet lab team of our university, gave us a hand to construction work.
2. Tina Zhang, the advisor of team XMU China, helped us in experiments, especially in site-specific mutagenesis and PCR.
3. Prof. Zhiliang Ji, College of Life Science, provided us with valuable guidance in choosing the project.
4. Prof. Shoufa Han, provided us with many instruments such as ELISA reader to test the fluorescence.
5. Qiang Kou, the previous team leader of SYSU-Software team, gives us lots of help.
6. Wenjun Rao, a shy boy from XMU Software College, build the beautiful interfaces for Brick Worker.
Xiamen, also known as Amoy to the west, is a cozy city located in the southeastern part of China, and has a relaxing coastal charm with a population of 1.3 million. It's a historical harbor city which was founded in the mid-14 century, in the early years of the Ming Dynasty. In the early 1980's, Xiamen was declared as one of China's first Special Economic Zone, taking advantages of the city's heritage as a trading center and the proximity to Taiwan. In 2004 the city won the finals of the world's Human Settlements and Environment Award, "Nations in Bloom". Xiamen is one of China's most attractive and best-maintained resort city, and attracts a large number of foreign and local tourists. The city is easily accessible by air, and there are direct flights from Hong Kong, Kuala Lumpur, Osaka, Seoul, Singapore and Tokyo. Within China, Xiamen airport is linked to more than 30 domestic airports.
Xiamen University (XMU), also known as Universitas Amoiensis in Latin, is one of the top universities in China. It was founded in 1921 by Tan Kah-Kee, the well-known patriotic overseas Chinese leader. As an integrated university, XMU owns a comprehensive branches of discipline as well as many specialized institutes. Economy, counting, chemistry, life science and marine science all win high fame nationwide and even worldwide. The main campus of XMU locates in a picturesque setting between the sea and a scenic mountain, spreading over 150 hectares, and is generally regarded as the most beautiful campus in China.
Abstract
In a promoter sequences, the sigma factor binding site and other transcription factor binding site affect the strength of binding significantly. For annotating promoters, some software was developed which mostly focused on the prediction of other transcription factors or one particular type of sigma factors but failed to analyze the promoter with both sigma factors and other transcription factors. 1-2 To solute this problem, a module of our software was designed which can analyze and evaluate promoters.
Our software use PWM method to calculate the similarity between promoter sequencess and the position frequency matrix of transcription factor binding sites (TFBS) to locate the TFBS as well as to predict the relative strength of the promoter. Promoter-Decoder overshadows its counterparts with all-round analysis and the prediction of promoter strength. It enables users to figure out promoter types, predict promoter strength, changeit by mutating the key sites and even change the property of certain promoter by adding new TFBS to the promoter sequences.
Background
Sigma Factors
Bacteria encode several thousands of different proteins, which are necessary for normal cell functions or for adaptation to environmental changes.3 These proteins are not required at the same time or in the same amount. Regulation of gene expression therefore enables the cell to control the production of proteins needed for its life cycle or for adaptation to extracellular changes. The various steps during transcription and translation are therefore subject to different regulatory mechanisms.4
The most prominent step in gene regulation is the initiation of transcription in which the DNA-dependent RNA polymerase (RNAP) is the key enzyme. The RNAP or the RNAP core enzyme is the catalytic machinery for the synthesis of RNA from a DNA template. However, RNAP cannot initiate transcription by itself. Initiation of transcription requires an additional polypeptide known as a sigma-factor.5 Sigma-factors are a family of relatively small proteins that can associate in a reversible way with the RNAP core enzyme. Together, the sigma-factor and the RNAP core enzyme form an initiation-specific enzyme, the RNAP holoenzyme.
The sigma-factor directs RNA polymerase to a specific class of promoter sequencess. Most bacterial species synthesize several different sigma-factors that recognize different consensus sequencess.6
This variety in sigma-factors provides bacteria with the opportunity to maintain basal gene expression as well as for regulation of gene expression in response to altered environmental or developmental signals.
The frequency at which the RNAP holoenzyme initiates transcription, also known as the strength of a promoter, is influenced by the promoter sequences and the conformation of the DNA in the promoter region. The sigma-factors recognize two conserved sequencess in the promoter region, known as the promoter consensus sequences. Sigma-factors or fragments of sigma-factors bind specifically to promoter DNA sequences and by specific base pair and amino acid substitutions in the promoter consensus sequencess or sigma factors. Most bacterial species synthesize several different sigma-factors which direct the RNAP holoenzyme to distinct classes of promoters with a different consensus sequences. This variety in sigma-factors provides the bacterium with the opportunity to maintain basal gene expression as well as for regulation of gene expression in response to specific environmental stimuli.
The identification of bacterial promoters is an essential step in the elucidation of gene regulation.7
As a general rule, the more complex the life-cycle and environmental niche of a bacterium, the greater the number of sigma factors with corresponding promoter types. Typically however, the most common promoter type is that which regulates the housekeeping genes and the corresponding major sigma-factor is shared by all bacteria (sigma 70 in the well studied E. coli, and its homologues in other species). The binding site for the sigma70-family of promoters is defined by two consensus hexamers, TTGACA and TATAAT, located at approximately −35 and −10, respectively relative to the transcript start site (TSS) and spaced 15–21 base pairs (bp) apart2. RNA polymerase core enzyme associates with the major sigma-factor to form the holoenzyme which in turn binds to its cognate promoters to initiate transcription.
In prokaryotes, the minimum requirement for RNA polymerase binding is recognition of the promoter by the sigma factor. In general, prokaryotic RNA polymerases can interchange a number of sigma factors which bind and initiate different groups of genes.3
Transcription Factors
Sigma factors are essential for the transcription initiation in E. coli.10
In addition, promoter strengths are not determined purely by the binding of the sigma factor. Other transcription factors can bind specific sequencess surrounding or overlapping the promoter to either activate or repress transcription.4 The mechanism is transcriptional activators and repressors contribute to and detract from the accessibility of DNA by the RNA polymerase. 12
These transcription-regulating nuclear proteins bind to specific binding sites in the regulatory regions (e.g. promoters, enhancers) of the genes thus providing their activation or repression.
Computational methods of predicting TF binding sites in DNA are very important for understanding the molecular mechanisms of gene regulation.
The binding sites of the same transcription factor show a significant sequences conservation, which is often summarized as a short (5–20 bases long) common pattern called a transcription factor binding site (TFBS) or binding consensus. Our software aims to figure out the possible TFBS in promoters and precisely locate the TFBS so that the user may know the exact sites that play a role in regulating the transcription.
In prokaryotes (lower organisms without nuclei), there are fewer TFs, their motifs tend to be relatively long and the strength of regulation for a particular gene often depends on how closely a particular site matches the consensus for the motif. The more mismatches to the consensus in a binding site, the less often the TF will bind and therefore the less control it will exert on the target gene. So our software will calculate the similarity between the possible TFBS in the promoter and the standard motifs so the user will know to which extent the transcription factor will control the promoter transcription
Primer Design
To facilitate the design of PCR primers of various promoters, we've developed an additional function, namely, primer design in this part of our program. After inputting the promoter sequences, the software will figure out the most suitable primers based on the theory of Thomas Kämpke1, Markus Kieninger, and Michael Mecklenburg.13
Data Source
RegulonDB
Genes and operons that are under control of the same TF are members of that TF's regulon. Although methods for the prediction of regulons have been substantially improved, they are still far from perfect.
Comparative genomics tools can be used to predict regulons in bacterial genomes but the procedure can lead to incorrect regulon calling. Despite this drawback, several regulon databases are available that are based on comparative genomics methods and lack experimental evidence.
Probably the extended and accurate databases of regulons for E. coli are RegulonDB which provides the data source for our program.
Algorithm
Experimental results show that these are the strongest promoters that have been characterized in vitro so far and confirm the hypothesis that the consensus promoter sequences is "best". To calculate the similarity between the promoter sequences and the best sequences, we implement the PWM method6 in our program.
PWM (Position Weight Matrix)
Molecular techniques for the identification of promoters are both costly and time consuming, hence in silico methods are an attractive and well explored alternative. The most common in silico method to identify sigma 70 promoters uses position weight matrices (PWMs) and depends on the relative conservation of the transcription factor binding site (TFBS, or motifs ).
The algorithm can be divided into two parts regarding to the difference between the motifs of sigma factors and other transcription factors.
Part 1: the recognition of other transcription factors.7
Other transcription factors are proteins that can bind to a specific DNA sequences (motifs) and regulate the promoter's transcription. To recognize these possible motifs in a given promoter sequences, we calculate the Matrix Similarity Score (MSS) of every possible sites in the promoter sequences using the position frequency matrix of 86 transcription factors published by RegulonDB. The algorithm reports only those matches of a matrix that have got MSS higher than the settled threshold. And MSS for a subsequences x of the length L is calculated in following steps:
fi,Bi , frequency of nucleotide B to occur at the position i of the matrix (B ∈{A, T, G, C})
f imin , frequency of the nucleotide which is rarest in position i in the matrix
f imax , highest frequency in position i.
The information vector
describes the conservation of the positions i in a matrix. Multiplication of the frequencies with the information vector leads to a higher acceptance of mismatches in less conserved regions, whereas mismatches in highly conserved regions are very much discouraged. This leads to a better performance in recognition of TF binding sites if compared with methods that do not use the information vector.
To determine the best threshold of the motif finding algorithm, we test various threshold values and analyze the true negative and false positive rate of each threshold value. The ideal threshold is supposed to have both the least true negative and false positive rates.
Threshold | 0.5977 | 0.598 | 0.69 | 0.7 |