From 2013.igem.org

Revision as of 09:05, 27 October 2013 by BillXue (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

hidde original style

Hello,welcome to visit our Wiki, for better view, please use chrome, firefox, safari .etc modern browser.

Transpeeder

Control the translation speed

Download

Background

Codons are the basic principles in the transmission of genetic information. Each amino acid corresponds to at least 1 codon and at most 6 codons. These codons encoding the same amino acid are called synonymous codes. But appearance frequency of synonymous codons is not equivalent in organisms or genes, while instead, one or several specific codon(s) usually tend to be used more often, and the phenomenon is termed codon usage bias（separate codon usages）. In 1991, Danchin and his colleagues first found the codon usages separation in E.coli. With the development of genome sequencing，more and more researchers are interested in codon usage bias，and have found there are different codon usage bias in different species.

Studies have shown that the codon usage bias not only has an effect on protein expression levels but also plays a role in the translation regulation of genes. The protein synthesis velocity is under the influence of the translation initiation rate and the peptide chain elongation rate. Moreover, translation initiation rate is determined by the rate of combining the ribosome and mRNA. Thus the concentration of ribosome and mRNA becomes one of the main influencial factors of protein synthesis. In other words, codon usage will directly affect the efficiency of the transcription and determine the concentration of mRNA. Furthermore, the correlation analysis of codon usage bias and the concentration of mRNA and protein length in yeast also showed that the codon usage bias may help to increase the efficiency of the transcription and translation (such as improve accuracy) and reduce consumption in the process of translation. In 2009, Science reported the ribosome profiling technology (Genome-Wide Analysis in Vivo of Translation with Nucleotide Resolution Using Ribosome Profiling), and in 2012, Nature also published a paper of analysis the codon choice with the method of ribosome profiling technology (The anti-Shine-Dalgarno sequence drives translational pausing and codon choice in bacteria). Both these two articles reported the choice of codon in gene sequences will avoid the homology of ribosome binding sites (RBS) sequence in order to improve the translation speed.

Learning from the above two articles, we developed a tool for gene translation regulation which was based on Shine-Dalgarno sequence of species, named Transpeeder. This tool provides the reference for scientific researchers in synthetic genes. At the same time，iGEMers can use transpeeder to mutate the component sequences in order to regulate translation speed and adapt to the experiment requirement when they are designing or using the component in iGEM.

We’ve developed two versions of Transpeeder: the online version and the desktop version. Both of them share the same functions. Users can submit an amino acid, a nucleic acid sequence,upload a local file to Transpeeder or choose parts ID of iGEM. And then Transpeeder will output slow speed sequence and fast speed sequence.

Modeling

Based on the background described above, we put forward an algorithm in order to implement the tool, and the details are as follows.

User can submit an amino acid or a nucleic acid sequence. When the sequence conforms to the input rules, the Clustalw is working. And then Transpeeder will choose the sequence sharing highest similarity with SD sequences as slow speed sequence and the one with the lowest similarity as fast speed sequence.

Algorithm is simply described as follows:

A. Determine the input sequence. Users can select the input sequence type: Amino acid/Nucleotide;

a. For amino acids sequence, determine the existence of the amino acids characters, and feedback to the users.

b. For nucleotide sequence, determine the existence of the nucleotide characters, then identify whether the sequence contains gene sequences (ORF identification).

B. Deal with selected SD sequence. Take E.coli for example，SD=AGGAGGT. Reverse the SD sequence, form new SD’ sequence, SD’=AGGTAGG.

C. Gene sequence split. Due to the SD sequences length are 5-7 nucleotides and the space of the ribosome size is about containing 2 codons. We split gene sequence with the length of 6 nucleotides (codon pair) and step length is set to 6. For each split (codon pair) of sequence set as SS.

D. Sequence alignment. Calling Clustalw2 program to align SS with SD and SD’.

E. Choose the corresponding alignment results according to user’s demands.

Experiments

With the preliminary algorithm, we mutated the GFP_ M62653 to a fast and a slow speed sequence. And we calculated the CAI of them : GFP_FAST(M62653) CAI: 0.695，GFP_LOW(M62653) CAI: 0.611. It is in accordance with previous work. Team UESTC_life helped us with experimental verification, and the results showed that the mutative sequences with our algorithm are in consistent with experiments work.

Fig.3 Experiment results

Future work

Now Transpeeder only collects one host: E.coli. In our next edition, we will add more species to it. At the same time, it will support more file formats and add sharing function, so that you can share your results via Twitter, Weibo or Facebook.

Team:UESTC/transpeeder