Team:Shenzhen BGIC 0101/Tutorial/neochr
From 2013.igem.org
(One intermediate revision not shown) | |||
Line 15: | Line 15: | ||
<h3>1.1 Decouple.pl</h3> | <h3>1.1 Decouple.pl</h3> | ||
<p>This plugin is to decouple the genes which have overlap gene regions. These overlapping genes can be decoupled if meet the following conditions: (1)If two genes have overlap gene regions, the latter gene 5’UTR does not cover the former gene initial codon (ATG); (2)Overlapping region initial coordinate is in the coding DNA sequences(CDS) of gene which is need to be decoupled; (3)The decouple site of CDS have synonymous substitute codon to replace; After decoupling, we use these non-redundancy genes to generate a GFF file and a FASTA file.</p> | <p>This plugin is to decouple the genes which have overlap gene regions. These overlapping genes can be decoupled if meet the following conditions: (1)If two genes have overlap gene regions, the latter gene 5’UTR does not cover the former gene initial codon (ATG); (2)Overlapping region initial coordinate is in the coding DNA sequences(CDS) of gene which is need to be decoupled; (3)The decouple site of CDS have synonymous substitute codon to replace; After decoupling, we use these non-redundancy genes to generate a GFF file and a FASTA file.</p> | ||
- | < | + | <p><b>1.1.1 Internal operation </b></p> |
<p>First, this plugin extracts base sequence from the genome file according to the gene order list, and records the gene order in the list. And then plugin records the annotation information according to the specie GFF file, moreover, plugin extends gene CDS upstream 600bp as 5’-UTR and downstream 100bp as 3’-UTR if the GFF file does not contain annotated these two features.<br/> | <p>First, this plugin extracts base sequence from the genome file according to the gene order list, and records the gene order in the list. And then plugin records the annotation information according to the specie GFF file, moreover, plugin extends gene CDS upstream 600bp as 5’-UTR and downstream 100bp as 3’-UTR if the GFF file does not contain annotated these two features.<br/> | ||
Second, this plugin detects the overlapping genes in the same chromosome. In case the overlapping genes are detected, it will judge whether the overlapping initial site is located in the CDS region, and identify the site is belong to phase0/1/2.<br/> | Second, this plugin detects the overlapping genes in the same chromosome. In case the overlapping genes are detected, it will judge whether the overlapping initial site is located in the CDS region, and identify the site is belong to phase0/1/2.<br/> | ||
Line 23: | Line 23: | ||
Finally, the plugin links non-redundancy genes to construct a new chromosome according to the gene order. | Finally, the plugin links non-redundancy genes to construct a new chromosome according to the gene order. | ||
</p> | </p> | ||
- | < | + | <p><b>1.1.2 Example</b></p> |
<p>We have two input forms to execute the plugin:<br/> | <p>We have two input forms to execute the plugin:<br/> | ||
1. Using string format as gene order list input form:<br/> | 1. Using string format as gene order list input form:<br/> | ||
Line 30: | Line 30: | ||
erl GeneDecouple.pl --species saccharomyces_cerevisiae_chr --list_format file --gene_order gene_ordre.list --geneset_dir ../gene_set --upstream_extend 600 --downstream_extend 100 --neo_chr_gff neochr.gff --neo_chr_fa neochr.fa | erl GeneDecouple.pl --species saccharomyces_cerevisiae_chr --list_format file --gene_order gene_ordre.list --geneset_dir ../gene_set --upstream_extend 600 --downstream_extend 100 --neo_chr_gff neochr.gff --neo_chr_fa neochr.fa | ||
</p> | </p> | ||
- | < | + | <p><b>1.1.3 Parameters </b></p> |
<p style="text-align:center;"><table border="1"> | <p style="text-align:center;"><table border="1"> | ||
<tr> | <tr> | ||
Line 94: | Line 94: | ||
</table> | </table> | ||
</p><br/> | </p><br/> | ||
- | < | + | <p><b>1.1.4 The format of output file</b></p> |
- | <p>The output files are standard GFF and FASTA format files which are decoupled.<br/> | + | <p>The output files are standard GFF and FASTA format files which are decoupled.</p><br/> |
- | + | <p>1.decoupled GFF file<br/></p> | |
<p style="text-align:center;"><img src="https://static.igem.org/mediawiki/2013/e/e5/T1-2.png" alt="data" style="width: 750px" /></p><<br/> | <p style="text-align:center;"><img src="https://static.igem.org/mediawiki/2013/e/e5/T1-2.png" alt="data" style="width: 750px" /></p><<br/> | ||
- | + | <p>2.decoupled FASTA file</p><br/> | |
<p style="text-align:center;"><img src="https://static.igem.org/mediawiki/2013/b/b2/T1-3.png" alt="data" style="width: 750px" /></p><br/> | <p style="text-align:center;"><img src="https://static.igem.org/mediawiki/2013/b/b2/T1-3.png" alt="data" style="width: 750px" /></p><br/> | ||
<h3>1.2 Add.pl </h3> | <h3>1.2 Add.pl </h3> | ||
<p>This plugin will add the LoxPsym sequence and the customized left and right telomeres, centromere and autonomously replicating sequence (ARS) into the FASTA file and GFF file which are generated by Decouple.pl.</p> | <p>This plugin will add the LoxPsym sequence and the customized left and right telomeres, centromere and autonomously replicating sequence (ARS) into the FASTA file and GFF file which are generated by Decouple.pl.</p> | ||
- | < | + | <p><b>1.2.1 Internal operation </b></p> |
<p>The plugin adds LoxPsym behind the first 3bp of 3’-UTR in each gene and adds telomere, centromere and ARS according this mode:<br/> | <p>The plugin adds LoxPsym behind the first 3bp of 3’-UTR in each gene and adds telomere, centromere and ARS according this mode:<br/> | ||
<b>left_telomere + gene1 + centromere + gene2 + ARS + gene3 + right_telomere</b><br/> | <b>left_telomere + gene1 + centromere + gene2 + ARS + gene3 + right_telomere</b><br/> | ||
Line 109: | Line 109: | ||
Finally, user can see the new added features chromosome according to the JBrowse. | Finally, user can see the new added features chromosome according to the JBrowse. | ||
</p> | </p> | ||
- | < | + | <p><b>1.2.2 Example </b></p> |
<p>perl 04.Add.pl --loxp loxPsym.feat --left_telomere UTC_left.feat --right_telomere UTC_right.feat --ars chromosome_I_ARS108.feature --centromere chromosome_I_centromere.feat --chr_gff neochr.gff --chr_seq neochr.fa --neochr_seq neochr.final.fa --neochr_gff neochr.final.gff<br/><br/> | <p>perl 04.Add.pl --loxp loxPsym.feat --left_telomere UTC_left.feat --right_telomere UTC_right.feat --ars chromosome_I_ARS108.feature --centromere chromosome_I_centromere.feat --chr_gff neochr.gff --chr_seq neochr.fa --neochr_seq neochr.final.fa --neochr_gff neochr.final.gff<br/><br/> | ||
All the feature file format is 4 lines format, for example:<br/> | All the feature file format is 4 lines format, for example:<br/> | ||
Line 119: | Line 119: | ||
</p> | </p> | ||
- | < | + | <p><b>1.2.3 Parameters</b></p> |
<p style="text-align:center;"><table border="1"> | <p style="text-align:center;"><table border="1"> | ||
<tr><th>Parameter</th> <th>Description</th> <th>Default Selectable range</th></tr> | <tr><th>Parameter</th> <th>Description</th> <th>Default Selectable range</th></tr> | ||
Line 133: | Line 133: | ||
- | < | + | <p><b>1.2.4 The format of output</b></p> |
<p>The output files are standard GFF and FASTA format of adding features chromosome.<br/> | <p>The output files are standard GFF and FASTA format of adding features chromosome.<br/> | ||
1. added features GFF file</p><br/> | 1. added features GFF file</p><br/> | ||
Line 141: | Line 141: | ||
<h3>1.3 Delete.pl </h3> | <h3>1.3 Delete.pl </h3> | ||
<p>This plugin can modify the GFF and FASTA file which are generated by Add.pl according to the user drags a window in the JBrowse and delete any gene in the window.</p> | <p>This plugin can modify the GFF and FASTA file which are generated by Add.pl according to the user drags a window in the JBrowse and delete any gene in the window.</p> | ||
- | < | + | <p><b>1.3.1 Internal operation </b></p> |
<p>Firstly, user uses mouse to drag a window in the added features FASTA file which is showed in the JBrowse and JBrowse displays all the genes in this window.Secondly, user decides which genes is need to be delected from the new chromosome and plugin deletes genes from GFF file and modify FASTA in the same time.</p> | <p>Firstly, user uses mouse to drag a window in the added features FASTA file which is showed in the JBrowse and JBrowse displays all the genes in this window.Secondly, user decides which genes is need to be delected from the new chromosome and plugin deletes genes from GFF file and modify FASTA in the same time.</p> | ||
- | < | + | <p><b>1.3.2 Example </b></p> |
<p>perl 05.delete.pl --delete="YAL054C,YAL038W" --neochr_gff neochr.refine.final.gff --neochr_fa neochr.refine.final.fa --slim_gff neochr.refine.delete.gff --slim_fa neochr.refine.delete.fa </p> | <p>perl 05.delete.pl --delete="YAL054C,YAL038W" --neochr_gff neochr.refine.final.gff --neochr_fa neochr.refine.final.fa --slim_gff neochr.refine.delete.gff --slim_fa neochr.refine.delete.fa </p> | ||
- | < | + | <p><b>1.3.3 Parameters </b></p> |
<p style="text-align:center;"><table border="1"> | <p style="text-align:center;"><table border="1"> | ||
<tr><th>Parameter</th> <th>Description</th> <th>Default</th> <th>Selectable range</th></tr> | <tr><th>Parameter</th> <th>Description</th> <th>Default</th> <th>Selectable range</th></tr> | ||
Line 155: | Line 155: | ||
</table> | </table> | ||
- | < | + | <p><b>1.3.4 The format of ouput</b></p> |
<p>The output files are standard GFF and FASTA format of deleted genes chromosome.</p> | <p>The output files are standard GFF and FASTA format of deleted genes chromosome.</p> | ||
</body> | </body> | ||
</html> | </html> |
Latest revision as of 05:31, 28 October 2013
Tutorial
NeoChr
NeoChr module would assist users to grab related genes in different pathways manually, to rewire genes’ relationship logically*, and to replace genes with ortholog that score higher*. Firstly, it would allow users to define gene order and orientation in DRAG&DROP way. Secondly, decoupled these genes if have overlap and make all genes are non-redundancy. Finally, add chromosome features to build a new chromosome and show in the JBrowse. Moreover, users can drag a window in the JBrowse and delete any gene in the window.
Note:
*These function are unavailable now, please wait for version 2.
**You can also add any thing here including your own water mark.
Plugin Scripts
This module contains three plugins: Decouple.pl, Add.pl and Delete.pl.
1.1 Decouple.pl
This plugin is to decouple the genes which have overlap gene regions. These overlapping genes can be decoupled if meet the following conditions: (1)If two genes have overlap gene regions, the latter gene 5’UTR does not cover the former gene initial codon (ATG); (2)Overlapping region initial coordinate is in the coding DNA sequences(CDS) of gene which is need to be decoupled; (3)The decouple site of CDS have synonymous substitute codon to replace; After decoupling, we use these non-redundancy genes to generate a GFF file and a FASTA file.
1.1.1 Internal operation
First, this plugin extracts base sequence from the genome file according to the gene order list, and records the gene order in the list. And then plugin records the annotation information according to the specie GFF file, moreover, plugin extends gene CDS upstream 600bp as 5’-UTR and downstream 100bp as 3’-UTR if the GFF file does not contain annotated these two features.
Second, this plugin detects the overlapping genes in the same chromosome. In case the overlapping genes are detected, it will judge whether the overlapping initial site is located in the CDS region, and identify the site is belong to phase0/1/2.
Third, the plugin attempts to synonymous substitute codon to break the initial codon intra the CDS. Printing information whether or not be decoupled successfully, such as:
And non-redundancy genes are generated.
Finally, the plugin links non-redundancy genes to construct a new chromosome according to the gene order.
1.1.2 Example
We have two input forms to execute the plugin:
1. Using string format as gene order list input form:
perl GeneDecouple.pl --species saccharomyces_cerevisiae_chr --list_format string --gene_order="YAL054C -,YAL038W +,YBR019C -,YBR145W +,YCL040W +,YCR012W +,YCR105W +,YDL168W +,YPL017C -,YIL177C -,YIL177W-A +,YIL172C -,YIL171W-A +,” --geneset_dir ../gene_set --upstream_extend 600 --downstream_extend 100 --neo_chr_gff neochr.gff --neo_chr_fa neochr.fa
2. Using file format as gene order list input form:
erl GeneDecouple.pl --species saccharomyces_cerevisiae_chr --list_format file --gene_order gene_ordre.list --geneset_dir ../gene_set --upstream_extend 600 --downstream_extend 100 --neo_chr_gff neochr.gff --neo_chr_fa neochr.fa
1.1.3 Parameters
Parameter | Description | ||
---|---|---|---|
list_format | set the input form of gene order list | string | string/file |
gene_order | set the input gene order list file(include pathway genes and addition genes) | ||
Parameter | Description | Default | Selectable range |
geneset_dir | set the species annotation directory | 600 | |
upstream_extend | set the length of gene downstram(bp) | 100 | |
neo_chr_gff | set the name of output neochr gff file | ||
neo_chr_fa | set the name of output neochr fasta file | ||
help | Show help information |
1.1.4 The format of output file
The output files are standard GFF and FASTA format files which are decoupled.
1.decoupled GFF file
2.decoupled FASTA file
1.2 Add.pl
This plugin will add the LoxPsym sequence and the customized left and right telomeres, centromere and autonomously replicating sequence (ARS) into the FASTA file and GFF file which are generated by Decouple.pl.
1.2.1 Internal operation
The plugin adds LoxPsym behind the first 3bp of 3’-UTR in each gene and adds telomere, centromere and ARS according this mode:
left_telomere + gene1 + centromere + gene2 + ARS + gene3 + right_telomere
The distance between centromere and ARS is less than 30Kb.
Finally, user can see the new added features chromosome according to the JBrowse.
1.2.2 Example
perl 04.Add.pl --loxp loxPsym.feat --left_telomere UTC_left.feat --right_telomere UTC_right.feat --ars chromosome_I_ARS108.feature --centromere chromosome_I_centromere.feat --chr_gff neochr.gff --chr_seq neochr.fa --neochr_seq neochr.final.fa --neochr_gff neochr.final.gff
All the feature file format is 4 lines format, for example:
name = site_specific_recombination_target_region
type = loxPsym
source = BIO
sequence = ATAACTTCGTATAATGTACATTATACGAAGTTAT
Note: the first line is the detail name of feature, the second line is the type of feature, the third line is the source of feature and the last line is the sequence of feature.
1.2.3 Parameters
Parameter | Description | Default Selectable range |
---|---|---|
loxp | set the sequence of loxp | ATAACTTCGTATAATGTATGCTATACGAAGTTAT |
left_telomere | set the sequence of left telomere | |
right_telomere | set the sequence of right telomere | |
chr_gff | set the input neorchr_gff file | |
chr_seq | set the input neorchr_gff file | |
neochr_seq | set the name of output added loxps and telomeres neochr_fa file | |
neochr_gff | set the name of output added loxps and telomeres neochr_gff file |
1.2.4 The format of output
The output files are standard GFF and FASTA format of adding features chromosome.
1. added features GFF file
1.3 Delete.pl
This plugin can modify the GFF and FASTA file which are generated by Add.pl according to the user drags a window in the JBrowse and delete any gene in the window.
1.3.1 Internal operation
Firstly, user uses mouse to drag a window in the added features FASTA file which is showed in the JBrowse and JBrowse displays all the genes in this window.Secondly, user decides which genes is need to be delected from the new chromosome and plugin deletes genes from GFF file and modify FASTA in the same time.
1.3.2 Example
perl 05.delete.pl --delete="YAL054C,YAL038W" --neochr_gff neochr.refine.final.gff --neochr_fa neochr.refine.final.fa --slim_gff neochr.refine.delete.gff --slim_fa neochr.refine.delete.fa
1.3.3 Parameters
Parameter | Description | Default | Selectable range |
---|---|---|---|
delete | Set the to be deleted gene list | ||
neochr_gff | Set the input GFF file which is generated by Add.pl | ||
neochr_fa | Set the input FASTA file which is generated by Add.pl | ||
slim_gff | Set the output GFF file | ||
slim_fa | Set the output FASTA file |
1.3.4 The format of ouput
The output files are standard GFF and FASTA format of deleted genes chromosome.