Team:TU-Eindhoven/ProteinSelection
From 2013.igem.org
Pascalaldo (Talk | contribs) |
(→CEST Based Marker Proteins) |
||
(40 intermediate revisions not shown) | |||
Line 2: | Line 2: | ||
{{:Team:TU-Eindhoven/Template:MenuBar}} | {{:Team:TU-Eindhoven/Template:MenuBar}} | ||
- | + | =CEST Based Marker Proteins= | |
- | {{:Team:TU-Eindhoven/Template:Lead}}To create bacteria with the ability to generate contrast on a CEST MRI scan, polypeptides that have that ability had to be found. Various solutions based on short Lysine, | + | {{:Team:TU-Eindhoven/Template:Lead}}To create bacteria with the ability to generate contrast on a CEST MRI scan, polypeptides that have that ability had to be found. Various solutions based on short Lysine, Arginine, Threonine or Serine rich sequences were proposed.{{:Team:TU-Eindhoven/Template:Ref | id=McMahonDIACEST | author=M.T. McMahon | title=New "Multicolor" Polypeptide Diamagnetic Chemical Exchange Saturation Transfer (DIACEST) Contrast Agents for MRI | journal=Magnetic Resonance in Medicine | edition=60 | pages=803-812 | year=2008 }} However, it is hard to predict how well these sequences will express in bacteria and whether they are sufficiently stable in vivo. To avoid these problems a new approach was taken. The suitability as a CEST based marker was estimated for proteins of which the structure is already clarified.{{:Team:TU-Eindhoven/Template:LeadEnd}} |
- | + | ==Scanning for Candidates== | |
- | To find suitable proteins the [pdb.org RCSB Protein Data Bank] was queried for all entries containing proteins. | + | For good CEST contrast a protein should have a high Lysine or Arginine content.{{:Team:TU-Eindhoven/Template:RefAgain | id=McMahonDIACEST }}{{:Team:TU-Eindhoven/Template:Ref | id=GiladArtificialGene | author=A.A. Gilad | title=Artificial reporter gene providing MRI contrast based on proton exchange. | journal=Nature biotechnology | edition=25.2 | pages=217-219 | year=2007 }} To find these suitable proteins the [http://www.pdb.org RCSB Protein Data Bank] was queried for all entries containing proteins. This was done using the SEARCH Web Service of PDB.org, for the XML query used see [[Team:TU-Eindhoven/Code:PDBQuery#query.xml | query.xml]]. A Python program was written to analyze the obtained amino acid sequences and calculate the ratio of Lysine or Arginine to the total chain length (see [[Team:TU-Eindhoven/Code:PDBQuery#query.xml | PDB.py]] and [[Team:TU-Eindhoven/Code:PDBQuery#query.xml | queryPDB.py]]). The results are visualized in {{:Team:TU-Eindhoven/Template:Figure | id=lysineRatioPlot }} and {{:Team:TU-Eindhoven/Template:Figure | id=arginineRatioPlot }}. |
- | {{:Team:TU-Eindhoven/Template: | + | {{:Team:TU-Eindhoven/Template:Float | position=left | size=6 }} |
- | + | {{:Team:TU-Eindhoven/Template:Image | filename=LysineRatio.png }} | |
- | + | {{:Team:TU-Eindhoven/Template:FloatEnd | caption=Plot of the Lysine ratio and count of a broad range of proteins from PDB.org. Red dots represent potentially interesting proteins. | id=lysineRatioPlot }}{{:Team:TU-Eindhoven/Template:Float | position=left | size=6 }} | |
- | + | {{:Team:TU-Eindhoven/Template:Image | filename=ArginineRatio.png }} | |
- | + | {{:Team:TU-Eindhoven/Template:FloatEnd | caption=Plot of the Arginine ratio and count of a broad range of proteins from PDB.org. Red dots represent potentially interesting proteins. | id=arginineRatioPlot }} | |
- | + | Using this data a selection of interesting proteins was made for further analysis. The protein list was also filtered by chain length and general practicality of the sequences. The selection is shown below: | |
- | + | {|class="table table-striped" | |
- | + | ! PDB ID | |
- | + | ! Frequent Amino Acid | |
- | + | ! Amount of Frequent Amino Acids (#) | |
- | + | ! Chain Length (#) | |
+ | ! Ratio (-) | ||
+ | |- | ||
+ | | 1ETF | ||
+ | | Arginine | ||
+ | | 11 | ||
+ | | 23 | ||
+ | | 0.48 | ||
+ | |- | ||
+ | | 1IWQ | ||
+ | | Lysine | ||
+ | | 7 | ||
+ | | 19 | ||
+ | | 0.37 | ||
+ | |- | ||
+ | | 1PJN | ||
+ | | Lysine | ||
+ | | 8 | ||
+ | | 21 | ||
+ | | 0.38 | ||
+ | |- | ||
+ | | 2IGR | ||
+ | | Lysine | ||
+ | | 15 | ||
+ | | 34 | ||
+ | | 0.44 | ||
+ | |- | ||
+ | <!--| 2KLW | ||
+ | | Lysine | ||
+ | | 10 | ||
+ | | 32 | ||
+ | | 0.31 | ||
+ | |- | ||
+ | | 2PCO | ||
+ | | Lysine | ||
+ | | 8 | ||
+ | | 26 | ||
+ | | 0.30 | ||
+ | |- --> | ||
+ | | 1G70 | ||
+ | | Arginine | ||
+ | | 10 | ||
+ | | 22 | ||
+ | | 0.46 | ||
+ | |- | ||
+ | | 1BY0 | ||
+ | | Lysine | ||
+ | | 8 | ||
+ | | 27 | ||
+ | | 0.30 | ||
+ | |- | ||
+ | | 1NWD | ||
+ | | Lysine | ||
+ | | 8 | ||
+ | | 28 | ||
+ | | 0.29 | ||
+ | |- | ||
+ | <!--| 1PEH | ||
+ | | Lysine | ||
+ | | 10 | ||
+ | | 35 | ||
+ | | 0.29 | ||
+ | |- --> | ||
+ | | 2L9A | ||
+ | | Lysine | ||
+ | | 8 | ||
+ | | 24 | ||
+ | | 0.33 | ||
+ | |- | ||
+ | <!--| 2L96 | ||
+ | | Lysine | ||
+ | | 8 | ||
+ | | 24 | ||
+ | | 0.33 | ||
+ | |- | ||
+ | | 2L99 | ||
+ | | Lysine | ||
+ | | 8 | ||
+ | | 24 | ||
+ | | 0.33 | ||
+ | |- | ||
+ | | 1LYP | ||
+ | | Lysine | ||
+ | | 9 | ||
+ | | 32 | ||
+ | | 0.28 | ||
+ | |- --> | ||
+ | | 1LQ7 | ||
+ | | Lysine | ||
+ | | 17 | ||
+ | | 67 | ||
+ | | 0.25 | ||
+ | |} | ||
+ | ==Molecular Dynamics== | ||
+ | To refine the selection the accessibility of the various exchangeable hydrogen atoms of the Lysines and Arginines was taken into account. Hereto Molecular Dynamics simulations of the proteins in water were carried out. Gromacs{{:Team:TU-Eindhoven/Template:Ref | id=Gromacs1 | author=B. Hess, C.Kutzner, D. van der Spoel and E. Lindahl | title=Gromacs 4: Algorithms for highly efficient, load-balanced, and scalable molecular simulation | journal=J. Chem. Theory Comput. | edition=4 | pages=435-447 | year=2008 }} was used to run the sequence of simulations. For every protein a ''in vacuo'' energy minimization was carried out to stabilize the protein. Subsequently an energy minimization in water with ions (0.15 M KCl) was executed to stabilize the whole system. Further stabilization was done by running a NVT and NPT simulation with position restraints followed by a NPT simulation without the restraints. Then the final run, of which the data was used for further analysis, was carried out. | ||
- | + | For the analysis of the data the ''g_dist'' utility provided with Gromacs was used in combination with self-made Matlab scripts. Using ''g_dist'' the number of water molecules within a 5 nm radius of certain characteristic groups was calculated for each protein. This value was divided by the runtime of the simulation, resulting in an indication of the accessibility of the characteristic group by water molecules. The results are shown in the table below, numbers should be interpreted as an index value. | |
+ | {|class="table table-striped" | ||
+ | ! PDB ID | ||
+ | ! Backbone Secondary Amine | ||
+ | ! Arginine Guanidine | ||
+ | ! Arginine Secondary Amine | ||
+ | ! Secondary Amine | ||
+ | ! Lysine Primary Amine | ||
+ | |- | ||
+ | | ''1ETF'' | ||
+ | | ''7.49'' | ||
+ | | ''7.23'' | ||
+ | | ''5.34'' | ||
+ | | ''5.81'' | ||
+ | | ''X'' | ||
+ | |- | ||
+ | | 1IWQ | ||
+ | | 44.4 | ||
+ | | 38.9 | ||
+ | | 32.2 | ||
+ | | 44.6 | ||
+ | | 42.1 | ||
+ | |- | ||
+ | | ''1PJN'' | ||
+ | | ''64.7'' | ||
+ | | ''34.7'' | ||
+ | | ''36.9'' | ||
+ | | ''68.1'' | ||
+ | | ''57.5'' | ||
+ | |- | ||
+ | | 2IGR | ||
+ | | 38.9 | ||
+ | | X | ||
+ | | X | ||
+ | | 38.9 | ||
+ | | 25.4 | ||
+ | |- | ||
+ | <!--| 2KLW | ||
+ | | | ||
+ | | | ||
+ | | | ||
+ | | | ||
+ | | | ||
+ | |- | ||
+ | | 2PCO | ||
+ | | | ||
+ | | | ||
+ | | | ||
+ | | | ||
+ | | | ||
+ | |- --> | ||
+ | | ''1G70'' | ||
+ | | ''23.8'' | ||
+ | | ''17.7'' | ||
+ | | ''17.4'' | ||
+ | | ''21.9'' | ||
+ | | ''X'' | ||
+ | |- | ||
+ | | 1BY0 | ||
+ | | 51.6 | ||
+ | | 61.5 | ||
+ | | 57.9 | ||
+ | | 53.8 | ||
+ | | 16.6 | ||
+ | |- | ||
+ | | 1NWD | ||
+ | | 20.9 | ||
+ | | X | ||
+ | | X | ||
+ | | 20.9 | ||
+ | | 27.3 | ||
+ | |- | ||
+ | <!--| 1PEH | ||
+ | | | ||
+ | | | ||
+ | | | ||
+ | | | ||
+ | | | ||
+ | |- --> | ||
+ | | 2L9A | ||
+ | | 26.8 | ||
+ | | X | ||
+ | | X | ||
+ | | 26.8 | ||
+ | | 26.2 | ||
+ | |- | ||
+ | <!--| 2L96 | ||
+ | | | ||
+ | | | ||
+ | | | ||
+ | | | ||
+ | | | ||
+ | |- | ||
+ | | 2L99 | ||
+ | | | ||
+ | | | ||
+ | | | ||
+ | | | ||
+ | | | ||
+ | |- | ||
+ | | 1LYP | ||
+ | | | ||
+ | | | ||
+ | | | ||
+ | | | ||
+ | | | ||
+ | |- --> | ||
+ | | 1LQ7 | ||
+ | | 0.585 | ||
+ | | 5.86 | ||
+ | | 2.75 | ||
+ | | 0.581 | ||
+ | | 15.3 | ||
+ | |} | ||
+ | In this case a higher value is expected to give a better CEST contrast. Taking this into account a selection was made. For the contrast Guanidine (of Arginine) based contrast '''1G70''' and '''1ETF''' were selected. Based on the simulations it is expected that 1G70 will give better contrast. '''1PJN''' was also tested to see the result of a protein that is expected to give high contrast with saturation pulses at the chemical shifts of both characteristic groups (Guanidine of Arginine and the primary Amide of Lysine). | ||
==References== | ==References== | ||
Line 26: | Line 234: | ||
{{:Team:TU-Eindhoven/Template:BaseFooter}} | {{:Team:TU-Eindhoven/Template:BaseFooter}} | ||
+ | {{:Team:TU-Eindhoven/Template:Sponsors}} | ||
{{:Team:TU-Eindhoven/Template:SetTitle | menu=drylab | page=Protein Selection }} | {{:Team:TU-Eindhoven/Template:SetTitle | menu=drylab | page=Protein Selection }} | ||
{{:Team:TU-Eindhoven/Template:UseReferencing}} | {{:Team:TU-Eindhoven/Template:UseReferencing}} | ||
+ | {{:Team:TU-Eindhoven/Template:UseFigures}} | ||
+ | {{:Team:TU-Eindhoven/Template:SetHeader | nr=4}} |
Latest revision as of 23:36, 18 October 2013
Contents |
CEST Based Marker Proteins
To create bacteria with the ability to generate contrast on a CEST MRI scan, polypeptides that have that ability had to be found. Various solutions based on short Lysine, Arginine, Threonine or Serine rich sequences were proposed.McMahonDIACESTM.T. McMahon, New "Multicolor" Polypeptide Diamagnetic Chemical Exchange Saturation Transfer (DIACEST) Contrast Agents for MRI. Magnetic Resonance in Medicine 60, 803-812 (2008) However, it is hard to predict how well these sequences will express in bacteria and whether they are sufficiently stable in vivo. To avoid these problems a new approach was taken. The suitability as a CEST based marker was estimated for proteins of which the structure is already clarified.
Scanning for Candidates
For good CEST contrast a protein should have a high Lysine or Arginine content.McMahonDIACESTGiladArtificialGeneA.A. Gilad, Artificial reporter gene providing MRI contrast based on proton exchange.. Nature biotechnology 25.2, 217-219 (2007) To find these suitable proteins the [http://www.pdb.org RCSB Protein Data Bank] was queried for all entries containing proteins. This was done using the SEARCH Web Service of PDB.org, for the XML query used see query.xml. A Python program was written to analyze the obtained amino acid sequences and calculate the ratio of Lysine or Arginine to the total chain length (see PDB.py and queryPDB.py). The results are visualized in and .
PDB ID | Frequent Amino Acid | Amount of Frequent Amino Acids (#) | Chain Length (#) | Ratio (-) |
---|---|---|---|---|
1ETF | Arginine | 11 | 23 | 0.48 |
1IWQ | Lysine | 7 | 19 | 0.37 |
1PJN | Lysine | 8 | 21 | 0.38 |
2IGR | Lysine | 15 | 34 | 0.44 |
1G70 | Arginine | 10 | 22 | 0.46 |
1BY0 | Lysine | 8 | 27 | 0.30 |
1NWD | Lysine | 8 | 28 | 0.29 |
2L9A | Lysine | 8 | 24 | 0.33 |
1LQ7 | Lysine | 17 | 67 | 0.25 |
Molecular Dynamics
To refine the selection the accessibility of the various exchangeable hydrogen atoms of the Lysines and Arginines was taken into account. Hereto Molecular Dynamics simulations of the proteins in water were carried out. GromacsGromacs1B. Hess, C.Kutzner, D. van der Spoel and E. Lindahl, Gromacs 4: Algorithms for highly efficient, load-balanced, and scalable molecular simulation. J. Chem. Theory Comput. 4, 435-447 (2008) was used to run the sequence of simulations. For every protein a in vacuo energy minimization was carried out to stabilize the protein. Subsequently an energy minimization in water with ions (0.15 M KCl) was executed to stabilize the whole system. Further stabilization was done by running a NVT and NPT simulation with position restraints followed by a NPT simulation without the restraints. Then the final run, of which the data was used for further analysis, was carried out.
For the analysis of the data the g_dist utility provided with Gromacs was used in combination with self-made Matlab scripts. Using g_dist the number of water molecules within a 5 nm radius of certain characteristic groups was calculated for each protein. This value was divided by the runtime of the simulation, resulting in an indication of the accessibility of the characteristic group by water molecules. The results are shown in the table below, numbers should be interpreted as an index value.
PDB ID | Backbone Secondary Amine | Arginine Guanidine | Arginine Secondary Amine | Secondary Amine | Lysine Primary Amine |
---|---|---|---|---|---|
1ETF | 7.49 | 7.23 | 5.34 | 5.81 | X |
1IWQ | 44.4 | 38.9 | 32.2 | 44.6 | 42.1 |
1PJN | 64.7 | 34.7 | 36.9 | 68.1 | 57.5 |
2IGR | 38.9 | X | X | 38.9 | 25.4 |
1G70 | 23.8 | 17.7 | 17.4 | 21.9 | X |
1BY0 | 51.6 | 61.5 | 57.9 | 53.8 | 16.6 |
1NWD | 20.9 | X | X | 20.9 | 27.3 |
2L9A | 26.8 | X | X | 26.8 | 26.2 |
1LQ7 | 0.585 | 5.86 | 2.75 | 0.581 | 15.3 |
In this case a higher value is expected to give a better CEST contrast. Taking this into account a selection was made. For the contrast Guanidine (of Arginine) based contrast 1G70 and 1ETF were selected. Based on the simulations it is expected that 1G70 will give better contrast. 1PJN was also tested to see the result of a protein that is expected to give high contrast with saturation pulses at the chemical shifts of both characteristic groups (Guanidine of Arginine and the primary Amide of Lysine).
References