Team:Calgary/Notebook/Journal/Modelling

From 2013.igem.org

Modelling Journal

Week 5: May 27 - May 31

Modelling sub-team formed. Currently we do not know if our method of using ferritin to identify and separate DNA of interest will be successful. I was in charge of finding alternative reporter genes that will indicate the presence of DNA of interest. Specifically Lisa and Rai came up a list of interesting and widely used reporter genes that I should look in to, in the hope that eventually a differential equation based model can be developed. The differential model would help us predict their signal strength for any given input conditions of cells etc. This week I mostly did qualitative research, where I found out how the genes work, what equipment is needed, and started on learning their kinetics.

Those reporter genes can be divided into two categories: enzymes or proteins. β-galactosidase, luciferase and horseradish peroxidase all report promoter activities by reacting with a substrate that will either emit light or undergo color change. I have found that between the two enzymes luciferase has superior sensitivity and linear response range over GFP and β-glactosidase (Nakajima and Ohmiya, 2010). On the other hand GFP, FRET and BRET behave similarly but themselves don't catalyze substrates. GFP is able to fluoresces on its own. For FRET two things (nanoparticles or FPs) acting as donor/acceptor are required; when they are close enough and an external source provides excitement, the donor transfer light received to acceptor which emits a different light (Piston and Kremers, 2007). BRET is similar to FRET in that donor/acceptor pair is required, but BRET need not external source, rather it is powered by oxidation of substrates (Pfleger and Eidne, 2006). It seems that for our purposes FRET and BRET are too complicated and doesn't offer good enough signal to noise ratio in our application to justify the use of them. Instead we should focus on simple biochemical reporters, i.e. those enzymes mentioned above.

I have started working on the kinetics of those reporter enzymes.

Week 6: June 3 - June 7

The focus of this week has been the kinetics of those enzyme based reporter genes. There are two approaches to their kinetics, one based on chemical kinetics uses coupled ODEs and another using the results of enzyme kinetics reporting values such as Km and V. I was not trained in enzyme kinetics so I found a textbook by Cornish-Bowden to learn the techniques (Cornish-Bowden, 2012). This took up most of the week.

The kinetics of horseradish peroxidase in the context of being used for waste water treatment can be found by the work of Carvalho et al. (2006). Even though we intend to use horseradish peroxidase in the context of reporter gene, but we believe the catalytic activities will be the same, so the reaction pathways for pehnols in that paper can be readily adopted for our use. On the other hand the kinetics of fluorescence of β-galactosidase has been studied by Huang (1991). However the kinetics presented is very simple compared to work done by Metelkin et al. (2009). We need to decide whatever the detailed model will give us better results to justify their complexity.

Week 7: June 10 - June 14

In this week four models of enzyme kinetics have been built in Mathematica for the study of β-galactosidase, luciferase and horseradish peroxidase. By using the built-in function Manipulate, we can inspect the impact of substrate/enzyme concentration on the time course of the product of interest. In special cases, such as one model for horseradish peroxidase, even temperature can be varied as well.

It was noticed that some models predict negative concentrations of enzyme intermediate even though the governing equations were correct. We believe this was due to the stiffness of the system, for some reactions occur 7 magnitudes faster than others. When using a different technique, say stiffness switching, the simulation takes a very long time while progresses little, reaching maximum iteration step sets as 10^6. Fortunately the negative concentrations predicted were only of magnitude of 10^(-30) micro molar, such that it doesn't effect the result much. To fix the issue completely, however, may take more time by simplifying the governing equations.

Currently luciferase was modelled in luminescence output while β-galactosidase and horseradish peroxidase was modelled in color metric output. To make a proper comparison we wish to model the luminescence of horseradish peroxidase too. Even though the unit of measurement of luminescence and color metric output were in arbitrary units, we wish to obtain good insights in the shape of their response curve to select the one best suited for our needs.

Week 8: June 17 - June 21

To add information about mass balance to the reaction scheme of chemiluminescence of horseradish peroxidase we used a system of differential algebraic equations (DAEs). That is, in addition to differential equations governing the rate of change, because of the chemical equilibrium between species, algebraic equations constraining the relation between them are also included. Mathematica complained about the system being over determined even if no initial condition was given. In fact the form this DAEs has is not solvable by Mathematica.

The workaround for this problem I have found was to reduce everything to the fundamentals. So instead of equilibrium expressed as system of algebraic relations, I broke them into two reactions with rate constants large enough to be fast still obey the proportionality required by the equilibrium constant.

Maya 2012 couldn't open the Ferritin file created in Maya 2014. So I switched to Maya 2014 and now everything works fine.

Week 9: June 24 - June 28

The modelling is finished for now; until data from our experiments start to coming in, there is nothing to model. To help prepare the experiments I spent time studying experimental design and analysis techniques, with focus on analysis of variance (ANOVA) and factorial design. The book I had chosen to study from was Montgomery's Design and Analysis of Experiments textbook.

Week 10: July 1 - July 5

After talking to Taylor it has been determined that we will do a factorial design to investigate the effects of concentration and PH on the magneto-ferritin we will prepare in our lab. The goal is to quickly find better operating conditions to move on to next stages of the experiment. My next step needs the raw materials to be here and processed, so I am continuing to learn more about factorial design.

It has been decided that for the modelling in Autodesk Maya, we shall use the plug-in ePMV, which is only available for 2013 as the latest version. So installing both Maya 2013 and ePMV. Our vision is to simulate the bonding between parts, so I was studying dynamics and other related techniques in Maya as well.

Week 11: July 8 - July 12

This week's focus has been on Maya. Mostly figuring out how to import protein data base files (.pdb) and how to install and use various extensions in ePMV. I found out that the ePMV website isn't always clear, but fortunately YouTube has wonderful video tutorials on ePMV.

I loaded the ferritin sphere into the database file and changed the color scheme. The resulting Maya file can be as large as 50 MB, sometimes 100 MB more.

Week 12: July 15 - July 19

I kept working on Maya. Now all parts of the assembly have been loaded to Maya and ribbon, coarse surface, surface and atom models have been constructed. We were trying to print the modelled scenes from Maya, by rendering the scenes to images. But so far we have no success; all rendering methods in Maya resulted in poor images that we can't visualize the 3D structure. Sometimes the colors were lost resulted in a region of red surface. This problem occurred on all computers with Maya installed, under many different settings, even in Autodesk Max, but not on our in-house expert Merry's computer. For now we just use print screen to get the image and will worry about rendering later.

At the same time Ceser asked us to study the following techniques in Maya: Dynamics, nParticle, and APBS extension of ePMV for calculating the electrostatic force of a protein. Like everything else in Maya, the APBS extension did not work as we expected; for some reason it complains error occurred when executing a line in the script but without the details necessary for debugging. As a result nothing came out of APBS extension. Even worse we were going for the IMD extension of ePMV for calculating the molecular interactions, but that was not released which we just found out this week.

Week 13: July 22 - July 26

Turns out by re-configuring the light source immediately after importing .pdb files into Maya, the ribbon and atom models will render beautifully. Unfortunately this trick doesn't work for the surface models. Nonetheless we generated some really high definition pictures (8192*8192 pixels with 220 pixels/inch) for the website. At the same time Justin has been transferred to work on Maya with us. I taught him everything I knew, from creating simple objects to do basic dynamics.

Rai asked me to do some modelling on β-lactamase, which will likely be the reporter in our system. What she wants to get out of this study is to understand, when given two different concentrations of substrates, at what time will their differences be largest such that we can tell them apart. Unfortunately β-lactamase is very fast enzyme and that reaction mechanism has not been studied in detail. Then I remembered what my chemical engineering professor taught me: when reaction is fast, model mass transfer. For our lateral flow strip system, the substance will spend most of its time diffusing before react with β-lactamse, whose reaction will be rapid. Then by timing when the reaction occurs we have a measurement of the diffusion and hence will tell us how much substance was there initially.

Week 14: July 29 - August 2

Cesar asked us (Justin and I) to divide up our work. So I was working on the quantitative modelling while he worked on the qualitative modelling. The task I was asked to do was to figure out how to simulate TALE binding to DNA, according to laws of physics. We were hoping for using the electrostatics plug-in and derive the reactions from Brownian motion, but I had no luck with any of these. We are still working to understand how Maya will perform for dynamic models. We are uncertain if Maya is the best tool to achieve our vision for quantitative modelling.

Justin has made good progress with animating TALE binding to DNA without any physics. He did the most of the work of creating two animations while I rendered them into .avi files. Together we now understand the entire process from importing .pdb files to output animations to .avi files; we are ready for production.

For next week, after meeting with Cesar, we had agreed to focus on animation for now. The physics and details can be added later as we gain experience and insight with the molecules we working on.

Week 15: August 5 - August 9

Justin kept working on animating TALE binding to DNA, except this time he will also add ferritin, coils and linkers. The result is ferritin sphere + 12 TALE's + 24 coils and linkers + 1 DNA. Unfortunately the computer he is working on is not powerful enough to smoothly handle this, as a result of the slow response from the computer his progress has been slow. For me the work is to animate immobilized TALE binding to DNA. The difficulty lies in finding out how TALE should deform such that it can bind to DNA while one end of it is immobilized. For this Justin and I actually spend some time playing with wires that represent TALE and DNA to figure out the proper motion. Moreover Maya just behaves oddly on our computers; sometimes we can't generate DNA using the ePMV plug-in, sometimes the button on the Maya shelf will disappear. As a result we have accumulated a lot of technical questions that we were hoping to address.

We spent a lot this weeks meeting with Cesar talking about the big pictures and the approach he would like to see. He advised us to abandon the way we had been doing animation, which is by key framing, and switch to Nucleus engine. We are going to hold off on these adjustments until after our meeting with Merry, the expert on Maya, has taken place.

Week 16: August 12 - August 16

The computer Justin was using for his animation work complained about the depletion of RAM. As a result he couldn't work effectively in the lab, so we agreed to let him work at home for he has a much powerful desktop. This means that except Monday he was not here with the rest of the lab, but he will show up on the group meeting next Monday to report upon his progress.

Our meeting with Cesar on Monday resulted in division of labour between Justin and me. From that time I will be working on the quantitative modelling (on what problem is yet determined) while Justin will be responsible for the animation (that is, the entire story telling). This supports the decision of letting him work at home, for now there is no need of cooperation between the two of us. For now I studied nDynamics per request of Cesar in case the quantitative modelling will be focused on spatial problems, and then moved on for the modelling of β-lactamase, a request from Rai with newly found papers on its kinetics.

I also took some time doing readings on Fogler's book Elements of Chemical Reaction Engineering. This is because my prior study of enzyme kinetics was still confusing, mostly because the Cornish-Brown (2012) book was not presented in the way consistent with my chemical engineering background. Fogler's book, especially Chapter 7, bridged the gap.

Week 17: August 19 - August 23

Meeting with Cesar resulted in the topic of the quantitative modelling being set to the interaction & interaction of TALE and DNA. During the meeting Justin reported his progress on his last movie, which he completed using key-frame. To this Cesar pushed us try harder with nDynamics and Justin responded by that he needs at least a week to study the techniques properly. At the same meeting Iain asked us to start writing our progress report and filling in the wiki.

While Justin practised at home, I was working on the quantitative model. I adopted the approach of Markov chain by modelling 2D TALE-DNA binding on the strip, with discrete time and space. Since TALE is immobilized to the strip, with each time increment the DNA moves according to the movement rule specified by me, and when certain conditions are met it can either be bind to TALE permanently or can exit the strip. The difficulty was to translate those rules to the transition matrix, whose n-th power gives us the probability of state transition after n-th time increment, which we can obtain the probability of DNA capture. This model essentially hides unnecessarily fine details such as how TALE binds to DNA, and focus on the efficiency of DNA biding directly. The probability of DNA capture obtained from the model can be tested against empirical data such that certain parameters (i.e. the rules) can be modified to better describe reality.

So far a model with hard-coded constants was constructed as a proof of concept. The revision of the code with functions were partially done but was not tested for all combinations of TALE and DNA. So far the model has only considered 1 TALE and 1 DNA strand. The movement rule is a type of Brownian motion, with no TALE-DNA dissociation, with no noise or false DNA possible for binding, and with no DNA and strip interaction. But those are future directions that I can work on to make the model more sophisticated.

Week 18: August 26 - August 30

After the quantitative modelling was presented to the entire team on Monday, some senior members gave us some great suggestions. One of them requested the implementation of the concept of saturation of TALE such that each TALE can only bing one DNA. It looks simple to implement, but in reality it increases the computational effort significantly. If there are n possible states for DNA to be with an unsaturated TALE, then another n states with saturated TALE needs to be included in the state transition matrix. This effectively doubles the number of states needed and quadruples the number of calculations. Given that TALE and DNA will be much smaller than the space we consider, a large number of states will be necessary, which translates that the computational effort will be enormous. Here Dr. Nygren's suggestion of using partial differential equations for the modelling of diffusion and reaction might be appropriate, except that we were unable to find the parameters needed for a detailed DNA TALE binding scheme.

After a few days of research and thinking, two approaches seem to be worth a try: either do Monte Carlo simulations of the discretized system, or Bayesian network to exploit the overlapping sub-problems. The Monte Carlo simulations would be the more straight forward to code, for it simply describes the system in computer language then runs thousands of simulations to empirically determine the probabilities. The Bayesian network it would be a little bit complicated. For an investigation of a DNA and b TALE, we first compute the probabilities of 1 DNA binding, existing on the same space with {b,b-1,b-2,¡­,1} TALE respectively. Then to find the probabilities of n DNA we assume no interaction and apply binomial distribution to get the result from the corresponding 1 DNA case. Thus we obtained the probability of moving from the state of n DNA with m TALE to, the state of n-1 DNA with m TALE (if DNA exists) or the state of n-1 DNA with m-1 TALE (if DNA binds). Then the entire network from n DNA with m TALE to 0 DNA m-x TALE will give us the distribution where we can ask the expected number of DNA binds.

For now the simulation method is very easy to code so work has been started on that front. The Bayesian network approach is very interesting in its own right and we wish to develop the method further. Modelling work paused.

Week 23: September 30 - October 4

Practice, practice, practice for our Regional Jamboree presentation!

Week 24: October 7 - October 11

New wetlab data has become available and we have resumed our modeling work. We are currently asking what concentrations we need to be playing with in the lab to see the response we want from Taylor's prussian blue ferritin. To do this we've been looking at differential equations that will show binding of our various TALEs to their DNA targets to form complexes of immobilized DNA (DNA + the TALE on the strip) and then the immobilized reporter (when the FerriTALE is exposed to the immobilized DNA). We can then use the constants Taylor elucidated for his prussian blue ferritin to calculate how quickly we hit the visible threshold.

Week 25: October 14 - October 18

The equations have been put together in a MATLAB clone called Scilab. We are currently looking for constants to plug into the model for the DNA interactions of the TALEs. We have the Kd values for the TALEs, but cannot find any literature relating to separating this into the forward and reverse reactions for TALEs. One thing we plan on doing is testing this to try and feed some data into the model, but haven't been able to do this yet due to lack of protein.

Week 26: October 21 - October 25

The model is ready to test with our experimental data. We were able to obtain the kinetic constants for our TALEs to plug into the model. We are able look at 6 variables in our model and test them over time to see how fast we can get a result in our system. The preliminary results show that our system should be able to give a visible response in a reasonable amount of time.

Week 27: October 28 - November 1

It's our favourite time of the year: Wiki Freeze! See you in Boston.