Our model now considers the maturation of KillerRed and the accumulation of damages done to the bacteria. It is able to explain and predict properly the evolution of all three quantities that are observed : the optical density of the suspension, its fluorescence and the density of living cells. But we still have to determine the best parameters to do it.
These are 6 parameters to find :
$r$ : the speed of growth of bacteria. in $min^{-1}$
$a$ : the production of KillerRed per bacteria. in $UF.OD^{-1}.min^{-1}$
$b$ : the efficiency of photobleaching. in $UF.UL^{-1}.min^{-1}$
$m$ : the maturation rate of KillerRed. in $min^{-1}$
$k$ : the toxicity of KillerRed. in $OD.UF^{-1}.UL^{-1}.min^{-1}$
$l$ : the ability of the bacteria to repair damages of ROS. unit less
With the units :
$OD$ is the Optical Density at $\lambda = 600nm$
$UF$ is an arbitrary Unit of Fluorescence (with $\lambda_absorption=585nm$ and $\lambda_emission=610nm$
$UL$ is an arbitrary Unit of Light, related to the energy received by the bacteria. $1 UF$ shall be the energy of light received by an Erlenmeyer flask with a MR16 LED on its side at full power.
The aim is to find the set of parameters that best explains the curves of OD and fluorescence observed. As we cannot determine them separately because they have opposite effects, we search for the set of parameters that minimizes the distance between the outputs of the model and the experimental data. The distance chosen is the Euclidian distance : the Sum of Square Residuals, or SSR. In our case, the easiest and quickest methods are unusable:
$\diamond$ A regression requires the solutions to be analytic functions, such as polynomials or exponentials to project the points on it.
$\diamond$ Gradient or Newton methods require a regularity in the effect of parameters that we don't have.
$\diamond$ The technique of design of experiments is also unusable for the same reason.
So we used an alternative method.
At first sight, the only possibility to find our parameters was to manipulate them by hand until the predictions seemed good enough. It wasn't a slow method since we could imagine how the output of the calculations would change when we vary each parameter. But it gave no clue that the solution found was the best one. This information can be obtained by an exhaustive research, but this is a pretty long process. To verify 10 values of each parameter, $10^6$ tests are needed, and each test consists in the calculation of 1000 points. For a standard computer, it represents 2 hours of continuous processing. Considering that 10 values are indeed not enough to get a precise answer, it would have been difficult to use.
That's why we used genetic algorithms.
The idea of a genetic algorithm is based on the evolution of a wild population and the natural selection of phenotypes best adapted to environment. Here, a phenotype is a set of parameters, and the measure of adaptation is the distance of the kinetics predicted from the kinetics observed.
1. First we start with a randomly chosen population (not too much random to accelerate the process).
2. The best ones, those that minimize the distance between previsions and observations, are selected.
3. With these best ones, other phenotypes are created by mixing the values of parameters (crossing-over) and modifying a bit some of them (mutations).
4. We now have a population of second generation. If they are all close enough to the solution (ie, the distance between previsions and observations is small enough), the algorithm is considered as 'stabilized', the best one is chosen and the process stop. If not, the algorithm goes back to step 2 with these new phenotypes.
For 6 parameters, this genetic algorithm works well with a population of 21 phenotypes and 6 selected ones to breed the next generation. To be sure to explore a lot of sets of parameters, we don't chose the 6 best ones, but we chose randomly 5 out of the 6 best ones, and 1 out of the 15 others. It makes the process longer but creates better solutions.
As this algorithm is not deterministic, the only way to compare it to the exhaustive research is to stay in front of the computer with a chronometer. The benefice is quite good : in only five minutes, we have a result much more precise.
We ran 5 experiments to find our parameters. All with the same stem and with the same initial conditions : Click here and go to 'KillerRed characterisation'. Butwith different kinds of illumination :
In the 3 first ones, light is switch off for the first 200 minutes, then switch on at full power.
In the 2 last ones, light is switch off for the 180 first minutes, then switch on at full power, and then we tries to stabilize the bacterial growth with the help of the predictions. The last one has been a success.
For each experiment, the best parameters are searched.
exp 1 | exp 2 | exp 3 | exp 4 | exp 5 | mean | Standard Deviation | Standard Score $\left(\frac{\sigma}{\mu}, in \%\right)$ | |
---|---|---|---|---|---|---|---|---|
$R$ | 75 | 82 | 81 | 81 | 95 | 82.8 | 7.36 | 8.89 |
$a$ | 93 | 150 | 150 | 170 | 100 | 132 | 34.0 | 25.7 |
$b (.10^{-2})$ | 1.5 | 1.0 | 0.8 | 0.3 | 0.7 | 0.86 | 0.439 | 51.1 |
$M$ | 120 | 100 | 100 | 120 | 100 | 108 | 11.0 | 10.1 |
$k (.10^{-7})$ | 1.1 | 1.5 | 1.1 | 0.15 | 0.5 | 0.87 | 0.538 | 61.8 |
$1-l$ | 0.004 | 0.02 | 0.03 | 0.018 | 0.011 | 0.0166 | 0.00979 | 59.0 |
$M$ and $R$ are not the variables used in the equation, but are directly linked to them :
$R$ is the time of division of bacteria, and so : $r=\frac{\ln(2)}{R}$.
$M$ is the time at which half of the KillerRed have matured, and so : $m=\frac{\ln(2)}{M}$ .
They are both actual times (in $min$) and make the figures easier to understand.
We use $1-l$ and not $l$ for $l$ has a huge effect on the previsions when $l\rightarrow1$
Analyzing the table shows that $M$ and $R$ are the least variable parameters. In contrast, $a$, $b$ and $k$ the parameters that characterize KillerRed production, photobleaching and photo toxicity, exhibit large variations. These parameters should therefore be determined in each experiment, by a standard procedure described later.