Sweeping Factors
The first step of any methods of DOE is to investigate all variables that affect the results and select controllable factors for the experiment. In terms of this experiment, all factors can be categorized into two kinds: environment factors, like temperature, the rotation speed of the shaker, and the components of the medium. We have looked up several papers about the optimization experiments on B.subtilis, finding the rotation speed of shakers ranging from 100 r/min to 250 r/min, and generally rotation speed only plays a tiny role. Additionally, our lab has only two shakers. While we can place twenty different mediums into one shaker at a time, we must run the shakers every time we alert the speed, which surely consumes longer time. Thus, we fixed the rotation speed of shakers at 200r/min. However, temperature and inoculation time are both vital environment factors whose effects cannot be ignored.
Inoculation amount and pack amount are also two factors that affect results slightly. We fixed them at 5 percent and 30mL/500mL respectively according to earlier authentic experiments.
A typical medium consists of carbon source, nitrogen source and inorganic salt, all of which are essential to ensure the regular metabolism of engineered bacteria. Finally in light of convenience, we infered the components of typical LB medium and determined three independent medium factors: peptone, yeast extract and sodium chloride (NaCl). Peptone provides nitrogen and carbon for the colonies, while yeast extract contains most required inorganic salt, therefore we did not list any inorganic salt except NaCl. We had no idea why NaCl is listed alone, and we suspected the influence of NaCl as yeast extract had already contains sodium.
Thus, we had five independent factors: temperature, inoculation time, peptone, yeast extract and NaCl. We further investigated some papers and defined their ranges. The following table displays their levels, and the unit of peptone, yeast extract and NaCl is g/L:
Factor |
Low |
High |
Temperature |
25℃ |
35℃ |
25℃ |
12h |
24h |
Peptone |
5 |
15 |
Yeast Extract |
2.5 |
7.5 |
NaCl |
5 |
15 |
Table 1. Factors and their values of our design
Designs&Results
The methodology of RSM can be divided into two subplots: Central Composite Design (CCD) and Box-Behnken Design. Generally the overall runs of Box-Behnken Design is fewer when factors are fixed, but Central composite design is often recommended when the sequential experimentation is required, because it incorporates information from a properly planned factorial experiment. In our experiment, time is more precious than reagents, and as time itself is also an independent factor, Box-Behnken Design would not have saved any time if adopted. Thus we selected CCD.
CCD itself can also be classified into three subplots: Central Composite Circumscribed Design(CCC), Central Composite Inscribed design(CCI) and Central Composite Face-centered Design(CCF). The alpha value of CCC is related to the number of factors, whereas in CCF α is fixed at 1, and only CCC is rotatable. The rotational invariance empowers CCC to be mathematically preferred, but the value of alpha in a five-factor-CCC is over 2. In other words, if we adopted CCC, we would get some absurd treatments where the concentration of some specific actual material were negative. If we narrowed down the ranges to ensure the concentration of all medium component are positive in every treatment, the ranges would be too narrow to yield cogent results. Therefore, we finally selected CCF.
We conducted our experiments according to the following table, which was calculated by Minitab, and the results measured by OD value, were also included:
No. |
Temperature |
Time |
Peptone |
Yeast extract |
NaCl |
OD |
1 |
25 |
12 |
5 |
2.5 |
5 |
0.511 |
2 |
35 |
24 |
5 |
2.5 |
5 |
1.625 |
3 |
35 |
12 |
15 |
2.5 |
5 |
2.783 |
4 |
25 |
24 |
15 |
2.5 |
5 |
1.74 |
5 |
35 |
12 |
5 |
7.5 |
5 |
2.317 |
6 |
25 |
24 |
5 |
7.5 |
5 |
2.4 |
7 |
25 |
12 |
15 |
7.5 |
5 |
0.912 |
8 |
35 |
24 |
15 |
7.5 |
5 |
3 |
9 |
35 |
12 |
5 |
2.5 |
15 |
2.169 |
10 |
25 |
24 |
5 |
2.5 |
15 |
1.77 |
11 |
25 |
12 |
15 |
2.5 |
15 |
0.371 |
12 |
35 |
24 |
15 |
2.5 |
15 |
2.7 |
13 |
25 |
12 |
5 |
7.5 |
15 |
0.754 |
14 |
35 |
24 |
5 |
7.5 |
15 |
2.58 |
15 |
35 |
12 |
15 |
7.5 |
15 |
3.128 |
16 |
25 |
24 |
15 |
7.5 |
15 |
2.38 |
17 |
30 |
18 |
10 |
5 |
10 |
2.908 |
18 |
30 |
18 |
10 |
5 |
10 |
2.908 |
19 |
30 |
18 |
10 |
5 |
10 |
1.75 |
20 |
30 |
18 |
10 |
5 |
10 |
2.908 |
21 |
35 |
12 |
5 |
2.5 |
5 |
2.082 |
22 |
25 |
24 |
5 |
2.5 |
5 |
1.75 |
23 |
25 |
12 |
15 |
2.5 |
5 |
0.508 |
24 |
35 |
24 |
15 |
2.5 |
5 |
2.6 |
25 |
25 |
12 |
5 |
7.5 |
5 |
0.989 |
26 |
35 |
24 |
5 |
7.5 |
5 |
2.8 |
27 |
35 |
12 |
15 |
7.5 |
5 |
2.782 |
28 |
25 |
24 |
15 |
7.5 |
5 |
1.7 |
29 |
25 |
12 |
5 |
2.5 |
15 |
0.508 |
30 |
35 |
24 |
5 |
2.5 |
15 |
1.338 |
31 |
35 |
12 |
15 |
2.5 |
15 |
3.061 |
32 |
25 |
24 |
15 |
2.5 |
15 |
2.2 |
33 |
35 |
12 |
5 |
7.5 |
15 |
2.167 |
34 |
25 |
24 |
5 |
7.5 |
15 |
1.53 |
35 |
25 |
12 |
15 |
7.5 |
15 |
0.555 |
36 |
35 |
24 |
15 |
7.5 |
15 |
2.9 |
37 |
30 |
18 |
10 |
5 |
10 |
2.908 |
38 |
30 |
18 |
10 |
5 |
10 |
2.908 |
39 |
30 |
18 |
10 |
5 |
10 |
2.908 |
40 |
30 |
18 |
10 |
5 |
10 |
2.957 |
41 |
25 |
18 |
10 |
5 |
10 |
1.907 |
42 |
35 |
18 |
10 |
5 |
10 |
|
43 |
30 |
12 |
10 |
5 |
10 |
2.652 |
44 |
30 |
24 |
10 |
5 |
10 |
2.908 |
45 |
30 |
18 |
5 |
5 |
10 |
2.726 |
46 |
30 |
18 |
15 |
5 |
10 |
3.042 |
47 |
30 |
18 |
10 |
2.5 |
10 |
2.598 |
48 |
30 |
18 |
10 |
7.5 |
10 |
3.124 |
49 |
30 |
18 |
10 |
5 |
5 |
2.999 |
50 |
30 |
18 |
10 |
5 |
15 |
2.834 |
51 |
30 |
18 |
10 |
5 |
10 |
2.908 |
52 |
30 |
18 |
10 |
5 |
10 |
2.908 |
53 |
30 |
18 |
10 |
5 |
10 |
2.908 |
54 |
30 |
18 |
10 |
5 |
10 |
2.908 |
Table 2. Treatments and results of our experiment
The result of No.42 medium is destroyed due to some unknown reasons. Additionally, multiple center points, which means conducting multiple experiments at the center points with identical treatments, is a very common phenomenon in DOE, although we dis only experiment at the center point and reuse its result, as a result of our limited time and reagents.
Estimated Regression Coefficients for OD
Term |
Coef |
SE Coef |
T |
P |
Constant |
2.87625 |
0.07126 |
40.361 |
0.000 |
Temperature |
0.60225 |
0.05210 |
11.560 |
0.000 |
Time |
0.28447 |
0.05072 |
5.608 |
0.000 |
Peptone |
0.18665 |
0.05072 |
3.680 |
0.001 |
Yeast Extract |
0.16776 |
0.05072 |
3.308 |
0.002 |
NaCl |
-0.01626 |
0.05072 |
-0.321 |
0.751 |
Temperature*Temperature |
-0.54900 |
0.24585 |
-2.233 |
0.033 |
Time*Time |
-0.18725 |
0.19289 |
-0.971 |
0.339 |
Peptone*Peptone |
-0.08325 |
0.19289 |
-0.432 |
0.669 |
Yeast Extract*Yeast Extract |
-0.10625 |
0.19289 |
-0.551 |
0.586 |
NaCl*NaCl |
-0.05075 |
0.19289 |
-0.263 |
0.794 |
Temperature*Time |
-0.358338 |
0.05228 |
-6.579 |
0.000 |
Temperature*Peptone |
0.17881 |
0.05228 |
3.420 |
0.002 |
Temperature*Yeast Extract |
0.04544 |
0.05228 |
0.869 |
0.391 |
Temperature*NaCl |
0.01550 |
0.05228 |
0.296 |
0.769 |
Time*Peptone |
0.02575 |
0.05228 |
0.493 |
0.626 |
Time*Yeast Extract |
.06112 |
0.05228 |
1.169 |
0.261 |
Time*NaCl |
-0.00144 |
0.05228 |
-0.027 |
0.978 |
Peptone*Yeast Extract |
-0.07469 |
0.05228 |
-1.429 |
0.163 |
Peptone*NaCl |
0.09150 |
0.05228 |
1.750 |
0.090 |
Yeast Extract*NaCl |
-0.04450 |
0.05228 |
-0.851 |
0.401 |
S = 0.295758 PRESS = 7.78904
R-Sq = 92.25% R-Sq(pred) = 78.45% R-Sq(adj) = 87.41%
Table 3. Estimated Regression Coefficients for OD
Suppose we redefine the factors according to the following table:
Term |
Mark |
OD |
F |
Temperature |
T |
Time |
T |
Peptone |
P |
Yeast Extract |
Y |
NaCl |
C |
Table 4. Mark for each term
According to the ANOVA calculated by Mnitab, we got the expression of OD:
P represents confidence coefficient, which is a key judgment to check the reliability of the fitting function. In other words, if P=0.05, the probability that this term is wrong is 5%. The coefficient of determination (R) was calculated to be 0.9225, indicating that the model could explain 92% of the variability. From the above table we can identify eight statistically significant and reliable terms:
- Constant;
- Temperature;
- Time;
- Yeast Extract;
- Peptone;
- Temperature*Temperature;
- Temperature*Time;
- Temperature*Yeast Extract;
The influences of linear terms predominated, except NaCl, which substantiated our suspicion, whereas most square terms and interaction terms were ignorable and statistically unreliable. Temperature and time and two most influential factors.
As the intact response surface is six-dimensional, it is impossible to draw the intact surface in our three-dimensional world. Therefore we had to fix some factors to lower the dimensions, draw contours and surfaces, and e can extrapolate this super surface by combining these pictures:
Figure 1. Surface plots of OD vs time and temperature.
Figure 2. Contour plots of OD vs time and temperature.
Figure 3. Surface plots of OD vs time and peptone.
Figure 4. Contour plots of OD vs time and peptone.
Figure 5. Surface plots of OD vs peptone and temperature.
Figure 6. Contour plots of OD vs peptone and temperature.
Figure 7. Conyour plots of OD vs yeast extract and temperature.
Figure 8. Contour plots of OD vs yeast extract and temperature.
The following four pictures illustrate the distribution of residual error:
Figure 9. Residual error vs order
Figure 10. Histogram of residual error
Figure 11. Residual error vs fits
Figure 12. Normal probability plot of residual error
Optimization
One remarkable character of CCD is that it is sequential, which is also the essence of RSM. Since we had got the fitting function, the next step is to calculate the gradient of the function, and define a small number as step length. Further experiments are supposed to be conducted from the beginning point according to the gradient and step length, and the final maximal treatment would be made sure. The methodology of RSM seems like climbing a mountain whose peak is unknown, and we are supposed to adjust our orientation according to the topography. The fitting surface, which can be often a super surface in higher dimensional spaces, can be likened to the mountain without clear peaks, and calculating gradient to orientating.
Unfortunately our remaining time is not enough to support further experiments, and as we looked up other researches utilizing RSM, none of which did second round experiment, and we realized perhaps that was the difference between a scientific research and a real industrial procedure. Yet the analytical methodology of response surface still acted as a powerful tool for ANOVA. Roughly, we could consider the treatment of No. 15 medium (Temperature 35℃, Time 12h, Peptone 15, Yeast Extract 7.5, NaCl 15)as the maximal condition for B.subtilis.