【Why Do We Design This Experiment】
Bacillus subtilis has been widely applied as engineered bacteria, especially in food industry and pharmaceutical industry, for its safety and excellent secretion capacity. Therefore, after comparing characters of distinct mutants we selected Bacillus subtilis WB800N mutant as our engineered bacteria and looked up plenty of papers to select the optimal conditions for our experiment. To our disappointment, very few experiments have been done on WB800N mutant, and most optimization experiments regarding Bacillus subtilis focus solely on the optimization of production of specific proteins produced by Bacillus subtilis. Consider the final goal of our project, it is imperative to design this experiment on our own to find out the best condition for Bacillus subtilis WB800N.
【Methodology】
Any optimization designs will inevitably involve the ideology of Design of Experiment (DOE), which includes several dependent plots. Among them Orthogonal Design and Response Surface Design, RSM for short, are the most common two in biological experiments. Generally, Orthogonal Design consumes less time and has been used more widely, yet it is not logically rigorous in mathematics, and sometimes it overlooks interactions and alias between or among factors. In contrast, RSM is constructed on rigorous mathematical theories and excels in data analysis. Having weighing the features of the two methods carefully, we finally chose RSM.
【Sweeping Factors】
The first step of any methods of DOE is to investigate all variables that affect the results and select controllable factors for the experiment. In terms of this experiment, all factors can be categorized into two kinds: environment factors, like temperature, the rotation speed of the shaker, and the components of the medium. We have looked up several papers about the optimization experiments on Bacillus subtilis, finding the rotation speed of shakers ranging from 100 r/min to 250 r/min, and generally rotation speed only plays a tiny role. Additionally, our lab has only two shakers. While we can place twenty different mediums into one shaker at a time, we must run the shakers every time we alert the speed, which surely consumes longer time. Thus, we fixed the rotation speed of shakers at 200r/min.However, temperature and inoculation time are both vital environment factors whose effects cannot be ignored.
Inoculation amount and pack amount are also two factors that affect results slightly. We fixed them at 5 percent and 30mL/500mL respectively according to earlier authentic experiments.
A typical medium consists of carbon source, nitrogen source and inorganic salt, all of which are essential to ensure the regular metabolism of engineered bacteria. Finally in light of convenience, we infered the components of typical LB medium and determined three independent medium factors: peptone, yeast extract and sodium chloride (NaCl). Peptone provides nitrogen and carbon for the colonies, while yeast extract contains most required inorganic salt, therefore we did not list any inorganic salt except NaCl. We had no idea why NaCl is listed alone, and we suspected the influence of NaCl as yeast extract had already contains sodium.
Thus, we had five independent factors: temperature, inoculation time, peptone, yeast extract and NaCl. We further investigated some papers and defined their ranges. The following table displays their levels, and the unit of peptone, yeast extract and NaCl is g/L:
Factor |
Low |
High |
Temperature |
25℃ |
35℃ |
25℃ |
12h |
24h |
Peptone |
5 |
15 |
Yeast Extract |
2.5 |
7.5 |
NaCl |
5 |
15 |
Table 1. Factors and their values of our design
【Designs&Results】
The methodology of RSM can be divided into two subplots: Central Composite Designs (CCD) and Box-Behnken Designs. Generally the overall runs of Box-Behnken Designs is fewer when the factors are fixed, but Central composite designs are often recommended when the design plan calls for sequential experimentation because these designs can incorporate information from a properly planned factorial experiment. In our experiment, time is more precious than reagents, and as time itself is also an independent factor, Box-Behnken Designs would not have saved any time if adopted. Thus we selected CCD.
CCD itself can also be classified into three subplots: Central Composite Circumscribed Design (CCC), Central Composite Inscribed design(CCI) and Central Composite Face-centered Design(CCF). The alpha value of CCC is related to the number of factors, whereas in CCF α is fixed at 1, and only CCC is rotatable. The rotational invariance empowers CCC to be mathematically preferred, yet the value of alpha in a five-factor-CCC is over 2. In other words, if we adopted CCC, we would get some absurd treatments where the concentration of some specific actual material were negative. If we narrowed down the range to ensure any concentration is positive, the ranges of all three medium factors would be too narrow to yield cogent results. Therefore, we finally selected CCF.
We conducted our experiments according to the following table, which was calculated by Minitab, and the results, which were measure by OD value, were also included:
No. |
Temperature |
Time |
Peptone |
Yeast extract |
NaCl |
OD |
1 |
25 |
12 |
5 |
2.5 |
5 |
0.511 |
2 |
35 |
24 |
5 |
2.5 |
5 |
1.625 |
3 |
35 |
12 |
15 |
2.5 |
5 |
2.783 |
4 |
25 |
24 |
15 |
2.5 |
5 |
1.74 |
5 |
35 |
12 |
5 |
7.5 |
5 |
2.317 |
6 |
25 |
24 |
5 |
7.5 |
5 |
2.4 |
7 |
25 |
12 |
15 |
7.5 |
5 |
0.912 |
8 |
35 |
24 |
15 |
7.5 |
5 |
3 |
9 |
35 |
12 |
5 |
2.5 |
15 |
2.169 |
10 |
25 |
24 |
5 |
2.5 |
15 |
1.77 |
11 |
25 |
12 |
15 |
2.5 |
15 |
0.371 |
12 |
35 |
24 |
15 |
2.5 |
15 |
2.7 |
13 |
25 |
12 |
5 |
7.5 |
15 |
0.754 |
14 |
35 |
24 |
5 |
7.5 |
15 |
2.58 |
15 |
35 |
12 |
15 |
7.5 |
15 |
3.128 |
16 |
25 |
24 |
15 |
7.5 |
15 |
2.38 |
17 |
30 |
18 |
10 |
5 |
10 |
2.908 |
18 |
30 |
18 |
10 |
5 |
10 |
2.908 |
19 |
30 |
18 |
10 |
5 |
10 |
1.75 |
20 |
30 |
18 |
10 |
5 |
10 |
2.908 |
21 |
35 |
12 |
5 |
2.5 |
5 |
2.082 |
22 |
25 |
24 |
5 |
2.5 |
5 |
1.75 |
23 |
25 |
12 |
15 |
2.5 |
5 |
0.508 |
24 |
35 |
24 |
15 |
2.5 |
5 |
2.6 |
25 |
25 |
12 |
5 |
7.5 |
5 |
0.989 |
26 |
35 |
24 |
5 |
7.5 |
5 |
2.8 |
27 |
35 |
12 |
15 |
7.5 |
5 |
2.782 |
28 |
25 |
24 |
15 |
7.5 |
5 |
1.7 |
29 |
25 |
12 |
5 |
2.5 |
15 |
0.508 |
30 |
35 |
24 |
5 |
2.5 |
15 |
1.338 |
31 |
35 |
12 |
15 |
2.5 |
15 |
3.061 |
32 |
25 |
24 |
15 |
2.5 |
15 |
2.2 |
33 |
35 |
12 |
5 |
7.5 |
15 |
2.167 |
34 |
25 |
24 |
5 |
7.5 |
15 |
1.53 |
35 |
25 |
12 |
15 |
7.5 |
15 |
0.555 |
36 |
35 |
24 |
15 |
7.5 |
15 |
2.9 |
37 |
30 |
18 |
10 |
5 |
10 |
2.908 |
38 |
30 |
18 |
10 |
5 |
10 |
2.908 |
39 |
30 |
18 |
10 |
5 |
10 |
2.908 |
40 |
30 |
18 |
10 |
5 |
10 |
2.957 |
41 |
25 |
18 |
10 |
5 |
10 |
1.907 |
42 |
35 |
18 |
10 |
5 |
10 |
|
43 |
30 |
12 |
10 |
5 |
10 |
2.652 |
44 |
30 |
24 |
10 |
5 |
10 |
2.908 |
45 |
30 |
18 |
5 |
5 |
10 |
2.726 |
46 |
30 |
18 |
15 |
5 |
10 |
3.042 |
47 |
30 |
18 |
10 |
2.5 |
10 |
2.598 |
48 |
30 |
18 |
10 |
7.5 |
10 |
3.124 |
49 |
30 |
18 |
10 |
5 |
5 |
2.999 |
50 |
30 |
18 |
10 |
5 |
15 |
2.834 |
51 |
30 |
18 |
10 |
5 |
10 |
2.908 |
52 |
30 |
18 |
10 |
5 |
10 |
2.908 |
53 |
30 |
18 |
10 |
5 |
10 |
2.908 |
54 |
30 |
18 |
10 |
5 |
10 |
2.908 |
Table 2. Treatments and results of our experiment
The result of No.42 medium is destroyed due to some unfortunate reason. Additionally, multiple center points, which means conducting multiple experiments at the center points with identical treatments, is a very common phenomenon in DOE, yet we decided to do only experiment at the center point and reuse its result due to our limited time and reagents.
Estimated Regression Coefficients for OD
Term |
Coef |
SE Coef |
T |
P |
Constant |
2.87625 |
0.07126 |
40.361 |
0.000 |
Temperature |
0.60225 |
0.05210 |
11.560 |
0.000 |
Time |
0.28447 |
0.05072 |
5.608 |
0.000 |
Peptone |
0.18665 |
0.05072 |
3.680 |
0.001 |
Yeast Extract |
0.16776 |
0.05072 |
3.308 |
0.002 |
NaCl |
-0.01626 |
0.05072 |
-0.321 |
0.751 |
Temperature*Temperature |
-0.54900 |
0.24585 |
-2.233 |
0.033 |
Time*Time |
-0.18725 |
0.19289 |
-0.971 |
0.339 |
Peptone*Peptone |
-0.08325 |
0.19289 |
-0.432 |
0.669 |
Yeast Extract*Yeast Extract |
-0.10625 |
0.19289 |
-0.551 |
0.586 |
NaCl*NaCl |
-0.05075 |
0.19289 |
-0.263 |
0.794 |
Temperature*Time |
-0.358338 |
0.05228 |
-6.579 |
0.000 |
Temperature*Peptone |
0.17881 |
0.05228 |
3.420 |
0.002 |
Temperature*Yeast Extract |
0.04544 |
0.05228 |
0.869 |
0.391 |
Temperature*NaCl |
0.01550 |
0.05228 |
0.296 |
0.769 |
Time*Peptone |
0.02575 |
0.05228 |
0.493 |
0.626 |
Time*Yeast Extract |
.06112 |
0.05228 |
1.169 |
0.261 |
Time*NaCl |
-0.00144 |
0.05228 |
-0.027 |
0.978 |
Peptone*Yeast Extract |
-0.07469 |
0.05228 |
-1.429 |
0.163 |
Peptone*NaCl |
0.09150 |
0.05228 |
1.750 |
0.090 |
Yeast Extract*NaCl |
-0.04450 |
0.05228 |
-0.851 |
0.401 |
S = 0.295758 PRESS = 7.78904
R-Sq = 92.25% R-Sq(pred) = 78.45% R-Sq(adj) = 87.41%
Table 3. Estimated Regression Coefficients for OD
Suppose we redefine the factors according to the following table:
Term |
Mark |
OD |
F |
Temperature |
T |
Time |
T |
Peptone |
P |
Yeast Extract |
Y |
NaCl |
C |
Table 4. Mark for each term
According to the ANOVA calculated by minitab, we got the expression of OD:
P represents confidence coefficient, which is a key judgment to check the reliability of the fitting function. In other words, if P=0.05, the probability that this term is wrong is 5%. The coefficient of determination (R) was calculated to be 0.9225, indicating that the model could explain 92% of the variability .From the above table we can identify eight statistically significant and reliable terms:
- Constant;
- Temperature;
- Time;
- Yeast Extract;
- Peptone;
- Temperature*Temperature;
- Temperature*Time;
- Temperature*Yeast Extract;
The influences of linear terms predominated, except NaCl, which substantiated our suspicion whereas most square terms and interaction terms were ignorable and statistically unreliable. Temperature and time and two most influential factor.
As our world is three-dimensional but the intact response surface is six-dimensional, it is impossible to draw the intact surface. Yet we could fix some factors to lower the dimensional, which empowers us to imagine the full surface. Here are some surfaces and contours of our fitting surface, we can extrapolate this super surface by combining these pictures:
Figure 1. Surface plots of OD vs time and temperature.
Figure 2. Contour plots of OD vs time and temperature.
Figure 3. Surface plots of OD vs time and peptone.
Figure 4. Contour plots of OD vs time and peptone.
Figure 5. Surface plots of OD vs peptone and temperature.
Figure 6. Contour plots of OD vs peptone and temperature.
Figure 7. Surface plots of OD vs yeast extract and temperature.
Figure 8. Contour plots of OD vs yeast extract and temperature.
The following four pictures illustrate the distribution of residual error:
【Optimization】
One remarkable character of CCD is that it is sequential, and this is also the essence of RSM. Since we had got the fitting function, the next step is to calculate the gradient of the function, and define a small number as step length. Further experiments are supposed to be conducted from the beginning point according to the gradient and step length, and the final maximal treatment would be made sure. The methodology of RSM seems like climbing a mountain whose peak is unknown, and we adjust our orientation according to the topography. The fitting surface, which can be often a super surface in higher dimensional spaces, can be likened to the mountain without clear peaks, and calculating gradient to orientating.
Unfortunately our remaining time is not adequate enough to support further experiments,and as we looked up other researches utilizing RSM, none of which did second round experiment, and we realized perhaps that was the difference between a scientific research and a real industrial procedure. Yet the analytical methodology of response surface still acted as a powerful tool for ANOVA. Roughly, we could consider the treatment of No. 15 medium (Temperature 35℃, Time 12h, Peptone 15, Yeast Extract 7.5, NaCl 15)as the maximal condition for Bacillus subtilis.