Team:USTC CHINA/Modeling/B.SubtilisCulture

From 2013.igem.org

Why Do We Design This Experiment

B.subtilis has been widely applied as engineered bacteria, especially in food industry and pharmaceutical industry, for its safety and excellent secretion capacity. Therefore, after comparing characters of distinct mutants we selected B.subtilis WB800N mutant as our engineered bacteria and looked up plenty of papers to select the optimal conditions for our experiment. To our disappointment, very few experiments have been done on WB800N mutant, and most optimization experiments regarding B.subtilis focus solely on the optimization of production of specific proteins produced by B.subtilis. Consider the final goal of our project, it is imperative to design this experiment on our own to find out the best condition for B.subtilis WB800N.

Methodology

Any optimization design will inevitably involve the ideology of Design of Experiment (DOE), which includes several dependent plots. Among them Orthogonal Design and Response Surface Design(RSM) are the most common two in biological experiments. Generally, Orthogonal Design consumes less time and has been used more widely, yet it is not logically rigorous in mathematics, and sometimes it overlooks interactions and alias between or among factors. In contrast, RSM is constructed on rigorous mathematical theories and excels in data analysis. Having weighed the features of the two methods carefully, we finally chose RSM.

Sweeping Factors

The first step of any methods of DOE is to investigate all variables that affect the results and select controllable factors for the experiment. In terms of this experiment, all factors can be categorized into two kinds: environment factors, like temperature, the rotation speed of the shaker, and the components of the medium. We have looked up several papers about the optimization experiments on B.subtilis, finding the rotation speed of shakers ranging from 100 r/min to 250 r/min, and generally rotation speed only plays a tiny role. Additionally, our lab has only two shakers. While we can place twenty different mediums into one shaker at a time, we must run the shakers every time we alert the speed, which surely consumes longer time. Thus, we fixed the rotation speed of shakers at 200r/min. However, temperature and inoculation time are both vital environment factors whose effects cannot be ignored.
Inoculation amount and pack amount are also two factors that affect results slightly. We fixed them at 5 percent and 30mL/500mL respectively according to earlier authentic experiments.
A typical medium consists of carbon source, nitrogen source and inorganic salt, all of which are essential to ensure the regular metabolism of engineered bacteria. Finally in light of convenience, we infered the components of typical LB medium and determined three independent medium factors: peptone, yeast extract and sodium chloride (NaCl). Peptone provides nitrogen and carbon for the colonies, while yeast extract contains most required inorganic salt, therefore we did not list any inorganic salt except NaCl. We had no idea why NaCl is listed alone, and we suspected the influence of NaCl as yeast extract had already contains sodium.
Thus, we had five independent factors: temperature, inoculation time, peptone, yeast extract and NaCl. We further investigated some papers and defined their ranges. The following table displays their levels, and the unit of peptone, yeast extract and NaCl is g/L:

Factor Low High
Temperature 25℃ 35℃
25℃ 12h 24h
Peptone 5 15
Yeast Extract 2.5 7.5
NaCl 5 15

Table 1. Factors and their values of our design

Designs&Results

The methodology of RSM can be divided into two subplots: Central Composite Design (CCD) and Box-Behnken Design. Generally the overall runs of Box-Behnken Design is fewer when factors are fixed, but Central composite design is often recommended when the sequential experimentation is required, because it incorporates information from a properly planned factorial experiment. In our experiment, time is more precious than reagents, and as time itself is also an independent factor, Box-Behnken Design would not have saved any time if adopted. Thus we selected CCD.
CCD itself can also be classified into three subplots: Central Composite Circumscribed Design(CCC), Central Composite Inscribed design(CCI) and Central Composite Face-centered Design(CCF). The alpha value of CCC is related to the number of factors, whereas in CCF α is fixed at 1, and only CCC is rotatable. The rotational invariance empowers CCC to be mathematically preferred, but the value of alpha in a five-factor-CCC is over 2. In other words, if we adopted CCC, we would get some absurd treatments where the concentration of some specific actual material were negative. If we narrowed down the ranges to ensure the concentration of all medium component are positive in every treatment, the ranges would be too narrow to yield cogent results. Therefore, we finally selected CCF.
We conducted our experiments according to the following table, which was calculated by Minitab, and the results measured by OD value, were also included:

No.

Temperature

Time

Peptone

Yeast extract

NaCl

OD

1

25

12

5

2.5

5

0.511

2

35

24

5

2.5

5

1.625

3

35

12

15

2.5

5

2.783

4

25

24

15

2.5

5

1.74

5

35

12

5

7.5

5

2.317

6

25

24

5

7.5

5

2.4

7

25

12

15

7.5

5

0.912

8

35

24

15

7.5

5

3

9

35

12

5

2.5

15

2.169

10

25

24

5

2.5

15

1.77

11

25

12

15

2.5

15

0.371

12

35

24

15

2.5

15

2.7

13

25

12

5

7.5

15

0.754

14

35

24

5

7.5

15

2.58

15

35

12

15

7.5

15

3.128

16

25

24

15

7.5

15

2.38

17

30

18

10

5

10

2.908

18

30

18

10

5

10

2.908

19

30

18

10

5

10

1.75

20

30

18

10

5

10

2.908

21

35

12

5

2.5

5

2.082

22

25

24

5

2.5

5

1.75

23

25

12

15

2.5

5

0.508

24

35

24

15

2.5

5

2.6

25

25

12

5

7.5

5

0.989

26

35

24

5

7.5

5

2.8

27

35

12

15

7.5

5

2.782

28

25

24

15

7.5

5

1.7

29

25

12

5

2.5

15

0.508

30

35

24

5

2.5

15

1.338

31

35

12

15

2.5

15

3.061

32

25

24

15

2.5

15

2.2

33

35

12

5

7.5

15

2.167

34

25

24

5

7.5

15

1.53

35

25

12

15

7.5

15

0.555

36

35

24

15

7.5

15

2.9

37

30

18

10

5

10

2.908

38

30

18

10

5

10

2.908

39

30

18

10

5

10

2.908

40

30

18

10

5

10

2.957

41

25

18

10

5

10

1.907

42

35

18

10

5

10

43

30

12

10

5

10

2.652

44

30

24

10

5

10

2.908

45

30

18

5

5

10

2.726

46

30

18

15

5

10

3.042

47

30

18

10

2.5

10

2.598

48

30

18

10

7.5

10

3.124

49

30

18

10

5

5

2.999

50

30

18

10

5

15

2.834

51

30

18

10

5

10

2.908

52

30

18

10

5

10

2.908

53

30

18

10

5

10

2.908

54

30

18

10

5

10

2.908

Table 2. Treatments and results of our experiment


The result of No.42 medium is destroyed due to some unknown reasons. Additionally, multiple center points, which means conducting multiple experiments at the center points with identical treatments, is a very common phenomenon in DOE, although we dis only experiment at the center point and reuse its result, as a result of our limited time and reagents.
Estimated Regression Coefficients for OD
S = 0.295758 PRESS = 7.78904
R-Sq = 92.25% R-Sq(pred) = 78.45% R-Sq(adj) = 87.41%


Term                           

Coef

SE Coef     

T   

P

Constant                    

 2.87625 

0.07126 

40.361 

0.000

Temperature                 

 0.60225 

0.05210 

11.560 

0.000

Time                        

 0.28447 

0.05072  

5.608 

0.000

Peptone                     

 0.18665 

0.05072  

3.680 

0.001

Yeast Extract                 

0.16776 

0.05072  

3.308 

0.002

NaCl                        

-0.01626 

0.05072 

-0.321 

0.751

Temperature*Temperature    

-0.54900 

0.24585 

-2.233 

0.033

Time*Time                   

-0.18725 

0.19289 

-0.971 

0.339

Peptone*Peptone            

 -0.08325 

0.19289 

-0.432 

0.669

Yeast Extract*Yeast Extract   

-0.10625 

0.19289 

-0.551 

0.586

NaCl*NaCl                  

 -0.05075 

0.19289 

-0.263 

0.794

Temperature*Time            

-0.358338

0.05228

-6.579

0.000

Temperature*Peptone          

0.17881

0.05228

3.420

0.002

Temperature*Yeast Extract   

 0.04544 

0.05228  

0.869 

0.391

Temperature*NaCl            

0.01550

0.05228

0.296

0.769

Time*Peptone                

 0.02575 

0.05228  

0.493 

0.626

Time*Yeast Extract           

.06112 

0.05228  

1.169 

0.261

Time*NaCl                  

 -0.00144 

0.05228 

-0.027 

0.978

Peptone*Yeast Extract        

-0.07469 

0.05228 

-1.429 

0.163

Peptone*NaCl                

0.09150 

0.05228

1.750

0.090

Yeast Extract*NaCl          

-0.04450 

0.05228 

-0.851 

0.401

Table 3. Estimated Regression Coefficients for OD


Suppose we redefine the factors according to the following table:

Term

Mark

OD

F

Temperature

T

Time

T

Peptone

P

Yeast Extract

Y

NaCl

C


Table 4. Mark for each term

According to the ANOVA calculated by Mnitab, we got the expression of OD:

P represents confidence coefficient, which is a key judgment to check the reliability of the fitting function. In other words, if P=0.05, the probability that this term is wrong is 5%. The coefficient of determination (R) was calculated to be 0.9225, indicating that the model could explain 92% of the variability. From the above table we can identify eight statistically significant and reliable terms:

  • Constant;
  • Temperature;
  • Time;
  • Yeast Extract;
  • Peptone;
  • Temperature*Temperature;
  • Temperature*Time;
  • Temperature*Yeast Extract;

The influences of linear terms predominated, except NaCl, which substantiated our suspicion, whereas most square terms and interaction terms were ignorable and statistically unreliable. Temperature and time and two most influential factors.
As the intact response surface is six-dimensional, it is impossible to draw the intact surface in our three-dimensional world. Therefore we had to fix some factors to lower the dimensions, draw contours and surfaces, and e can extrapolate this super surface by combining these pictures:


Figure 1. Surface plots of OD vs time and temperature.





Figure 2. Contour plots of OD vs time and temperature.





Figure 3. Surface plots of OD vs time and peptone.





Figure 4. Contour plots of OD vs time and peptone.





Figure 5. Surface plots of OD vs peptone and temperature.





Figure 6. Contour plots of OD vs peptone and temperature.





Figure 7. Conyour plots of OD vs yeast extract and temperature.



Figure 8. Contour plots of OD vs yeast extract and temperature.




The following four pictures illustrate the distribution of residual error:


Figure 9. Residual error vs order





Figure 10. Histogram of residual error





Figure 11. Residual error vs fits





Figure 12. Normal probability plot of residual error




Optimization

One remarkable character of CCD is that it is sequential, which is also the essence of RSM. Since we had got the fitting function, the next step is to calculate the gradient of the function, and define a small number as step length. Further experiments are supposed to be conducted from the beginning point according to the gradient and step length, and the final maximal treatment would be made sure. The methodology of RSM seems like climbing a mountain whose peak is unknown, and we are supposed to adjust our orientation according to the topography. The fitting surface, which can be often a super surface in higher dimensional spaces, can be likened to the mountain without clear peaks, and calculating gradient to orientating.
Unfortunately our remaining time is not enough to support further experiments, and as we looked up other researches utilizing RSM, none of which did second round experiment, and we realized perhaps that was the difference between a scientific research and a real industrial procedure. Yet the analytical methodology of response surface still acted as a powerful tool for ANOVA. Roughly, we could consider the treatment of No. 15 medium (Temperature 35℃, Time 12h, Peptone 15, Yeast Extract 7.5, NaCl 15)as the maximal condition for B.subtilis.