Team:USTC-Software/Project/Examples

From 2013.igem.org

Revision as of 08:59, 26 October 2013 by Zigzag (Talk | contribs)

Slide

Take a gNAP before wearing your gloves! Genetic Network Analyze and Predict
The sketch and final GUI of gNAP!
We compare the result of our software with gene expression profile in literature.
We are USTC-Software!


Example

Examples

Test and verify by experiment literatures

To prove our software’s reliability, we search for lots of literatures. It is hard to find an appropriate literature which research the effect of importing an exogenous gene into E.coli K-12. But actually, our software could also simulate the effect of changing endogenous gene by putting the same promoter and gene sequence in.

So, we eventually found four literatures to test and verify our software.


Global Gene Expression Profiling in Escherichia coliK12 THE EFFECTS OF INTEGRATION HOST FACTOR

In this literature, Stuart and his team measure the gene expression profiles in otherwise isogenic integration host factor IHF+ and IHF- strains. And IHF is one of the genes in our genetic regulatory network(GRN).

By importing the IHF’s promoter and gene sequence, we used our software simulating the enhancement of IHF’s expression and compared the result with the gene expression profile in that literature.

There are 30 genes in that profile which are also in our GRN. Here is the list and Genes differentially expressed between E. coli K12 strains IH100 (IHF+) and IH105 (IHF-) with a p value less than 0.0005:

Gene

Avg

S.D.

p value

Fold

Compare
Result

IH100

IH105

IH100

IH105

glnA

2.91E-03

9.39E-04

6.80E-04

1.33E-04

1.30E-03

-3.1

fit

ilvA

5.06E-04

3.42E-04

1.86E-05

2.26E-05

3.00E-05

-1.48

unfit

ilvE

5.81E-04

3.58E-04

4.70E-05

5.77E-05

9.80E-04

-1.62

unfit

ilvG

1.97E-04

7.67E-05

2.65E-05

2.23E-05

4.40E-04

-2.57

unfit

leuA

6.99E-04

1.07E-03

9.21E-05

9.23E-05

1.30E-03

1.53

fit

cobT

1.00E-05

7.97E-05

7.82E-06

2.13E-05

8.50E-04

7.97

unfit

cobU

4.26E-05

1.22E-04

1.79E-05

1.95E-05

9.90E-04

2.85

unfit

lacA

5.14E-03

1.21E-03

1.54E-03

3.52E-04

2.50E-03

-4.24

unfit

lacZ

2.10E-03

5.14E-04

3.77E-04

1.34E-04

2.20E-04

-4.08

unfit

lacY

1.62E-03

4.08E-04

2.53E-04

7.95E-05

9.80E-05

-3.96

unfit

ompF

7.23E-03

2.35E-03

1.90E-03

3.69E-04

2.40E-03

-3.07

fit

gltD

9.91E-04

1.40E-04

1.88E-04

3.06E-05

1.10E-04

-7.1

fit

lpdA

1.07E-03

7.60E-04

1.17E-04

7.75E-05

4.60E-03

-1.41

fit

rffT

5.81E-06

3.65E-05

4.66E-06

2.86E-05

9.40E-04

6.28

fit

ndh

5.03E-05

1.46E-04

1.94E-05

3.29E-05

2.50E-03

2.9

fit

cheR

1.29E-04

2.68E-05

2.07E-04

1.75E-05

1.30E-03

-4.82

fit

sodA

3.80E-04

9.74E-04

1.06E-04

6.26E-05

7.00E-05

2.57

fit

sodB

7.80E-04

1.91E-03

2.45E-04

4.11E-04

3.30E-03

2.44

fit

cpdB

1.92E-05

7.56E-05

1.24E-05

1.40E-05

9.50E-04

3.94

fit

guaA

8.25E-04

4.31E-04

5.43E-05

1.34E204

1.60E203

-1.91

unfit

yiaJ

3.47E-05

6.15E-04

1.74E-05

1.64E204

4.10E204

17.74

fit

dsdX

1.05E-05

3.88E-05

5.23E-06

2.44E205

1.70E203

3.7

fit

oppD

2.32E-05

8.02E-05

1.81E-05

1.66E205

3.50E203

3.46

fit

glnL

2.41E-04

3.99E-05

4.81E-05

2.81E205

3.60E204

-6.04

fit

oppA

2.54E-03

5.06E-03

1.72E-04

5.68E204

1.40E204

2

fit

oppB

1.06E-04

3.57E-04

3.06E-05

6.22E205

3.60E204

3.35

fit

proV

2.50E-05

5.30E-05

7.34E-06

9.57E206

3.60E203

2.12

fit

rbsC

4.20E-05

1.12E-04

1.47E-05

2.70E205

3.90E203

2.67

fit

hdeB

1.09E-03

5.51E-06

1.80E-04

3.47E206

2.00E205

-198.5

fit

yefM

4.63E-04

8.12E-04

5.02E-05

6.07E205

1.10E204

1.75

fit

The compare result means that whether the result of our software fit to the result of gene expression profile. After statistic, in these 30 genes, there are 21 genes whose result are same to gNAP’s simulation, 70% of the total.


Global Gene Expression Profiling in Escherichia coliK12 THE EFFECTS OF LEUCINE-RESPONSIVE REGULATORY PROTEIN

In this literature, researchers measure the gene expression profiles in Escherichia coli k12 with the effects of leucine-responsive regulatory protein(Lrp). And Lrp is one of the genes in our genetic regulatory network(GRN).

By importing the Lrp’s promoter and gene sequence, we used our software simulating the enhancement of Lrp’s expression and compared the result with the gene expression profile in that literature.

There are 22 genes in that profile which are also in our GRN. Here is the list and Genes differentially expressed between lrp+ and lrp- (control vs. experimental) E. coli strains with a p value less than 0.001:

Gene name

Control

Experimental

Control

Experimental

p value

PPDE(<p)

Fold

Compare result

 

mean

mean

S.D.

S.D.

 

 

 

 

uvrA

0.00128

0.00104

1.50E-05

3.37E-05

1.70E-05

0.99386

-1.23

unfit

gdhA

9.16E-05

2.73E-04

1.52E-05

2.16E-05

2.18E-05

0.99329

2.98

unfit

oppB*

7.51E-05

0.00114

2.12E-05

3.79E-04

2.48E-05

0.99298

15.12

fit

artP

6.73E-05

4.23E-04

1.24E-05

1.16E-04

3.60E-05

0.992

6.28

fit

oppC*

2.01E-04

0.00108

2.34E-05

3.61E-04

5.44E-05

0.99074

5.38

fit

gltD*

5.28E-04

2.74E-05

1.28E-04

1.42E-05

5.87E-05

0.99049

-19.27

fit

oppA*

0.00162

0.0316

7.63E-04

0.0103

8.45E-05

0.9892

19.44

fit

malE*

3.56E-04

2.01E-04

2.32E-05

2.17E-05

1.16E-04

0.98793

-1.78

fit

oppD*

8.97E-05

6.55E-04

2.76E-05

2.05E-04

1.16E-04

0.98793

7.3

fit

galP

3.75E-04

2.11E-04

2.25E-05

2.40E-05

1.31E-04

0.9874

-1.78

fit

lysU*

1.81E-04

0.00124

7.48E-05

2.78E-04

1.44E-04

0.98697

6.87

fit

hybA

3.53E-04

2.47E-04

2.11E-05

1.50E-05

1.49E-04

0.98682

-1.43

unfit

hybC

3.54E-04

2.34E-04

2.20E-05

1.81E-05

1.61E-04

0.98646

-1.51

unfit

ilvG_1*

4.21E-04

9.15E-04

7.55E-05

6.85E-05

2.54E-04

0.98411

2.17

fit

phoP

8.29E-05

2.10E-04

1.20E-05

4.42E-05

3.16E-04

0.98285

2.54

fit

emrA

3.58E-04

2.78E-04

2.43E-05

4.57E-06

3.95E-04

0.98147

-1.29

unfit

glpA

1.28E-04

8.01E-05

8.54E-06

9.26E-06

4.71E-04

0.98029

-1.59

fit

manA

8.71E-05

2.40E-04

2.16E-05

4.08E-05

4.80E-04

0.98016

2.75

fit

amn

4.31E-04

6.51E-04

4.47E-05

4.72E-05

6.07E-04

0.97848

1.51

unfit

speB

1.21E-04

3.56E-05

2.09E-05

1.08E-05

7.73E-04

0.97659

-3.4

fit

hdeA

2.40E-04

8.29E-04

8.46E-05

9.90E-05

8.12E-04

0.97619

3.45

fit

lrp*

2.96E-04

1.11E-04

6.21E-05

2.22E-05

8.27E-04

0.97604

-2.67

unfit

The compare result means that whether the result of our software fit to the result of gene expression profile. After statistic, in these 22 genes, there are 15 genes whose result are same to gNAP’s simulation, 68.2% of the total.


Global Gene Expression Profiling in Escherichia coli K12 THE EFFECTS OF OXYGEN AVAILABILITY AND FNR

In this literature, researchers measure the gene expression profiles in Escherichia coli k12 with the effects of oxygen availability and FNR. And FNR is one of the genes in our genetic regulatory network(GRN). We do not consider the effect of oxygen, but instead, we control the oxygen in the same way and consider the effect of FNR+ and FNR-.

By importing the FNR’s promoter and gene sequence, we used our software simulating the enhancement of FNR’s expression and compared the result with the gene expression profile in that literature.

There are 38 genes in that profile which are also in our GRN. Here is the list and Genes differentially expressed between FNR+ and FNR- E. coli strains:

Gene

p value

PPDE(<p)

Fold

Compare result

trpB

4.94E-04

0.99872

9.24

fit

cyoA

6.96E-07

0.99999

13.05

fit

gpmA

1.73E-04

0.99939

10.41

fit

crr

6.26E-05

0.9997

3.47

unfit

nuoE

1.17E-04

0.99953

4.49

fit

rplM

5.72E-06

0.99994

14.32

fit

gatY

1.92E-04

0.99934

8

unfit

trmD

1.64E-04

0.99941

2.5

fit

ndh

8.73E-06

0.99992

5.06

fit

manY

2.49E-04

0.99921

8.35

unfit

manZ

1.07E-04

0.99956

2.53

unfit

ompA

6.57E-06

0.99994

3.41

unfit

rplT

3.68E-05

0.99979

8.41

fit

rpsJ

6.23E-04

0.99849

4.27

unfit

cydA

3.15E-04

0.99906

4.73

unfit

rplS

2.70E-07

0.99999

6.24

fit

ptsG

3.41E-04

0.99901

3.17

fit

oppA

9.06E-14

1

4.53

fit

talA

7.62E-05

0.99965

2.76

unfit

fdhF

1.84E-04

0.99936

-2.28

fit

caiT

2.28E-07

0.99999

-6.62

fit

pyrD

6.25E-06

0.99994

-13.74

unfit

recC

6.38E-05

0.99969

-2.29

fit

tdh

1.06E-05

0.99991

-3.01

fit

araB

1.21E-04

0.99951

-3.59

fit

nanT

1.13E-05

0.99991

-3.04

fit

acrF

1.53E-06

0.99998

-6.83

fit

pstS

4.63E-05

0.99975

-4.35

fit

metL

3.16E-06

0.99996

-5.65

fit

mhpF

2.77E-05

0.99983

-5.92

fit

glgA

3.51E-04

0.99899

-2.33

fit

glnD

1.09E-05

0.99991

-6.32

unfit

uraA

2.11E-04

0.99929

-2.48

fit

speC

2.49E-06

0.99997

-3.54

unfit

fliP

6.45E-04

0.99846

-3.58

fit

dinG

4.66E-05

0.99975

-3.14

unfit

proW

3.73E-06

0.99996

-4.97

unfit

sbcC

5.66E-04

0.99859

-3.06

unfit

The compare result means that whether the result of our software fit to the result of gene expression profile. After statistic, in these 38 genes, there are 25 genes whose result are same to gNAP’s simulation, 65.8% of the total.


Global Gene Expression Profiling in Escherichia coli K12 EFFECTS OF OXYGEN AVAILABILITY AND ArcA

In this literature, researchers measure the gene expression profiles in Escherichia coli k12 with the effects of oxygen availability and arcA. And arcA is one of the genes in our genetic regulatory network(GRN). We do not consider the effect of oxygen, but instead, we control the oxygen in the same way and consider the effect of arcA+ and arcA-.

By importing the arcA’s promoter and gene sequence, we used our software simulating the enhancement of arcA’s expression and compared the result with the gene expression profile in that literature.

There are 43 genes in that profile which are also in our arcA. Here is the list and Genes differentially expressed between arcA+ and arcA- E. coli strains:

Gene name(NIH) and b no.

p value

PPDE(<p)

-Fold

Compare result

talA(b2464)

2.21E-04

0.99933

3.46

unfit

crr(b2417)

2.14E-09

1

3.63

fit

oppA(b1243)

8.78E-05

0.99966

3.79

fit

rpsJ(b3321)

6.28E-06

0.99995

3.8

fit

ompA(b0957)

1.48E-06

0.99998

4.02

unfit

rbsD(b3748)

3.85E-08

1

4.83

unfit

rplS(b2606)

3.02E-05

0.99984

5.81

fit

nuoE(b2285)

2.34E-09

1

8.83

unfit

rplT(b1716)

4.41E-06

0.99996

9.01

fit

gatY(b2096)

3.49E-07

0.99999

10.72

fit

sdhA(b0723)

2.06E-07

1

14.54

unfit

gpmA(b0755)

1.27E-09

1

16.95

fit

rplM(b3231)

3.40E-07

0.99999

17.05

fit

mdh(b3236)

4.00E-04

0.99896

17.95

unfit

nuoB(b2287)

1.32E-04

0.99954

19.34

unfit

trpB(b1261)

3.12E-10

1

19.97

fit

cyoA(b0432)

9.70E-10

1

23.3

unfit

sdhB(b0724)

1.25E-05

0.99992

27.87

unfit

sucD(b0729)

2.42E-05

0.99987

86.14

unfit

gltA(b0720)

3.09E-05

0.99984

107.01

fit

pyrD(b0945)

5.36E-06

0.99996

-18.91

unfit

dinG(b0799)

3.20E-04

0.99912

-11.79

unfit

gadB(b1493)

1.87E-08

1

-11.23

fit

gadA(b3517)

5.14E-07

0.99999

-9.44

fit

glnD(b0167)

8.77E-05

0.99966

-8.58

unfit

aroM(b0390)

1.81E-04

0.99942

-7.13

unfit

pnuC(b0751)

1.71E-05

0.9999

-6.22

fit

gadX(b3516)

2.32E-06

0.99998

-6.11

fit

sbcC(b0397)

3.89E-04

0.99898

-5.59

unfit

xylR(b3569)

3.55E-04

0.99905

-5.11

fit

gadW(b3515)

2.41E-05

0.99987

-4.63

fit

recC(b2822)

2.98E-05

0.99985

-4.62

fit

appC(b0978)

4.83E-09

1

-4.01

fit

speC(b2965)

2.86E-04

0.99919

-2.97

unfit

glgA(b3429)

1.74E-04

0.99944

-2.9

fit

nanT(b3224)

7.22E-05

0.99971

-2.46

fit

appB(b0979)

3.31E-09

1

-2.43

fit

rhaA(b3903)

1.48E-04

0.9995

-2.41

fit

hycD(b2722)

1.50E-05

0.99991

-2.21

fit

hdeA(b3510)

4.82E-09

1

-2.83

fit

hyaB(b0973)

7.30E-08

1

-2.83

fit

uraA(b2497)

7.91E-05

0.99968

-2.6

fit

glgC(b3430)

4.69E-04

0.99883

-2.15

fit

The compare result means that whether the result of our software fit to the result of gene expression profile. After statistic, in these 38 genes, there are 25 genes whose result are same to gNAP’s simulation, 62.8% of the total.

Consistency

The consistency of the program has also been tested. We inserted a gene as same as a gene in the network and compared the regulations predicted by the program with the original regulations. Without filtering the random similarities, the actual regulations were submerged in the network noise.

Figure 1. The green line represents regulating values.
The blue line represents regulated values.

Figure 2.Predicted regulation without filtered.
The actual regulations are submerged by the noise.

With random similarities filtered, all original regulations were picked out. The result shows that the program is consistent with the original network.


Figure 3.The SNR is better.
The actual regulations are picked out.

Summary

In first two literatures, without the limit of oxygen, the average fitness is up to 69.1%. And in the other two literatures, the average fitness is 64.3%. We thought that it may be the oxygen’s limit which affect the expression of each gene. Gene regulatory network analysis has its weakness about environment’s change.

All in all, the total average of fitness is still up to 66.7%. Therefore, we may draw the following conclusion that our software could simulate the impact of new gene to some extent.

Reference

Arfin S M, Long A D, Ito E T, et al. Global Gene Expression Profiling in Escherichia coliK12 THE EFFECTS OF INTEGRATION HOST FACTOR[J]. Journal of Biological Chemistry, 2000, 275(38): 29672-29684.

Hung S, Baldi P, Hatfield G W. Global Gene Expression Profiling in Escherichia coliK12 THE EFFECTS OF LEUCINE-RESPONSIVE REGULATORY PROTEIN[J]. Journal of Biological Chemistry, 2002, 277(43): 40309-40323.

Salmon K, Hung S, Mekjian K, et al. Global Gene Expression Profiling in Escherichia coli K12 THE EFFECTS OF OXYGEN AVAILABILITY AND FNR[J]. Journal of Biological Chemistry, 2003, 278(32): 29837-29855.

Salmon K A, Hung S, Steffen N R, et al. Global Gene Expression Profiling in Escherichia coli K12 EFFECTS OF OXYGEN AVAILABILITY AND ArcA[J]. Journal of Biological Chemistry, 2005, 280(15): 15084-15096.