Team:Wageningen UR/Flux balance analysis
From 2013.igem.org
- Safety introduction
- General safety
- Fungi-related safety
- Biosafety Regulation
- Safety Improvement Suggestions
- Safety of the Application
Modeling
“When I came out of school I didn't even think that modeling was a job.”
Introduction
To develop and investigate mathematical models of metabolic processes is one of the primary challenges in systems biology. As a proof of concept of our modular domain approach lovastatin has been chosen and its production in several Aspergilli will be modeled. To investigate the potential of lovastatin production in Aspergillus niger will be compared to that in Aspergillus nidulans , Aspergillus oryzae and Aspergillus terreus.
Rationale
Producing a compound in a novel host at first requires investigation of the possibility to do so. Since the compounds required for biosynthesis of lovastatin are occur naturally in metabolic routes such as the citric acid cycles and fatty acid synthesis pathways, all of the Aspergilli that are modeled have the potential ability to produce lovastatin when the required genes are introduced. Analysis and comparison of the different models allows for a broad insight in efficient biosynthesis strategies.
Aim
• Model and balance the lovastatin pathway
• Expand the metabolic model of A. niger, A nidulans, A. oryzae with the lovastatin biosynthesis pathway
• Perform flux balance analysis to analyze the flux of lovastatin and compare this with the model of A. terreus
• Flux variability analysis to determine the ranges of fluxes that correspond to an optimal solution determined through flux balance analysis
• Change media composition in the model to investigate its effect on lovastatin production
• Use OptKnock to determine gene deletion strategies leading to increased production of lovastatin
Approach
First we need to make the models consistent, meaning that we need to make sure that similar compounds and reactions have similar names in the different models. Since the origin of the models is not the same, and even in those that originate from the same research group, there are differences that complicate a comparative analysis. After having generated a generic namespace for both reactions and metabolites we will analyze the metabolic flux towards lovastatin and the corresponding state space. Changing medium conditions will allows us to obtain insight in effect of its compositions to deduce efficient production media. Last of all we will use a computational intensive script to determine what gene deletion strategies are most favorable.
Research methods
First of all we extract the models from their respective sources. Since the A. terreus model is not in xml format we need to create this ourselves. In order to do so and make the models consistent we make use of MetanetX , which is an initiative in trying to standardise metabolic models. The models that we investigate are those of A. terreus, A. niger, A. nidulans and A. oryzae. After we have obtained all the models in xml format we make use of the COBRA toolbox within MATLAB. The COBRA toolbox facilitates easy input of the metabolic model in the Systems Biology Markup Language (SBML) to perform these calculations in MATLAB. Once the model has been expanded flux balance analysis allows for a genome-scale approach. OptKnock can be used to determine which gene knockouts should increase the metabolic flux towards lovastatin.
Lovastatin pathway
Lovastatin starts with the synthesis of dihydromonacolin L by a large iterative polyketide synthase (lovB) and an enoyl reductase (lovC). The iterative polyketide synthase starts with amalgamation of acetyl-CoA and malonyl-CoA in the first step, after which another malonyl-group is added at each subsequent step. At one point a methyl group is added, which is derived from S-adenosyl-methionine. Together LovB and LovC they catalyze 18 reactions to form this intermediate of lovastatin. Also a Diels Alder cyclization occurs during the process, though this reaction occurs spontaneous. In order to model the biosynthesis of this intermediate several steps have been lumped into a total of 8 reactions, simply because exact details of intermediates formed are unknown and this results in the highest level of detail possible.
In parallel with this process there is another, very similar polyketide synthase, LovF, that synthesizes the intermediate 2-metylbutyryl-CoA from the same starting substrates, acetyl-CoA and malonyl-CoA.
In the final step of the process, LovD amalgamates monacolin J and 2-metylbutyryl-CoA into the product lovastatin.
However, in order to balance the pathway more detail is required. It turns out that the co-factor NADPH is required by LovB, a non-trivial detail, which was found via Uniprot.Results
In order to analyze the differences and compare the different models it is best to use a standardized namespace. Although MetanetX attempts to do so, it is not perfect and therefore one must carefully address the changed imposed by the system when one tries to upload a model. Another thing that needs to be taken into account is that the MATLAB script written is generic, such that when the models are modified the script should function properly as before.
Since the files are in SBML format (.xml), converting them to MATLAB (.mat) files allows for much faster loading and saving of the models. In order not to overwrite the previous model when saving it a prefix will be added to the file name
click here for code
.
%chose the right path
modelPath = {'/Users/.../model1.xml','/Users/.../model2.xml,'/Users/.../model3.xml'}
modelPath = {'/Users/.../model1.mat','/Users/.../model2.mat,'/Users/.../model3.mat'}
modelPath = {'/Users/.../model1.sbml2.xml','/Users/.../model2.sbml2.xml,'/Users/.../model3.sbml2.xml'}
model = readCbModel(modelPath); % .xml files
load(modelPath{i}); % .mat files
modelNames = regexprep(modelPath{i},{'\.mat','.*/([^/]*)$'},{'.xml','new_$1'});
writeCbModel(model,'sbml',modelNames); % .xml files
save(modelNames,'model'); % .mat files
Converting and improving the A. terreus model
The A. terreus model was obtained in excel format. The first step taken is to convert the Excel file (.xls) into .xml format such that the model can be accessed via MATLAB: (link to script). For A. terreus the biomass reaction contained a great number of metabolites. By adding an additional compound to the reactions called 'BIOMASS' with a stoichiometric coefficient of 1, we can add an exchange reaction for biomass. This allows for optimization of this exchange reaction in FBA, a feature that should be generic for all models.
The A terreus model does already contain a pathway for lovastatin production, however as it turns out this pathway is not functional. As it turns out the pathway is not balanced; the first step in which they lump the 35 reactions performed by lovB and lovC is missing the co-factor NADPH on one side, while NADP+ is present on the other.
J. Lui et al. (2013). Genome-scale reconstruction and in silico analysis of Aspergillus terreus metabolism. Molecular BioSystems, Vol. 9, p. 1939-1948
Standardising the models
Before proper conversion to the MNXM namespace from MetanetX can be finalized, uploading to model to MetanetX yields a mapping summary. This mapping summary includes information on the metabolites, compartments and reactions and their success of mapping. Since the system does not recognize all of them properly, the report has to be checked manually.
Metabolites
Mapping of metabolites is done on recognition of IDs and description. To check whether all metabolites were mapped correctly, mapping of those that do not have similar descriptions, IDs, or both, have to be manually curated from errors. Wrongly mapped metabolites had to be either set to their right respective MNXM id or else were given the prefix ‘my_’ to ensure proper mapping.
click here for code
.
model.mets =
regexprep(model.mets,{'^COA\[','^ACCOA\[','^MALCOA\[','^NADPH\[','^NADP\[','^CO2\[','^H2O\[','^SAM\[','^SAH\[','^nadp\[','^accoa\[','^c160\[','^h2o\[','^coa\[','^hcya\[','^glcnt\[','^papce\[','^fgam\[','^sam\[','^co2\[','^sah\[','^malcoa\[','^nadph\['},{'MNXM12[','MNXM21[','MNXM40[','MNXM5[','MNXM6[','MNXM13[','MNXM2[','MNXM16[','MNXM19[','MNXM5[','MNXM21[','my_c160[','MNXM2[','MNXM12[','my_hcya[','my_glcnt[','pcace[','my_fgam[','MNXM16[','MNXM13[','MNXM19[','MNXM40[','MNXM6['});
model.mets =
regexprep(model.mets,{'^NADPLUS\[','^ACE\[','^PHACAL\['},{'NAD[','AC[','my_PHACAL['});
model.mets =
regexprep(model.mets,{'^COA\[','^OIVAL\[','^NADP\[','^CO2\[','^NADPH\[','^H2O\[','^ACCOA\['},{'MNXM12[','my_OIVAL[','MNXM5[','MNXM13[','MNXM6[','MNXM2[','MNXM21['});
model.mets =
regexprep(model.mets,{'^nadp\[','^accoa\[','^c160\[','^h2o\[','^coa\[','^hcya\[','^glcnt\[','^papce\[','^fgam\[','^sam\[','^co2\[','^sah\[','^malcoa\[','^nadph\['},{'MNXM5[','MNXM21[','my_c160[','MNXM2[','MNXM12[','my_hcya[','my_glcnt[','pcace[','my_fgam[','MNXM16[','MNXM13[','MNXM19[','MNXM40[','MNXM6['});
Compartments
To ensure mapping to the right compartment, it was found useful to put all the compartment names as a suffix behind square brackets (link to script). In a few cases mapping to the wrong compartment still occurred with several compounds that were not yet in the MNXM namespace, therefore these compounds were manually modified (link to script). Assignment of
metabolite compartments
,
manual correction via excel
and
new names
to prevent mis mapping of metabolites.
%metabolite compartments
model.mets = regexprep(model.mets,{'m$','e$','p$','([^\]])$'},{'[m]','[e]','[x]','$1[c]'});
[newmets,newMNXM] = import_excel;
%function
model= newname(model,newmets,newMNXM); %function
function [mets,MNXM] = import_excel
%% Import data from spreadsheet
% Script for importing data from the following spreadsheet
% Workbook: /Users/.../correction.xlsx
% Worksheet: Sheet1
% To extend the code for use with different selected data or a different
% spreadsheet, generate a function instead of a script.
% Auto-generated by MATLAB on 2013/09/05 16:25:41
%% Import the data
[~, ~, raw] = xlsread('/Users/.../model.xlsx','Sheet1');
raw = raw(:,1:3);
raw(cellfun(@(x) ~isempty(x) && isnumeric(x) && isnan(x),raw)) = {''};
cellVectors = raw(:,[1,2,3]);
%% Allocate imported array to column variable names
mets = cellVectors(:,1);
metNames = cellVectors(:,2);
MNXM = cellVectors(:,3);
%% Clear temporary variables
clearvars data raw cellVectors;
end
function model = newname(model,mets,MNXM)
mets = regexprep(mets,{'^M_','_[cmep]$'},{'',''});
for i=1:numel(mets)
if strcmp(MNXM{i},'')
model.mets = regexprep(model.mets,['^(' mets{i} '\[.*)'],'my_$1');
elseif strncmp(MNXM(i),'MNXM',4)
name = MNXM{i};
model.mets = regexprep(model.mets,['^' mets{i} '\['],[name '[']);
else
disp('unexpected entity')
end
end
Reactions
Similar to the A. terreus model, both A. niger and A. nidulans don’t have a exchange reaction for biomass, which was added to allow for optimization of this parameter with the same script for all models. The last non-triviality that had to be taken into account was the balancing of the reactions. To do so the model adds protons to the side of the reaction where it deems they are missing. A serious downside of this feature is that when metabolites are not recognized are involved, it is not able to take into account the protons absorbed or lost by these compounds. Still however, it adds protons to balance the reaction, but these are not rightly added.
click here for code
.
compartments
assigned to an unknown compartment.
if i==4 %add biomass for terreus
model = addMetabolites(model,{'BIOMASS[c]'},{''});
model = removeRxns(model,{'BIOMASS'});
rxnFormula = '71.4 h2o[c] + 0.14613 asp[c] + 0.15371 glu[c] + 0.22294 ala[c] + 0.17484 gly[c] + 0.10126 gln[c] + 71.4 atp[c] + 0.08887 asn[c] + 0.15713 pro[c] + 0.03297 cys[c] + 0.16063 arg[c] + 0.05566 met[c] + 0.2067 ser[c] + 0.03822 trp[c] + 0.15021 thr[c] + 0.06317 his[c] + 0.09533 phe[c] + 0.07351 tyr[c] + 0.26575 chit[c] + 0.02795 utp[c] + 0.3348 mnt[c] + 0.03051 gtp[c] + 0.0015 glycogen[c] + 0.90068 13glucan[c] + 0.076 gl[c] + 0.07966 pc[c] + 0.06168 tagly[c] + 0.03399 pe[c] + 0.02264 my_paa[c] + 0.01378 ps[c] + 0.02091 ctp[c] + 0.16131 val[c] + 0.23243 leu[c] + 0.03153 my_ffa[c] + 0.00431 dgtp[c] + 0.00398 datp[c] + 0.00431 dctp[c] + 0.00398 dttp[c] + 0.12286 ile[c] + 0.1126 lys[c] -> 71.4 h[c] + 71.4 pi[c] + 71.4 adp[c] + 1 BIOMASS[c]';
model = addReaction(model,{'BIOMASS[c]','BIOMASSDEDUCED'},rxnFormula,[ ],[ ],[ ],[ ],1,'BIOMASS','',[ ],[ ],false);
end
biomass = model.mets(strcmpi(model.mets,'biomass[c]'));
if any(i==[2 3 4])
model = addExchangeRxn(model,biomass,-1000,1000);
end
if i==1 %oryzae
model = removeRxns(model,{'r1068','r1069','r499','r500',},false,false);
rxnFormula = 'ATP[c] + GTP[c] -> AMP[c] + pppGp[x] +6 H[c]';
model = addReaction(model,{'r1068','GTP diphosphokinase'},rxnFormula,[ ],[ ],[ ],[ ],1,'GTP diphosphokinase','',[ ],[ ],false);
rxnFormula = 'H2O[c] + pppGp[x] + 4 H[c] <=> PI[c] + ppGp[x]';
model = addReaction(model,{'r1069','Exopolyphosphatase'},rxnFormula,[ ],[ ],[ ],[ ],1,'Exopolyphosphatase','',[ ],[ ],false);
rxnFormula = 'NADP[c] + 2 FERI[m] -> NADPH[c] + 2 FERO[m] + 4 H[m]';
model = addReaction(model,{'r499','NADPH-cytochrome p-450 reductase'},rxnFormula,[ ],[ ],[ ],[ ],1,'NADPH-cytochrome p-450 reductase','',[ ],[ ],false);
model = addReaction(model,{'r500','NADPH-cytochrome p-450 reductase'},rxnFormula,[ ],[ ],[ ],[ ],1,'NADPH-cytochrome p-450 reductase','',[ ],[ ],false);
end
if i==3 %niger
model = removeRxns(model,{'312'})
rxnFormula = '2 FERI[m] + NADP[m] -> 2 FERO[m] + NADPH[m] + 4 H[m]';
model = addReaction(model,{'312','NADPH-ferrihemoprotein reductase (cprA)'},rxnFormula,[],[],[],[],1,'NADPH-ferrihemoprotein reductase (cprA)','',[],[],false);
end
if i==4 %terreus
model = removeRxns(model,{'r0023'});
rxnFormula = 'fad[m] + hpro[c] -> fadh2[m] + phc[c] + h[c]';
model = addReaction(model,{'r0023','hydroxyproline oxidase'},rxnFormula,[],[],[],[],1,'hydroxyproline oxidase','',[],[],false);
end
Adding the lovastatin pathway
Since the lovastatin pathway is only present in the A. terreus model, but is not well balanced here, a function was written to add the pathway for lovastatin biosynthesis. First all the metabolites that are not yet in the model need to be added. Then the pathway upto the production of the intermediate dihydromonacolin L acid was added. This pathway consists of 35 reactions that have been lumped into 10 reactions. Together with the remaining 4 reactions required the total lovastatin biosynthesis pathway has been modeled in 14 steps. Finally also a exchange reaction for lovastatin was added to be able to optimise for its production.
The prefix 'Ex_' was added to all exchange reactions for easy recognition and an additional function was implemented that allows to easily change the optimisation criterion. Both the
metabolites
and
reactions
need to be added to the model.
%add metabolites for lovastatin pathway
j =
{'diketide_LovBbound[c]','triketide_LovBbound[c]','tetraketide_LovBbound[c]','pentaketide_LovBbound[c]','hexaketide_LovBbound[c]','cyclic_hexaketide_LovBbound[c]','heptaketide_LovBbound[c]','octaketide_LovBbound[c]','nonaketide_LovBbound[c]','dihydromonacolin_L_acid[c]','monacolin_L_acid[c]','monacolin_J_acid[c]','2-metylbutyryl-COA[c]','lovastatin_acid[c]'};
h =
{'C4H5O','C6H7O','C9H13O','C11H17O','C13H19O','C13H19O','C15H23O','C17H27O2','C19H32O4','C19H32O4','C19H30O4','C19H30O5','C26H45N7O17P3S','C24H36O6'};
model = addMetabolites(model,j,h); %function
model = addLovPathway(model); %add reactions for lovastatin
model = addExchangeRxn(model,{'lovastatin_acid[c]'},0,1000); %add exchange reactions for lovastatin production
model.c(:) =0;
function model = addLovPathway(model)
metNames = {{'ACCOA','MALCOA', 'NADPH','diketide_LovBbound', 'NADP','COA','CO2','H2O'}, ...
{'MALCOA','diketide_LovBbound', 'NADPH','triketide_LovBbound','NADP','COA','CO2','H2O'}, ...
{'MALCOA','triketide_LovBbound','SAM','NADPH','tetraketide_LovBbound','SAH','NADP','COA','CO2','H2O'}, ...
{'MALCOA','tetraketide_LovBbound','NADPH','pentaketide_LovBbound','NADP','COA','CO2','H2O'}, ...
{'MALCOA','pentaketide_LovBbound','NADPH','hexaketide_LovBbound','NADP','COA','CO2','H2O'}, ...
{'hexaketide_LovBbound','cyclic_hexaketide_LovBbound'}, ...
{'MALCOA','cyclic_hexaketide_LovBbound','NADPH','heptaketide_LovBbound','NADP','COA','CO2','H2O'}, ...
{'MALCOA','heptaketide_LovBbound','NADPH','octaketide_LovBbound','NADP','COA','CO2'}, ...
{'MALCOA','octaketide_LovBbound','NADPH','H2O','nonaketide_LovBbound','NADP','COA','CO2'}, ...
{'nonaketide_LovBbound','dihydromonacolin_L_acid'}, ...
{'dihydromonacolin_L_acid','monacolin_L_acid'}, ...
{'monacolin_L_acid','H2O','monacolin_J_acid'}, ...
{'ACCOA','MALCOA','SAM','2-metylbutyryl-COA','COA','SAH','CO2','H2O'}, ...
{'2-metylbutyryl-COA','monacolin_J_acid','lovastatin_acid','COA'} ...
};
stoich = {[-1 -1 -1 1 1 2 1 1], ...
[-1 -1 -1 1 1 1 1 1], ...
[-1 -1 -1 -2 1 1 2 1 1 1], ...
[-1 -1 -2 1 2 1 1 1], ...
[-1 -1 -1 1 1 1 1 1], ...
[-1 1], ...
[-1 -1 -2 1 2 1 1 1], ...
[-1 -1 -1 1 1 1 1], ...
[-1 -1 -1 1 1 1 1 1], ...
[-1 1], ...
[-1 1], ...
[-1 -1 1], ...
[-1 -1 -1 1 1 1 1 1], ...
[-1 -1 1 1]};
for i=1:numel(stoich)
metNames{i} = strcat(metNames{i},'[c]');
metNames{i} = regexprep(metNames{i},' ','_');
model = addReaction(model,['rxnName' num2str(i)],metNames{i},stoich{i});
end
Medium constraints
In order to find medium constraints of the models all lower and upper bounds are first set to the value of -1000 and 1000 arbitrary units, respectively. Then one by one the lower and upper limit of the reaction bounds are set to 0 for one reaction at a time. If the model is not able to optimise the reaction of interest this indicates that exchange of this component with the medium is a prerequisite. If a lower bound larger then 0 is required this indicates that the component needs to be present in the in silico medium. When an upper bound larger then one required this means the opposite, namely that detrimental accumulation of this metabolite occurs (
See script
).
function model = mediumCheck(model)
model.lb(findExcRxns(model))=-1000;
model.ub(findExcRxns(model))=1000;
er = find(findExcRxns(model));
for i=1:numel(er)
model.lb(er(i)) = 0;
check = optimizeCbModel(model);
if check.f<10^-6
display 'essential metabolite required, knowing:'
[model.rxns(er(i)) model.rxnNames(er(i))]
end
model.lb(er(i)) = -1000;
end
for i=1:numel(er)
model.ub(er(i))=0;
check = optimizeCbModel(model);
if check.f<10^m-6
display 'detrimental accumulation of'
[model.rxns(er(i)) model.rxnNames(er(i))]
end
model.ub(er(i)) = 1000;
end
Media
In order to define a proper in silico medium need to know which exchange reactions the models have in common and which of these are of essential importance.
Flux Balance Analysis
Gene knockout strategies