CoReCo – New metabolic modelling tool for the production strain development

by Mikko Arvas

 
Metabolic engineering is required to make a microbe to produce a new chemical or to improve the production of an existing product. But how to select the right genes and pathways to be engineered?

 
Stoichiometric metabolic modelling encompasses numerous techniques to make these selections using state-of-the-art computational tools and databases of chemical reactions and compounds. At the heart of stoichiometric metabolic modelling are the metabolic models of organisms. In order to model the metabolism of an organism and hence to select the required genetic modifications a metabolic model for that organism is required.

 
Our task at VTT is to develop products, production strains and production processes for the biotechnology industry using microbial production systems. Our focus is on the production of bulk chemicals (for example polymer precursors or biofuels) and proteins (for example biomass degrading enzymes such as cellulases).

 
In collaboration with Aalto University and University of Helsinki, we have developed a novel tool, CoReCo (Comparative ReConstruction), to reconstruct genome wide metabolic models from genome sequence alone (Figure). Unlike previous tools it takes into account information from related species through a phylogenetic approach and verifies the correctness of reactions by atom-maps.

FigureFigure. CoReCo (Comparative ReConstruction) process. Sequence homology searches (InterProScan, Blast and GTG) are carried out for a set of genomes i.e. a genome of interest and some related genomes. A probabilistic model is built for the presence of each enzyme in the set of species and their ancestors. After that atom mapped, electron and element balanced reactions are taken from a reaction database (for example KEGG) to reconstruct a metabolic network of the reactions that the enzymes can carry out. The end product of the process is a Systems Biology Markup Language – model which contains the stoichiometric matrix required for stoichiometric modelling.

 
We have demonstrated the functionality and usability of the reconstructed models with computational steady-state biomass production experiments (Pitkänen et al. (2014) PLoS Computational Biology, 10(2), e1003465). For example, we show that functional models can be built for species that are very distant from major model organisms such as baker’s yeast and for incomplete genome sequences. After the publication we have carried out extensive development of bacterial and fungal reaction databases and also made algorithmic improvements.

 
With novel long read sequencing techniques such as PacBIO, purchasing a high quality genome for a micro-organism starts to be very cost efficient. For example, a yeast genome costs around 5 – 10 000 €. This opens up efficient genetic modifications techniques and now also, with CoReCo, metabolic modelling for any cultivable micro-organism.

 
Well-established microbial production organisms, such as baker’s yeast, are heavily patented and only represent a tiny fraction of natural variability of metabolism. Therefore, exploration of novel production organisms utilizing CoReCo represents considerable opportunities for the industrial biotechnology sector to create new production strains and IPR.

Mikko photoAsk more information about CoReCo and metabolic modelling from the author Senior Scientist Dr Mikko Arvas. He has background in genetics, but for the last ten years he has concentrated on computational genome analysis of fungi for the needs of industrial biotechnology.