Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check that the stoichiometry of components in the biomass reaction is correct. #243

Open
ChristianLieven opened this issue Oct 13, 2017 · 7 comments

Comments

@ChristianLieven
Copy link
Contributor

This is a future idea:

We could double-check that the conversions from a range of different input units to mmol/ gDW are correct. So a user could provide a table with all their measurements per metabolite in relative units (mol/liter, molecules/cell, mg/gDW), memote could calculate the stoichiometric coefficient from that, and then assert if the components of the biomass equation in the reconstruction match this.

@rmahadevan12
Copy link

I am not sure if this is captured else where but it is important to make sure these add up to 1 g else mass will made lost....

@Midnighter
Copy link
Member

I am not sure if this is captured else where but it is important to make sure these add up to 1 g else mass will made lost...

That's already being tested :)

@kcorreia
Copy link

Several comments about the biomass equation.

Real DCW contains ash content and organic matter. Most biomass equations only have organic matter even though I have seen ash content ranging from 5-10% in S. cerevisiae. If the biomass equation doesn't have ash content then I think its fine to have a biomass equation that is between 0.9 and 1 gDCW.

There are two basic ways of specifying the biomass composition. A lumped reaction with all biomass precursors (most published models follow this format), and breaking the biomass equation into several reactions. I think more "points" should be given to the latter since it is easier to read and modify.

Lumped biomass reaction.
Not everyone uses the correct molar masses to balance the biomass equation but I'd expect it to become more clear with memote. I have seen people use the full molar mass to balance the biomass equation. The "dehydrated molar mass" should be used for the amino acids and carbohydrates. And (MM of XTP - MM ppi) for RNA/DNA.

Segregated biomass reaction.
I actually have some code I wrote in python to build these reactions given some inputs which I can share.

  • Sum(vi AAi) + x H (required for neutral protein compound) -> [Sum(vi) -1 ] H2O1 + 1 gDCW protein (neutral charge). Input is mol AA / mol protein, which is what can be found experimentally.
  • Sum(vi XTP) + Sum(vi) H -> Sum(vi) ppi + H2O + 1 gDCW RNA. I calculated from gene content.
  • Sum(vi dXTP) + Sum(vi) H -> Sum(vi) ppi + H2O + 1 gDCW DNA. Calculated from genome GC content.
    (additional constrain that mol A = mol T, mol C=molG)
    I also use reactions for single species to convert them into 1 gram species below since they can change depending on the organism or conditions:
    -4.9208 chitin -> chitin_1g, glycogen, trehalose.

Many phospholipid reactions are quite repeitive or cumbersome to modify. Different versions I have seen include:
iMM904
0.02 dcacoa[c] + 0.06 ddcacoa[c] + glyc3p[c] + 0.17 hdcoa[c] + 0.09 ocdycacoa[c] + 0.24 odecoa[c] + 0.27 pmtcoa[c] + 0.05 stcoa[c] + 0.1 tdcoa[c] -> 0.01 1ag3p_SC[c] + coa[c]
YMN6.05
1-acyl-sn-glycerol 3-phosphate [endoplasmic reticulum] + acyl-CoA [endoplasmic reticulum]  <=> coenzyme A [endoplasmic reticulum] + 4 H+ [endoplasmic reticulum] + phosphatidate [endoplasmic reticulum] (similar to what I have seen in Pathway Tools)
YMN 7.6.1
1-acyl-sn-glycerol 3-phosphate (18:1) [Golgi membrane] + H2O [Golgi membrane] -> 1-monoglyceride (18:1) [Golgi membrane] + phosphate [Golgi membrane]

My preference is having a combination of iMM904 and YMN6.05 where I specifiy a composition for the acyl-coa, which then feeds into reactions that consume it. The phospholipid molecules then create the DW phospholipid.

  • 0.0115 tdcoa[c] + 0.1533 pmtcoa[c] + 0.0184 stcoa[c] + 0.5665 hdcoa[c] + 0.2504 odecoa[c] -> acoa[c] (monounsaturated)
  • 0.1482 pmtcoa[c] + 0.2347 hdcoa[c] + 0.3265 odecoa[c] + 0.2225 lnlccoa[c] + 0.0682 lnlncgcoa[c] -> acoa[c] (polyunsaturated)
  • 0.0791 tdcoa[c] + 0.7941 pmtcoa[c] + 0.1268 stcoa[c] -> acoa[c] (saturated)

-pc+pa+ps+pg+pe+ptd1ino+clpn -> phospholipid_1g

The advantage of this method is that calculating the molar mass of the phospholipid compounds is easier (for 1 gDCW test), as is modifying the acyl-coa composition. Some databases do not have chemical information for these molecules (or did not last time I did model curation).

All of these reactions can then feed into the biomass equation which can be easily edited (assuming composition of amino acids in protein and and XTP's in RNA are contstant). Another advantage of this segregated biomass equation is that it is easier to debug which macromolecule cannot be synthesized. It's an awful feeling to load a model and see no growth, but with the segregated biomass equation you can set each macromolecule as the objective function to see where the gap is.

a (1 gDW ash) + b (1 gDW phospholipids) + c (free fatty acids)+ d (1 gDW carbs) + e (1 gDW protein) + f (1 gDW RNA) + g (1 gDW DNA) + h (vitamins/cofactors)-> 1 gDCW biomass
where a+b+c+d+e+f+g+h=1.

I realize this is more complex but it's worth the effort in my opinion.

@ChristianLieven
Copy link
Contributor Author

Although I personally have more experience working with a single lumped reaction I can see the benefits of this approach. In the end, we will have to support both, since both are in use and the difference between them boils down to preference.

a (1 gDW ash) + b (1 gDW phospholipids) + c (free fatty acids)+ d (1 gDW carbs) + e (1 gDW protein) + f (1 gDW RNA) + g (1 gDW DNA) + h (vitamins/cofactors)-> 1 gDCW biomass

How do you handle the growth associated maintenance here? Shouldn't it be:
a (1 gDW ash) + b (1 gDW phospholipids) + c (free fatty acids)+ d (1 gDW carbs) + e (1 gDW protein) + f (1 gDW RNA) + g (1 gDW DNA) + h (vitamins/cofactors) + xATP + xH2O-> 1 gDCW biomass + xADP + xH + xPi?

@kcorreia
Copy link

kcorreia commented Oct 27, 2017

Oh I forgot to include GAM. Your formulation is correct.

Although I have seen a recent model that either had a GAM for each macromolecule or GAM calculated from protein content. I prefer a lumped GAM since I don't think we have enough experimental data to delineate GAM based on biomass composition.

@BenjaSanchez
Copy link

The distinction between split (i.e. clustered biomass) and lumped (all parts in 1 rxn) pseudoreaction might not be properly working: Even though the biomass rxn of yeast-GEM is split (has in total 14 metabolites = less than 15, which is the cut-off according to memote), the following results suggest that either the reaction is detected as lumped and/or the "Essential Biomass Precursors" metabolites are not detected (see the memote report here):

  • Number of Missing Essential Biomass Precursors = 37
  • Biomass Consistency = 0.00.

Additional problems with the biomass tests are outlined at SysBioChalmers/yeast-GEM#138

Current biomass pseudoreaction in yeast-GEM:

r_4041: 61.98 ATP + 61.98 H2O + lipid + 0.00099 riboflavin + 0.02 sulphate + 1e-06 heme a + protein + carbohydrate + RNA + DNA -> 61.98 ADP + biomass + 61.98 H + 61.98 phosphate 

@ChristianLieven
Copy link
Contributor Author

@BenjaSanchez thanks for reporting that! We'll take another look at that as soon as possible!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants