-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
new interface for samples and fba, fva solutions and workaround for #96 #98
base: develop
Are you sure you want to change the base?
Changes from 8 commits
727c0c2
6526db7
8b1410a
e319445
e6eeb9c
d138a58
dd468ef
e4acafd
a263584
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -218,6 +218,27 @@ max_biomass_objective = fva_output[3] | |
|
||
The output of FVA method is tuple that contains `numpy` arrays. The vectors `min_fluxes` and `max_fluxes` contains the minimum and the maximum values of each flux. The vector `max_biomass_flux_vector` is the optimal flux vector according to the biomass objective function and `max_biomass_objective` is the value of that optimal solution. | ||
|
||
```python | ||
fva_output_df = model.fva_to_df() | ||
``` | ||
|
||
returns the solution is an easier to query dataframe with the model's reactions as indices: | ||
```python | ||
>>> fva_output_df | ||
minimum maximum | ||
PFK 7.477306 7.477485 | ||
PFL 0.000000 0.000066 | ||
PGI 4.860632 4.861110 | ||
PGK -16.023610 -16.023451 | ||
PGL 4.959736 4.960214 | ||
>>> | ||
>>> df.loc["NH4t"] | ||
minimum 4.765316 | ||
maximum 4.765324 | ||
Name: NH4t, dtype: float64 | ||
``` | ||
|
||
|
||
To apply FBA method, | ||
|
||
```python | ||
|
@@ -229,6 +250,16 @@ max_biomass_objective = fba_output[1] | |
|
||
while the output vectors are the same with the previous example. | ||
|
||
Again, one may use the `fba_to_df()` to get the solution as a pandas dataframe with model's reactions as indices. | ||
```python | ||
>>> model.fba_to_df() | ||
fluxes | ||
PFK 7.477382 | ||
PFL 0.000000 | ||
PGI 4.860861 | ||
PGK -16.023526 | ||
PGL 4.959985 | ||
``` | ||
|
||
|
||
### Set the restriction in the flux space | ||
|
@@ -266,6 +297,29 @@ sampler = polytope_sampler(model) | |
steady_states = sampler.generate_steady_states() | ||
``` | ||
|
||
### Change the medium on your model | ||
|
||
To do that, you need to first describe the new medium and then assign it on your `model`. | ||
For example: | ||
|
||
```python | ||
initial_medium = model.medium | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this can be simplified avoiding to copy the entire dictionary (see #95 (comment)) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. check for my reply on the issue. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK, I see so there is still an issue in setting the model correctly. Should we address it in a different PR? |
||
model.medium | ||
# {'EX_co2_e': 1000.0, 'EX_glc__D_e': 10.0, 'EX_h_e': 1000.0, 'EX_h2o_e': 1000.0, 'EX_nh4_e': 1000.0, 'EX_o2_e': 1000.0, 'EX_pi_e': 1000.0} | ||
model.fba()[-1] | ||
# 0.8739215069684305 | ||
new_medium = initial_medium.copy() | ||
|
||
# Set anoxygenic conditions | ||
new_medium['EX_o2_e'] = 0 | ||
model.medium = new_medium | ||
model.fba()[-1] | ||
|
||
# Check the difference in the optimal value | ||
# 0.21166294973531055 | ||
``` | ||
|
||
|
||
|
||
|
||
### Plot flux marginals | ||
|
@@ -282,8 +336,8 @@ steady_states = sampler.generate_steady_states(ess = 3000) | |
# plot the histogram for the 14th reaction in e-coli (ACONTa) | ||
reactions = model.reactions | ||
plot_histogram( | ||
steady_states[13], | ||
reactions[13], | ||
steady_states.loc["ACONTa"], | ||
"ACONTa", | ||
n_bins = 60, | ||
) | ||
``` | ||
|
@@ -306,8 +360,8 @@ steady_states = sampler.generate_steady_states(ess = 3000) | |
# plot the copula between the 13th (PPC) and the 14th (ACONTa) reaction in e-coli | ||
reactions = model.reactions | ||
|
||
data_flux2=[steady_states[12],reactions[12]] | ||
data_flux1=[steady_states[13],reactions[13]] | ||
data_flux2=[steady_states.loc["ACONTa"], "ACONTa"] | ||
data_flux1=[steady_states.loc["PPC"], "PPC"] | ||
|
||
plot_copula(data_flux1, data_flux2, n=10) | ||
``` | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -7,12 +7,15 @@ | |
# Licensed under GNU LGPL.3, see LICENCE file | ||
|
||
import numpy as np | ||
import pandas as pd | ||
import sys | ||
from typing import Dict | ||
import cobra | ||
from dingo.loading_models import read_json_file, read_mat_file, read_sbml_file, parse_cobra_model | ||
from dingo.fva import slow_fva | ||
from dingo.fba import slow_fba | ||
import logging | ||
logger = logging.getLogger(__name__) | ||
|
||
try: | ||
import gurobipy | ||
|
@@ -33,10 +36,12 @@ def __init__(self, tuple_args): | |
import gurobipy | ||
|
||
self._parameters["fast_computations"] = True | ||
self._parameters["tol"] = 1e-06 | ||
except ImportError as e: | ||
self._parameters["fast_computations"] = False | ||
self._parameters["tol"] = 1e-03 | ||
|
||
if len(tuple_args) != 10: | ||
if len(tuple_args) != 12: | ||
raise Exception( | ||
"An unknown input format given to initialize a metabolic network object." | ||
) | ||
|
@@ -51,6 +56,8 @@ def __init__(self, tuple_args): | |
self._medium = tuple_args[7] | ||
self._medium_indices = tuple_args[8] | ||
self._exchanges = tuple_args[9] | ||
self._reactions_map = tuple_args[10] | ||
self._metabolites_map = tuple_args[11] | ||
|
||
try: | ||
if self._biomass_index is not None and ( | ||
|
@@ -108,30 +115,51 @@ def fva(self): | |
"""A member function to apply the FVA method on the metabolic network.""" | ||
|
||
if self._parameters["fast_computations"]: | ||
return fast_fva( | ||
min_fluxes, max_fluxes, max_biomass_flux_vector, max_biomass_objective = fast_fva( | ||
self._lb, | ||
self._ub, | ||
self._S, | ||
self._biomass_function, | ||
self._parameters["opt_percentage"], | ||
) | ||
else: | ||
return slow_fva( | ||
min_fluxes, max_fluxes, max_biomass_flux_vector, max_biomass_objective = slow_fva( | ||
self._lb, | ||
self._ub, | ||
self._S, | ||
self._biomass_function, | ||
self._parameters["opt_percentage"], | ||
) | ||
self._min_fluxes = min_fluxes | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we really need them to be stored in the model? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My initial thought was about the At first, I thought of keeping those in the model to provide them in the function but then I thought it would be simpler if the fva is performed with the Yet, maybe it's practical to keep them. No strong arguments on that. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry, I was lost a bit. Are those class variables used even after that changes of this PR? As far as understand sample_from_fva_output function is not using them. |
||
self._max_fluxes = max_fluxes | ||
self._opt_vector = max_biomass_flux_vector | ||
self._opt_value = max_biomass_objective | ||
return min_fluxes, max_fluxes, max_biomass_flux_vector, max_biomass_objective | ||
|
||
def fba(self): | ||
"""A member function to apply the FBA method on the metabolic network.""" | ||
|
||
if self._parameters["fast_computations"]: | ||
return fast_fba(self._lb, self._ub, self._S, self._biomass_function) | ||
opt_vector, opt_value = fast_fba(self._lb, self._ub, self._S, self._biomass_function) | ||
else: | ||
return slow_fba(self._lb, self._ub, self._S, self._biomass_function) | ||
|
||
opt_vector, opt_value = slow_fba(self._lb, self._ub, self._S, self._biomass_function) | ||
self._opt_vector = opt_vector | ||
self._opt_value = opt_value | ||
return opt_vector, opt_value | ||
|
||
def fba_to_df(self): | ||
if not hasattr(self, '_opt_vector'): | ||
self.fba() | ||
fba_df = pd.DataFrame({'fluxes': self._opt_vector}, index=self._reactions) | ||
return fba_df | ||
|
||
def fva_to_df(self): | ||
if not hasattr(self, '_min_fluxes'): | ||
self.fva() | ||
fva_df = pd.DataFrame({'minimum': self._min_fluxes, 'maximum': self._max_fluxes}, index=self._reactions) | ||
return fva_df | ||
|
||
# Descriptors | ||
@property | ||
def lb(self): | ||
return self._lb | ||
|
@@ -172,6 +200,30 @@ def exchanges(self): | |
def parameters(self): | ||
return self._parameters | ||
|
||
@property | ||
def reactions_map(self): | ||
return self._reactions_map | ||
|
||
@property | ||
def metabolites_map(self): | ||
return self._metabolites_map | ||
|
||
@property | ||
def opt_value(self, value): | ||
self._opt_value = value | ||
|
||
@property | ||
def opt_vector(self, value): | ||
self._opt_vector = value | ||
|
||
@property | ||
def min_fluxes(self, value): | ||
self._min_fluxes = value | ||
|
||
@property | ||
def max_fluxes(self, value): | ||
self._max_fluxes = value | ||
|
||
@property | ||
def get_as_tuple(self): | ||
return ( | ||
|
@@ -183,16 +235,11 @@ def get_as_tuple(self): | |
self._biomass_index, | ||
self._biomass_function, | ||
self._medium, | ||
self._inter_medium, | ||
self._exchanges | ||
self._exchanges, | ||
self._reactions_map, | ||
self._metabolites_map | ||
) | ||
|
||
def num_of_reactions(self): | ||
return len(self._reactions) | ||
|
||
def num_of_metabolites(self): | ||
return len(self._metabolites) | ||
|
||
@lb.setter | ||
def lb(self, value): | ||
self._lb = value | ||
|
@@ -221,7 +268,6 @@ def biomass_index(self, value): | |
def biomass_function(self, value): | ||
self._biomass_function = value | ||
|
||
|
||
@medium.setter | ||
def medium(self, medium: Dict[str, float]) -> None: | ||
"""Set the constraints on the model exchanges. | ||
|
@@ -275,17 +321,51 @@ def set_active_bound(reaction: str, reac_index: int, bound: float) -> None: | |
# Turn off reactions not present in media | ||
for rxn_id in exchange_rxns - frozen_media_rxns: | ||
""" | ||
is_export for us, needs to check on the S | ||
order reactions to their lb and ub | ||
is_export for us, needs to check on the S | ||
order reactions to their lb and ub | ||
""" | ||
# is_export = rxn.reactants and not rxn.products | ||
reac_index = self._reactions.index(rxn_id) | ||
products = np.any(self._S[:,reac_index] > 0) | ||
products = np.any(self._S[:,reac_index] > 0) | ||
reactants_exist = np.any(self._S[:,reac_index] < 0) | ||
is_export = True if not products and reactants_exist else False | ||
set_active_bound( | ||
rxn_id, reac_index, min(0.0, -self._lb[reac_index] if is_export else self._ub[reac_index]) | ||
) | ||
self._medium = medium | ||
self._opt_vector = None | ||
|
||
@reactions_map.setter | ||
def reactions_map(self, value): | ||
self._reactions_map = value | ||
|
||
@metabolites_map.setter | ||
def metabolites_map(self, value): | ||
self._metabolites_map = value | ||
|
||
@min_fluxes.setter | ||
def min_fluxes(self, value): | ||
self._min_fluxes = value | ||
|
||
@max_fluxes.setter | ||
def max_fluxes(self, value): | ||
self._max_fluxes = value | ||
|
||
@opt_value.setter | ||
def opt_value(self, value): | ||
self._opt_value = value | ||
|
||
@opt_vector.setter | ||
def opt_vector(self, value): | ||
self._opt_vector = value | ||
|
||
|
||
# Attributes | ||
def num_of_reactions(self): | ||
return len(self._reactions) | ||
|
||
def num_of_metabolites(self): | ||
return len(self._metabolites) | ||
|
||
def set_fast_mode(self): | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool. Do we still need the old method
model.fva()
that changes the internal state of model? I suppose that one function is enough. Then the new one could be calledfva()
it is more clear I think. And the old one should be made private. Actually it is even simpler if there exist only one functionfva()
with the new interface.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar comments hold for
fva
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bit tricky as the original
fva
returns a set of 4 things that we use in other parts of thePolytopeSampler
class. Yet, I changes those parts with an_fva
and_fba
and we now havefva()
andfba()
to return the dataframes. 😉There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, but as the name suggests
_fva
and_fba
should be private, right?Also, I still cannot see the need of two fva (and fba) functions. I think it complicates the interface. Is it possible to keep the old functions and then the user (if needed) compute a dataframe on demand (it is one line of code).