-
Notifications
You must be signed in to change notification settings - Fork 218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Nullspace calculation similar to COBRA (MATLAB) #1393
Comments
Yes absolutely, that would be nice to have, especially since null spaces are also used in other places (flux sampling for instance). Do you want to take a stab at this? Happy to review a PR... |
I suppose haha, I must say I will first have to figure out most of the stuff that goes on in there. I was porting some matlab pipeline to cobrapy and stumbled upon this. If there are people that are more familiar with the optimizations already implemented in scipy, and how the matlab sparseNull actually is faster, then they might be more suited. I will eventually figure something out that will make this run at reasonable times, so when I do, you will know about it |
Okay sounds good. Otherwise, somebody else may get to it at some point. Another option would be trying around with a sparse QR decomposition. That may be a bit simpler to implement. |
@cdiener So I wrote a functioning (I think, still need to write some tests) sparseNull with included partial code for the LUQ decomposition (as sparseNull doesn't utilize the full LUQ function (the "do_pivot" is not utilized)). I have never done any work for open source projects so are there things I should add in? When I have a little more time this week; Once I have verified it really is working in every instance, I will upload for review! As a side note, I realize we do not utilize Chan et al 2017's LLC method for addLoopLaw? Interestingly my matlab code for addLoopLawConstraints is different from the one present on the COBRA github. I am getting fairly bad performance solving the MILP with the full set of constraints (for the same model that I can solve in MATLAB with reduced constraints from the LLC method). Are there any plans to add in the LLC method into cobrapy? |
Hi @dagl1, that sounds great and I can help you with setting up the PR when you are done. The basic workflow would be:
No we have no plans for LLC, though it would be cool to have. Though it would probably be only the C_NS version because EFMs are currently not in COBRAPY either. We mostly provide the CycleFree method for fast loopless solutions but this has its own issues as pointed out in the Chan paper as well (though it is likely faster than LLC because it's just LPs). |
Perfect, thanks a lot for that workflow, will implement it at a later date! Regarding the CycleFree, if I understand correctly this method does not in fact make anything infeasible if it contains a loop, it just is some type of after-cleaning or is there a smart way of using this to also remove loops during a run? (I am attempting to calculate the highest average expression of a metabolic task (set input/output exchange reactions, all other boundary reactions closed) and without infeasible constraints on loops, the solution will just be loops of highly expressed reactions with a small set necessary for the actual completion of the metabolic task. In such cases, trying to find flux distributions afterwards that are loopless is counterintuitive as such a distribution might be taking a completely different path in the first place (with a different average score). Cyclefree cannot be used in any such manner right? |
Yes you are right, that wouldn't work because the loops themselves go into your objective. How would you go about maximizing that anyway? Because the absolute value trick (min |v_f - v_r| = min v_f + v_r) only works for minimization but not for maximization... |
Hmm so there are a few different things, I create an irrversible model so that all fluxes are >=0 and utilize binary variable Y denoting activity. I was also trying to figure out a way to get a new continous variable Z such that it is atleast the absolute value of V, For the actual implementation of average, the linearization is a bit... difficult (but someone had created a transformation (charnes cooper) and I have implemented the set of constraints for that from: https://math.stackexchange.com/questions/3500493/doing-a-charnes-cooper-transformation-with-matrices-and-an-zero-one-constraint |
Okay that makes sense and is the common strategy I have seen as well for absolute value maximization. For Z you could do simply Also for large problems I would also try the new hybrid solver which also has a faster MIP solver, unless you are already using cplex or gurobi. |
I already run everything in gurobi (computationally I get pretty bad performance in general when a task definition is complex). So when introducing Z as in Z >= V and Z >= -V, I don't actually have any insurance that V == 0 means z = 0. Either way, irreversible models seem the way to go in general, just because it is easier to work with, and the additional reactions don't seem to pose too many problems. |
So I would put everything in the standard form. So there are no negative v_i just v_f,i and v_b,i which are always >=0. Then impose the big M constraints on each individual forward and reverse flux and have the pair constraint like before y_f,i + y_b,i <= 1. In that case |v_i| = |v_f,i - v_b,i| = v_f,i + v_b,i, because one of them is strictly zero due to the previous constraint. |
Checklist
Problem
Using the same model, implementation of add_loops (addLoopLawConstraints in cobra toolbox) takes significantly longer (30-300 seconds versus <1 second) in cobrapy than its matlab counterpart. Timing each step I find that it is the function nullspace, and specifically inside this function the line: _, s, vh = np.linalg.svd(A) that is taking up the majority if the ocmputation time (98%+).
In the matlab implementation a custom (it seems?) function for the nullspace is utilized, and I wondered if it is possible to implement and add this to how cobrapy handles the nullspace approximation?
https://github.com/opencobra/cobratoolbox/blob/572cad82f50b93d7ddeab5d2b5e5e9fca9da4250/external/base/utilities/sparseNull.m#L4
Solution
Implement matlab's version of approximating the nullspace to achieve similar performance as in MATLAB
Alternatives
I attemtped using scipy's sparse svds function, but this was incompatible (produced no null space at all), as well as scipy's regular svd function (which in the few tests i did is about 2x as slow).
Anything else?
No response
The text was updated successfully, but these errors were encountered: