You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The intention is to leverage Tensorflow's gradient implementation which computes symbolic partial derivatives of an output tensor with respect to a number of input tensors.
So for example, Montblanc's tensorflow implementation will be able to output the partial derivative of a chunk of model visibilities with respect to (lm, stokes, alpha, ebeam, etc...).
Question for the mathematicians: Is it possible to combine the partials to compute a derivative with respect to all input tensors? If not, it'll have to be brute-forced by computing visibilities for different parameters and differencing.
@ArunAniyan made the point that the partial derivatives are all separate because their quantities are different (coordinate, flux etc.). It should be possible to combine partial derivatives if their quantities are the same.
@bmerry points at that if the functions is well behaved you can take the dot product of the partials with respect to some offsets as in the following equation:
Things to consider when implementing:
Both computation and memory requirements will probably scale by P, where P is the number of partial derivatives. Thus, the memory budgeting mechanisms will need to take this into account.
Not sure to understand the question...
For the chi squared gradient, obviously you can combine the partial derivatives of the visibilities to compute it because of the derivative chaining rule.
After discussion with @landmanbester and some experiments this morning it looks like like tensorflow function (tf.gradients) provides the Jacobian of some tensor with respect to input tensors. For example, this example computes y and the Jacobian of y w.r.t. x:
importtensorflowastfxshape= (4, 2)
x=tf.ones(shape=xshape, dtype=tf.float64)
y=tf.reduce_sum(x**2+x)
grad=tf.gradients(y, x)
withtf.Session() asS:
# Compute y and it's symbolic gradient w.r.t. x_y, _grad=S.run([y, grad])
# _grad[0] contains partial of y w.r.t. x. Same shape as x.assert_grad[0].shape==xshapeprint_y, _grad
Then, to produce a gradient operator, one can flatten the Jacobian(s) out to produce a vector of parameters. So if one thinks of lm coordinates instead of x one will have eight parameters (four l and four m coordinates).
The intention is to leverage Tensorflow's gradient implementation which computes symbolic partial derivatives of an output tensor with respect to a number of input tensors.
So for example, Montblanc's tensorflow implementation will be able to output the partial derivative of a chunk of model visibilities with respect to (lm, stokes, alpha, ebeam, etc...).
Question for the mathematicians: Is it possible to combine the partials to compute a derivative with respect to all input tensors? If not, it'll have to be brute-forced by computing visibilities for different parameters and differencing.
Things to consider when implementing:
/cc @marziarivi, @jtlz2, @SaiyanPrince and all Bayesian inference people, everywhere.
The text was updated successfully, but these errors were encountered: