-
Notifications
You must be signed in to change notification settings - Fork 23
stochastic indexing
Some initial thoughts from Chris (5/1/17):
Here's an example that I think will cover many of the use cases:
y[i] ~ dnorm(mu[tmp[i]+1], 1)
In cases like this we presumably need to lift
index[i] <- tmp[i]+1
y[i] ~ dnorm(mu[index[i]], 1)
So I think we can focus on
y[i] ~ dnorm(mu[index[i]], 1)
In terms of nodeFunctions, we'll end up with code like model$logProb_y[i] = dnorm(model$mu[model$index[i]], 1)
but then we'll need to modify the INDEXEDNODEINFO functionality to handle model values (i.e., model$index[i]) as well as constants. I hope that this is a straightforward change.
I think the main challenge is in dealing with dynamic dependencies. At a high level, I think we want dynamic dependency objects that return the current dependents at run-time when used in calculate(), simulate(), etc., but that are only updated when needed (lazy evaluation) and may only be updated specific to the dependency information that is needed. Here are some further thoughts:
I think we'll want a dynamic depsObject class that can be queried by calculate(), simulate(), etc. and will return a vector of nodes. So there will be another level of indirection when using a nodeFunction because the vector of dependencies will not be static.
In the example above, the dependency of index[1] is y[1], while the dependent of mu[i] changes as index[1:n] changes.
depsObject might be something like this:
member data: currentDeps
member data: fixedDeps
member data: one or more stochDepObjects (see below)
member method: get -- a getter will return the currentDeps if these are up-to-date;
if not, it will get the current stochastic deps from stochDepObject and combine
with fixedDeps to get the sorted, unique deps. This will be stored in currentDeps
and returned
Note that depsObject will be created by algorithms and are not a part of a model. depsObject will need to exist in C++ as calls to the get method will be invoked at run-time.
stochDepObject will be another class that will have instances for each variable that involves stochastic indexing. So in the example above, there will be a stochDepObject for 'mu' that has a list of relevant indices, 'index[1]','index[2]',...
I think that stochDepObject will be a part of the model, as we want one for each variable that involves stochastic indexing.
stochDepObject might look like this:
member data: name of variable that is stochastically indexed (e.g., 'mu')
member data:
[1]: vector of indices that are equal to 1
[2]: vector of indices that are equal to 2
...
member data: queued_changes
member data: stored_index_values
member method: get(theindex) -- will return the nodes for the indices
indicated by the vector corresponding to theindex, if stochDepObject is
up-to-date. If not, it will see if the queued changes involve theindex.
If they do, it will update the relevant vectors. It will then return the
relevant nodes.
So a call to mu_stochDepObject$get(myindex)
will look up all the indices that are equal to myindex and will return the corresponding nodes. These nodes can then be queried for their dependencies using the depObjects functionality (this probably means that a call to getDependencies() in setup code may trigger some downstream additional calls to getDependencies for relevant indices.
Now here's the part that would be good computationally but may get rather involved. Whenever 'index[1]' (or other indices) changes, we want to pass that info to all stochDepObjects that involve 'index[1]', telling them that 'index[1]' has changed - this info would get put in the queued_changes slot. Then mu_stochDepObject knows that it is out-of-date and what needs to be recalculated.
So we somehow need that the dependencies of 'index[1]' include 'mu_stochDepObject' as some sort of special deterministic 'node' so that calculate(getDependencies('index[1]'))
modifies 'mu_stochDepObject'. And I think for the caching above, we need to store the most recent values for the various indices so that mu_stochDepObject can be updated efficiently. This may mean having a vector of index values in stochDepObject ('stored_index_values'). Then when index[1] changes and mu_stochDepObject updates itself by seeing that queued_values includes index[1], it can look up stored_index_values['index[1]']
and the current value of index[1] and use that to update the relevant member data vectors.
As a side note, we presumably need "indexID" as a notion analogous to "nodeID".
Further thoughts:
We'll presumably want to handle complications to basic scalar stochastic indexing, such as
y[i,] ~ dmnorm(mu[index, ], prec[ , , index])
I'm not sure if we want to allow multiple stochastic indices, e.g.,:
index <- c(rcat(1, p), rcat(1, p))
tmp[1:2] ~ dmnorm(mu[index], prec[,])
That might be handled by requiring the user to construct things manually but will need to think through in terms of dependency stuff:
for(i in 1:n) {
id <- rcat(1, p)
muvec[i] <- mu[id]
}
tmp[1:n] ~ dmnorm(muvec[1:n], prec[,])