communityPGLMM.bayes #30

arives · 2018-10-11T14:55:31Z

Russell,

I've just tried communityPGLMM.bayes with a binomial distribution in the cbind(success, failure) format, and I get the error

Error in INLA::inla(as.formula(inla_formula), data = data, verbose = verbose, :
dim(y...orig)[2] == 1 is not TRUE

Is this because of the format?

I'm trying the bayes version because I'm working with a large dataset, and the ML version isn't converging.

Thanks, Tony

daijiang · 2018-10-11T15:36:46Z

I don't think the binomial distribution has been implemented in our code yet. The binary version should work.

rdinnager · 2018-10-11T17:41:01Z

Hi Tony, As Daijiang mentioned, only binary coding works at the moment. It's on my to do list to add successes and trials coding. Should be pretty easy to do so I will try and get that done today. I'll let you know when the update is merged. Cheers, Russell

…

On Fri., 12 Oct. 2018, 2:04 am Anthony R. Ives, ***@***.***> wrote: Russell, I've just tried communityPGLMM.bayes with a binomial distribution in the cbind(success, failure) format, and I get the error Error in INLA::inla(as.formula(inla_formula), data = data, verbose = verbose, : dim(y...orig)[2] == 1 is not TRUE Is this because of the format? I'm trying the bayes version because I'm working with a large dataset, and the ML version isn't converging. Thanks, Tony — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#30>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AFq4OTWQXQYplkh7F-ApMobCRitVXwCGks5uj14AgaJpZM4XXuhO> .

rdinnager · 2018-10-12T03:34:46Z

I've made a pull request where I have added this functionality, which you can try once it has been merged. You have to add the number of trials as a vector to the communityPGLMM call, like this:

phyr::communityPGLMM(freq ~ 1 + shade + (1 | sp__) + (1 | site) + (1 | sp__@site), 
                                               dat, tree = phylotree, family = 'binomial', add.obs.re = F, bayes = TRUE,
                                               Ntrials = dat$freq + dat$freq2, ML.init = FALSE,
                                               prior = "pc.prior.auto")

I would recommend using prior = "pc.prior.auto", because the default priors are usually quite bad, especially for non-gaussian response. Also, ML.init = FALSE will be necessary since ML.init = TRUE (the default) will fit the ML version first to use as starting values, and since it isn't converging, that will not be too useful. Let me know if you run into any other issues.

Cheers!

daijiang · 2018-10-12T20:28:27Z

Tony,
Now you should be able to use the bind(success, fail) in binomial formula when using bayes = T. Remember to set ML.init = F so it won't fit the ML version first.
Cheers,
Daijiang

arives · 2018-10-13T11:49:32Z

Daijiang, Thanks so much for doing this! In the last 2-3 days I've looked more carefully for ways to improve the stability and speed of PGLMM. I'm doing it because the final version of Jesse's ms was accepted, but it really should include a binomial model. We're supposed to return a final version next week. There are some small improvements possible. A problem is that the working estimates of b can get very large or small, and this causes problems with the iterative estimates of the mean effects. The values of b can be truncated (e.g., b[b<(-8)] <- (-8)), but where the cutoff is affects the results sometimes. I also looked at the possibility of using a Laplace approximation, but there are technical reasons why I think this might not be a good solution. The Laplace transform, even though it solves the problem in a different way, still requires an iterative solution to the values of b. By the way, your choice of nelder-mead-nlopt is a good one! Cheers, Tony

arives · 2018-10-14T12:49:53Z

The bayes version still doesn't seem to take the bind() specification. I installed phyr from both the master and bayes branches, and both gave the same:

mod.pglmm.bayes <- communityPGLMM(value ~ env*trait, random.effects = list(re.sp, re.spV, re.env.sp, re.site, re.obs), family="binomial", data=dat, verbose=T, bayes = TRUE, ML.init = FALSE)
Error in INLA::inla(as.formula(inla_formula), data = data, verbose = verbose, :
dim(y...orig)[2] == 1 is not TRUE

daijiang · 2018-10-14T14:52:55Z

Tony,
Can you use cbind(success, fail) instead of value in the formula? It is tricky when using value in parsing the formula.
Best,
Daijiang

arives · 2018-10-14T16:26:16Z

Thanks. I took a quick look, and it wasn't obvious to me how to apply the same code used in the PGLMM approach in the bayes approach. It might be something worth updating just to match glmer. Formulating binomial distributions is always annoying, so making it match glmer might be easier for people. Overall, I'm really impressed with the bayes version. Thanks Russell! Honestly, for large problems or detailed simulation studies, my way of doing binomial distributions (using PQL) is just too slow. I guess we could try to hack glmer for their code. But a lot of the speed of glmer is based on being able to decompose non-nested matrices. I guess I could also use something like a Gibbs sampler. But bayes is working well. I think I might just stick with it.

daijiang · 2018-10-14T17:04:28Z

Hi Tony,
I think glmer also use the cbind(success, fail) form for binomial distribution, right?
Daijiang

arives · 2018-10-14T17:49:10Z

I don't think you can enter "cbind(s,f)", but you can do x$value <- cbind(s,f) and then use value. That's why I set up the non-bayes version like this.

daijiang · 2018-10-14T18:29:06Z

I think you can. https://rdrr.io/cran/lme4/man/glmer.html see the examples.

rdinnager · 2018-10-15T04:12:42Z

Hi Tony,

Thanks! I am glad you like the bayes version. I can't take too much credit however, I just figured out how to specify the model in INLA, which is a package which I am continually impressed with. So thanks goes to the awesome authors of INLA. I have been using it to expand the basic idea of PGLMMs into some cool extensions (such as incorporating space into the models). Anyway, I still want to write a few utility functions for communityPGLMM objects fit with bayes = TRUE, so that users can easily extract and plot things like the marginal posterior distributions of the parameters (or sample from the joint posterior). @daijiang I'm interested in make some plotting functions using ggplot2. Is it cool if I import some functions from ggplot2 into the phy namespace for that?

arives · 2018-10-15T11:55:24Z

Russell, I think Daijiang should have final say over importing ggplot2. Honestly, it annoys me, because it seems intentionally designed to be incompatible with other data structures. But I know a lot of people use it, including Daijiang. Boy, I do seem to sound more and more like a grumpy old fart. But you have to know that I didn't get rid of my b&w monitor until 2001. So maybe I've been a grumpy old fart all my life. Cheers, Tony

daijiang · 2018-10-15T13:17:40Z

Hi Russell,
I think I should spend sometime to learn INLA later! Is it possible to use current syntax of phyr to do the models you want to do? Like including spatial correlation with (1|site__) and specify the V matrix for site? (If we treat time as "site" this can also be temporal autocorrelations) Currently, the data are limited to be organized as species nested within sites, it is possible to include another variable such as year so that temporal and spatial autocorrelations can be included at the same time.

In terms of ggplot2, I think it should be fine to import it in. We have some plotting functions to show the random term matrices in phyr using lattice, but nothing to plot posterior distributions. Is there any other packages have plot functions for INLA outputs?

Tony, I think compatibility is not an issue because we can easily transform the structure of the data to be plotted with base plot function in R. In most cases, ggplot2 uses the "long" format while base R functions use the "wide" format data sets.

Cheers,
Daijiang

arives · 2018-10-15T14:08:44Z

Russell and Daijiang, I think the general specification of communityPGLMM will handle pretty much any covariance matrix if you use the re=list() version with a single element in the list. If you have a factor other than sp and site, the re=list() with 3 or 4 elements should work too (and should be faster). As for ggplot2, I was just whining. Yes, as long as you don't need to interact with ggplot2, it doesn't matter: you can just give it stuff. It's just not very flexible. Cheers, Tony

arives · 2018-10-16T17:55:23Z

I just remembered why it would be good to have communityPGLMM take data.frames with a variable value = cbind(success, failure). simulate() and update() don't work for glmer() unless this format is used.

Cheers, Tony

daijiang · 2018-10-17T01:36:28Z

Hi Tony,
Can you post some code to show that it does not work? It seems work for me.

library(lme4)
gm1 <- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd),
              data = cbpp, family = binomial)
update(gm1,devFunOnly=TRUE)
simulate(gm1, nsim = 1)

Cheers,
Daijiang

arives · 2018-10-17T10:57:56Z

Daijiang, Oh, the problem comes when you try to have update within another function. Here is an example. It took me a while to figure this out, and I'm not sure why lme4 responds this way. Cheers, Tony bootMer_LRT <- function(mod, formula.r, re.form=NA, nAGQ=0, nboot=2000, verbose=F, data=NA) { mod.f <- update(mod, nAGQ=nAGQ, control = glmerControl(calc.derivs=F)) mod.r <- update(mod, formula=formula.r, nAGQ=nAGQ, control = glmerControl(calc.derivs=F)) LLR.true <- logLik(mod.f) - logLik(mod.r) dist <- data.frame(LLR = rep(0,nboot)) sim.data <- model.frame(mod.f) for(j in 1:nboot){ if(is.na(re.form)){ simY <- simulate(mod.r) }else{ simY <- simulate(mod.r, re.form=formula(re.form)) } sim.data[,1] <- simY[,1] mod.f.boot <- update(mod.f, data=sim.data, start=list(theta=attributes(mod.f)$theta)) mod.r.boot <- update(mod.r, data=sim.data, start=list(theta=attributes(mod.r)$theta)) dist$LLR[j] <- logLik(mod.f.boot) - logLik(mod.r.boot) if(verbose==T) show(c(j, LLR.true, dist$LLR[j])) } P <- mean(LLR.true < dist$LLR) return(list(mod.r=mod.r, dist.LLR=dist$LLR, P=P)) } mod <- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd), data = cbpp, family = binomial) mod.r <- glmer(cbind(incidence, size - incidence) ~ 1 + (1 | herd), data = cbpp, family = binomial) bootMer_LRT(mod, mod.r) But this works cbpp$value <- cbind(cbpp$incidence, cbpp$size - cbpp$incidence) mod <- glmer(value ~ period + (1 | herd), data = cbpp, family = binomial) mod.r <- glmer(value ~ 1 + (1 | herd), data = cbpp, family = binomial) bootMer_LRT(mod, mod.r, nboot=20) From: Daijiang Li <[email protected]> Reply-To: daijiang/phyr <[email protected]> Date: Tuesday, October 16, 2018 at 8:36 PM To: daijiang/phyr <[email protected]> Cc: "Anthony R. Ives" <[email protected]>, Author <[email protected]> Subject: Re: [daijiang/phyr] communityPGLMM.bayes (#30) Hi Tony, Can you post some code to show that it does not work? It seems work for me. library(lme4) gm1 <- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd), data = cbpp, family = binomial) update(gm1,devFunOnly=TRUE) simulate(gm1, nsim = 1) Cheers, Daijiang — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub<#30 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/ALF_LM4KORUA9ovHhAErhV_O9EPtFSG2ks5ulomDgaJpZM4XXuhO>.

daijiang · 2018-10-17T15:41:41Z

Humm... I do not know. I think the formula is still cbind(success, fail) but the data frame from simulate() has the column name as "cbind(success, fail)" literally; then when update() trying to find success in the data frame, it cannot find it (because the name is "cbind(success, fail).success" now. This is not a problem for value because the name did not change after using simulate()...

Not know what to do about it yet.

daijiang · 2018-10-17T15:59:27Z

Tony,
A possible workaround for lme4:

library(lme4)

bootMer_LRT <- function(mod, formula.r = formula(mod.r), re.form=NA, nAGQ=0, nboot=20, verbose=F, data=NA) {
  mod.f <- update(mod, nAGQ=nAGQ, control = glmerControl(calc.derivs=F))
  mod.r <- update(mod, formula=formula.r, nAGQ=nAGQ, control = glmerControl(calc.derivs=F))
  LLR.true <- logLik(mod.f) - logLik(mod.r)
  dist <- data.frame(LLR = rep(0,nboot))
  for(j in 1:nboot){
    simY = simulate(mod.r)

    mod.f.boot <- lme4::refit(mod.f, newresp = simY) # not using update
    mod.r.boot <- lme4::refit(mod.r, newresp = simY)
    dist$LLR[j] <- logLik(mod.f.boot) - logLik(mod.r.boot)

    if(verbose==T) show(c(j, LLR.true, dist$LLR[j]))
  }
  P <- mean(LLR.true < dist$LLR)
  return(list(mod.r=mod.r, dist.LLR=dist$LLR, P=P))
}

cbpp$fail = cbpp$size - cbpp$incidence

mod <- glmer(cbind(incidence, fail) ~ period + (1 | herd),  data = cbpp, family = binomial)
mod.r <- glmer(cbind(incidence, fail) ~ 1 + (1 | herd), data = cbpp, family = binomial)
x1 = bootMer_LRT(mod, mod.r, verbose = T)

cbpp$value <- cbind(cbpp$incidence, cbpp$size - cbpp$incidence)
mod <- glmer(value ~ period + (1 | herd),  data = cbpp, family = binomial)
mod.r <- glmer(value ~ 1 + (1 | herd), data = cbpp, family = binomial)
x2 = bootMer_LRT(mod, mod.r, nboot=20, verbose = T)
x1
x2

arives · 2018-10-17T18:46:05Z

Daijiang, This is great. Thanks. I've updated the code for Jesse's ms with this. This is the type of things that drives me crazy! Thanks for figuring it out. Cheers, Tony

…

______________ Anthony R. Ives UW-Madison 459 Birge Hall 608-262-1519 From: Daijiang Li <[email protected]> Reply-To: daijiang/phyr <[email protected]> Date: Wednesday, October 17, 2018 at 10:59 AM To: daijiang/phyr <[email protected]> Cc: "Anthony R. Ives" <[email protected]>, Author <[email protected]> Subject: Re: [daijiang/phyr] communityPGLMM.bayes (#30) lme4::refit

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

communityPGLMM.bayes #30

communityPGLMM.bayes #30

arives commented Oct 11, 2018

daijiang commented Oct 11, 2018

rdinnager commented Oct 11, 2018 via email

rdinnager commented Oct 12, 2018

daijiang commented Oct 12, 2018

arives commented Oct 13, 2018 via email

arives commented Oct 14, 2018

daijiang commented Oct 14, 2018

arives commented Oct 14, 2018 via email

daijiang commented Oct 14, 2018

arives commented Oct 14, 2018 via email

daijiang commented Oct 14, 2018

rdinnager commented Oct 15, 2018

arives commented Oct 15, 2018 via email

daijiang commented Oct 15, 2018

arives commented Oct 15, 2018 via email

arives commented Oct 16, 2018

daijiang commented Oct 17, 2018

arives commented Oct 17, 2018 via email

daijiang commented Oct 17, 2018

daijiang commented Oct 17, 2018 •

edited

Loading

arives commented Oct 17, 2018 via email

communityPGLMM.bayes #30

communityPGLMM.bayes #30

Comments

arives commented Oct 11, 2018

daijiang commented Oct 11, 2018

rdinnager commented Oct 11, 2018 via email

rdinnager commented Oct 12, 2018

daijiang commented Oct 12, 2018

arives commented Oct 13, 2018 via email

arives commented Oct 14, 2018

daijiang commented Oct 14, 2018

arives commented Oct 14, 2018 via email

daijiang commented Oct 14, 2018

arives commented Oct 14, 2018 via email

daijiang commented Oct 14, 2018

rdinnager commented Oct 15, 2018

arives commented Oct 15, 2018 via email

daijiang commented Oct 15, 2018

arives commented Oct 15, 2018 via email

arives commented Oct 16, 2018

daijiang commented Oct 17, 2018

arives commented Oct 17, 2018 via email

daijiang commented Oct 17, 2018

daijiang commented Oct 17, 2018 • edited Loading

arives commented Oct 17, 2018 via email

daijiang commented Oct 17, 2018 •

edited

Loading