Skip to content

Commit

Permalink
commit
Browse files Browse the repository at this point in the history
  • Loading branch information
changwoo-lee committed Jan 12, 2024
1 parent ecac1f4 commit 8319c1e
Show file tree
Hide file tree
Showing 3 changed files with 39 additions and 37 deletions.
69 changes: 35 additions & 34 deletions R/bglm_me.R
Original file line number Diff line number Diff line change
@@ -1,33 +1,34 @@
#' Bayesian generalized linear models with spatial exposure measurement error.
#'
#' This function fits Bayesian generalized linear model in the presence of spatial exposure measurement error of covariate(s) \eqn{X}, represented as a multivariate normal prior.
#' One of the most important features of this function is that it allows sparse matrix input for the prior precision matrix of \eqn{X} for scalable computation.
#' This function fits a Bayesian generalized linear model in the presence of spatial exposure measurement error for covariate(s) \eqn{X}, represented as a multivariate normal prior.
#' One of the most important features of this function is that it allows a sparse matrix input for the prior precision matrix of \eqn{X} for scalable computation.
#' As of version 1.0.0, only Bayesian logistic regression model is supported among GLMs, and
#' function \code{bglm_me()} runs a Gibbs sampler to carry out posterior inference using Polya-Gamma augmentation (Polson et al., 2013).
#' See below "Details" section for the model description, and Lee et al. (2024) for an application example in environmental epidemiology.
#' See below "Details" section for the model description and Lee et al. (2024) for an application example in environmental epidemiology.
#'
#' Denote \eqn{Y_i} be a binary response, \eqn{X_i} be a \eqn{q\times 1} covariate that is subject to spatial exposure measurement error,
#' and \eqn{Z_i} be a \eqn{p\times 1} covariate without measurement error.
#' Consider logistic regression model, independently for each \eqn{i=1,\dots,n,}
#' Denote \eqn{Y_i} be a binary response, \eqn{X_i} be a \eqn{q\times 1} covariate vector that is subject to spatial exposure measurement error,
#' and \eqn{Z_i} be a \eqn{p\times 1} covariate vector without measurement error.
#' Consider a logistic regression model, independently for each \eqn{i=1,\dots,n,}
#' \deqn{\log(Pr(Y_i = 1)/Pr(Y_i = 0)) = \beta_0 + X_i^\top \beta_X + Z_i^\top \beta_Z,}
#' and spatial exposure measurement error of \eqn{X_i} for \eqn{i=1,\dots,n} is incorporated into the model as a multivariate normal prior.
#' For example when \eqn{q=1}, we have \eqn{n-}dimensional multivariate normal prior of \eqn{X = (X_1,\dots,X_n)^\top},
#' and spatial exposure measurement error of \eqn{X_i} for \eqn{i=1,\dots,n} is incorporated into the model using a multivariate normal prior.
#' For example when \eqn{q=1}, we have an \eqn{n-}dimensional multivariate normal prior of \eqn{X = (X_1,\dots,X_n)^\top},
#' \deqn{(X_1,\dots,X_n)\sim N_n(\mu_X, Q_X^{-1}).}
#' Most importantly, it allows sparse matrix input for the prior precision matrix \eqn{Q_X} for scalable computation, which could be obtained by Vecchia approximation.
#' Most importantly, it allows a sparse matrix input for the prior precision matrix \eqn{Q_X} for scalable computation, which could be obtained by Vecchia approximation.
#' When \eqn{q>1}, \eqn{q} independent \eqn{n-}dimensional multivariate normal priors are assumed.
#'
#' We consider normal priors for regression coefficients,
#' \deqn{\beta_0 \sim N(0, V_\beta), \quad \beta_{X,j} \stackrel{iid}{\sim} N(0, V_\beta), \quad \beta_{Z,k} \stackrel{iid}{\sim} N(0, V_\beta)}
#' where \code{var_beta} corresponds to \eqn{V_\beta}.
#'
#' @param Y *vector<int>*, n by 1 binary response vector.
#' @param X_mean *matrix<num>*, n by 1 prior mean matrix \eqn{\mu_X}. When there are q multiple exposures subject to measurement error, it can be a length q list of n by 1 matrices.
#' @param X_prec *matrix<num>*, n by 1 prior precision matrix \eqn{Q_X}, which allows sparse format from \link{Matrix} package. When there are q multiple exposures subject to measurement error, it can be a length q list of n by n matrices.
#' @param Z *matrix<num>*, n by p covariates that are not subject to measurement error.
#' @param X_mean *vector<num>*, n by 1 prior mean vector \eqn{\mu_X}. When there are q multiple exposures subject to measurement error, it can be a length q list of n by 1 vectors.
#' @param X_prec *matrix<num>*, n by n prior precision matrix \eqn{Q_X}, which allows sparse format from \link{Matrix} package. When there are q multiple exposures subject to measurement error, it can be a length q list of n by n matrices.
#' @param Z *matrix<num>*, n by p matrix containing covariates that are not subject to measurement error.
#' @param family *class \link[stats]{family}*, a description of the error distribution and link function to be used in the model. Currently only supports binomial(link = "logit").
#' @param nburn *integer*, burn-in iteration, default 1000.
#' @param nsave *integer*, number of posterior samples, default 1000. Total number of MCMC iteration is \code{nburn + nsave * nthin}.
#' @param nthin *integer*, thin-in rate, default 1.
#' @param prior *list*, list of prior parameters of regression model. Default is \code{list(var_beta = 100)}.
#' @param nburn *integer*, number of burn-in iteration (default 5000).
#' @param nsave *integer*, number of posterior samples (default 5000). Total number of MCMC iteration is \code{nburn + nsave * nthin}.
#' @param nthin *integer*, thin-in rate (default 1).
#' @param prior *list*, list of prior parameters of the regression model. Default is \code{list(var_beta = 100)}.
#' @param saveX *logical*, default FALSE, whether save posterior samples of X (exposure).
#'
#' @return List of the following:
Expand All @@ -49,9 +50,9 @@
#' data(health_sim)
#' library(fields)
#' library(maps)
#' # Obtain the predicted exposure mean and covariance at health subject locations based on Jan 10, 2012 data
#' # using Gaussian process prior with mean zero and exponential covariance kernel
#' # with fixed range 5 (in miles) and stdev 0.5 (in ppbv)
#' # Obtain the predicted exposure mean and covariance at simulated health subject locations based on Jan 10, 2012 NO2 data
#' # using a Gaussian process prior with mean zero and exponential covariance kernel
#' # with fixed range 8 (in km) and standard deviation 1.
#'
#' # exposure data
#' data_jan10 = NO2_Jan2012[NO2_Jan2012$date == as.POSIXct("2012-01-10"),]
Expand All @@ -60,36 +61,36 @@
#' # health data
#' coords_health = cbind(health_sim$lon, health_sim$lat)
#'
#' distmat_xx <- rdist.earth(coords_monitor)
#' distmat_xy <- rdist.earth(coords_monitor, coords_health)
#' distmat_yy <- rdist.earth(coords_health)
#' distmat_xx <- rdist.earth(coords_monitor, miles = F)
#' distmat_xy <- rdist.earth(coords_monitor, coords_health, miles = F)
#' distmat_yy <- rdist.earth(coords_health, miles = F)
#'
#' a = 5; sigma = 1; # assume known
#' a = 8; sigma = 1; # assume known
#'
#' Sigmaxx = fields::Matern(distmat_xx, smoothness = 0.5, range = a, phi = sigma^2)
#' Sigmaxy = fields::Matern(distmat_xy, smoothness = 0.5, range = a, phi = sigma^2)
#' Sigmayy = fields::Matern(distmat_yy, smoothness = 0.5, range = a, phi = sigma^2)
#'
#' # posterior predictive mean and covariance of exposure at health data
#' # posterior predictive mean and covariance of exposure at health subject locations
#' X_mean <- t(Sigmaxy) %*% solve(Sigmaxx, data_jan10$lnNO2)
#' X_cov <- Sigmayy - t(Sigmaxy) %*% solve(Sigmaxx,Sigmaxy) # n_y by n_y
#'
#' # visualize
#' # monitoring station exposure data
#' quilt.plot(cbind(data_jan10$lon, data_jan10$lat),
#' data_jan10$lnNO2, main = "NO2 exposures (in log) at 21 stations",
#' data_jan10$lnNO2, main = "NO2 exposures (in log) at 21 monitoring stations",
#' xlab = "longitude", ylab= "latitude", xlim = c(-96.5, -94.5), ylim = c(29, 30.5))
#' maps::map("county", "Texas", add = T)
#'
#' # posterior predictive mean of exposure at health data
#' # posterior predictive mean of exposure at health subject locations
#' quilt.plot(cbind(health_sim$lon, health_sim$lat),
#' X_mean, main = "posterior predictive mean of exposure at health data locations",
#' X_mean, main = "posterior predictive mean of exposure at health subject locations",
#' xlab = "longitude", ylab= "latitude", xlim = c(-96.5, -94.5), ylim = c(29, 30.5))
#' maps::map("county", "Texas", add = T)
#'
#' # posterior predictive sd of exposure at health data
#' # posterior predictive sd of exposure at health subject locations
#' quilt.plot(cbind(health_sim$lon, health_sim$lat),
#' sqrt(diag(X_cov)), main = "posterior predictive sd of exposure at health data locations",
#' sqrt(diag(X_cov)), main = "posterior predictive sd of exposure at health subject locations",
#' xlab = "longitude", ylab= "latitude", xlim = c(-96.5, -94.5), ylim = c(29, 30.5))
#' maps::map("county", "Texas", add = T)
#'
Expand All @@ -100,14 +101,14 @@
#' Q_sparse = run_vecchia$Q
#' run_vecchia$cputime
#'
#' # fit the model, continuous data
#' # fit the model, binary response
#' fit = bglm_me(Y = health_sim$Ybinary,
#' X_mean = X_mean,
#' X_prec = Q_sparse, # sparse precision matrix
#' Z = health_sim$Z,
#' family = binomial(link = "logit"),
#' nburn = 1000,
#' nsave = 1000,
#' nburn = 5000,
#' nsave = 5000,
#' nthin = 5)
#' fit$cputime
#' summary(fit$posterior)
Expand All @@ -120,8 +121,8 @@ bglm_me <- function(Y,
X_prec,
Z,
family = binomial(link = "logit"),
nburn = 1000,
nsave = 1000,
nburn = 5000,
nsave = 5000,
nthin = 1,
prior = NULL,
saveX = FALSE){
Expand Down
2 changes: 1 addition & 1 deletion README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ knitr::opts_chunk$set(
# bspme
<!-- badges: start -->
<!-- badges: end -->
Go to package website: [[link]](https://changwoo-lee.github.io/bspme/) (NOTE: IT IS NOW UNPUBLISHED)
Go to the package website: [[link]](https://changwoo-lee.github.io/bspme/) (NOTE: IT IS NOW UNPUBLISHED)

See a vignette with ozone exposure data: [[link]](https://changwoo-lee.github.io/bspme/articles/Ozone-exposure-and-health-data-analysis.html) (NOTE: SEE BELOW "INSTALLATION" SECTION INSTEAD)

Expand Down
5 changes: 3 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,9 @@
<!-- badges: start -->
<!-- badges: end -->

Go to package website: [\[link\]](https://changwoo-lee.github.io/bspme/)
(NOTE: IT IS NOW UNPUBLISHED)
Go to the package website:
[\[link\]](https://changwoo-lee.github.io/bspme/) (NOTE: IT IS NOW
UNPUBLISHED)

See a vignette with ozone exposure data:
[\[link\]](https://changwoo-lee.github.io/bspme/articles/Ozone-exposure-and-health-data-analysis.html)
Expand Down

0 comments on commit 8319c1e

Please sign in to comment.