commit

changwoo-lee · Jan 12, 2024 · 8319c1e · 8319c1e
1 parent ecac1f4
commit 8319c1e
Show file tree

Hide file tree

Showing 3 changed files with 39 additions and 37 deletions.
diff --git a/R/bglm_me.R b/R/bglm_me.R
@@ -1,33 +1,34 @@
 #' Bayesian generalized linear models with spatial exposure measurement error.
 #'
-#' This function fits Bayesian generalized linear model in the presence of spatial exposure measurement error of covariate(s) \eqn{X}, represented as a multivariate normal prior.
-#' One of the most important features of this function is that it allows sparse matrix input for the prior precision matrix of \eqn{X} for scalable computation.
+#' This function fits a Bayesian generalized linear model in the presence of spatial exposure measurement error for covariate(s) \eqn{X}, represented as a multivariate normal prior.
+#' One of the most important features of this function is that it allows a sparse matrix input for the prior precision matrix of \eqn{X} for scalable computation.
 #' As of version 1.0.0, only Bayesian logistic regression model is supported among GLMs, and
 #' function \code{bglm_me()} runs a Gibbs sampler to carry out posterior inference using Polya-Gamma augmentation (Polson et al., 2013).
-#' See below "Details" section for the model description, and Lee et al. (2024) for an application example in environmental epidemiology.
+#' See below "Details" section for the model description and Lee et al. (2024) for an application example in environmental epidemiology.
 #'
-#' Denote \eqn{Y_i} be a binary response, \eqn{X_i} be a \eqn{q\times 1} covariate that is subject to spatial exposure measurement error,
-#' and \eqn{Z_i} be a \eqn{p\times 1} covariate without measurement error.
-#' Consider logistic regression model, independently for each \eqn{i=1,\dots,n,}
+#' Denote \eqn{Y_i} be a binary response, \eqn{X_i} be a \eqn{q\times 1} covariate vector that is subject to spatial exposure measurement error,
+#' and \eqn{Z_i} be a \eqn{p\times 1} covariate vector without measurement error.
+#' Consider a logistic regression model, independently for each \eqn{i=1,\dots,n,}
 #' \deqn{\log(Pr(Y_i = 1)/Pr(Y_i = 0)) = \beta_0 + X_i^\top \beta_X +  Z_i^\top \beta_Z,}
-#' and spatial exposure measurement error of \eqn{X_i} for \eqn{i=1,\dots,n} is incorporated into the model as a multivariate normal prior.
-#' For example when \eqn{q=1}, we have \eqn{n-}dimensional multivariate normal prior of \eqn{X = (X_1,\dots,X_n)^\top},
+#' and spatial exposure measurement error of \eqn{X_i} for \eqn{i=1,\dots,n} is incorporated into the model using a multivariate normal prior.
+#' For example when \eqn{q=1}, we have an \eqn{n-}dimensional multivariate normal prior of \eqn{X = (X_1,\dots,X_n)^\top},
 #' \deqn{(X_1,\dots,X_n)\sim N_n(\mu_X, Q_X^{-1}).}
-#' Most importantly, it allows sparse matrix input for the prior precision matrix \eqn{Q_X} for scalable computation, which could be obtained by Vecchia approximation.
+#' Most importantly, it allows a sparse matrix input for the prior precision matrix \eqn{Q_X} for scalable computation, which could be obtained by Vecchia approximation.
+#' When \eqn{q>1}, \eqn{q} independent \eqn{n-}dimensional multivariate normal priors are assumed.
 #'
 #' We consider normal priors for regression coefficients,
 #' \deqn{\beta_0 \sim N(0, V_\beta), \quad \beta_{X,j} \stackrel{iid}{\sim} N(0, V_\beta), \quad \beta_{Z,k} \stackrel{iid}{\sim} N(0, V_\beta)}
 #' where \code{var_beta} corresponds to \eqn{V_\beta}.
 #'
 #' @param Y *vector&lt;int&gt;*, n by 1 binary response vector.
-#' @param X_mean *matrix&lt;num&gt;*, n by 1 prior mean matrix \eqn{\mu_X}. When there are q multiple exposures subject to measurement error, it can be a length q list of n by 1 matrices.
-#' @param X_prec *matrix&lt;num&gt;*, n by 1 prior precision matrix \eqn{Q_X}, which allows sparse format from \link{Matrix} package. When there are q multiple exposures subject to measurement error, it can be a length q list of n by n matrices.
-#' @param Z *matrix&lt;num&gt;*, n by p covariates that are not subject to measurement error.
+#' @param X_mean *vector&lt;num&gt;*, n by 1 prior mean vector \eqn{\mu_X}. When there are q multiple exposures subject to measurement error, it can be a length q list of n by 1 vectors.
+#' @param X_prec *matrix&lt;num&gt;*, n by n prior precision matrix \eqn{Q_X}, which allows sparse format from \link{Matrix} package. When there are q multiple exposures subject to measurement error, it can be a length q list of n by n matrices.
+#' @param Z *matrix&lt;num&gt;*, n by p matrix containing covariates that are not subject to measurement error.
 #' @param family *class \link[stats]{family}*, a description of the error distribution and link function to be used in the model. Currently only supports binomial(link = "logit").
-#' @param nburn *integer*, burn-in iteration, default 1000.
-#' @param nsave *integer*, number of posterior samples, default 1000. Total number of MCMC iteration is \code{nburn + nsave * nthin}.
-#' @param nthin *integer*, thin-in rate, default 1.
-#' @param prior *list*, list of prior parameters of regression model. Default is \code{list(var_beta = 100)}.
+#' @param nburn *integer*, number of burn-in iteration (default 5000).
+#' @param nsave *integer*, number of posterior samples (default 5000). Total number of MCMC iteration is \code{nburn + nsave * nthin}.
+#' @param nthin *integer*, thin-in rate (default 1).
+#' @param prior *list*, list of prior parameters of the regression model. Default is \code{list(var_beta = 100)}.
 #' @param saveX *logical*, default FALSE, whether save posterior samples of X (exposure).
 #'
 #' @return List of the following:
@@ -49,9 +50,9 @@
 #' data(health_sim)
 #' library(fields)
 #' library(maps)
-#' # Obtain the predicted exposure mean and covariance at health subject locations based on Jan 10, 2012 data
-#' # using Gaussian process prior with mean zero and exponential covariance kernel
-#' # with fixed range 5 (in miles) and stdev 0.5 (in ppbv)
+#' # Obtain the predicted exposure mean and covariance at simulated health subject locations based on Jan 10, 2012 NO2 data
+#' # using a Gaussian process prior with mean zero and exponential covariance kernel
+#' # with fixed range 8 (in km) and standard deviation 1.
 #'
 #' # exposure data
 #' data_jan10 = NO2_Jan2012[NO2_Jan2012$date == as.POSIXct("2012-01-10"),]
@@ -60,36 +61,36 @@
 #' # health data
 #' coords_health = cbind(health_sim$lon, health_sim$lat)
 #'
-#' distmat_xx <- rdist.earth(coords_monitor)
-#' distmat_xy <- rdist.earth(coords_monitor, coords_health)
-#' distmat_yy <- rdist.earth(coords_health)
+#' distmat_xx <- rdist.earth(coords_monitor, miles = F)
+#' distmat_xy <- rdist.earth(coords_monitor, coords_health, miles = F)
+#' distmat_yy <- rdist.earth(coords_health, miles = F)
 #'
-#' a = 5; sigma = 1; # assume known
+#' a = 8; sigma = 1; # assume known
 #'
 #' Sigmaxx = fields::Matern(distmat_xx, smoothness = 0.5, range = a, phi = sigma^2)
 #' Sigmaxy = fields::Matern(distmat_xy, smoothness = 0.5, range = a, phi = sigma^2)
 #' Sigmayy = fields::Matern(distmat_yy, smoothness = 0.5, range = a, phi = sigma^2)
 #'
-#' # posterior predictive mean and covariance of exposure at health data
+#' # posterior predictive mean and covariance of exposure at health subject locations
 #' X_mean <- t(Sigmaxy) %*% solve(Sigmaxx, data_jan10$lnNO2)
 #' X_cov <- Sigmayy - t(Sigmaxy) %*% solve(Sigmaxx,Sigmaxy) # n_y by n_y
 #'
 #' # visualize
 #' # monitoring station exposure data
 #' quilt.plot(cbind(data_jan10$lon, data_jan10$lat),
-#'            data_jan10$lnNO2, main = "NO2 exposures (in log) at 21 stations",
+#'            data_jan10$lnNO2, main = "NO2 exposures (in log) at 21 monitoring stations",
 #'            xlab = "longitude", ylab= "latitude", xlim = c(-96.5, -94.5), ylim = c(29, 30.5))
 #' maps::map("county", "Texas", add = T)
 #'
-#' # posterior predictive mean of exposure at health data
+#' # posterior predictive mean of exposure at health subject locations
 #' quilt.plot(cbind(health_sim$lon, health_sim$lat),
-#'            X_mean, main = "posterior predictive mean of exposure at health data locations",
+#'            X_mean, main = "posterior predictive mean of exposure at health subject locations",
 #'            xlab = "longitude", ylab= "latitude", xlim = c(-96.5, -94.5), ylim = c(29, 30.5))
 #' maps::map("county", "Texas", add = T)
 #'
-#' # posterior predictive sd of exposure at health data
+#' # posterior predictive sd of exposure at health subject locations
 #' quilt.plot(cbind(health_sim$lon, health_sim$lat),
-#'            sqrt(diag(X_cov)), main = "posterior predictive sd of exposure at health data locations",
+#'            sqrt(diag(X_cov)), main = "posterior predictive sd of exposure at health subject locations",
 #'            xlab = "longitude", ylab= "latitude", xlim = c(-96.5, -94.5), ylim = c(29, 30.5))
 #' maps::map("county", "Texas", add = T)
 #'
@@ -100,14 +101,14 @@
 #' Q_sparse = run_vecchia$Q
 #' run_vecchia$cputime
 #'
-#' # fit the model, continuous data
+#' # fit the model, binary response
 #' fit = bglm_me(Y = health_sim$Ybinary,
 #'                 X_mean = X_mean,
 #'                 X_prec = Q_sparse, # sparse precision matrix
 #'                 Z = health_sim$Z,
 #'                 family = binomial(link = "logit"),
-#'                 nburn = 1000,
-#'                 nsave = 1000,
+#'                 nburn = 5000,
+#'                 nsave = 5000,
 #'                 nthin = 5)
 #' fit$cputime
 #' summary(fit$posterior)
@@ -120,8 +121,8 @@ bglm_me <- function(Y,
                        X_prec,
                        Z,
                        family = binomial(link = "logit"),
-                       nburn = 1000,
-                       nsave = 1000,
+                       nburn = 5000,
+                       nsave = 5000,
                        nthin = 1,
                        prior = NULL,
                        saveX = FALSE){

diff --git a/README.Rmd b/README.Rmd
@@ -20,7 +20,7 @@ knitr::opts_chunk$set(
 # bspme 
 <!-- badges: start -->
 <!-- badges: end -->
-Go to package website: [[link]](https://changwoo-lee.github.io/bspme/) (NOTE: IT IS NOW UNPUBLISHED)
+Go to the package website: [[link]](https://changwoo-lee.github.io/bspme/) (NOTE: IT IS NOW UNPUBLISHED)
 
 See a vignette with ozone exposure data: [[link]](https://changwoo-lee.github.io/bspme/articles/Ozone-exposure-and-health-data-analysis.html) (NOTE: SEE BELOW "INSTALLATION" SECTION INSTEAD)
 

diff --git a/README.md b/README.md
@@ -6,8 +6,9 @@
 <!-- badges: start -->
 <!-- badges: end -->
 
-Go to package website: [\[link\]](https://changwoo-lee.github.io/bspme/)
-(NOTE: IT IS NOW UNPUBLISHED)
+Go to the package website:
+[\[link\]](https://changwoo-lee.github.io/bspme/) (NOTE: IT IS NOW
+UNPUBLISHED)
 
 See a vignette with ozone exposure data:
 [\[link\]](https://changwoo-lee.github.io/bspme/articles/Ozone-exposure-and-health-data-analysis.html)