Title: | Bolstad Functions |
---|---|
Description: | A set of R functions and data sets for the book "Understanding Computational Bayesian Statistics." This book was written by Bill (WM) Bolstad and published in 2009 by John Wiley & Sons (ISBN 978-0470046098). |
Authors: | James Curran <[email protected]> |
Maintainer: | James Curran <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.0-29 |
Built: | 2025-03-01 05:20:58 UTC |
Source: | https://github.com/jmcurran/bolstad2 |
Data from a hypothetical HMO-HIV+ study shown in Table 1.1 of Hosmer, D.W. and Lemeshow, S. (1998) Applied Survival Analysis: Regression Modeling of Time to Event Data, John Wiley and Sons Inc., New York, NY
A data frame with 100 observations on 7 variables.
[,1] | id | numeric | l Subject ID code |
[,2] | entdate | date | Entry date (ddmmyr) |
[,3] | enddate | date | Entry date (ddmmyr) |
[,4] | time | numeric | Survival Time = days between Entry date and End date |
[,5] | age | numeric | Age in years |
[,6] | drug | factor | History of IV drug use (0 = No, 1 = Yes) |
[,7] | censor | factor | Follow-Up Status1 = Death due to AIDS or AIDS |
related factors (0 = Alive at study end or lost to follow-up) | |||
Uses a Metropolis Hastings scheme on the proportional hazards model to draw sample from posterior. Uses a matched curvature Student's t candidate generating distribution with 4 degrees of freedom to give heavy tails.
BayesCPH( y, t, x, steps = 1000, priorMean = NULL, priorVar = NULL, mleMean = NULL, mleVar, startValue = NULL, randomSeed = NULL, plots = FALSE )
BayesCPH( y, t, x, steps = 1000, priorMean = NULL, priorVar = NULL, mleMean = NULL, mleVar, startValue = NULL, randomSeed = NULL, plots = FALSE )
y |
the Poisson censored response vector. It has value 0 when the variable is censored and 1 when it is not censored. |
t |
time |
x |
matrix of covariates |
steps |
the number of steps to use in the Metropolis-Hastings updating |
priorMean |
the mean of the prior |
priorVar |
the variance of the prior |
mleMean |
the mean of the matched curvature likelihood |
mleVar |
the covariance matrix of the matched curvature likelihood |
startValue |
a vector of starting values for all of the regression coefficients including the intercept |
randomSeed |
a random seed to use for different chains |
plots |
Plot the time series and auto correlation functions for each of the model coefficients |
A list containing the following components:
beta |
a data frame containing the sample of the model coefficients from the posterior distribution |
mleMean |
the mean of the matched curvature likelihood. This is useful if you've used a training set to estimate the value and wish to use it with another data set |
mleVar |
the covariance matrix of the matched curvature likelihood. See mleMean for why you'd want this |
Performas Metropolis Hastings on the logistic regression model to draw sample from posterior. Uses a matched curvature Student's t candidate generating distribution with 4 degrees of freedom to give heavy tails.
BayesLogistic( y, x, steps = 1000, priorMean = NULL, priorVar = NULL, mleMean = NULL, mleVar, startValue = NULL, randomSeed = NULL, plots = FALSE )
BayesLogistic( y, x, steps = 1000, priorMean = NULL, priorVar = NULL, mleMean = NULL, mleVar, startValue = NULL, randomSeed = NULL, plots = FALSE )
y |
the binary response vector |
x |
matrix of covariates |
steps |
the number of steps to use in the Metropolis-Hastings updating |
priorMean |
the mean of the prior |
priorVar |
the variance of the prior |
mleMean |
the mean of the matched curvature likelihood |
mleVar |
the covariance matrix of the matched curvature likelihood |
startValue |
a vector of starting values for all of the regression coefficients including the intercept |
randomSeed |
a random seed to use for different chains |
plots |
Plot the time series and auto correlation functions for each of the model coefficients |
A list containing the following components:
beta |
a data frame containing the sample of the model coefficients from the posterior distribution |
mleMean |
the mean of the matched curvature likelihood. This is useful if you've used a training set to estimate the value and wish to use it with another data set |
mleVar |
the covariance matrix of the matched curvature likelihood. See mleMean for why you'd want this |
data(logisticTest.df) BayesLogistic(logisticTest.df$y, logisticTest.df$x)
data(logisticTest.df) BayesLogistic(logisticTest.df$y, logisticTest.df$x)
Performs Metropolis Hastings on the logistic regression model to draw sample from posterior. Uses a matched curvature Student's t candidate generating distribution with 4 degrees of freedom to give heavy tails.
BayesPois( y, x, steps = 1000, priorMean = NULL, priorVar = NULL, mleMean = NULL, mleVar, startValue = NULL, randomSeed = NULL, plots = FALSE )
BayesPois( y, x, steps = 1000, priorMean = NULL, priorVar = NULL, mleMean = NULL, mleVar, startValue = NULL, randomSeed = NULL, plots = FALSE )
y |
the binary response vector |
x |
matrix of covariates |
steps |
the number of steps to use in the Metropolis-Hastings updating |
priorMean |
the mean of the prior |
priorVar |
the variance of the prior |
mleMean |
the mean of the matched curvature likelihood |
mleVar |
the covariance matrix of the matched curvature likelihood |
startValue |
a vector of starting values for all of the regression coefficients including the intercept |
randomSeed |
a random seed to use for different chains |
plots |
Plot the time series and auto correlation functions for each of the model coefficients |
A list containing the following components:
beta |
a data frame containing the sample of the model coefficients from the posterior distribution |
mleMean |
the mean of the matched curvature likelihood. This is useful if you've used a training set to estimate the value and wish to use it with another data set |
mleVar |
the covariance matrix of the matched curvature likelihood. See mleMean for why you'd want this |
data(poissonTest.df) results = BayesPois(poissonTest.df$y, poissonTest.df$x)
data(poissonTest.df) results = BayesPois(poissonTest.df$y, poissonTest.df$x)
This function uses the MetropolisHastings algorithm to draw a sample from a correlated bivariate normal target density using a random walk candidate and an independent candidate density respectively where we are drawing both parameters in a single draw. It can also use the blockwise Metropolis Hastings algorithm and Gibbs sampling respectively to draw a sample from the correlated bivariate normal target.
bivnormMH(rho, rho1 = 0.9, sigma = c(1.2, 1.2), steps = 1000, type = "ind")
bivnormMH(rho, rho1 = 0.9, sigma = c(1.2, 1.2), steps = 1000, type = "ind")
rho |
the correlation coefficient for the bivariate normal |
rho1 |
the correlation of the candidate distribution. Only used when type = 'ind' |
sigma |
the standard deviations of the marginal distributions of the independent candidate density. Only used when type = 'ind' |
steps |
the number of Metropolis Hastings steps |
type |
the type of candidate generation to use. Can be one of 'rw' = random walk, 'ind' = independent normals, 'gibbs' = Gibbs sampling or 'block' = blockwise. It is sufficient to use 'r','i','g', or 'b' |
returns a list which contains a data frame called targetSample with members x and y. These are the samples from the target density.
## independent chain chain1.df=bivnormMH(0.9)$targetSample ## random walk chain chain2.df=bivnormMH(0.9, type = 'r')$targetSample ## blockwise MH chain chain3.df=bivnormMH(0.9, type = 'b')$targetSample ## Gibbs sampling chain chain4.df=bivnormMH(0.9, type = 'g')$targetSample oldPar = par(mfrow=c(2,2)) plot(y ~ x, type = 'l', chain1.df, main = 'Independent') plot(y ~ x, type = 'l', chain2.df, main = 'Random Walk') plot(y ~ x, type = 'l', chain3.df, main = 'Blockwise') plot(y ~ x, type = 'l', chain4.df, main = 'Gibbs') par(oldPar)
## independent chain chain1.df=bivnormMH(0.9)$targetSample ## random walk chain chain2.df=bivnormMH(0.9, type = 'r')$targetSample ## blockwise MH chain chain3.df=bivnormMH(0.9, type = 'b')$targetSample ## Gibbs sampling chain chain4.df=bivnormMH(0.9, type = 'g')$targetSample oldPar = par(mfrow=c(2,2)) plot(y ~ x, type = 'l', chain1.df, main = 'Independent') plot(y ~ x, type = 'l', chain2.df, main = 'Random Walk') plot(y ~ x, type = 'l', chain3.df, main = 'Blockwise') plot(y ~ x, type = 'l', chain4.df, main = 'Gibbs') par(oldPar)
A random sample of size 10 from a distribution
where both mu and sigma are unknown parameters.
A data frame with 10 observations in a single variable called y
The age and coronory heart disease status of 100 individuals taken from Hosmer and Lemeshow (1989).
A data frame with 100 observations in two columns
[,1] | age | numeric | age in years |
[,2] | chd | numeric factor | coronary heat disease status. Levels (1 = Yes), (0 = No) |
Calculates a lower, upper, or two-sided credible interval from the numerical posterior CDF or from a sample from the posterior.
credInt(theta, cdf = NULL, conf = 0.95, type = "twosided")
credInt(theta, cdf = NULL, conf = 0.95, type = "twosided")
theta |
either a sample from the posterior density or the values over which the the posterior CDF is specified |
cdf |
the values of the CDF, |
conf |
the desired 'confidence' level |
type |
the type of interval to return, 'lower' = one sided lower bound, 'two-sided' = two - sided, or 'upper' = one sided upper bound. It is sufficient to use 'l','t' or 'u' |
This function uses linear interpolation to calculate bounds for points that may not be specified by CDF
a list containing the elements lower.bound, uppper.bound or both depending on type
## commands for calculating a numerical posterior CDF. ## In this example, the likelihood is proportional to ## \eqn{\theta^{3/2}\times \exp(-\theta/4)} and a N(6, 9) prior is used. theta = seq(from = 0.001, to = 40, by = 0.001) prior = dnorm(theta,6,3) ppnLike = theta^1.5*exp(-theta/4) ppnPost = prior*ppnLike scaleFactor = sintegral(theta, ppnPost)$int posterior = ppnPost/scaleFactor cdf = sintegral(theta, posterior)$y ci = credInt(theta, cdf) par(mfrow=c(2,2)) plot(prior ~ theta, type = 'l', main = 'Prior N(6, 9)') plot(ppnLike ~ theta, type = 'l', main = 'Proportional likelihood') plot(posterior ~ theta, type = 'l', main = 'Posterior') abline(v=c(unlist(ci))) ## Use an inverse method to take a random sample of size 1000 ## from the posterior suppressWarnings({Finv = approxfun(cdf,theta)}) thetaSample = Finv(runif(1000)) ci = credInt(thetaSample)
## commands for calculating a numerical posterior CDF. ## In this example, the likelihood is proportional to ## \eqn{\theta^{3/2}\times \exp(-\theta/4)} and a N(6, 9) prior is used. theta = seq(from = 0.001, to = 40, by = 0.001) prior = dnorm(theta,6,3) ppnLike = theta^1.5*exp(-theta/4) ppnPost = prior*ppnLike scaleFactor = sintegral(theta, ppnPost)$int posterior = ppnPost/scaleFactor cdf = sintegral(theta, posterior)$y ci = credInt(theta, cdf) par(mfrow=c(2,2)) plot(prior ~ theta, type = 'l', main = 'Prior N(6, 9)') plot(ppnLike ~ theta, type = 'l', main = 'Proportional likelihood') plot(posterior ~ theta, type = 'l', main = 'Posterior') abline(v=c(unlist(ci))) ## Use an inverse method to take a random sample of size 1000 ## from the posterior suppressWarnings({Finv = approxfun(cdf,theta)}) thetaSample = Finv(runif(1000)) ci = credInt(thetaSample)
Calculates a lower, upper, or two-sided credible interval from the numerical posterior CDF.
credIntNum(theta, cdf, conf = 0.95, type = "twosided")
credIntNum(theta, cdf, conf = 0.95, type = "twosided")
theta |
the values over which the the posterior CDF is specified |
cdf |
the values of the CDF, |
conf |
the desired 'confidence' level |
type |
the type of interval to return, 'lower' = one sided lower bound, 'two-sided' = two - sided, or 'upper' = one sided upper bound. It is sufficient to use 'l','t' or 'u' |
This function uses linear interpolation to calculate bounds for points that may not be specified by CDF
a list containing the elements lower.bound, uppper.bound or both depending on type
## commands for calculating a numerical posterior CDF. ## In this example, the likelihood is proportional to ## \eqn{\theta^{3/2}\times \exp(-\theta/4)} and a N(6, 9) prior is used. theta = seq(from = 0.001, to = 40, by = 0.001) prior = dnorm(theta,6,3) ppnLike = theta^1.5*exp(-theta/4) ppnPost = prior*ppnLike scaleFactor = sintegral(theta, ppnPost)$int posterior = ppnPost/scaleFactor cdf = sintegral(theta, posterior)$y ci=credIntNum(theta, cdf) par(mfrow=c(2,2)) plot(prior ~ theta, type = 'l', main = 'Prior N(6, 9)') plot(ppnLike ~ theta, type = 'l', main = 'Proportional likelihood') plot(posterior ~ theta, type = 'l', main = 'Posterior') abline(v=c(unlist(ci)))
## commands for calculating a numerical posterior CDF. ## In this example, the likelihood is proportional to ## \eqn{\theta^{3/2}\times \exp(-\theta/4)} and a N(6, 9) prior is used. theta = seq(from = 0.001, to = 40, by = 0.001) prior = dnorm(theta,6,3) ppnLike = theta^1.5*exp(-theta/4) ppnPost = prior*ppnLike scaleFactor = sintegral(theta, ppnPost)$int posterior = ppnPost/scaleFactor cdf = sintegral(theta, posterior)$y ci=credIntNum(theta, cdf) par(mfrow=c(2,2)) plot(prior ~ theta, type = 'l', main = 'Prior N(6, 9)') plot(ppnLike ~ theta, type = 'l', main = 'Proportional likelihood') plot(posterior ~ theta, type = 'l', main = 'Posterior') abline(v=c(unlist(ci)))
Calculates a lower, upper, or two-sided credible interval from the numerical posterior CDF.
credIntSamp(theta, conf = 0.95, type = "twosided")
credIntSamp(theta, conf = 0.95, type = "twosided")
theta |
a sample from the posterior density |
conf |
the desired 'confidence' level |
type |
the type of interval to return, 'lower' = one sided lower bound, 'two-sided' = two - sided, or 'upper' = one sided upper bound. It is sufficient to use 'l','t' or 'u' |
This function uses linear interpolation to calculate bounds for points that may not be specified by CDF
a list containing the elements lower.bound, uppper.bound or both depending on type
## posterior is N(0,1) theta = rnorm(1000) ci=credIntSamp(theta) plot(density(theta)) abline(v=c(unlist(ci)))
## posterior is N(0,1) theta = rnorm(1000) ci=credIntSamp(theta) plot(density(theta)) abline(v=c(unlist(ci)))
This function is designed to emulate the Minitab function DESCRIBE. It gives simple descriptive statistics for a data frame
describe(x, varNames = NULL)
describe(x, varNames = NULL)
x |
A matrix or data.frame with numeric entries. Different variables are represented by columns. |
varNames |
A vector of variable names for each of the columns |
A data.frame containing the following elements:
N |
The number of observations for each variable |
mean |
The sample mean for each variable |
stdev |
The sample standard deviation |
sterr |
The standard error of the mean |
min |
The minimum |
q1 |
The lower quartile |
med |
The median |
q3 |
The upper quartile |
max |
The maximum |
data(poissonTest.df) describe(poissonTest.df)
data(poissonTest.df) describe(poissonTest.df)
Calculate the Gelman Rubin statistic
GelmanRubin(theta)
GelmanRubin(theta)
theta |
A matrix containing samples from at least two chains on a parameter theta. Each chain should 2n iterations. The last n iterations will be used to calculate the statistic |
A list containing n, the between chain variance B, the within chain
variance W, the estimated variance of the parameter vHat, and the Gelman
Rubin statistic
Gelman, A. and Rubin, D.B. (1992) 'Inference from iterative simulations using multiple sequences with discussion.' Statistical Science 8, pp. 457-511
## take four chains sampling from a normal mixture density theta0 = c(0,1) theta1 = c(3,2) p = 0.6 candidate = c(0, 3) v1 = normMixMH(theta0, theta1, p, candidate, steps = 200) v2 = normMixMH(theta0, theta1, p, candidate, steps = 200) v3 = normMixMH(theta0, theta1, p, candidate, steps = 200) v4 = normMixMH(theta0, theta1, p, candidate, steps = 200) theta=cbind(v1,v2,v3,v4) GelmanRubin(theta)
## take four chains sampling from a normal mixture density theta0 = c(0,1) theta1 = c(3,2) p = 0.6 candidate = c(0, 3) v1 = normMixMH(theta0, theta1, p, candidate, steps = 200) v2 = normMixMH(theta0, theta1, p, candidate, steps = 200) v3 = normMixMH(theta0, theta1, p, candidate, steps = 200) v4 = normMixMH(theta0, theta1, p, candidate, steps = 200) theta=cbind(v1,v2,v3,v4) GelmanRubin(theta)
fits a hierarchical normal model of the form
hierMeanReg( design, priorTau, priorPsi, priorVar, priorBeta = NULL, steps = 1000, startValue = NULL, randomSeed = NULL )
hierMeanReg( design, priorTau, priorPsi, priorVar, priorBeta = NULL, steps = 1000, startValue = NULL, randomSeed = NULL )
design |
a list with elements y = response vector, group = grouping vector, x = matrix of covariates or NULL if there are no covariates |
priorTau |
a list with elements tau0 and v0 |
priorPsi |
a list with elements psi0 and eta0 |
priorVar |
a list with elements s0 and kappa0 |
priorBeta |
a list with elements b0 and bMat or NULL if x is NULL |
steps |
the number of Gibbs sampling steps to take |
startValue |
a list with possible elements tau, psi, mu, sigmasq and beta. tau, psi and sigmasq must all be scalars. mu and beta must be vectors with as many elements as there are groups and covariates respectively |
randomSeed |
a random seed for the random number generator |
A data frame with variables:
tau |
Samples from the posterior distribution of tau |
psi |
Samples from the posterior distribution of psi |
mu |
Samples from the posterior distribution of mu |
beta |
Samples from the posterior distribution of beta if there are any covariates |
sigmaSq |
Samples from the posterior distribution of
|
sigma |
Samples from the posterior distribution of sigma |
priorTau = list(tau0 = 0, v0 = 1000) priorPsi = list(psi0 = 500, eta0 = 1) priorVar = list(s0 = 500, kappa0 = 1) priorBeta = list(b0 = c(0,0), bMat = matrix(c(1000,100,100,1000), ncol = 2)) data(hiermeanRegTest.df) data.df = hiermeanRegTest.df design = list(y = data.df$y, group = data.df$group, x = as.matrix(data.df[,3:4])) r=hierMeanReg(design, priorTau, priorPsi, priorVar, priorBeta) oldPar = par(mfrow = c(3,3)) plot(density(r$tau)) plot(density(r$psi)) plot(density(r$mu.1)) plot(density(r$mu.2)) plot(density(r$mu.3)) plot(density(r$beta.1)) plot(density(r$beta.2)) plot(density(r$sigmaSq)) par(oldPar) ## example with no covariates priorTau = list(tau0 = 0, v0 = 1000) priorPsi = list(psi0 = 500, eta0 = 1) priorVar = list(s0 = 500, kappa0 = 1) data(hiermeanRegTest.df) data.df = hiermeanRegTest.df design = list(y = data.df$y, group = data.df$group, x = NULL) r=hierMeanReg(design, priorTau, priorPsi, priorVar) oldPar = par(mfrow = c(3,2)) plot(density(r$tau)) plot(density(r$psi)) plot(density(r$mu.1)) plot(density(r$mu.2)) plot(density(r$mu.3)) plot(density(r$sigmaSq)) par(oldPar)
priorTau = list(tau0 = 0, v0 = 1000) priorPsi = list(psi0 = 500, eta0 = 1) priorVar = list(s0 = 500, kappa0 = 1) priorBeta = list(b0 = c(0,0), bMat = matrix(c(1000,100,100,1000), ncol = 2)) data(hiermeanRegTest.df) data.df = hiermeanRegTest.df design = list(y = data.df$y, group = data.df$group, x = as.matrix(data.df[,3:4])) r=hierMeanReg(design, priorTau, priorPsi, priorVar, priorBeta) oldPar = par(mfrow = c(3,3)) plot(density(r$tau)) plot(density(r$psi)) plot(density(r$mu.1)) plot(density(r$mu.2)) plot(density(r$mu.3)) plot(density(r$beta.1)) plot(density(r$beta.2)) plot(density(r$sigmaSq)) par(oldPar) ## example with no covariates priorTau = list(tau0 = 0, v0 = 1000) priorPsi = list(psi0 = 500, eta0 = 1) priorVar = list(s0 = 500, kappa0 = 1) data(hiermeanRegTest.df) data.df = hiermeanRegTest.df design = list(y = data.df$y, group = data.df$group, x = NULL) r=hierMeanReg(design, priorTau, priorPsi, priorVar) oldPar = par(mfrow = c(3,2)) plot(density(r$tau)) plot(density(r$psi)) plot(density(r$mu.1)) plot(density(r$mu.2)) plot(density(r$mu.3)) plot(density(r$sigmaSq)) par(oldPar)
Data for testing hiermeanReg which uses Gibbs sampling on a hierarchical normal mean model with regression on covariates
A data frame with 30 observations on 4 variables.
\ [1,] | y | numeric | the response vector |
[2,] | group | factor | the grouping factor levels 1-3 |
[3,] | x1 | numeric | the first covariate |
[4,] | x2 | numeric | the second covariate |
hiermeanReg
A test data set for bayesLogisticReg
A data frame with 100 observations on 6 variables.
[1,] | x | numeric | the covariate |
[2,] | eps | numeric | the error in the response |
[3,] | logit.p | numeric | the logit of the probability of success given x = 2 + 3*x + eps |
[4,] | p | numeric | the probability of success given x |
[5,] | u | numeric | a U[0,1] random variable |
[6,] | y | binary | if u[i]<p[i] = 1, otherwise 0 |
bayesLogistic
normGibbs draws a Gibbs sample from the posterior distribution of the
parameters given the data fron normal distribution with unknown mean and
variance. The prior for given
is prior mean
and prior variance
. That means
is the 'equivalent
sample size.' The prior distribution of the variance is
times an
inverse chi-squared with
degrees of freedom. The joint prior is
the product
.
normGibbs(y, steps = 1000, type = "ind", ...)
normGibbs(y, steps = 1000, type = "ind", ...)
y |
A vector containing the data |
steps |
The number of iterations of Gibbs sampling to carry out |
type |
Either 'ind' for sampling from an independent conjugate prior or 'joint' for sampling from a joint conjugate prior. 'i' and 'j' can be used as compact notation |
... |
If type = 'ind' then the user can specify the prior for
|
A data frame containing three variables
[1,] | mu | a sample from the posterior distribution of the mean |
[2,] | sig | a sample from the posterior distribution of the standard deviation |
[3,] | mu | a sample from the posterior distribution of the variance = sig^2 |
James M. Curran
## firstly generate some random data mu = rnorm(1) sigma = rgamma(1,5,1) y = rnorm(100, mu, sigma) ## A \eqn{N(10,3^2)} prior for \eqn{\mu} and a 25 times inverse chi-squared ## with one degree of freedom prior for \eqn{\sigma^2} MCMCSampleInd = normGibbs(y, steps = 5000, priorMu = c(10,3), priorVar = c(25,1)) ## We can also use a joint conjugate prior for \eqn{\mu} and \eqn{\sigma^2}. ## This will be a \emph{normal}\eqn{(m,\sigma^2/n_0)} prior for \eqn{\mu} given ## the variance \eqn{\sigma^2}, and an \eqn{s0} times an \emph{inverse ## chi-squared} prior for \eqn{\sigma^2}. MCMCSampleJoint = normGibbs(y, steps = 5000, type = 'joint', priorMu = c(10,3), priorVar = c(25,1)) ## Now plot the results oldPar = par(mfrow=c(2,2)) plot(density(MCMCSampleInd$mu),xlab=expression(mu), main = 'Independent') abline(v=mu) plot(density(MCMCSampleInd$sig),xlab=expression(sig), main = 'Independent') abline(v=sigma) plot(density(MCMCSampleJoint$mu),xlab=expression(mu), main = 'Joint') abline(v=mu) plot(density(MCMCSampleJoint$sig),xlab=expression(sig), main = 'Joint') abline(v=sigma)
## firstly generate some random data mu = rnorm(1) sigma = rgamma(1,5,1) y = rnorm(100, mu, sigma) ## A \eqn{N(10,3^2)} prior for \eqn{\mu} and a 25 times inverse chi-squared ## with one degree of freedom prior for \eqn{\sigma^2} MCMCSampleInd = normGibbs(y, steps = 5000, priorMu = c(10,3), priorVar = c(25,1)) ## We can also use a joint conjugate prior for \eqn{\mu} and \eqn{\sigma^2}. ## This will be a \emph{normal}\eqn{(m,\sigma^2/n_0)} prior for \eqn{\mu} given ## the variance \eqn{\sigma^2}, and an \eqn{s0} times an \emph{inverse ## chi-squared} prior for \eqn{\sigma^2}. MCMCSampleJoint = normGibbs(y, steps = 5000, type = 'joint', priorMu = c(10,3), priorVar = c(25,1)) ## Now plot the results oldPar = par(mfrow=c(2,2)) plot(density(MCMCSampleInd$mu),xlab=expression(mu), main = 'Independent') abline(v=mu) plot(density(MCMCSampleInd$sig),xlab=expression(sig), main = 'Independent') abline(v=sigma) plot(density(MCMCSampleJoint$mu),xlab=expression(mu), main = 'Joint') abline(v=mu) plot(density(MCMCSampleJoint$sig),xlab=expression(sig), main = 'Joint') abline(v=sigma)
normMixMH uses the Metropolis-Hastings algorithm to draw a sample from a univariate target distribution that is a mixture of two normal distributions using an independent normal candidate density or a random walk normal candidate density.
normMixMH( theta0, theta1, p, candidate, steps = 1000, type = "ind", randomSeed = NULL, startValue = NULL )
normMixMH( theta0, theta1, p, candidate, steps = 1000, type = "ind", randomSeed = NULL, startValue = NULL )
theta0 |
A vector of length two containing the mean and standard deviation of the first component of the normal mixture |
theta1 |
A vector of length two containing the mean and standard deviation of the second component of the normal mixture |
p |
A value between 0 and 1 representing the mixture proportion, so
that the true density is |
candidate |
A vector of length two containing the mean and standard deviation of the candidate density |
steps |
The number of steps to be used in the Metropolis-Hastings algorithm. steps must be greater than 100 |
type |
Either 'ind' or 'rw' depending on whether a independent candidate density or random walk candidate density is to be used. 'i' and 'r' may be used as alternative compact notation |
randomSeed |
A seed for the random number generator. Only used when you want the same sequence of random numbers in the chain |
startValue |
A starting value for the chain |
A vector containing a sample from the normal mixture distribution.
## Set up the normal mixture theta0 = c(0,1) theta1 = c(3,2) p = 0.8 ## Sample from an independent N(0,3^2) candidate density candidate = c(0, 3) MCMCsampleInd = normMixMH(theta0, theta1, p, candidate) ## If we wish to use the alternative random walk N(0, 0.5^2) ## candidate density candidate = c(0, 0.5) MCMCsampleRW = normMixMH(theta0, theta1, p, candidate, type = 'rw')
## Set up the normal mixture theta0 = c(0,1) theta1 = c(3,2) p = 0.8 ## Sample from an independent N(0,3^2) candidate density candidate = c(0, 3) MCMCsampleInd = normMixMH(theta0, theta1, p, candidate) ## If we wish to use the alternative random walk N(0, 0.5^2) ## candidate density candidate = c(0, 0.5) MCMCsampleRW = normMixMH(theta0, theta1, p, candidate, type = 'rw')
Calculates the probability of a one sided null hypothesis from a numerically calculated posterior CDF or from a sample from the posterior.
pNull(theta0, theta, cdf = NULL, type = "upper")
pNull(theta0, theta, cdf = NULL, type = "upper")
theta0 |
the hypothesized value, i.e. H0: theta <= theta0 |
theta |
a sample of values from the posterior density, or, if cdf is not NULL then the values over which the the posterior CDF is specified |
cdf |
the values of the CDF, |
type |
the type of probability to return, 'lower' = Pr(theta <= theta0) or 'upper' = Pr(theta >= theta0). It is sufficient to use 'l' or 'u' |
This function uses linear interpolation to calculate bounds for points that may not be specified by CDF
a list containing the element prob which will be the upper or lower tail probability depending on type
## commands for calculating a numerical posterior CDF. ## In this example, the likelihood is proportional to ## \eqn{\theta^{3/2}\times \exp(-\theta/4)} and a N(6, 9) prior is used. theta = seq(from = 0.001, to = 40, by = 0.001) prior = dnorm(theta,6,3) ppnLike = theta^1.5*exp(-theta/4) ppnPost = prior*ppnLike scaleFactor = sintegral(theta, ppnPost)$int posterior = ppnPost/scaleFactor cdf = sintegral(theta, posterior)$y pNull(15, theta, cdf) ## Use an inverse method to take a random sample of size 1000 ## from the posterior suppressWarnings({Finv = approxfun(cdf, theta)}) thetaSample = Finv(runif(1000)) pNull(15, thetaSample)
## commands for calculating a numerical posterior CDF. ## In this example, the likelihood is proportional to ## \eqn{\theta^{3/2}\times \exp(-\theta/4)} and a N(6, 9) prior is used. theta = seq(from = 0.001, to = 40, by = 0.001) prior = dnorm(theta,6,3) ppnLike = theta^1.5*exp(-theta/4) ppnPost = prior*ppnLike scaleFactor = sintegral(theta, ppnPost)$int posterior = ppnPost/scaleFactor cdf = sintegral(theta, posterior)$y pNull(15, theta, cdf) ## Use an inverse method to take a random sample of size 1000 ## from the posterior suppressWarnings({Finv = approxfun(cdf, theta)}) thetaSample = Finv(runif(1000)) pNull(15, thetaSample)
Calculates the probability of a one sided null hypothesis from a numerically calculated posterior CDF.
pnullNum(theta0, theta, cdf, type = "upper")
pnullNum(theta0, theta, cdf, type = "upper")
theta0 |
the hypothesized value, i.e. H0: theta <= theta0 |
theta |
the values over which the the posterior CDF is specified |
cdf |
the values of the CDF, |
type |
the type of probability to return, 'lower' = Pr(theta <= theta0) or 'upper' = Pr(theta >= theta0). It is sufficient to use 'l' or 'u' |
This function uses linear interpolation to calculate bounds for points that may not be specified by CDF
a list containing the element prob which will be the upper or lower tail probability depending on type
## commands for calculating a numerical posterior CDF. ## In this example, the likelihood is proportional to ## \eqn{\theta^{3/2}\times \exp(-\theta/4)} and a N(6, 9) prior is used. theta = seq(from = 0.001, to = 40, by = 0.001) prior = dnorm(theta,6,3) ppnLike = theta^1.5*exp(-theta/4) ppnPost = prior*ppnLike scaleFactor = sintegral(theta, ppnPost)$int posterior = ppnPost/scaleFactor cdf = sintegral(theta, posterior)$y pnullNum(1, theta, cdf)
## commands for calculating a numerical posterior CDF. ## In this example, the likelihood is proportional to ## \eqn{\theta^{3/2}\times \exp(-\theta/4)} and a N(6, 9) prior is used. theta = seq(from = 0.001, to = 40, by = 0.001) prior = dnorm(theta,6,3) ppnLike = theta^1.5*exp(-theta/4) ppnPost = prior*ppnLike scaleFactor = sintegral(theta, ppnPost)$int posterior = ppnPost/scaleFactor cdf = sintegral(theta, posterior)$y pnullNum(1, theta, cdf)
Calculates the probability of a one sided null hypothesis from a sample from a posterior density.
pnullSamp(theta, theta0 = 0, type = "upper")
pnullSamp(theta, theta0 = 0, type = "upper")
theta |
a sample of values from a posterior density |
theta0 |
the hypothesized value, i.e. H0: theta <= theta0 |
type |
the type of probability to return, 'lower' = Pr(theta <= theta0) or 'upper' = Pr(theta >= theta0). It is sufficient to use 'l' or 'u' |
This function uses linear interpolation to calculate bounds for points that may not be specified by CDF
a list containing the element prob which will be the upper or lower tail probability depending on type
## The posterior density is N(3,1) theta = rnorm(1000,3) ## test whether the true mean is greater than 0 (it is obviously!) pnullSamp(theta)
## The posterior density is N(3,1) theta = rnorm(1000,3) ## test whether the true mean is greater than 0 (it is obviously!) pnullSamp(theta)
A test data set for bayesPois. The data come from the equation
where
comes from N(0,0.01).
A data frame with 100 observations on 5 variables.
[1,] | x | numeric | the covariate |
[2,] | eps | numeric | the error in the log response |
[3,] | log.lam | numeric |
where
|
[4,] | lam | numeric | |
[5,] | y | numeric | a Poisson random variate with mean |
bayesPois
Takes a vector of values and a corresponding set of postive
values and evaluates the area under the curve:
.
sintegral(x, fx, n.pts = 256)
sintegral(x, fx, n.pts = 256)
x |
a sequence of |
fx |
the value of the function to be integrated at |
n.pts |
the number of points to be used in the integration. |
returns a list with the following elements
x |
the x-values at which the integral has been evaluated |
y |
the cummulative integral |
int |
the value of the integral over the whole range |
## integrate the normal density from -3 to 3 x=seq(-3,3,length=100) fx=dnorm(x) estimate=sintegral(x,fx)$int true.val=diff(pnorm(c(-3,3))) cat(paste('Absolute error :',round(abs(estimate-true.val),7),'\n')) cat(paste('Relative percentage error :', 100*round((abs(estimate-true.val)/true.val),6),'%\n'))
## integrate the normal density from -3 to 3 x=seq(-3,3,length=100) fx=dnorm(x) estimate=sintegral(x,fx)$int true.val=diff(pnorm(c(-3,3))) cat(paste('Absolute error :',round(abs(estimate-true.val),7),'\n')) cat(paste('Relative percentage error :', 100*round((abs(estimate-true.val)/true.val),6),'%\n'))
Thins the output from an MCMC process
thin(x, k)
thin(x, k)
x |
A vector, matrix or data.frame containing output from an MCMC sampling scheme |
k |
An integer. This function takes every kth element from x |
Note this function does not check to see if k is sensible.
A thinned vector, matrix or data frame containing every kth element of x.
## A blockwise Metropolis-Hastings chain of 1000 elements, thinned to ## 5th element ## MCMCSampleBW = bivnormMH(0.9, type = 'block') MCMCSampleBW = thin(MCMCSampleBW, 5)
## A blockwise Metropolis-Hastings chain of 1000 elements, thinned to ## 5th element ## MCMCSampleBW = bivnormMH(0.9, type = 'block') MCMCSampleBW = thin(MCMCSampleBW, 5)