Package 'jaggR'

Title: Supporting Files and Functions for the Book Bayesian Modelling with 'JAGS'
Description: All the data and functions used to produce the book. We do not expect most people to use the package for any other reason than to get simple access to the 'JAGS' model files, the data, and perhaps run some of the simple examples. The authors of the book are David Lucy (now sadly deceased) and James Curran. It is anticipated that a manuscript will be provided to Taylor and Francis around Augus 2020, with bibliographic details to follow at that point. Until such time, further information can be obtained by emailing James Curran.
Authors: James Curran [aut, cre], David Lucy [aut]
Maintainer: James Curran <[email protected]>
License: GPL (>= 2)
Version: 0.1.1
Built: 2024-11-25 02:51:33 UTC
Source: https://github.com/jmcurran/jaggr

Help Index


Age estimation from aspartic acid concentration

Description

Aspartic acid data for modern upper and lower first pre-molars: taken from Gillard et al 1991

Usage

acid.df

Format

A data.frame with 37 rows and 3 columns:

age

Age in years.

period

Period of tooth, modern or victorian.

aspartic

Percentage of D-aspartic acid.

Source

Gillard, R.D., Hardman, S.M., Pollard, A.M., Sutton, P.A. and Whittaker, D.K. (1991) 'Determinations of age at death in the archaeological populations using the D/L ratio of aspartic acid in dental collagen' in Archaeometry 90, eds. Pernicka, E. and Wagner, G.A., p.637-644, Birkhauser Verlag, Berlin.


Energy requirements for different activities

Description

An experiment was conducted to compare the energy requirements of three physical activities: running, walking and bicycle riding. Eight subjects were asked to run, walk and bicycle a measured distance, and the number of kilocalories expended per kilometre was measured for each subject during each activity. The activities are run in random order with time for recovery between activities. Each activity was monitored exactly once for each individual.

Usage

activity.df

Format

A data.frame with 24 rows and 3 columns:

subject

a subject ID.

activity

running, walking, riding.

energy

energy expended during activity, in kilocalories (Cal)

Source

Milton, J. S. (1992). Statistical Methods in the Biological and Health Sciences 2nd Edition, McGraw-Hill, New York, p. 316–319.


Books Data

Description

This data consists of 50 sentence lengths from each of 8 books. The books “Disclosure” and “Rising Sun” were written by Michael Crichton, whilst the others “Four Past Midnight”, “The Dark Half”, “ Eye of the Dragon”, “The Shining”, “The Stand” and “The Tommy-Knockers” where written by Stephen King. The pages and sentences where chosen using a multistage design where the pages where selected at random, and then sentences within each page were selected at random. These data were collected by James Curran.

Usage

books.df

Format

The data frame consists of 400 observations on 2 variables.

length

sentence length

book

a factor with levels: 4.Past.Mid, Dark.Half, Disclosure, Eye.Drag,Rising.Sun, Shining, Stand, T.Knock.

author

a factor with levels: MC, SK.


Time taken to sort random vectors of various lengths using bubble sort.

Description

Students learning to programme are often taught the bubble sort algorithm for several reasons. Firstly, sorting is a commonly used operation in programming, so having a way of sorting vectors into order is useful. Secondly, it lets the instructor talk about the order of the algorithm, and how it is very inefficient. In computer science, big O notation is used to classify algorithms according to how their run time or space requirements grow as the input size grows. The bubble sort algorithm is known to be O(n^2). That is, the time taken to run the algorithm increases quadratically (with the square) with the size of the vector.

Usage

bsort.df

Format

A data.frame with 200 rows and 2 columns:

n

Size of the random vector.

time

Time in seconds taken to sort the vector using bubbleSort.

Details

This data set consists of 200 observations generated using the following code: “' set.seed(123) N = 200 bsort.df = data.frame(n = rep(0, N), time = rep(0, N))

n = sample(100:1000, size = N, replace = TRUE)

pb = txtProgressBar(0, N, style = 3)

for(i in 1:N) x = rnorm(n[i]) bsort.df$n[i] = n[i] bsort.df$time[i] = system.time(bubbleSort(x))[1] setTxtProgressBar(pb, i) close(pb) “' It consists of the times taken to sort 200 vectors of random length between 100 and 1,000. The vectors themselves are random samples of size n[i] from the standard normal distribution.

See Also

bubbleSort


Bubble sort

Description

Sorts the vector x into ascending order using a very inefficient bubble sort algorithm

Usage

bubbleSort(x)

Arguments

x

a vector of numbers

Value

the vector x sorted into ascending order

Examples

set.seed(123)
x = rnorm(10)
bubbleSort(x)

Calculus marks

Description

Calculus marks from the 2012 first year calculus course from the Department of Mathematics and Statistics at Lancaster University.

Usage

calculus.df

Format

A data.frame with 147 rows and two columns:

coursework

final coursework mark out of 100.

examination

final examination mark out of 100.

Source

George Moran, Department of Mathematics and Statistics at Lancaster.


Car listings from trademe

Description

This data set consists of 3,618 listings scraped from the New Zealand website tradee. trademe is similar to ebay in that is an online auction site which allows sellers to list new and used goods for sale. Goods may be purchased via auction, or outright if the seller has enabled that option. Many New Zealanders, including commericial car dealers, use the website to buy and sell cars. The listings gathered consist are mostly for Mazda 3 and Toyota Corolla vehicles, along with imported vehicles which may be the same car, but with different badging.

Usage

car.prices.df

Format

An object of class data.frame with 3618 rows and 13 columns.

Details

@format A data.frame with 3,618 rows and 14 columns:

obs

The observation number, from 1 to 3618.

title

The listing title - basically the make and model of the car.

year

The year of manufacture of the vehicle.

age

The age of the vehicle as of 2013 (when this data was collected). So a car manufactured in 2009 would have an age of 4, for example.

price

The asking price, in NZD.

km

The number of kilometres on the odometer—i.e. the "mileage."

cc

The displacement of the engine in cubic centimetres.

fuel

The fuel used by the vehicle: either Petrol (gasoline) or Diesel.

doors

The number of doors in the car. Note 3 and 5 door cars are hatchbacks.

list.color

The colour of the car given in the listing.

simple.color

An attempt to standardise the colour to a reduced category. For example sky blue, and light blue would both get transformed to blue.

make

The manufacturer of the car: either Mazda or Toyota


Carbon isotopes in trees

Description

These observations were made by Robertson et. al. They are the mean delta 13 C compositions of several individual trees from two locations in Central England mean temperatures from the CET are also given

Usage

carbon.df

Format

A data.frame with 200 rows and 4 columns:

year
iso
temp

Cell survival data

Description

The data comes from an experiment to measure the mortality of cancer cells under radiation under taken in the Department of Radiology, University of Cape Town. Four hundred cells were placed on a dish, and three dishes were irradiated at a time, or occasion. After the cells were irradiated, the surviving cells were counted. Since cells would also die naturally, dishes with cells were put into the radiation chamber without being irradiated, to establish the natural mortality. These data gives only these zero-dose data. these data are from ozDASL

Usage

cell_surv.df

Format

An object of class data.frame with 27 rows and 2 columns.


Energy and fat in chocolate bars

Description

The amount of fat (g) and energy (Cal) in 16 chocolate bars. Source is unknown, but we would be happy to give credit if someone tells us.

Usage

chocolate.df

Format

A data.frame with 16 rows and 2 columns:

energy

energy, in Calories = kilocalories

fat

fat content, in grams

Source

Source is unknown, but we would be happy to give credit if someone tells us.


Does insulation make a difference?

Description

This data arose from an experiment conducted by David to testing the insulation of the ground floor bedroom of his house–The Spinney. The idea was that the better the insulation the slower the rate cooling, so for some exponential model y(t) = y(0) exp(-lambda t) - the value of lambda should go down for a better insulated room In the experiment, David ran two extension cords into the room through a service port to power two electric heaters and a fan. He then sealed up the room by shutting windows and door. The heaters were left to heat up the room as much as they could. This happened to be about 24.6 C. He then turned the heaters and fan off and the recorded the rate of cooling by observing a temperture probe from outside the room for about two hours. Standard theory says that the rate of cooling is proportional to the temperature differential between the indoor and outdoor temperatures. To control for this days were selected which had approximately the same external temperatures. The room has walls which are external and internal. It was assumed that the outside and internal house (no heating) had reached an equilibrium so that we only need to know the outside room, but inside house temperature rather than both

Usage

cooling.df

Format

A data.frame with 47 rows and 3 columns:

time

The time since turning off the heaters and fan

uninsulated

The recorded temperature with absolutely no insulation in the room whatsoever—outside temperature 8.0 C.

insulated

The recorded temperature with part of a wall and the floor insulated— outside temperature 8.1 C

Source

David Lucy


Extract sampled parameter values from an mcmc.list

Description

This function makes it easy to extract sampled values of one or more parameters. The function can extract multiple parameters from multiple chains

Usage

extractValues(x, params, chain = NULL, drop = TRUE, ...)

Arguments

x

an object of class mcmc.list - usually from coda.samples

params

a vector of one or more strings OR regular expressions which identifies the parameters we want to extract from the chain

chain

the chain, or chains we want to extract the parameters from. If chain is NULL then the values will be extracted from all chains.

drop

used to preserve the dimensions of an array. If a single parameter is requested, then the results will be returned as a vector rather than a matrix if drop == TRUE.

...

any other arguments. Not used yet.

Value

If there is only one chain or the user asks for results from exactly one chain, then a matrix with class mcmc will be returned containing only the parameters of interest in the columns. The column names of the matrix will correspond to the parameter. If there is more than one chain, and the user asks for results from more than one chain, or alternatively leaves chain as NULL, then a list of matrices with class mcmc will be returned where each matrix contains only the parameters of interest in the columns. The column names of each of the matrices will correspond to the parameter.


Height, weight and fingerprint measurements collected from 200 participants

Description

This dataset contains the height, weight and 4 fingerprint measurements (length, width, area and circumference), collected from 200 participants. This data was collected with the intention of performing regression analysis to asses whether a significant relationship exists between fingerprint size and physical stature.

Usage

fingerprints.df

Format

a data.frame with 200 rows and 11 columns:

number

participant number

gender

self-declared gender of participant female or male

age

age in years

hand

dominant hand left or right

height

height in centimetres, average of three measurements

weight

weight in kilograms, average of three measurements

temp

fingerprint temperature in degrees Celius

fpheight

fingerprint height in millimetres

width

fingerprint width in millimetres

area

fingerprint area in squared millimetres

circumference

fingerprint circumference in millimetres

Source

McMurchie, Beth; Torrens, George; Kelly, Paul (2019). Height, weight and fingerprint measurements collected from 200 participants. Loughborough University. [Dataset](https://doi.org/10.17028/rd.lboro.7539206.v1)


Get a JAGS model file

Description

This function provides an easy way for readers to get the JAGS model files used in the book. The modelID is the 4-5 character identifier used in the book. For example to get 'model-001.bugs.R', you would use getModel("001").

Usage

getModel(modelID)

Arguments

modelID

a string containing a valid model ID

Value

a string containing the model. The intention is that this can be written to disk.

Examples

getModel("001")

Age estimation based on changes in dental characteristics

Description

Age estimation based on changes in dental characteristics

Usage

gustafson.df

Format

a data.frame with 759 rows and 10 columns:

sex

sex of subject, female or male.

age

age, in years.

quadrant

location in mouth of tooth

tooth

tooth identifier

attrition
recession
dentine

qualitative assessment of remaining dentine


Hedgehog growth

Description

Hedgehog growth

Usage

hedgehog.growth.df

Format

a data.frame with 77 rows and 2 columns:

date

Date in DD-Month-YYYY format

weight

weight of the hedgehog, in grams

Source

David Lucy


Hedgehog survival

Description

The Bunnell Index (or BI) is a measurement of how tightly the hedgehog are curled into a ball. One measurement is taken round the middle of the animal to cross at the point where the nose ends ("A," latitudinal circumference). The other measurement, using a second tape measure already secured underneath the animal, is taken round the hedgehog from head to tail ("B," longitudinal circumference). Care must be taken with both measurements to ensure that the ends of the tape measure meet easily without altering the shape/positioning of the hedgehog. When obtaining measurement A, the positioning of the tape measure is crucial; a measurement taken lower down toward the tail can result in a lower (inaccurate) reading. Repeatedly measuring many hedgehogs over several consecutive days demonstrated consistent BI values and hence the reliability of the method. A is divided by B to give a value for the BI. It is important to determine the BI value to two decimal places (i.e., a value of 0.794, becomes 0.79, while a value of 0.805 becomes 0.81).

Usage

hedgehog.survival.df

Format

A data.frame with 31 observations and 2 columns:

BI

The Bunnell Index (BI) of the hedgehog at the time of admission.

survived

A logical variable recording whether the hedgehog survived or died.

Source

Bunnell, T. (2002) The Assessment of British Hedgehog (Erinaceus europaeus) Casualties on Arrival and Determination of Optimum Release Weights Using a New IndexJournal of Wildlife Rehabilitation 25 (4):11-21


Impact strength of insulation cuts in foot-pounds.

Description

Impact strength of insulation cuts in foot-pounds.

Usage

insulation.df

Format

a data.frame with 100 rows and 3 colums:

Lot

Lot of insulating material

Cut

Lengthwise (Length) or crosswise (Cross)

Strength

Impact strength, in foot-pounds (ft-lb)

Source

Ostle, B. (1963). Statistics in Research: Basic Concepts and Techniques for Research. Ames, Iowa. Iowa State University Press.


jaggR: Supporting files and functions for the book Bayesian Modelling with JAGS

Description

A set of functions used in teaching STATS 201/208 Data Analysis at the University of Auckland. The functions are designed to make parts of R more accessible to a large undergraduate population who are mostly not statistics majors.

Author(s)

James Curran, David Lucy


Michelson's speed of light data

Description

Michelson's speed of light data

Usage

lightspeed.df

Format

a data.frame with 43 rows and 2 columns:

speed

The scaled speed of light measured in a single experiment. The scaling is the measurement minus 299,000 km/s. E.g. the first entry in the data.frame is 850, which is 299,850 km/s.

year

The year in which the experiment was conducted, either 1879 or 1882.

Source

Stigler, S. M. (1977), "Do robust estimators work with real data?", The Annals of Statistics 5:1055-1098.


Mortality rates for different species

Description

Ecologists Michael McCoy and James Gillooly were interested in predicting mortality rates for different species based on a number of variables including body mass, temperature. In their paper (McCoy and Gillooly, 2008) they explore the hypothesis that the natural logarithm of temperature‐corrected mortality rate should be a linear function of the natural logarithm of body mass. The temperature-corrected mortality rate is based upon previous work which draws on results from biology, biochemistry, and thermodynamics. Users are encouraged to read the original source for a deeper explanation.

Usage

mortality.df

Format

a data.frame with 2117 rows and 4 columns:

group

a factor indicating which one of the six taxonimic groups the observation belongs to: bird, fish, invertebrate, mammal, multicellular plant, and phytoplankton.

species

the species of the observation.

mass

the body mass in grams (g).

mortality

the mortality rate.

temp

the average body temperature in degrees Celcius.

E

average activation energy of heterotrophic respiration in animals (0.65 eV) or photosynthesis in plants (0.32 eV).

mort.corrected

mortality corrected by a Boltzmann-Arrhenius factor, specifically, divided by exp(-E/k * (1 / T - 1 / T20)), where k is Boltzmann constant 8.62 x 10^-5, T20 is 20 degrees Celcius in degrees Kelvin, i.e. 293, and T is average body temperature temp in degrees Kelvin.

Source

McCoy, M.W. and Gillooly, J.F. (2008), Predicting natural mortality rates of plants and animals. Ecology Letters, 11: 710-716. https://doi-org.ezproxy.auckland.ac.nz/10.1111/j.1461-0248.2008.01190.x


Distance travelled by paper planes

Description

A group from Queensland University of Technology conducted an experiment where they recorded the distance flown by paper aeroplanes. The experimenters used a sealed corridor at the University, and controlled the design of the aeroplane, the weight of the paper from which each aeroplace was constructed, and the angle of incidence at launch for each paper plane. The data and further notes for this experiment can be found at OzDASL - Australasian Data and Story Library.

Usage

planes.df

Format

A data.frame with 16 rows and 6 columns:

distance

Distance travelled in mm.

paper

Paper weight in grams per square metre (gsm), either 50 gsm or 80 gsm.

angle

Angle of launch, horizontal (0 degrees) or 45 degrees.

design

Design of the plane, either simple or advanced.

treat

The treatment number used in the experiment. There are eight combinations of the levels of the factors, so the treatment number corresponds to one of these unique combinations.

rep

Replicate number within treatment. Each treatment is repeated twice so rep is either 1 or 2.

Source

Mackisack, M. S. (1994). What is the use of experiments conducted by statistics students? Journal of Statistics Education, 2(1).

References

Smyth, G. K. (2011). Australasian Data and Story Library (OzDASL).


S3 print method for objects of type summary.mcmc

Description

This function overrides the hidden method in the coda package that provides a print method for the output of the coda{summary} function. The idea is to be able to suppress some of the output so that only the summary statistics of interest are shown. This is primarily used in the preparation of the book.

Usage

## S3 method for class 'summary.mcmc'
print(
  x,
  digits = max(3, .Options$digits - 3),
  runDetails = FALSE,
  means = FALSE,
  quantiles = TRUE,
  ...
)

Arguments

x

an object of type summary.mcmc.

digits

The number of digits to print.

runDetails

if TRUE print the details of the sampling.

means

if TRUE print the posterior means.

quantiles

if TRUE print the posterior quantiles.

...

other arguments passed to print.

Value

x is invisibly returned


from the Commission facility in Hanford, Washington. One of the major safety problems encountered there has been the storage of radioactive wastes. Over the years, significant quantities of these substances - including strontium 90 and cesium 137 - have leaked from their open-pit storage areas into the nearby Columbia River, which flows along the Washington-Oregon border, and eventually empties into the Pacific Ocean.

Description

To measure the health consequences of this contamination, an index of exposure was calculated for each of the nine Oregon counties having frontage on either the Columbia River or the Pacific Ocean. This particular index was based on several factors, including the county's stream distance from Hanford and the average distance of its population from any water frontage. As a covariate, the cancer mortality rate was determined for each of these same counties. The data give the index of exposure and the cancer mortality rate during 1959-1964 for the nine Oregon counties affected. Higher index values represent higher levels of contamination.

Usage

radiation.df

Format

An object of class data.frame with 9 rows and 3 columns.

Source

Fadeley, R. C. (1965). Oregon malignancy pattern physiographically related to Hanford, Washington, Radioisotope Storage. Journal of Environmental Health 27, 883-897.


Times taken for a rat to navigate through a maze

Description

Times taken for a rat to navigate through a maze

Usage

ratmaze.df

Format

A data.frame with 135 rows and 4 columns:

subject

An ID for each rat

treatment

The treatment adminstered to the subject: control/none, thouiracil, thyroxin.

test

A maze number.

time

time, in seconds taken for the rat to navigate the maze.


Age estimation by root dentine translucency

Description

Root dentine translucency is, in humans, an age related physiological feature. In the dentine of teeth in adult humans the tubecular microstructures fill with a highly crystalline substance making them become nearly invisible when looked at in normal light. This process starts from the apical foramen in early adulthood, and progresses up the tooth into advanced old age. Solheim (Lucy et al., 1996) collected data on age, root dentine translucency for 71 maxillary second incisors from a Norweigian population. The sex of each individual was also noted.

Usage

rdt.df

Format

A data.frame with 71 rows and 3 columns:

age

Age of subject, in years

sex

Sex of subject, female or male

rdt

root dentine translucency

Source

Lucy, D., Aykroyd, R.G., Pollard, A.M. and Solheim (1996), T.,"A Bayesian approach to adult human age estimation from dental observations by Johanson's age changes", Journal of Forensic Sciences 41(2):189-194.


Reorder the columns of mcmc objects in an mcmc.list

Description

Reorders the output from rjags{coda.samples} to match the preferred order of the user. The function will stop if one or more of the specified variable names does not match the variable names in the first mcmc object of x.

Usage

## S3 method for class 'mcmc.list'
reorder(x, variable.names, ...)

Arguments

x

an object of type mcmc.list

variable.names

a vector of variable names in user order.

...

other arguments. Currently ignored.

Value

an object of type mcmc.list


Set Plotting Preferences

Description

Set Plotting Preferences

Usage

setPlotPrefs(
  mar = c(3, 4, 1, 1),
  cex = 1,
  oma = c(0, 0, 0, 0),
  tcl = -0.35,
  mgp = c(1.5, 0.5, 0),
  las = 1,
  cex.lab = 1,
  font.lab = 1,
  lwd = 1,
  on.graph.line = 3,
  shading.density = 8,
  arrow.length = 0.1,
  on.graph.cex = 1,
  margin.cex = 1.2,
  ...
)

Arguments

mar

plot margings

cex

character expansion factor

oma

outer margins

tcl

tick length

mgp

somethen

las

text rotation on axes

cex.lab

plot labels cex

font.lab

font of plot labels

lwd

line width

on.graph.line

no idea

shading.density

shading density

arrow.length

arrow head length

on.graph.cex

character expansion for text on graphs

margin.cex

character expansion for text for margins

...

other arguments to be passed to par

Value

the previous par settings so that they can be restored


Shotgun range data In order to test the validity of range-of-fire estimates obtained by the application of regression analysis to shotgun pellet patterns, a blind study was conducted in which questioned pellet patterns were fired at randomly selected ranges between 3.0 and 15.2 m (10 and 50 ft) with two different 12-gauge shotguns. each firing a different type of buckshot cartridge. Test firings at known ranges were also conducted with the same weapons and ammunition.

Description

Shotgun range data In order to test the validity of range-of-fire estimates obtained by the application of regression analysis to shotgun pellet patterns, a blind study was conducted in which questioned pellet patterns were fired at randomly selected ranges between 3.0 and 15.2 m (10 and 50 ft) with two different 12-gauge shotguns. each firing a different type of buckshot cartridge. Test firings at known ranges were also conducted with the same weapons and ammunition.

Usage

shotgun.df

Format

A data frame with 70 observations on 4 variables.

range

The range in feet of the firing.

gun

The model of shotgun used in the experiment.

expt

A factor recording whether the data was to be used for building/training the model, or testing it.

area

The area of the smallest rectangle that would enclose the pellet pattern.

Source

Rowe, W.F. and Hanson, S.R. (1985) Range-of-fire estimates from regression analysis applied to the spreads of shotgun pellet patterns: Results of a blind study, Forensic Science International, 28(3-4): 239-250.


Simulated weights of difference breeds of terriers

Description

Simulated samples of weights from English terrier breeds with the parameter values for the means for the simulation taken from http://www.dogsindepth.com. The variances are assumed to be constant.

Usage

terriers.df

Format

A data.frame with 30 rows and 2 columns.

weight

Weight of dog in kg.

breed

Breed, either Skye, Manchester or Norwich.


Tidy BUGS files

Description

This function cleans up the formatting

Usage

tidy_bugs(
  path = ".",
  arrow = TRUE,
  brace.newline = FALSE,
  indent = 2,
  wrap = TRUE,
  width.cutoff = 50
)

Arguments

path

location of file(s)

arrow

use the <- operator if TRUE, = otherwise

brace.newline

move braces to a new line if TRUE

indent

number of spaces to indent code blocks

wrap

whether to wrap comments to the linewidth determined by width.cutoff

width.cutoff

passed to deparse: integer in [20, 500] determining the cutoff at which line-breaking is tried