Title: | Fit Zeta Distributions to Forensic Data |
---|---|
Description: | Fits Zeta distributions (discrete power laws) to data that arises from forensic surveys of clothing on the presence of glass and paint in various populations. The general method is described to some extent in Coulson, S.A., Buckleton, J.S., Gummer, A.B., and Triggs, C.M. (2001) <doi:10.1016/S1355-0306(01)71847-3>, although the implementation differs. |
Authors: | James Curran [aut, cre] |
Maintainer: | James Curran <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.0.2 |
Built: | 2024-10-31 04:05:44 UTC |
Source: | https://github.com/jmcurran/fitps |
psData
Tests to see if two objects of class psData
are equal. That is
their type
is the same, and the data contained in data
is the
same. See readData
for a description of the psData
class.
## S3 method for class 'psData' lhs == rhs
## S3 method for class 'psData' lhs == rhs
lhs |
an object of class |
rhs |
an object of class |
NOTE: the notes
member variable is ignored in this function
as it is unlikely that a user would want to see if the notes are the same.
TRUE if the two objects are equal
p = readData(system.file("extdata", "p.xlsx", package = "fitPS")) p1 = makePSData(n = 0:2, count = c(98, 1, 1), type = "P") p2 = makePSData(n = 0:2, count = c(97, 2, 1), type = "P") p == p1 ## TRUE p == p2 ## FALSE p1 == p2 ## FALSE
p = readData(system.file("extdata", "p.xlsx", package = "fitPS")) p1 = makePSData(n = 0:2, count = c(98, 1, 1), type = "P") p2 = makePSData(n = 0:2, count = c(97, 2, 1), type = "P") p == p1 ## TRUE p == p2 ## FALSE p1 == p2 ## FALSE
Add one or more new observations to an existing clothing survey object.
add(x, newData)
add(x, newData)
x |
an object of class |
newData |
either a |
an object of class pSData
add(Ssurveys$lau, c(11, 1))
add(Ssurveys$lau, c(11, 1))
psData
to a data.frame
Converts an object of class psData
—see readData
—to a
data.frame
that can be used with in functions in other packages such as
vglm
to fit more complicated models.
## S3 method for class 'psData' as.data.frame(x, ...)
## S3 method for class 'psData' as.data.frame(x, ...)
x |
an object of class |
... |
any other arguments passed to |
If x
is a psData
object of type "P"
, i.e. it
relates to numbers of groups of glass, then a data.frame
with a single variable
count
will be return where count = rep(x$data$n + 1,
x$data$rn)
. The counts have one added to them because the zeta
distribution requires that the counts are greater than or equal to one. If
x
is a psData
object of type "P"
, i.e. it relates to
group sizes, then a data.frame
with a single variable count
will be return where count = rep(x$data$n, x$data$rn)
.
a data.frame
with a single variable count
. The number
of rows in the data.frame
is equal to sum(x$data$rn)
.
p = readData(system.file("extdata", "p.xlsx", package = "fitPS")) p.df = as.data.frame(p) table(p.df$count) p$data
p = readData(system.file("extdata", "p.xlsx", package = "fitPS")) p.df = as.data.frame(p) table(p.df$count) p$data
Use boostrapping to generate confidence intervals, or confidence regions in the case of the zero-inflated model.
bootCI(x, ...) ## Default S3 method: bootCI( x, level = 0.95, B = 2000, model = c("zeta", "ziz"), returnBootValues = FALSE, silent = FALSE, plot = FALSE, parallel = TRUE, progressBar = FALSE, pbopts = list(type = "txt"), ... ) ## S3 method for class 'psData' bootCI(x, ...) ## S3 method for class 'psFit' bootCI(x, ...)
bootCI(x, ...) ## Default S3 method: bootCI( x, level = 0.95, B = 2000, model = c("zeta", "ziz"), returnBootValues = FALSE, silent = FALSE, plot = FALSE, parallel = TRUE, progressBar = FALSE, pbopts = list(type = "txt"), ... ) ## S3 method for class 'psData' bootCI(x, ...) ## S3 method for class 'psFit' bootCI(x, ...)
x |
a object either of class |
... |
other arguments. |
level |
the confidence level required—restricted to [0.75, 1). This may be a vector, in which case multiple intervals, or confidence regions will be returned. |
B |
the number of bootstrap samples to take. |
model |
which model to fit to the data, either |
returnBootValues |
if |
silent |
if |
plot |
if |
parallel |
if |
progressBar |
if |
pbopts |
a list of arguments for the |
This function uses bootstrapping to compute a confidence interval for
the shape parameter in the case of the zeta model and a confidence region in
the case of the zero-inflated zeta model. A smoothed bootstrap approach is
taken rather than a simple percentile method. The kernel density estimation
is performed by the ks
package using a smoothed cross-validated
bandwidth selection procedure.
If returnBootVals == TRUE
then the results are returned in a
list with elements named ci
and bootVals
for the zeta model
and confRegion
and bootVals
for the zero-inflated zeta model.
The structure of ci
and confregion
is described below. If
model == "zeta"
, then either a vector
or a data.frame
with elements/columns named "lower"
and "upper"
representing
the lower and upper bounds of the confidence interval(s). Multiple bounds
are returned in a data.frame
when level
has more than one
value. If model == "ziz
"
, then a list with length equal to the
length of level
is returned. The name of each element in the list is
the level with
list has a single element named "95%"
. It is possible for there to
be multiple contours for the confidence region for a given level
. If
there is only one contour for each value of level
, then each element
of the list consists of a list
with elements named pi
and
shape
which specify the coordinates of the contour(s) for that level.
There is a third element named level
which gives the height of the
kernel density estimate at that contour. If there are multiple contours for
a given value of level
then each list element is a list of lists with
the structure given above (level
, pi
, and shape
). NOTE:
it is quite possible that there are multiple contours for a given height. If
you want a way of thinking about this consider a mountain range with two
mountains of equal height. If you draw the contours for (almost) any
elevation, then you would expect to capture a region from each mountain.
bootCI(default)
: Bootstrap confidence intervals or regions
bootCI(psData)
: Bootstrap confidence intervals or regions
bootCI(psFit)
: Bootstrap confidence intervals or regions
## Not run: data(Psurveys) roux = Psurveys$roux confRegion = bootCI(roux, model = "ziz", parallel = FALSE, plot = TRUE) ## This will not work unless you have the sp package installed ## Count how many of the points lie within the 95% confidence region lapply(confRegion, function(cr){ table(sp::point.in.polygon(fit$pi,fit$shape, cr$pi, cr$shape)) . }) ## End(Not run)
## Not run: data(Psurveys) roux = Psurveys$roux confRegion = bootCI(roux, model = "ziz", parallel = FALSE, plot = TRUE) ## This will not work unless you have the sp package installed ## Count how many of the points lie within the 95% confidence region lapply(confRegion, function(cr){ table(sp::point.in.polygon(fit$pi,fit$shape, cr$pi, cr$shape)) . }) ## End(Not run)
Compare two surveys on the basis of their shape parameters
compareSurveys(x, ...) ## Default S3 method: compareSurveys( x, y, xname = NULL, yname = NULL, alternative = c("two.sided", "less", "greater"), null.value = 0, print = TRUE, ... ) ## S3 method for class 'psData' compareSurveys(x, y, ...) ## S3 method for class 'psFit' compareSurveys(x, y, ...) compare.surveys(x, ...) comp.survs(x, ...)
compareSurveys(x, ...) ## Default S3 method: compareSurveys( x, y, xname = NULL, yname = NULL, alternative = c("two.sided", "less", "greater"), null.value = 0, print = TRUE, ... ) ## S3 method for class 'psData' compareSurveys(x, y, ...) ## S3 method for class 'psFit' compareSurveys(x, y, ...) compare.surveys(x, ...) comp.survs(x, ...)
x |
either an object of class |
y |
either an object of class |
xname |
an optional name for the first survey object. |
yname |
an optional name for the second survey object. |
alternative |
one of |
null.value |
the true value of the difference in the shape parameters under the null hypothesis. |
print |
if |
... |
further arguments to be passed to or from methods. |
This function **only** works for the zeta distribution. It does not work for the zero-inflated zeta distribution. If the results from fitting ZIZ models are passed to this function, then it will ignore the zero-inflated part and simply refit a zeta model.
There is very little reason for null.value
to be set to be anything other than 0
. However it has been included for flexibility.
alternative = "greater"
is the alternative that x has a larger shape parameter than y.
alternative = "less"
is the alternative that x has a smaller shape parameter than y.
The function returns a list
of class "htest"
with the following elements:
statistic
– the test statistic.
p.value
– the P-value associated with the estimate.
estimate
– the estimated difference in the shape parameters.
null.value
– the specified hypothesized value of the difference in shape parameters—0
by default.
stderr
– the standard error of the difference.
alternative
– a character string describing the alternative hypothesis.
method
– a character string describing the method.
data.name
– a character string with the names of the two input data sets separated by " and ".
compareSurveys(default)
: Compare two surveys on the basis of their shape parameters
compareSurveys(psData)
: Compare two surveys on the basis of their shape parameters
compareSurveys(psFit)
: Compare two surveys on the basis of their shape parameters
compare.surveys()
: Compare two surveys on the basis of their shape parameters
comp.survs()
: Compare two surveys on the basis of their shape parameters
data(Psurveys) lau = Psurveys$lau jackson = Psurveys$jackson compareSurveys(lau, jackson) ## Example with fitted objects - note the function just refits the models fit.lau = fitDist(lau) fit.jackson = fitDist(jackson) compareSurveys(fit.lau, fit.jackson) ## Example with a bigger difference compareSurveys(Psurveys$roux, lau)
data(Psurveys) lau = Psurveys$lau jackson = Psurveys$jackson compareSurveys(lau, jackson) ## Example with fitted objects - note the function just refits the models fit.lau = fitDist(lau) fit.jackson = fitDist(jackson) compareSurveys(fit.lau, fit.jackson) ## Example with a bigger difference compareSurveys(Psurveys$roux, lau)
Compare two or more surveys on the basis of their shape parameters using a Likelihood Ratio Test
compareSurveysLRT(...)
compareSurveysLRT(...)
... |
two or more objects of class |
This function **only** works for the zeta distribution. The function carries out a likelihood ratio test (LRT) to test the null hypothesis
versus the alternative
where is the shape parameter for the zeta distribution of the
survey.
The function returns a list
of class "htest"
with the following elements:
statistic
– the test statistic.
parameter
– the degrees of freedom for the test
p.value
– the P-value associated with the estimate.
method
– a character string describing the method hypothesis.
data.name
– the names of the data sets used in the test
data(Psurveys) lau = Psurveys$lau jackson = Psurveys$jackson compareSurveysLRT(lau, jackson) ## Example with three surveys roux = Psurveys$roux compareSurveysLRT(lau, jackson, roux)
data(Psurveys) lau = Psurveys$lau jackson = Psurveys$jackson compareSurveysLRT(lau, jackson) ## Example with three surveys roux = Psurveys$roux compareSurveysLRT(lau, jackson, roux)
S3 confint method for objects of class psFit
## S3 method for class 'psFit' confint(object, parm, level = 0.95, ...)
## S3 method for class 'psFit' confint(object, parm, level = 0.95, ...)
object |
an object of class |
parm |
added for compatibility. Should be left empty as it is ignored. |
level |
the confidence level required—restricted to [0.75, 1) |
... |
in theory other parameters to be passed to |
NOTE: the method for ZIZ model is a little computationally intensive and possibly (almost certainly) unstable.
if the zeta model is used (i.e object
comes from a call to
fitDist
),then a list with two items: wald
and
prof
containing the Wald and profile likelihood confidence intervals
respectively for the shape parameter of the fitted zeta distribution is
returned. In general these should be relatively close to each other.
**NOTE** These values are for the VGAM parameterisation of the Zeta
distribution which uses . This means they
can be used without alteration in
dzeta
. If a
zero-inflated zeta model is used (i.e. object
comes from a call to
fitZIDist
) then list of a confidence regions is returned with
and element for each value of level
. The confidence regions are
data.frame
s with variables pi
and shape
which can be
used with lines
or polygon
to draw a the confidence region.
data(Psurveys) roux = Psurveys$roux fit = fitDist(roux) confint(fit) ## Not run: fit.zi = fitZIDist(roux) cr = confint(fit.zi, level = c(0.80, 0.95)) plot(cr[["0.95"]], type = "l") polygon(cr[["0.8"]]) ## End(Not run)
data(Psurveys) roux = Psurveys$roux fit = fitDist(roux) confint(fit) ## Not run: fit.zi = fitZIDist(roux) cr = confint(fit.zi, level = c(0.80, 0.95)) plot(cr[["0.95"]], type = "l") polygon(cr[["0.8"]]) ## End(Not run)
This function uses maximum likelihood estimation (MLE) to estimate the shape parameter of a zeta distribution from a set of observed counts for either the number of groups/sources of forensically interesting material (mostly glass or paint) recovered from clothing, or the number of fragments/particles in each group. This, in turn, allows the estimation of the P and S probabilities, as described by Evett and Buckleton (1990), which used in computing the likelihood ratio (LR) for activity level propositions. The data itself arises from clothing surveys. The general method is described in Coulson et al. (2001), although poor typesetting, and a lack of definition of terms makes it hard to see. This package improves on the estimation in that linear interpolation is not required, and standard numerical optimisation is used instead. The zeta distribution has probability mass function
where
is the Reimann Zeta function. Coulson et al. (2001) did not have an easy way
to rapidly compute this quantity, hence their use of linear interpolation.
fitDist(x, nterms = 10, start = 1, ...) fitdist(x, nterms = 10, start = 1, ...)
fitDist(x, nterms = 10, start = 1, ...) fitdist(x, nterms = 10, start = 1, ...)
x |
an object of type |
nterms |
the number of terms to compute the probability distribution for. |
start |
a starting value for the optimiser. |
... |
other parameters - not currently used. |
The function returns an object of class psFit
which is a
list
contains four elements:
psData
– an object of class psData
–see readData
,
fit
– the fitted object from optim
,
shape
– the maximum likelihood estimate of the shape parameter,
var.shape
- the maximum likelihood estimate of the shape parameter,
fitted
- a named vector
containing the first nterms of
the fitted distribution.
model
- set to "zeta"
for this model
.
The output can be used in a variety of ways. If the interest is just in the
shape parameter estimate, then the shape
member of the psFit
object contains this information. It is also displayed along with a number
of fitted probabilities by the print.psFit
method. The fitted
object can also be plotted using the plot method plot.psFit
,
and to create a probability function with probfun
. **NOTE**
The value of the shape parameter that is printed (if you print the fitted
object) is different from that value that is stored in shape
. The
stored value is for the VGAM parameterisation of the Zeta
distribution which uses . Therefore the
printed value is
. If you intend to use
the fitted value with
dzeta
, then you should use the
stored value .
If start
is not specified, then it is chosen randomly from (0.5, 1).
The reason the lower value is not zero is that small starting values seem
to cause instability in the likelihood. If you specify your own starting
value, it would be sensible to keep it above 0.5.
an object of class psFit
–see Details.
fitdist()
: Fit a Zeta Distribution to Forensic Data
export
Coulson, S. A., Buckleton, J. S., Gummer, A. B., and Triggs, C.M., "Glass on clothing and shoes of members of the general population and people suspected of breaking crimes", Science & Justice 2001: 41(1): 39–48.
Evett, I. W. and Buckleton, J. S., "The interpretation of glass evidence. A practical approach", Journal of the Forensic Science Society 1990: 30(4): 215–223.
plot.psFit
, print.psFit
,
probfun
.
p = readData(system.file("extdata", "p.xlsx", package = "fitPS")) fit = fitDist(p) fit
p = readData(system.file("extdata", "p.xlsx", package = "fitPS")) fit = fitDist(p) fit
psFit
S3 fitted method for an object of class psFit
## S3 method for class 'psFit' fitted(object, n = NULL, ...)
## S3 method for class 'psFit' fitted(object, n = NULL, ...)
object |
an object of class |
n |
This parameter is |
... |
other arguments passed to |
a named vector of fitted probabilities
This function uses maximum likelihood estimation (MLE) to estimate mixing parameter and the shape parameter of a zero-inflated zeta distribution from a set of observed counts for either the number of groups/sources of forensically interesting material (mostly glass or paint) recovered from clothing, or the number of fragments/particles in each group. This, in turn, allows the estimation of the P and S probabilities, as described by Evett and Buckleton (1990), which used in computing the likelihood ratio (LR) for activity level propositions. The data itself arises from clothing surveys. The zero-inflated zeta distribution has probability mass function
where is the Reimann Zeta function.
fitZIDist(x, nterms = 10, start = c(0.5, 1), ...) fitZIdist(x, nterms = 10, start = c(0.5, 1), ...) fitzidist(x, nterms = 10, start = c(0.5, 1), ...)
fitZIDist(x, nterms = 10, start = c(0.5, 1), ...) fitZIdist(x, nterms = 10, start = c(0.5, 1), ...) fitzidist(x, nterms = 10, start = c(0.5, 1), ...)
x |
an object of type |
nterms |
the number of terms to compute the probability distribution for. |
start |
a starting value for the optimiser. |
... |
other parameters - not currently used. |
The function returns an object of class psFit
which is a
list
contains seven elements:
psData
– an object of class psData
–see readData
,
fit
– the fitted object from optim
,
pi
- the maximum likelihood estimate of the mixing parameter,
shape
– the maximum likelihood estimate of the shape parameter,
var.cov
– the estimated variance-covariance matrix for the parameters,
fitted
– a named vector
containing the first nterms of
the fitted distribution.
model
– set to "ziz"
for this model.
The output can be used in a variety of ways. If the interest is just in the
mixing and shape parameter estimates, then the pi
and shape
member of the psFit
object contains this information. It is also
displayed along with a number of fitted probabilities by the
print.psFit
method. The fitted object can also be plotted
using the plot method plot.psFit
, and to create a probability
function with probfun
. **NOTE** The value of the shape
parameter that is printed (if you print the fitted object) is different
from that value that is stored in shape
. The stored value is for the
VGAM parameterisation of the zeta distribution which uses
. Therefore the printed value is
. If you intend to use the fitted value with
dzeta
, then you should use the stored value
.
If start
is not specified, then it is set to (0.5, 1). The reason
the starting values are not zero is that small starting values seem to
cause instability in the likelihood. If you specify your own starting
value, it would be sensible to keep both above 0.5.
an object of class psFit
–see Details.
Evett, I. W. and Buckleton, J. S., "The interpretation of glass evidence. A practical approach", Journal of the Forensic Science Society 1990: 30(4): 215–223.
plot.psFit
, print.psFit
,
probfun
.
data(Psurveys) roux = Psurveys$roux fit = fitZIDist(roux) fit
data(Psurveys) roux = Psurveys$roux fit = fitZIDist(roux) fit
Create a survey data set from the command line rather than reading data in from a file. This function is likely to be only useful where there are a very small number of group sizes, or sizes of groups of glass.
makePSData(n, count = NULL, type = c("P", "S"), notes = NULL) makeData(n, count = NULL, type = c("P", "S"), notes = NULL) createPSData(n, count = NULL, type = c("P", "S"), notes = NULL)
makePSData(n, count = NULL, type = c("P", "S"), notes = NULL) makeData(n, count = NULL, type = c("P", "S"), notes = NULL) createPSData(n, count = NULL, type = c("P", "S"), notes = NULL)
n |
Either the number of groups of glass or the size of different groups
of glass, or a |
count |
Either the number of people in the survey sample who had
|
type |
either |
notes |
a |
If count
is NULL
, then it is assumed that n
consists of actual observed group sizes or numbers of groups of glass found
on a survey of N individuals. That is, one could provide n = rep(0:1,
98, 1)
or n = 0:1, count = c(98, 1)
. The former is more useful when
performing simulation studies.
an object of type psData
—see readData
for more
details.
readData
## recreate the data read in the readData example p1 = makePSData(n = c(0, 1, 2), count = c(98, 1, 1), type = "P") s1 = makePSData(n = 1:3, count = c(1, 1, 1), type = "S") p1 s1
## recreate the data read in the readData example p1 = makePSData(n = c(0, 1, 2), count = c(98, 1, 1), type = "P") s1 = makePSData(n = 1:3, count = c(1, 1, 1), type = "S") p1 s1
An S3 method for computing the mean of clothing survey for the number of groups or size of groups
## S3 method for class 'psData' mean(x, ...)
## S3 method for class 'psData' mean(x, ...)
x |
an object of class |
... |
other arguments which are passed to |
the mean of the data. If there are observations of
the value
then the mean is given by
.
data(Psurveys) mean(Psurveys$roux)
data(Psurveys) mean(Psurveys$roux)
psFit
S3 plot method for an object of class psFit
## S3 method for class 'psFit' plot( x, ylim = c(0, 1), conf = FALSE, conf.level = 0.95, ci.type = c("wald", "prof"), log.scale = FALSE, ... )
## S3 method for class 'psFit' plot( x, ylim = c(0, 1), conf = FALSE, conf.level = 0.95, ci.type = c("wald", "prof"), log.scale = FALSE, ... )
x |
an object of class |
ylim |
the limits of the y-axis. |
conf |
if |
conf.level |
the confidence level for the confidence intervals. Must be between 0.75 and 0.99. |
ci.type |
Specifies the type of confidence interval. If |
log.scale |
if |
... |
other arguments passed to |
No return value, called for side effects
p = readData(system.file("extdata", "p.xlsx", package = "fitPS")) fit = fitDist(p) plot(fit) ## An example with Wald generated intervals plot(fit, conf = TRUE) plot(fit, conf = TRUE, ci.type = "p")
p = readData(system.file("extdata", "p.xlsx", package = "fitPS")) fit = fitDist(p) plot(fit) ## An example with Wald generated intervals plot(fit, conf = TRUE) plot(fit, conf = TRUE, ci.type = "p")
psFit
S3 predict method for an object of class psFit
## S3 method for class 'psFit' predict( object, newdata, interval = c("none", "prof", "wald"), level = 0.95, ... )
## S3 method for class 'psFit' predict( object, newdata, interval = c("none", "prof", "wald"), level = 0.95, ... )
object |
an object of class |
newdata |
an optional vector of integers at which to calculate
|
interval |
either |
level |
the level of a confidence interval. Ignored if |
... |
other arguments passed to |
either a named vector of fitted probabilities, or a data.frame
with columns predicted
, lower
, and upper
and the row
names set to show what terms are being calculatd
data(Psurveys) roux = Psurveys$roux fit = fitDist(roux) predict(fit, interval = "prof")
data(Psurveys) roux = Psurveys$roux fit = fitDist(roux) predict(fit, interval = "prof")
psData
S3 print method for an object of class psData
## S3 method for class 'psData' print(x, ...)
## S3 method for class 'psData' print(x, ...)
x |
an object of class |
... |
other arguments passed to |
No return value, called for side effects
psFit
S3 print method for an object of class psFit
## S3 method for class 'psFit' print(x, ...)
## S3 method for class 'psFit' print(x, ...)
x |
an object of class |
... |
other arguments passed to |
No return value, called for side effects.
Creates a probability function that allows the computation of any P or S term.
probfun(psFitobj)
probfun(psFitobj)
psFitobj |
a function that can be used to calculate any P or S term.
p = readData(system.file("extdata", "p.xlsx", package = "fitPS")) fit = fitDist(p) P = probfun(fit) P(0:5)
p = readData(system.file("extdata", "p.xlsx", package = "fitPS")) fit = fitDist(p) P = probfun(fit) P(0:5)
Count data from six different surveys looking at the number of sources/groups of glass found on the upper surfaces of clothing taken from the general public.
data(Psurveys)
data(Psurveys)
A list
with nine objects of class psData
—see
readData
for more details. The elements of the list are
named: coulson
, jackson
, lau
, lewis.all
,
lewis.clothing
, lewis.shoes
, pettard
,
ross
, and roux
, corresponding to the lead author in each of
the references given below. lau
, pettard
, and ross
were taken from Coulson et al. (2001) rather than the original source. The three
objects starting with lewis
represent the combined data (all
),
the groups of glass found on the outer clothing (clothing
), and the groups of
glass found on shoes/footwear (shoes
).
Coulson, S. A., Buckleton, J. S., Gummer, A. B., and Triggs, C. M. (2001) doi:10.1016/S1355-0306(01)71847-3 Glass on clothing and shoes of members of the general population and people suspected of breaking crimes, Science & Justice, 41(1):39–48.
Lau L, Beveridge AD, Callowhill BC, Conners N, Foster K, Groves RJ, Ohashi KN, Sumner AM, Wong H (1997). “The Frequency of Occurrence of Paint and Glass on the Clothing of High School Students.” Canadian Society of Forensic Science Journal, 30(4), 233–240. doi:10.1080/00085030.1997.10757103.
Lewis AD, Alexander LC, Ovide O, Duffett O, Curran JM, Buzzini P, Trejos T (2023). “A study on the occurrence of glass and paint across various cities in the United States—Part I: Background presence of glass in the general population.” Forensic Chemistry, 34, 100497. doi:10.1016/j.forc.2023.100497.
Petterd CI, McCallum I, Bradford L, Brinch K, Stewart S (1998). “Glass particles in the clothing of the general population in Canberra—a survey.” In Proceedings of the 14th International Symposium on the Forensic Sciences.
Ross P, Nguyen H (1998). “A survey of clothing for the presence of glass fragments.” In Proceedings of the 14th International Symposium on the Forensic Sciences.
Roux C, Kirk R, Benson S, Van Haren T, Petterd CI (2001). “Glass particles in footwear of members of the public in south-eastern Australia—a survey.” Forensic Science International, 116(2), 149–156. doi:10.1016/S0379-0738(00)00355-8.
Jackson F, Maynard P, Cavanagh-Steer K, Dusting T, Roux C (2013). “A survey of glass found on the headwear and head hair of a random population vs. people working with glass.” Forensic Science International, 226(1), 125–131. doi:10.1016/j.forsciint.2012.12.017.
Reads observed counts of either the number of groups or the size of the groups. The file must have only two columns. One of the columns must be labelled P or S and the other count. It does not matter if the column names are in upper case or not. The P column can have labels 0, 1, 2, ... representing the observation of 0, 1, 2, or more groups. The corresponding count column should contain a positive (non-zero) count for each number of groups. Similarly, if the file contains S counts, then the S column can contain labels 1, 2, ... representing the observation of 1, 2, ... fragments in a group. Note that zeros are neither allowed, or useful, in the file as they both simply result in log-likelihood terms of zero, and therefore make no difference.
readData(fileName, notes = NULL, ...)
readData(fileName, notes = NULL, ...)
fileName |
the name of the file to be read. Must be either a modern (xlsx) Excel file or a csv file. |
notes |
any additional information about the data, such as the source or a reference. |
... |
any additional parameters which will be passed to either
|
an object of class psData
which is a list containing member
variables:
type
– either "P"
or "S"
data
– a data.frame
which contains columns
n
and rn
, representing the number of groups/fragments, and the
number of times that was seen, respectively.
notes
— either a bibentry
or a character
string which allows extra information about the data to be stored, such as
the source, or reference.
p = readData(system.file("extdata", "p.xlsx", package = "fitPS")) p s = readData(system.file("extdata", "s.xlsx", package = "fitPS")) s
p = readData(system.file("extdata", "p.xlsx", package = "fitPS")) p s = readData(system.file("extdata", "s.xlsx", package = "fitPS")) s
Generate random variates from a zeta distribution
rzeta(n, shape)
rzeta(n, shape)
n |
Same as |
shape |
The positive shape parameter
. See |
Generate zero inflated zeta random variates
rZIzeta(n, pi = 0.5, shape = 1, offset = 0) rzizeta(n, pi = 0.5, shape = 1, offset = 0) rzizeta(n, pi = 0.5, shape = 1, offset = 0)
rZIzeta(n, pi = 0.5, shape = 1, offset = 0) rzizeta(n, pi = 0.5, shape = 1, offset = 0) rzizeta(n, pi = 0.5, shape = 1, offset = 0)
n |
the number of observations. |
pi |
the mixing parameter for the zero-inflated zeta model—must be in (0, 1). |
shape |
the shape parameter for the zero-inflated zeta. Must be greater than zero. |
offset |
the zeta distribution returns random variates that are greater
than, or equal to one. If the offset is greater than 0, then the
distribution is anchored on (has minimum value of) |
Technically this function returns values from the one-inflated zeta
distribution. However, if offset
is greater than zero (and typically
we expect it to be 1), then the minimium random variate value is 1 -
offset
. We chose the name "zero-inflated zeta" as more people are familiar
with zero-inflated models.
a vector of random variates from a zero-inflated zeta model
data(Psurveys) roux = Psurveys$roux fit.zi = fitZIDist(roux) x = rZIzeta(n = sum(roux$data$rn), pi = fit.zi$pi, shape = fit.zi$shape) table(x)
data(Psurveys) roux = Psurveys$roux fit.zi = fitZIDist(roux) x = rZIzeta(n = sum(roux$data$rn), pi = fit.zi$pi, shape = fit.zi$shape) table(x)
Count data from six different surveys looking at the number of sources/groups of glass found on the upper surfaces of clothing taken from the general public.
data(Psurveys)
data(Psurveys)
A list
with five objects of class psData
—see
readData
for more details. The elements of the list are
named: jackson
, lau
, pettard
, ross
, and
roux
, corresponding to the lead author in each of the references
given below. lau
, pettard
, and ross
were taken from
Coulson et al. (2001) rather than the original source.
Coulson, S. A., Buckleton, J. S., Gummer, A. B., and Triggs, C. M. (2001) doi:10.1016/S1355-0306(01)71847-3 Glass on clothing and shoes of members of the general population and people suspected of breaking crimes, Science & Justice, 41(1):39–48.
Lau L, Beveridge AD, Callowhill BC, Conners N, Foster K, Groves RJ, Ohashi KN, Sumner AM, Wong H (1997). “The Frequency of Occurrence of Paint and Glass on the Clothing of High School Students.” Canadian Society of Forensic Science Journal, 30(4), 233–240. doi:10.1080/00085030.1997.10757103.
Petterd CI, McCallum I, Bradford L, Brinch K, Stewart S (1998). “Glass particles in the clothing of the general population in Canberra—a survey.” In Proceedings of the 14th International Symposium on the Forensic Sciences.
Ross P, Nguyen H (1998). “A survey of clothing for the presence of glass fragments.” In Proceedings of the 14th International Symposium on the Forensic Sciences.
Coulson SA, Buckleton JS, Gummer AB, Triggs CM (20011). “Glass on clothing and shoes of members of the general population and people suspected of breaking crimes.” Science & Justice, 41(1), 39–48. doi:10.1016/S1355-0306(01)71847-3.
Roux C, Kirk R, Benson S, Van Haren T, Petterd CI (2001). “Glass particles in footwear of members of the public in south-eastern Australia—a survey.” Forensic Science International, 116(2), 149–156. doi:10.1016/S0379-0738(00)00355-8.
Jackson F, Maynard P, Cavanagh-Steer K, Dusting T, Roux C (2013). “A survey of glass found on the headwear and head hair of a random population vs. people working with glass.” Forensic Science International, 226(1), 125–131. doi:10.1016/j.forsciint.2012.12.017.
psFit
S3 summary method for an object of class psFit
## S3 method for class 'psFit' summary(object, ...)
## S3 method for class 'psFit' summary(object, ...)
object |
|
... |
other arguments passed to |
Experimental because I am unsure if it is useful. If object
is a zero-inflated zeta fitted object,
then the function carries out a likelihood ratio test for the value of pi. Currently not implemented for the logarithmic
distribution because we are currently not interested in the logarithmic distribution.
No return value, called for side effects
Variance generic
var(x, ...)
var(x, ...)
x |
an object for which we want to compute the sample variance. |
... |
Any additional arguments to be passed to |
An S3 method for computing the variance of clothing survey for the number of groups or size of groups
## S3 method for class 'psData' var(x, ...)
## S3 method for class 'psData' var(x, ...)
x |
an object of class |
... |
other arguments which are passed to |
the mean of the data. If there are observations of
the value
then the variance is computed by
, where
is computed using
, and
is computed by
. We realise that the
computational formula,
, is usually not
regarded as computationally stable, but the magnitude of the numbers
involved is such that, that this is not likely to cause an issue.
data(Psurveys) var(Psurveys$roux)
data(Psurveys) var(Psurveys$roux)