Title: | Multivariate Likelihood Ratio Calculation and Evaluation |
---|---|
Description: | Functions for calculating and evaluating likelihood ratios from uni/multivariate continuous observations. The package includes the two-level functions to calculate the LR assuming multivariate normality, and another with drops this assumption and uses a multivariate kernel density estimate. The package also contains code to perform empirical cross entropy (ECE) calibration of likelihood ratios. The LR functions are based primarily on Aitken, C.G.G. and Lucy, D. (2004) <doi:10.1046/j.0035-9254.2003.05271.x>, "Evaluation of trace evidence in the form of multivariate data," Journal of the Royal Statistical Society: Series C (Applied Statistics), 53: 109-122. The ECE functions are based primarily on D. Ramos and J. Gonzalez-Rodrigues, (2008) "Cross-entropy analysis of the information in forensic speaker recognition," in Proc. IEEE Odyssey, Speaker Lang. Recognit. Workshop. |
Authors: | David Lucy [aut], James Curran [aut, cre], Agnieszka Martyna [aut] |
Maintainer: | James Curran <[email protected]> |
License: | GPL (>=2) |
Version: | 1.0-5 |
Built: | 2025-03-04 02:46:12 UTC |
Source: | https://github.com/jmcurran/comparison |
Calculates the empirical cross-entropy (ECE) for likelihood ratios from a sequence same and different item comparisons.
calc.ece(LR.ss, LR.ds, prior = seq(from = 0.01, to = 0.99, length = 99))
calc.ece(LR.ss, LR.ds, prior = seq(from = 0.01, to = 0.99, length = 99))
LR.ss |
a vector of likelihood ratios (LRs) from same source calculations |
LR.ds |
a vector of LRs from different source calculations |
prior |
a vector of ordinates for the prior in ascending order, and between 0 and 1. Default is 99 divisions of 0.01 to 0.99. |
The function to calculate the values of the likelihood ratio for the
calibrated.set
draws heavily upon the opt_loglr.m
function from
Niko Brummer's FoCal package for Matlab.
Returns an S3 object of class ece
David Lucy
@references D. Ramos and J. Gonzalez-Rodrigues, (2008) "Cross-entropy analysis of the information in forensic speaker recognition," in Proc. IEEE Odyssey, Speaker Lang. Recognit. Workshop. Zadora, G. & Ramos, D. (2010) Evaluation of glass samples for forensic purposes - an application of likelihood ratio model and information-theoretical approach. Chemometrics and Intelligent Laboratory: 102; 63-83.
isotone::gpava()
, calibrate.set()
LR.same = c(0.5, 2, 4, 6, 8, 10) # the same has 1 LR < 1 LR.different = c(0.2, 0.4, 0.6, 0.8, 1.1) # the different has 1 LR > 1 ece.1 = calc.ece(LR.same, LR.different) # simplest invocation plot(ece.1) # use plot method
LR.same = c(0.5, 2, 4, 6, 8, 10) # the same has 1 LR < 1 LR.different = c(0.2, 0.4, 0.6, 0.8, 1.1) # the different has 1 LR > 1 ece.1 = calc.ece(LR.same, LR.different) # simplest invocation plot(ece.1) # use plot method
Calculates and returns the calibrated set of ideal' LRs from the observed LRs using the penalised adjacent violators algorithm. This is very much a rewrite of Nico Brummer's
optloglr()' function for Matlab.
calibrate.set(LR.ss, LR.ds, method = c("raw", "laplace"))
calibrate.set(LR.ss, LR.ds, method = c("raw", "laplace"))
LR.ss |
a vector of likelihood ratios for the comparisons of items known to be from the same source |
LR.ds |
a vector of likelihood ratios for the comparisons of items known to be from different sources |
method |
the method used to perform the calculation, either |
This is an internal function, and is not meant to be called directly. However it has been exported just in case.
a list
with two items:
calibrated LRs for the comparison for same set
calibrated LRs for the comparison for different set
David Lucy
D. Ramos and J. Gonzalez-Rodrigues, (2008) "Cross-entropy analysis of the information in forensic speaker recognition," in Proc. IEEE Odyssey, Speaker Lang. Recognit. Workshop.
This package is for computing the weight of the evidence, i.e. the
likelihood ratio (LR) for trace evidence which has been quantified with
some instrument. For example a forensic scientist might be have determined
the refractive indices of fragments of glass taken from a crime scene and
fragments of glass recovered from the clothing of the suspected breaker. This
package evaluates the probability (density) of the evidence, , (the RI
values from the two samples) under the hypothesis
that they
originated from the same source, and alternatively under the hypothesis
that they originated from another source. The LR is the ratio of
these two quantities, i.e.
.
A which is greater than one indicates that the evidence supports
, and a
which is less than one indicates that the evidence supports
.
The computation can use either univariate or multivariate observations of a physical object. For example trace element measurements, and a similar set of uni/multivariate observations from another object, and calculates a likelihood ratio for the propositions that the first item came from the same source as the second given some population data.
In a package of functions such as these which have undergone a long development over a number of years, it is inevitable that a number of people, besides those directly cited, have helped to correct and add to the code. These people are (in alphabetical order): Ivo Alberink (NFI), Anabel Bolck (NFI), Sonja Menges (BKA), Geoff Morrison (Aston), Tereza Neocleous (Glasgow), Anders Nordgaard (SKL), Brad Patterson (George Mason), Phil Rose (ANU), Agnieszka Rzepecka (Jagiellonian), Marjan Sjerps (NFI) and Hanjing Zhang (Edinburgh).
Aitken, C.G.G. & Lucy, D. (2004) Evaluation of trace evidence in the form of multivariate data. Applied Statistics: 53(1):109-122.
These data are from Grzegorz (Greg) Zadora at the Institute of Forensic Research in Krakow, Poland. They are the log of the ratios of each element to oxygen, so logNaO is the log(10) of the Sodium to Oxygen ratio, and logAlO is the log of the Aluminium to Oxygen ratio. The instrumental method was SEM-EDX.
data(glass)
data(glass)
a data.frame
with 2400 rows and 9 columns.
factor
200 levels - which item the measurements came from
factor
4 levels - which of the four fragments from each item the observations were made upon
numeric
log of sodium concentration to oxygen concentration
numeric
log of magnesium concentration to oxygen concentration
numeric
log of aluminium concentration to oxygen concentration
numeric
log of silicon concentration to oxygen concentration
numeric
log of potassium concentration to oxygen concentration
numeric
log of calcium concentration to oxygen concentration
numeric
log of iron concentration to oxygen concentration
The item
indicates the object the glass came from. The levels for each item
are unique to that item. The fragment
can be considered a sub-item. When
collecting these observations Greg took a glass object, say a jam jar, he
would then break it, and extract four fragments. Each fragment would be
measured three times upon different parts of that fragment. The fragment
labels are repeated, so, for example, fragment "f1" from item "s2" has
nothing whatsoever to do with fragment "f1" from item "s101".
For two level models use item
as the lower level - three level models can
use the additional information from the individual fragments.
Grzegorz Zadora Institute of Forensic Research, Krakow, Poland.
Aitken, C.G.G. Zadora, G. & Lucy, D. (2007) A Two-Level Model for Evidence Evaluation. Journal of Forensic Sciences: 52(2); 412-419.
ece
An S3 plot method for objects of class ece
## S3 method for class 'ece' plot(x, ...)
## S3 method for class 'ece' plot(x, ...)
x |
an S3 object of class |
... |
other arguments that are passed to the |
David Lucy
compitem
object.This function creates a compitem
object from a data.frame
or matrix
of
observations from an item to be deemed a control, or a recovered, item.
two.level.comparison.items(data, data.columns)
two.level.comparison.items(data, data.columns)
data |
a |
data.columns |
vector of integers giving which columns in |
an object of class compitem
# load Greg Zadora's glass data data(glass) # calculate a compitem object representing the control item control = two.level.comparison.items(glass[1:6,], c(7,8,9))
# load Greg Zadora's glass data data(glass) # calculate a compitem object representing the control item control = two.level.comparison.items(glass[1:6,], c(7,8,9))
Takes a large sample from the background population and calculates the within
and between covariance matrices, a vector of means, a vector of the counts of
replicates for each item from the sample, and other bits needed to make up a
compcovar
object.
two.level.components(data, data.columns, item.column)
two.level.components(data, data.columns, item.column)
data |
a |
data.columns |
a |
item.column |
an integer indicating which column gives the item |
Uses ML estimation at the moment - this will almost certainly change in the future and hopefully allow regularisation methods to get a more stable (and non-singular) estimate.
an object of class compvar
# load Greg Zadora's glass data data(glass) # calculate a compcovar object based upon glas # using K, Ca and Fe - warning - could take time # on slower machines Z = two.level.components(glass, c(7,8,9), 1)
# load Greg Zadora's glass data data(glass) # calculate a compcovar object based upon glas # using K, Ca and Fe - warning - could take time # on slower machines Z = two.level.components(glass, c(7,8,9), 1)
Takes a compitem
object which represents some control item, and a
compitem
object which represents a recovered item, then uses information
from a compcovar
object, which represents the
information from the population, to calculate a likelihood ratio as a measure
of the evidence given by the observations for the same/different source
propositions.
two.level.density.LR(control, recovered, background)
two.level.density.LR(control, recovered, background)
control |
a |
recovered |
a |
background |
a |
an estimate of the likelihood ratio
Aitken, C.G.G. & Lucy, D. (2004) Evaluation of trace evidence in the form of multivariate data. Applied Statistics: 53(1); 109-122.
library(comparison) # load Greg Zadora's glass data data(glass) # calculate a compcovar object based upon glass # using K, Ca and Fe - warning - could take time # on slower machines Z = two.level.components(glass, c(7,8,9), 1) # calculate a compitem object representing the control item control = two.level.comparison.items(glass[1:6,], c(7,8,9)) # calculate a compitem object representing the recovered item # known to be from the same item (item 1) recovered.1 = two.level.comparison.items(glass[7:12,], c(7,8,9)) # calculate a compitem object representing the recovered item # known to be from a different item (item 2) recovered.2 = two.level.comparison.items(glass[19:24,], c(7,8,9)) # calculate the likelihood ratio for a known # same source comparison - should be 20.59322 # 2020-08-01 Both this version and the previous version return 20.58967 lr.1 = two.level.density.LR(control, recovered.1, Z) lr.1 # calculate the likelihood ratio for a known # different source comparison - should be 0.02901532 # 2020-08-01 Both this version and the previous version return 0.01161392 lr.2 = two.level.density.LR(control, recovered.2, Z) lr.2
library(comparison) # load Greg Zadora's glass data data(glass) # calculate a compcovar object based upon glass # using K, Ca and Fe - warning - could take time # on slower machines Z = two.level.components(glass, c(7,8,9), 1) # calculate a compitem object representing the control item control = two.level.comparison.items(glass[1:6,], c(7,8,9)) # calculate a compitem object representing the recovered item # known to be from the same item (item 1) recovered.1 = two.level.comparison.items(glass[7:12,], c(7,8,9)) # calculate a compitem object representing the recovered item # known to be from a different item (item 2) recovered.2 = two.level.comparison.items(glass[19:24,], c(7,8,9)) # calculate the likelihood ratio for a known # same source comparison - should be 20.59322 # 2020-08-01 Both this version and the previous version return 20.58967 lr.1 = two.level.density.LR(control, recovered.1, Z) lr.1 # calculate the likelihood ratio for a known # different source comparison - should be 0.02901532 # 2020-08-01 Both this version and the previous version return 0.01161392 lr.2 = two.level.density.LR(control, recovered.2, Z) lr.2
Takes a compitem
object which represents some control item, and a
compitem
object which represents a recovered item, then uses information
from a compcovar
object, which represents the information from the
population, to calculate a likelihood ratio as a measure of the evidence
given by the observations for the same/different source propositions.
two.level.lindley.LR(control, recovered, background)
two.level.lindley.LR(control, recovered, background)
control |
a |
recovered |
a |
background |
a |
Does the likelihood ratio calculations for a two-level model assuming that the between item distribution is univariate normal. This function is taken from the approach devised by Denis Lindley in his 1977 paper (details below) and represents the progenitor of all the functions in this package.
an estimate of the likelihood ratio
David Lucy
Lindley, D. (1977) A problem in forensic Science. Biometrika: 64; 207-213.
# load Greg Zadora's glass data data(glass) # calculate a compcovar object based upon dat # using K Z = two.level.components(glass, 7, 1) # calculate a compitem object representing the control item control = two.level.comparison.items(glass[1:6,], 7) # calculate a compitem object representing the recovered item # known to be from the same item (item 1) recovered.1 = two.level.comparison.items(glass[7:12,], 7) # calculate a compitem object representing the recovered item # known to be from a different item (item 2) recovered.2 = two.level.comparison.items(glass[19:24,], 7) # calculate the likelihood ratio for a known # same source comparison - should be 6.323941 # This value is 6.323327 in this version and in the last version written by David (1.0-4) lr.1 = two.level.lindley.LR(control, recovered.1, Z) lr.1 # calculate the likelihood ratio for a known # different source comparison - should be 0.004422907 # This value is 0.004421978 in this version and the last version written by David (1.0-4) lr.2 = two.level.lindley.LR(control, recovered.2, Z) lr.2
# load Greg Zadora's glass data data(glass) # calculate a compcovar object based upon dat # using K Z = two.level.components(glass, 7, 1) # calculate a compitem object representing the control item control = two.level.comparison.items(glass[1:6,], 7) # calculate a compitem object representing the recovered item # known to be from the same item (item 1) recovered.1 = two.level.comparison.items(glass[7:12,], 7) # calculate a compitem object representing the recovered item # known to be from a different item (item 2) recovered.2 = two.level.comparison.items(glass[19:24,], 7) # calculate the likelihood ratio for a known # same source comparison - should be 6.323941 # This value is 6.323327 in this version and in the last version written by David (1.0-4) lr.1 = two.level.lindley.LR(control, recovered.1, Z) lr.1 # calculate the likelihood ratio for a known # different source comparison - should be 0.004422907 # This value is 0.004421978 in this version and the last version written by David (1.0-4) lr.2 = two.level.lindley.LR(control, recovered.2, Z) lr.2
Takes a compitem
object which represents some control item, and a
compitem
object which represents a recovered item, then uses information
from a compcovar
object, which represents the information from the
population, to calculate a likelihood ratio as a measure of the evidence
given by the observations for the same/different source propositions.
two.level.normal.LR(control, recovered, background)
two.level.normal.LR(control, recovered, background)
control |
a |
recovered |
a |
background |
a |
Does the likelihood ratio calculations for a two-level model assuming that the between item distribution is uni/multivariate normal.
an estimate of the likelihood ratio
Agnieszka Martyna and David Lucy
Aitken, C.G.G. & Lucy, D. (2004) Evaluation of trace evidence in the form of multivariate data. Applied Statistics: 53(1); 109-122.
# load Greg Zadora's glass data data(glass) # calculate a compcovar object based upon glass # using K, Ca and Fe - warning - could take time # on slower machines Z <- two.level.components(glass, c(7,8,9), 1) # calculate a compitem object representing the control item control <- two.level.comparison.items(glass[1:6,], c(7,8,9)) # calculate a compitem object representing the recovered item # known to be from the same item (item 1) recovered.1 <- two.level.comparison.items(glass[7:12,], c(7,8,9)) # calculate a compitem object representing the recovered item # known to be from a different item (item 2) recovered.2 <- two.level.comparison.items(glass[19:24,], c(7,8,9)) # calculate the likelihood ratio for a known # same source comparison - should be 51.16539 # This value is 51.14243 in this version and the last version David wrote (1.0-4) lr.1 <- two.level.normal.LR(control, recovered.1, Z) lr.1 # calculate the likelihood ratio for a known # different source comparison - should be 0.02901532 # This vsalue is 0.02899908 in this version and the last version David wrote (1.0-4) lr.2 <- two.level.normal.LR(control, recovered.2, Z) lr.2
# load Greg Zadora's glass data data(glass) # calculate a compcovar object based upon glass # using K, Ca and Fe - warning - could take time # on slower machines Z <- two.level.components(glass, c(7,8,9), 1) # calculate a compitem object representing the control item control <- two.level.comparison.items(glass[1:6,], c(7,8,9)) # calculate a compitem object representing the recovered item # known to be from the same item (item 1) recovered.1 <- two.level.comparison.items(glass[7:12,], c(7,8,9)) # calculate a compitem object representing the recovered item # known to be from a different item (item 2) recovered.2 <- two.level.comparison.items(glass[19:24,], c(7,8,9)) # calculate the likelihood ratio for a known # same source comparison - should be 51.16539 # This value is 51.14243 in this version and the last version David wrote (1.0-4) lr.1 <- two.level.normal.LR(control, recovered.1, Z) lr.1 # calculate the likelihood ratio for a known # different source comparison - should be 0.02901532 # This vsalue is 0.02899908 in this version and the last version David wrote (1.0-4) lr.2 <- two.level.normal.LR(control, recovered.2, Z) lr.2