If you found this file in an archive then use keyword "nutteingy" in a
search engine to find an updated version or related pages.
Updated file March 2006
Despite official government sites linking to these files there are
still corrupt persons knocking out my sites, so for the
purposes of searchengines cross-linking them, files no longer
available on the original web hosting sites were on
http://www.nutteing.50megs.com/dnay.htm , http://www.nutteing.freeisp.co.uk/dnay.htm,
http://nutteing.no-frills.net/dnay.htm and http://nutteing3.no-frills.net/dnay.htm (last 2 due now to host failure)
http://www.nutteing.batcave.net/dnay.htm , http://home.graffiti.net/nutteing/dnay.htm
Articles posted to moderated Yahoo community group for forensic science
personel starting 16 July 2003,title of thread "Problems with DNA Profiles" under
my nom de guerre of Nona Revers or as it turns up there nonarevers.
Text in brown is my contribution - due to the nature of such dialogues the thread can get rather broken as it splits into different sub-themes at different times and involving different people.
It shows the very dangerous blinkered mindset of forensic scientists. I have to assume the same attitude is prevalent within the police and the judiciary.
Problems that even Prof Sir Alec Jeffreys will not address.
DNA profiles for unrelated people are very far from
Cases from real life
1/ http://tinyurl.com/dtfe Raymond Easton of Swindon ,Newspaper
15 August 2000
or if not showing there then
Seriously disabled but arrested for a cat burglary 200 miles away.
Peter Hamkin of Liverpool,
newspaper report ,10 March 2003
Arrested for murder 1000 miles away
3/ http://18.104.22.168/germany.asp?pad=190,205,&item_id=31550 27
May 2003 - the Goettingen prisoner ,newspaper report
also published in Die Welt ,24 May 2003.
The supposed murderer but he had the perfect alibi - he was in
prison at the time of the murder.
Add in the
http://news.bbc.co.uk/1/hi/england/3007854.stm Milly Dowler case
probably false match
All cases of two totally separate unrelated people having
the same DNA profile and that is just the ones that get
to be published - there are many unreported so called
'unresolved duplicate pairs' of such matches in the UK
Although DNA profiles are not necessarily unique, your statement is
misleading. The weight of a "DNA profile" is dependant on the loci
used. If only one locus is used, it is possible that almost everyone
in the population will match the DNA profile. If a sufficient number
of markers are used, the profile can approach uniqueness.
Just because one profile with a given typing system may be nearly
unique, this does not mean that every profile is unique. This is
particularly true with mixed samples.
I find it slightly amusing when people that don't understand the facts
behind what they are saying give not explanation, but simply cite
publications that are written by people that don't really understand
the facts behind what THEY are saying.
Unless you know otherwise these are all
full sets of loci producing full matches so
that the innocent were arrested except the
Gottingen prisoner of course.
No other corrobarative evidence was used
- just a trawl of DNA databases.
Remember it is happening in the UK first
because they have 'joined-up' databases
now consisting of over 1.8 million such profiles.
My own profile is too close to the UK caucasian
norm (nowhere more than 2 alleles away from the median
on each locus) to be comfortable
You proved my point. You don't have any knowledge as to whether these
are full profiles, mixtures or whatever. You make serious assumptions
in your arguement.
Even if you have the most common alleles at each locus, it doesn't
indicate that your profile is common. I cannot give any additional
information to prove this, since I'm from america and we neither use
all the same loci as in SGM+ nor do I have the population study
information for the UK handy.
Although it is true that as a database size increases, it is possible
to have a match purely by chance, but these can be excluded by using
additional loci. This has been done in at least one of the cases you
Whether derrived from mixed samples or not it
was considered sufficient grounds to arrest totally innocent
persons without the slightest corroboratory evidence.
My own profile based on the 10 loci
used in the UK FSS NDNAD structure is
( slightly altered for obvious reasons )
the UK caucasian median profile ie the 'average Joe' is
(assuming D2 Allele 20 UK subgroup for A,B as
the frequency plot is triple valued )
so normalised relative to UK caucasians ie taking the
difference between each element is
the UK caucasian 'average Joe 'would be all (0,0) numbers.
So you see the proponderance of 0s,1s and 2s makes
me uncomfortable as far as being wrongly implicated
in some future crime whether in the UK or interpol-wise
somewhere else in Europe.
It was considered sufficient evidence to arrest him because it IS
sufficient evidence to arrest him. Arresting a person does not mean
the person must have committed the crime. It means that evidence
exists that shows the person is possibly or probably implicated in the
crime. If your gun is stolen and used to kill a person, you should
not be totally dumbstruck when the police come calling. If your blood
is at the scene of a crime because you accidentaly cut yourself there
prior to the crime happening, it doesn't mean you committed the crime,
but the police have an obligation to investigate and possibly arrest
you. If the DNA types are a million times more likely if you left the
DNA than if a random person did, you should not be shocked if you are
arrested, because there is sufficient evidence to arrest you. If you
are CONVICTED, then there is a miscarriage of justice.
Your information about your DNA types and how they copare the "average
Joe" shows a complete misunderstanding of DNA evidence. First of all
there is no "average" DNA type. I am assuming that your "average Joe"
type is a combination of the most common alleles (I have no idea where
you came up with this information). In any case, this would probably
be present in only a few individuals in Europe.
DNA typing has nothing to do with how many STR repeat units your type
is away from the most common allele. Your list of 0s, 1s and 2s is
complete nonsense. Before you question evidence, actually study it.
You are guilty of making the same error you are concerned about. You
rush to judgement before all the facts are gathered.
In your scenario I assume all guns in the USA have
unique serial numbers. Then indeed good evidence
but that is precisely my point DNA profiles are not unique.
Take a hypothetical for instance a bit in the future when the USA
(following the lead of the UK unusually) has structured a nationwide
DNA profile database covering all or at least many states and
If you in San Diego have a murder scene-of-crime DNA profile and you
interrogate the new states-wide database and find there is an exact
full loci match to an individual in New York. Despite there
being absolutely no independent corroboratory evidence
would you get NYPD to arrest this person the other side of the
I would posit that in this scenario there just isn't sufficient
reason to arrest someone. If after further investigation eg mobile
phone cell records/credit card transactions or whatever place
this New York person in your area at the relevent time - then there
My data on median groups within the UK is from
allele frequency tables in the forensic science journal
International Journal of Legal Medicine issues:
(2001) 114:147-155 By L.A. Foreman and I. W. Evett
and (1997) 110: 5-9 By I. W. Evett et al.
In forensic science terms the ideal frequency distribution of
alleles would be flat ie equal likelihood of occurance right
through the range but alas profiles are an abstraction from biology
so a Gaussian distribution, generally speaking ,with
very specific locus/allele peaks for a given sub-population.
Table of the percentages of the caucasian population
within plus or minus 2 alleles of the median for each locus in
the 10 used in the UK FSS NDNAD
Locus / Percentage
eg the median allele for VWA is 17 with 27 percent ,those with 16,17
or 18 then 71 percent and for 15,16,17,18 or 19 (ie median +-2 )then
88 per cent of people.
So again in forensic science terms D2S1338 is best as 61 percent of
people are outside the median+-2 and the worst is D19 with
only 9 percent outside the median +-2.
Considering people within this combined median (average Joe + or -2 )
area then probabilities
of false matches between unrelated separate people with matching DNA
profiles is much less than the 100 trillion sort of figures normally
This principle equally applies to the USA but of course with
a different set of chosen loci to the UK SGM+ set.
Co-incidentally I came across this CBS report this weekend concerning
friction ridge ,dermal fingerprints problems
This has gone on long enough. There will be a great deal of coincidence at
an individual locus. There will be far less at a combination of loci. Your
example of the average Joe's profile and your own misses a crucial point.
The difference between the two means that no self respecting scientist would
call a match between a crime committed by somebody of that profile and
yourself, even taking margins of error into account.
There are legitimate concerns regarding whether DNA profiling is unique or
simply very rare. A full SGM+ profile is usually referred to as having a
random match statistic of 1:1bn. Given a population of the world of 6bn,
this suggests that there would be five other people in the world with
similar DNA to your own. These five people could be anywhere in the world.
It would be an incredible coincidence for one of them to be in the vicinity
of the crime and yourself, but not impossible. This is why DNA evidence
needs to be corroborated, but that corroboration must be tested in court. It
should in my opinion be perfectly acceptable for police to seek an
explanation from a suspect/ person to be eliminated of DNA evidence
suggesting the possibility that the person concerned may be responsible for
that crime. At the stage of arrest no determination of guilt or innocence
has been made, only that there is evidence that requires an explanation and
that there may be a case to answer.
Given your references to the UK I take it that you live here. Are you aware
of the case of Jeffrey Gafoor? He recently pleaded guilty to the murder of
Lynette White. The 1988 murder was then the most horrific murder in Welsh
history. He was caught by DNA evidence. Innovative police work regarding
particular alleles resulted in his capture after a partial match on the
National DNA Database indicated that a male relative was likely to be the
Gafoor gave buccal swabs which were matched to the crime scene samples.
Following that he attempted suicide. He was arrested solely on the basis of
the DNA results. He was subsequently interviewed and admitted his guilt. Are
you seriously saying that after a match was declared to Gafoor on the DNA
police should not have arrested him? If you are, then I suggest you join the
real world asap, as Gafoor's admissions would not have occurred without his
arrest and an extremely vicious killer would not only have remained at
liberty, but he could not ever have been brought to justice.
I would certainly agree that in court, if it gets that far, there needs to
be corroborating evidence. In your example, the New York alibi would mean
that barring incompetence, etc. the person will not be charged and convicted
and I doubt that anybody on this list or in the scientific community would
think they should be, so what is your point? I would suggest that after a
match is called police in San Diego would be duty-bound to investigate that
person. From having read Adam'sposts for many months I am sure that once he
established there was no corroborating evidence, he would pursue other lines
of enquiry, but how do you suggest that police check whether the man was in
New York or not and what is your definition of corroboratory evidence? The
simplest method would be to ask him and his associates. After all, just
because he lives in New York, doesn't mean he can't have been in San Diego
at the relevant time does it?
You suggest that there is no corroboratory evidence in your example, but you
fail to indicate how corroboration would be sought. I think you have it
about face. The DNA hit indicates that the person is worthy of further
investigation and that an explanation of it may be required. How do you
propose to obtain corroboration and/or eliminate that person?
In Gafoor's case that required arrest. And for the record, Gafoor's arrest
did not prove his guilt, it meant that he was a person who had to be spoken
to under arrest in order to try to explain the DNA evidence. You surely
would accept that police were entitled to seek an explanation of the DNA
evidence from him wouldn't you? If not, history would not and could not have
been made in Cardiff Crown Court on July 4th. By the way, that case had seen
a previous miscarriage of justice. Without the DNA evidence all but proving
Gafoor's guilt, the whispering campaign against the original defendants
would still be carrying on as we speak! There may be some problems with DNA
evidence, but please let us not throw the baby out with the bathwater.
I agree that it would be very logical to increase the number of loci
looked at, as the statistical probability of there being a match
between several people would greatly decrease. However, how many
more should be added? Those who distrust DNA evidence or simply do
not understand it (and its statistical importance) will probably
never be satisfied with the amount of loci used. There will always
be that "grey area" unless every single person ever to live on the
earth has their DNA tested at every single loci, and the entire
universe is explored just to make sure that there are no humans or
other creatures with similar DNA that happen to be living anywhere
Your probably right Shannon. Many will never be satisfied and that's OK
considering the awful consequences of being wrong in a death penalty case. But
as I remember we went from 6 to 13 loci with the last upgrade (RFLP to STR) in
DNA technology, maybe an increase will accompany the next generation (LCN?).
Anything that makes us more certain and less nervous about capital or long
sentence criminal convictions. Can you imagine the nightmare of being
imprisoned for someone else's crime?
Your posts continue to show poor understanding of population gentics
and the DNA testing performed. I cannot continue lengthy responses to
your post, so I will try to keep it simple.
1) The USA has a DNA database comparable to the database in the UK.
2) DNA profiles are unique if sufficient loci are used (excepting
3) The burden of proof for arresting a person is the lowest burden in
the justice system. It is simply "probable cause". Being arrested
does not mean you are guilty, it simply means that the law enforcement
agency has a reason to believe that you are probably involved.
4) If a search of the national database in the USA leads to a match,
it will probably be sufficient for an arrest. Usually additional work
is done, but it probably doesn't need to.
5) Although I don't have access to the journal that you cite, your
use of the information is seriously flawed. First of all, the
statistics don't have anything to do with how close to the most common
type you are. In addition, you have to account for dizygosity. You
are also neglecting to take into account the fact that many of the
people that have the most common allele at one locus have an uncommon
allele at another locus.
You have some idea that half of the people in England or Europe or
wherever have the same DNA type. If you actually tried to take the
time to understand what you are looking at, you will realize that you
are totally unjustified in your position. You are trying to come up
with some off-base arguement to justify your position, but you don't
really understand what you are saying. Very few people have most of
the most common alleles.
One quick note on the statistical differences between RFLP and STR
technologies. With RFLP there were many more possible combinations at
each locus than with STRs. This is part of the reason why the FBI
required 13 loci for STRs vs. the six for RFLP. That stats for six
RFLPs are similar to 13 STRs. But STRs are preferred for many other
reasons (particularly since they require significantly less sample).
The profiles used in DNA analysis now are limited to 13 probes/loci (?) as I
understand it. There are millions or billions of genetics pairings. Even given
that genes are made up of more than one pairing, there must be millions of
billions of possible combinations. Even with just the 13, I've heard figures of
up to 9 1/2 trillion to one. Isn't the statistical uniqueness of a profile a
function of how many genetic characteristics you compare? So if there were to
be a question in a given case, just increase the number of comparison points,
Studies could be done to validate those contingency characteristics. Tempest in
I am not one of those who will never be satisfied. I happen to believe that
it is one of the most important tools in the fight not just against crime,
but also against miscarriages of justice as well. In Britain the system of
SGM+ played a pivotal role in the finally resolving the Lynette White case.
I had been involved in that for twelve years. Every system that was used in
Britain from 1988 onwards was used in that case. As such it is an excellent
case to review the effectiveness of DNA testing systems over time.
To reply to your questions, Shannon, in Britain SGM+ tests at ten loci (it
also tests at amelogenin which is very sensitive for the Y-chromosome). The
random match statistic quoted in Britain is 1:1 billion. Whether this
statistic accurately reflects the level of discrimination of that system I
know not, but this is what juries hear.
In 1999 Britain went from SGM (six plus amelogenin) to SGM+. The previous
system was quoted as offering a random match statistic of 1:50 million -
approximately the population of the UK.
I know that there are compatible genes that could easily be added to SGM+. I
would think that only one locus would need to be added in order for the
random match statistic offered in evidence in Britain to be greater than the
population of the world and that would satisfy me as it would be a case of
the statistic offered in evidence being such that it would offer compelling
evidence of inclusion until and unless an example is found of a random match
occurring using this system. Were that to occur then the statistic offered
would have to be wrong.
To date it has not occurred in SGM+, but were it to occur it would not prove
the statistic offered to be wrong for reasons I referred to in a previous
post. Within the statistic offered by SGM+ at present there is a grey area
in which unscrupulous people can find refuge or a miscarriage of justice can
occur. I just don't see why we have to live in this grey area when it is so
easy to offer certainty, at least until proven otherwise.
In an ideal world we would have the DNA of everyone to compare with each
other and test at every locus. That is not going to happen, nor should it.
It would be time-consuming and outrageously expensively without improving
matters significantly. It may satisfy some doubters, but it would bring
everything to a grinding halt with no prospect of ever catching up with the
backlog that it would create. In my view it is simply not a viable option,
however desirable it might be. The random match statistic only needs to
offer a level of discrimination that is greater than the population of the
world as that will in effect offer exclusivity until and unless an example
can be found to disprove it.
I would add that to function as effectively as it is capable of DNA
Databases need to contain the DNA of all citizens of that jurisdiction and
that this should occur throughout the world. I believe this to be in
everyone's interests as it would (possibly) generate new lines of enquiry,
especially in cold cases and it would enable wrongful convictions to be
corrected and prevent visits to 'the usual suspects' where their DNA does
not match that obtained from the crime scene. The Gafoor case is a good
illustration of the need for this. But for excellent police work involving a
profile stored on the National DNA Database the case would never have been
solved, but what if Gafoor's nephew had never been arrested at all? If that
were the case his DNA would never have been on the UK Database and nor would
his uncle's. This would have meant that the real killer not only would have
remained at liberty, but that he never could have been caught. In my opinion
that is too high a price to pay, especially when it could be resolved so
Hope that clarifies my position.
Firstly a general comment then some individual replies.
I am amazed at the complascency here. I can safely assume
most people in this group,for ellimination purposes, have
their DNA profile on a database somewhere.
Just because you work in the criminal justice system does
not make you immune from becoming a "Peter Hamkin" (a previous post)
perhaps 5 years down the line and (falsely) arrested for some serious
crime in another part of the country or abroad even. You ,though, have enough savvy to
demand an extended-loci DNA profile test to get yourself excluded
but that is no comfort to the Peter Hamkins of this world.
Ironically the only persons not in this Damoclean / Kafka-esque
world are the likes of the 'Goettingen Prisoner' (a previous post)
safely banged-up inside at the time he would have been accused of murder.
There are hundreds of "unresolved duplicate pairs of profiles"
in the UK NDNAD first reported in the journal Forensic Science
International 95 (1998) p30.
Many of these would be one individual arrested and processed more
than once but using aliases. Is no one concerned about these situations ?.
No one other than myself is
concerned about all the other 'matches'. I don't have clearance to
interrogate the relevant databases . From the NDNAD find all
such pairs of matches and take both names and DOBs and interrogate
the dermal fingerprint database. A match in both databases - then
one individual using an alias. Mismatch in the dermal fingerprint database
- then two individuals sharing the same DNA profile. Some would be
clerical errors of course but in that area you have false inclusions and
false exclusions. As a scientist I find the (deliberate?) non-investigations of
such anomalies abhorrent and alien to the scientific ethic.
Individual replies to
From the promega site concerning SGM+ in the UK they quote 100 billion : 1
for random match.
A DNA 'match' without any other evidence just shows a set of numbers
derrived from a crime-scene match a set of numbers on a database somewhere,
it is just a coincidence until further evidence.
A DNA profile is just a snapshot of part of someone's DNA it is not a unique
I'm surprised you raise the Gafoor case - I don't know the generic
term for this new technique i will call it sub-set trawling,10 point
matches instead of 20 (UK). I'm
surprised the civil liberties lot have not been screaming their heads off.
Collaring of suspects by using the DNA of their blood relatives who
happen to be recorded on a DNA database plus disturbance
to the dozens or hundreds snagged along the way. How
many people within a family share half the alleles ? brothers,sisters,parents, sons
and daughters. No wonder the arrestee side of the NDNAD will be stopped
at 3million profiles - using 10 point trawls along with serious number crunching
then in effect perhaps 20 million (one third) of the UK population are snareable.
Gafoor and the victim as far as I am aware were related so evidence of inclusion
You could have mentioned the similar case of Joseph Kappen implicated by DNA profile
alone although deceased. The relatives of Kappen now have the stigma of
a murderer in the family but because he is dead they have no chance of
vindication in court. Over 100 other men presumably grilled or were volunteered
to surrender a DNA sample now added and irretrevable from this
Juggernaut database. The whole sub-set trawling exercise is supposed to be initiated
again in the Scottish, Helen Scott/ Christine Eadie murders re-investigation.
To Wallyl & Shannon
I am reminded of a mathematical treatment for infinity.
You give me the largest number you can think of and I can
always add 1 to make it bigger.
In the UK 6 loci were considered by the forensic statisticians to
be perfectly adequate until the Raymond Easton (a previous post)
case forced a rejig to 10 loci. However many loci are used there
is still a finite chance of a false match.
I wish I had a rare locus/allele combination somewhere in
my profile but my rarest ( D2S1338 /18 ) is shared by 9 percent of UK caucasians.
I would not feel vulnarable to a future false match determination if just one
of them was ,say, sub 1 percent frequency of occurance.
You need to be pretty careful with the statistics for this, as it's
not as simple and obvious as you suggest. The chance of duplicates
occurring in substantial sized population is actually much greater
than it might appear at first sight. It is rather similar to
the 'Birthday problem', the chance of finding any two people with
the same birthday in a group as small as 20 people.
I have done the calculations, and I'd tell you that in a population
the size of the UK (circa 6E7), a random match probability of 1E9:1
for the DNA test means you can expect to find on the order of 3.6E6
people in the UK population whose profile is not unique, or 6% of
Depending on the circumstances of how the match was located, that
means the chance of latching on to an innocent could be as high as
3%, if you randomly select the innocent person (half-profile) from
the 6% of those with the same profile. That is approximately a 1:33
chance within the UK population only, which is much, much greater
than your '5 random matches expected in the world' (correct as far
as it goes, but not the whole story) leads you to expect. Not such
an 'incredible coincidence' after all!
If arrests and investigations are frequently being launched with
chances of error as large as a few percent, then it's likely that
innocents will be accused on a regular basis, and a chance of error
as large as that is certainly enough to constitute reasonable doubt
by itself, requiring additional and convincing corroborating
evidence to support a case.
You had also better be sure that the test really is as accurate as
you suggest - if it turned out to be even somewhat less accurate,
then the chance of false matches rapidly increases.
Hope that helps...
BTW, I do of course mean 3,600,000 people out of the 60 million with
a non-unique profile. That is to say, 1,800,000 different
profile 'pairs' with one profile shared by two people. Within that
3.6E6 people, there would also be a much lesser number of triples
and higher profile n-tuples, but those are second-order, third-order
etc effects - the likely number of 'pairs' being the most
significant and potentially surprising/disturbing result of the
Sorry if that wasn't clear - it can be difficult to express
precisely without ambiguity...
That was very,very interesting. I know it's easier said than done -
any chance of posting the background maths ? Despite just being
ordinary text - algebraic formulae or whatever rather than numeric so
I can plug in my own figures .
Today I revisited the 'Birthday Problem' to brush up on this sort of
This is someone's site about it
My maths is exceedingly rusty but I did managed to pick up a
typographical error in his formula -
for 36530 it should read 365^30. The first 6 elements of his table
are correct anyway,I could check them on my basic calculator. It
still seems intuitively wrong that 23 people gives a better than
evens chance that 2 people share a birthday. I was also somewhat
surprised that with 57 people you are 99 per cent certain to have 2
people sharing a birthday. Again intuitively I would have said for
99% then something over 150 people would be required. But that is
precisely the problem intuition does not square with billion : 1
sorts of probabilities.
Anyway with a bit of study I hope to be able to get my head round
your maths should you be kind enough to post it here.
As a first-order approximation for the number of non-unique
individuals, use the square of size of the database divided by the
odds against a single random match.
That is to say, if the size of the database (let it be N) is 1.8
million, and the odds against two profiles chosen at random matching
are (let it be S) 1 billion:1, then expect the number of profiles
which are not unique to be N^2 / S, or 3240 for these numbers. The
number of 'pairs' is that divided by 2, or 1620 pairs.
The approximation is valid until the number of non-unique profiles
starts becoming a significant proportion of the population. As the
number grows, it will start to be an over-estimate (in fact, once
there starts to be a significant chance of getting more than two
people sharing the same profile, i.e. triples and higher n-tuples).
In the context of DNA databases such as we are talking about, it's a
suitable approximation because by the time we start seeing triples
etc there's already a big problem with the number of duplicates!
To give you an idea, using an example I quoted earlier, if we take N
as 6E7 for the UK population, and leave S as 1 billion:1, then N^2 /
S is 3,600,000. That's actually a slight over-estimate, as with
those numbers there are likely to be a little over 100,000 triples
of three people sharing the same profile, and if you make an
adjustment for that you'll find that the number of pairs is actually
more like 3.4 million.
But I'm sure you'll agree that the number not unique is large enough
to be problematic long before we need to consider such second-order
I was expecting loads of factorials and complex sigma functions not a
plain and simple N*N/S.
Very impressive nevertheless. I remember a concept in maths
'reductio ad absurdum' . If there is a complex conjecture or
formula that is difficult to prove then check it by entering
a simplified case. If it throws up an inconsistency then the complex
form is suspect. I don't consider the discrepancy between 23 and 27
in the birthday problem simplified case to be disproof.
The US/CODIS database is subject to the same statistics assuming
each state uses the same loci set. The use of 13 loci reduces the number
of duplicate pairs but of course does not eliminate them.
It is absolutely amazing the 'powers-that-be' can keep such
a simple piece of maths and data out of the public arena (and technical
arena for that matter) . The quote I have for 300 unresolved DNA profile matches
in the NDNAD may relate back years before 2002. Unless there are
two tv documentaries titled "DNA in the Dock" although shown
December 2002 it was recorded and first shown maybe 5 years previously.
So the NDNAD having many thousands of profile matches plus the
aliases matches would be consistent. There must be a very firm
lid to keep such info from the public domain.
The following quote from the Regina v Watters ,2000,appeal court
judgement is now confirmed on primary source Butterworths Lexis Nexis
[Typing error in the last sentence 'acquitted who when' in the Lexis Nexis version,
later amended to 'acquitted when'.]
The other evidence results from more stringent tests that have been done on the DNA material that was available in this case. That is partly as a result of a case in which a 6 point match was found to produce two possible suspects, one of whom had been charged despite living at the other end of the country and had to be acquitted who when it was appreciated that the DNA matched a second person.
So part of the judiciary at least is aware of duplicate pairs of profiles in the NDNAD.
In normal circumstances you would expect a complete monograph at least on an otherwise
From Hansard [ Record of proceedings of the Houses of Parliament ] 08 April 2003
Mr. Bob Ainsworth: The total number of profiles held on the [ NDNAD ] database at 25 March 2003 was 2,094,858. The Forensic Science Service calculate that these profiles relate to an estimated 1,886,000 different individuals.
So only 2 weeks to access,process and relay this basic data.
It would be nice to see how many of the 2,094,858 are repeat DNA profiles
broken down into:
a) repeated acquiring of profile from same person at different times
within the bounds of clerical error for personal details.
b) same profile ,markedly different personal details but same dermal fingerprint data within error bounds.
c) same profile,markedly different personal details,different dermal fingerprints
But that would destroy public confidence - so it will never be done.
Dear Wally L and others -
I have only posted, to the best of my recollection, once previously to this
group. I am definitely out of my league and not a qualified expert. However,
Mr. Lind, I find your attitude towards other posters and your conduct
outrageous, insufferable and inexcusable. DNA is NOT so widely accepted any
My previous post was about the subjective versus the objective approached to
both DNA and fingerprints. You agreed on the fingerprint aspect but as I
recall, you disagreed on the DNA aspect. You made reference to the FBI
comparing a match in the trillions whereby now they just say "it's a match".
It is my understanding and belief founded by the journals I have read that
there is a great difference between a DNA match and a DNA profile. In the case
of most major law enforcement agencies, a comparison is made to the profile,
not the match. There have been hundreds of incorrectly arrested individuals
based on DNA who have had to be released because of DNA.
In a time in our life when science has almost perfected forensic evidence, I
find that the technicians are less concerned, in too much of a hurry and
maintain a cavalier "who cares" attitude about much of the processing itself.
Whether it is inadequate testing of DNA or an improper description of the year,
make & model of a car, I find that liberty and freedom in this great nation of
ours is far too precious to allow those in "position" to have a cavalier
attitude. Perhaps it is time that the threshold be elevated for what
"probable cause" for an arrest.
Although it is my understanding and belief that the conduct is more prevalent
in America, other countries such as the UK are much more willing to share the
truth and expose the inadequacies. Thus the release of articles such as the
ones I've included below. In the USofA I have found that although we are very
willing to admit that wrongly convicted innocent people have been released
because of the results of DNA testing which was not available at the time of
their conviction and we make certain that the news hits front page, we are not
admitting publicly too often about people who have been wrongly arrested based
on DNA evidence.
The Court of Appeal in the UK said that DNA statistics were found to be
insufficient to convict without corroboration in a decision they returned in
In most criminal prosecutions where DNA evidence is utilized, the evidence
serves to corroborate, in a powerful manner, other circumstances pointing to
guilt of the accused. But should DNA evidence alone be sufficient to convict
when there is no corroborative evidence, except of the most generalized and
non specific nature? Some high courts have said "NO".
DNA mystery in murder probe
27 May 2003 GOETTINGEN - German justice officials investigating a murder six
years ago are faced with a baffling problem after a DNA sample appeared to
confirm the killer. The sample prosecutors found in connection with the murder
fitted the DNA profile of a 40-year-old man. But their sole suspect had the
perfect alibi - he was in jail at the time.
"This is a very mysterious affair," admitted Hanover public prosecutor Thomas
Klinge. The September 1997 murder of a 61-year-old woman, whose body was left
on a playground in Hanover after she had been beaten about the head with a
stone, had baffled police for several years. But last year specialists achieved
a breakthrough when they discovered small traces of DNA material on the
victim's bicycle. A check of the BKA federal police department's DNA databank
confirmed it matched the profile of a suspect with a previous record of
sexual offences. However, the man has been held at a high-security unit at
Goettingen's closed mental health hospital since the middle of 1997. Officials
at the unit have confirmed that it is absolutely secure. Director Gunter Heinz
said he was "100 per cent certain" the suspect could not have left and
returned. Klinge said there could be no doubt about the accuracy of the DNA
which had been tested by several institutes. Neither was there any reason to
believe the evidence had somehow appeared on the bicycle after the crime. But
added: "The alibi appears to be absolutely reliable, and we have no knowledge
the man has an identical twin brother."
Cleared murder accused victim of DNA blunder
Mar 10 2003
By Chris Johnson Daily Post Correspondent
A MAN arrested on DNA evidence has been cleared of murdering an Italian woman
after police admitted he was the victim of an Interpol blunder.
Bartender Peter Hamkin, 23, was arrested in Merseyside on an extradition
warrant and hauled up before Bow Street Magistrates' Court in London.
His DNA fingerprint was said to be a perfect match for the man who shot
Annalisa Vincenti in Tuscany in August 2002.
The killer had escaped in a Rover car and Mr Hamkin, from Litherland, was
said to match the e-fit produced with the help of witnesses by the Italian
Police arrested him while he was pulling pints behind the bar at Buckley's
Pub, at Litherland, when it was crowded with regulars.
Mr Hamkin protested his innocence from the outset and swore that he had never
even been to Italy, let alone murdered anyone.
Following a 20-day ordeal he has been informed that the Metropolitan Police
have now ruled him out following a second DNA test.
Mr Hamkin said: "This has been the worst three weeks of my life. I've not
been able to sleep or think straight with this nightmare hanging over me.
"I've been a prisoner in my own home, constantly on edge thinking the Italian
police were going to arrive to take me away. "You hear about miscarriages of
justice and innocent men have been hanged for murders they did not commit.
"They picked me out of millions of men in Europe.
"I had dozens of alibi witnesses but as far as they were concerned I was
guilty because the DNA said so.
"I told the police who arrested me that I have never been to Italy and I
could prove it but they just did not listen.
"They marched me off and threw me in a cell overnight and then stuck me in a
van for eight hours to take me to London.
"I felt like I was trapped in some kind of crazy film and told myself this
could not be happening to me."
Mr Hamkin's DNA profile was kept on record by police after he was convicted
of drink driving in 2001.
He added: "It was a stupid thing to do but I never thought it would land me
in court accused of murder."
At Bow Street, Mr Hamkin had to find a ££10,000 surety before magistrates
agreed to grant him bail when he faced an extradition warrant on February 16.
Mr Hamkin, who has a twoyear-old daughter Alicia, is consulting his lawyer
with a view to suing police for wrongful arrest and false imprisonment.
His solicitor, Rex Makin, said: "The police say that a 'more refined result'
from a second DNA sample that shows that he is not a match for the Italian
"I suspect that the original DNA sample was bungled and that has resulted in
this terrible experience for Mr Hamkin.
"It begs a lot of questions about the procedures surrounding the routine
sampling of DNA and the conduct of police in accepting DNA file evidence."
Mr Hamkin must attend Bow Street on March 25 when the extradition proceedings
will be formally dropped.
Disabled man turns down payout offer
Invalid Raymond Easton has been offered £2,000 compensation by police after
he was arrested for a burglary he did not commit.
Parkinson's disease sufferer Mr Easton was arrested on the basis of DNA
evidence which later proved false.
He says the amount is an insult and has instructed his solicitor to ask for
at least £500 more.
He said: It is not the amount, it is the principle of the thing and what they
are offering is not enough for what I have gone through.
Mr Easton, 49, of Pound Lane, Pinehurst, was arrested last year by Swindon
Police for a burglary which he was supposed to have committed 200 miles away in
Despite not being able to dress or bath himself and being unable to walk more
than 10 yards unassisted, Mr Easton was arrested in April and charged with a
burglary during which electrical equipment worth £440 was taken.
Police matched the DNA found at the crime scene to Mr Easton's, which was on
file from a domestic incident four years ago that resulted in a caution.
He was told the chances of it being wrong were one in 37 million.
Despite this evidence against him, Mr Easton was adamant that on the day the
burglary was committed he was at home, looking after his unwell 16-year-old
After being arrested by Swindon police on behalf of Greater Manchester police
he was kept in a cell from 9am to 4pm.
Mr Easton's solicitor demanded another DNA test, which was more accurate than
the first and led to the case being dropped.
A spokesman for Greater Manchester Police would only confirm the force does
have a compensation claim being made against it and is waiting to hear from Mr
I was amazed when I read the following piece that an American had
written as an aside on a Usenet golfing usergroup.
"Making your own (golf) clubs is a lot of fun and can add a lot to
your enjoyment of the game. It does not require any sort of high tech
qualifications or excess of expertise. My colleagues think it is a big
deal that I can fix my thermal cycler (PCR machine) and automated DNA
sequencer.....it's not. They are simple electrical/mechanical devices
and pulling a board or replacing a pump is no different than doing
the same for a cheap radio or a dishwasher."
You don't tinker with these sort of expensive and sensitive machines
however well intentioned.
Just moving a wiring loom could upset the callibration. I won't
identify that individual more than he probably works in a Mississippi
There is the ongoing fiasco of the Houston crime lab leaking roof and
numerous other problems with other USA crime labs (not from
Mississippi yet reported though). As far as I know the profile
databases in the USA are fragmented to state or county level and
no 'federal' combined database. False matches only start turning up
with large databases so it will occur in country-wide databases
first. To balance your observation concerning Europe being more open
about reporting false arrests due to DNA 'evidence'. I find it a bit
suspicious that I've not read any similar reports concerning
systemic/procedural errors in UK forensic labs. Considering I have
written evidence of a dyslexic employed within the main Birmingham
Forensic Science Service site.
I apologize if my views offend you. That said, a "profile" is the
result of a DNA analysis procedure on a known or unknown sample.
A "match" is the comparison of a
known and a crime scene or other profile. There must be thousands of
unresolved profiles in databases that report to CODIS, because no
profile has been submitted to compare it to. Is that what you meant?
Your are flat wrong about the acceptance of DNA. It always did have
to fit the circumstances of the case. If the suspect's DNA is
identified in the vagina of a rape victim, you still have to prove
that the incident was rape. DNA is not only accepted, it is expected
in todays courtroom. Due to the propularizing if it on TV, nearly
everyone know about it. And, like fingerprints, if you don't have
DNA in your case presentation, you almost feel like explaining to
the judge or jury why not. Is that what you meant?
As for freedom, many innocent people have been freed from prison
through DNA analysis. The people who have been wrongly convicted
with the aid of DNA evidence, as in the Houston PD case, have
suffered this fate because of incompetence or criminal activity, not
because there is anything wrong with DNA analysis. Any technology
can be screwed up. Lab accreditation and proper supervision are what
What is this guy talking about? What unsolved DNA "profiles"? Outside of twins
there aren't 5 people who have the same DNA in the 120,000 years modern man
has been around. It is theoretically possible in the same way that its
theoretically possible that this guy might catch a ride to alpha centauri on
the next UFO. Even taking the 13 loci STR profile, which the FBI says has 9 1/2
trillion to 1 stats, a profile is unique among all the hominids that have ever
lived. DNA is widely accepted, because its common sense to an educated public.
I recourse to evidence not abuse.
The FSI article referred back to data as of 24 October 1966
and concerned just 6884 DNA profiles. Within those
profiles were 11 matches. Of those how may were
cases of aliases and how many were 2 unique individuals
was not resolved then and is still not resolved although
now concerning over 1.8 million such profiles.
The latest figure I have for these unresolved matches
was 300 but I don't know when that inerrogation
of the NDNAD was made.
Don't blame me if the controllers of these databases
deliberately don't resolve these matches.
No,the unresolved matches refer purely and solely to the section of
the NDNAD containing the profiles of arrestees (plus some others such
as crime scene examiners) not the database of profiles derived from
crime scenes - any matches in that database are usually just the same
criminal detected on different occasions at different crimes. Because
of the large numbers involved in such a database (1.8m) and the
statistics relating to large numbers there are hundreds or maybe now
thousands of pairs of profiles on this part of the NDNAD that are
from TWO or more SEPARATE unrelated individuals. It is confused
because there must also be matches recorded from the same individuals
using an alias at different times of arrest. Statistics cannot help
in quantifying what proportion alias cases constitutes to
the "unresolved matched pairs'.
So, let me get this straight...
If two people match exactly in the DNA database, you
would interrogate the dermal fingerprint database to
see if two aliases match. End of story. The
identical profile is simply a duplicate.
If, however, alias #1 doesn't match dermal
fingerprints to alias #2... A HA!! DNA goes by the
way of the dinosaur.
What if alias #2 never had a dermal print taken (only
DNA)? Only you and OJ Simpson would still be
searching for the real killer (on a golf course, as
the joke goes).
Furthermore, are you not now, in order to discredit
the current method of DNA profiling, requiring an
enormous dermal fingerprint database to show beyond a
reasonable doubt that 2 individuals share the same DNA
The requirement of a global dermal fingerprint
database to disprove DNA is abhorrent and alien to me.
Besides, don't we all know that the "average Joe" is
only +/- two loops and whorls from everyone else in
Replace with interrogation of the mugshot PNC
database instead of dermal (i just considered it
more rigorous) if you prefer for the cross-correlation.
But my point is no one is bothered about
these unresolved matches ,utterly astonishing.
I really don't want to get dragged into a long debate here, but I
think you completely misunderstand the Gafoor case. There really is
no doubt that Gafoor is the real murderer. The DNA evidence is
absolutely compelling, so compelling Gafoor tried to kill himself and
admitted his guilt on his way to hospital, not just to police, but to
ambulance staff. He has consistently maintained his guilt ever since
his arrest. He even apologised to Lynette's family and the original
defendants through his QC. He accepts responsibility for Lynette's
murder. I can't see why you seem unable to accept the evidence and
his guilty plea. If you think that the evidence is a ten allele
profile then I'm afraid you are showing your ignorance of the facts;
it is a ten loci profile with 20 alleles. The evidence in this case
is not the 12 allele partial profile of the boy, it is the full
profile of Jeffrey Gafoor. The 12 allele partial hit of the boy did
nothing more than narrow the search to relatives of the boy. Your
number crunching is absurd.
Do you seriously believe that a twelve allele hit would ensnare 20
million people? If so please explain how instead of narrowing down
the search to a third of the population it narrowed it down to one
family within which there was one person and only one person whose
profile was a perefect 20 allele match to crime scene samples.
The use of his nephew's profile was to narrow down possible lines
of enquiry. What on earth is wrong with that? The match was between
Gafoor's own profile and several crime scene samples, both from
samples discovered at the time of the murder and ones discovered
during the re-investigation. There is a crucial difference between
Kappen and Gafoor. Gafoor has faced trial and been found guilty
beyond reasonable doubt. Kappen is deceased and will never face trial
and as such is entitled to be presumed innocent. Being entitled to be
presumed innocent does not mean he is innocent; it means he has not
been proved guilty and never will be. What do you suggest police
should do in the Kappen case - ignore the DNA evidence altogether?
And while we are at it should they ignore the DNA evidence in the
Gafoor case as well. And before pointing to the other evidence
remember that it was the DNA evidence that resulted in the other
evidence being obtained. Without it this case could not have been
I don't know where you get your statistic from, but I was in court
for Gafoor's guilty plea. The statistic quoted was 1:1 billion. I
know that for a fact. The nephew's profile was a 12 allele out of 20
match to the crime scene profiles. Gafoor's was a full 20 allele
match. You seem most concerned with protecting civil rights, but are
strangely silent on the fact that the evidence has already proved
Gafoor's guilt. It is overwhelming and in the absence of a full
database such innovative work was the only means of solving it.
Gafoor had the opportunity to challenge this evidence. He pleaded
guilty instead. I don't understand what your problem is regarding the
police's work in this re-investigation. For the record the relatives
of the boy were asked to give voluntary swabs, which they did
including Gafoor. The only person arrested over this was Gafoor AFTER
the results indicated a perfect match to the crime scene profiles,
because at that stage it WAS reasonable to seek an explanation from
him. Had police waited Gafoor's suicide attempt would have succeeded
and then no doubt you would have gleefully raised the similarity with
the Kappen case. Then it would have been similar. Now there are none
at all. They intended to keep him under surveillance before arrest.
His suicide attempt prevented that. What do you suggest they should
I notice that in your concern to avoid a miscarriage of justice you
completely ignore the miscarriage of justice that did occur in this
case. It was not suffered by Mr Gafoor; it was suffered by the
original defendants, who were cleared by the DNA testing you condemn.
The conviction of Gafoor has silenced the whispering campaign against
the original defendants once and for all. Save your concern for the
five men who were wrongly arrested and three of them were wrongfully
convicted and for Lynette's family who endured a living hell for 15
years as a result of that. They deserve your sympathy and concern.
Gafoor does not.
Having worked on this case for 12 years, both to prove the
innocence of the Cardiff Three and to get justice for Lynette by
finding her real murderer, I can tell you that your concern over
Gafoor is misplaced. Police did nothing wrong in that investigation.
I wish I could say the same about the original investigation. Sadly I
can't. I suggest that you take a good look at what happened in the
original case and what happened in the latest one before suggesting
that there may have been a miscarriage of justice. As one who knows
more than most about this case, let me assure you the evidence, not
just the DNA, against Gafoor is absolutely compelling. He is not the
victim of a miscarriage of justice and nobody's civil rights were
violated. The boy was never arrested in connection with Lynette's
murder, nor was anyone who gave buccal swabs other than Jeffrey
Gafoor, because Gafoor's was the only profile that matched those
obtained from the crime scene.
By the way what do you mean by Gafoor and the victim were related,
so evidence of inclusion anyway? This does not make sense. They were
not relatives and prior to the DNA evidence Gafoor had not occurred
as a suspect. But for the innovative narrowing down of profiles of
interest he never would have either. But before you claim victory
remember he admits that he killed Lynette and has done so at every
opportunity since confronted with the DNA evidence. Without the
innovative police work he never would have and this murder would
never have been solved and five entirely innocent men would continue
to endure a thoroughly unjustified whispering campaign. That is is
the the injustice of this case, not what happened to Gafoor. Please
remember that but for the excellent work by the police in this case
Jeffrey Gafoor would never have admitted his guilt. He had 15 years
to come forward voluntarily. He failed to do so. He had no intention
of ever doing so. He even watched men he knew to be innocent go to
prison for his crime. Their lives and those of their families and
those of Lynette's family have been wrecked by Gafoor. Society is
entitled to a very long rest from him.
Commiserations for your email server problem [ His server was firing-off multiples of the same email].
A year ago as part of a 80 person Cc group I was on the receiving
end ,like all 80, of the same repeated original email, twice an
hour ,continuously for a week. Simple filter blocked the immediate
problem at my end but it seems the problem was a server in a small
analytical laboratory that had closed for
annual holiday. The intended recipient there had a full mailbox and
an autoredirect to someone else
on the same server also with a redirect back to the other person .
Anyway back to Gafoor. So the ends justify the means does it. ? I'm
quite confident Gafoor is the guilty party, that is not the problem.
I was relying on memory that he was somehow related to the victim so
I was wrong there.
Up until his confession (better to ambulance staff than the notorious
jailbird confessions) he was just the victim of coincidence of SoCo
sample profile matching his own set of numbers.
I would be grateful if you could tell me what the accepted generic
name for what I call sub-set trawling is. Including subsets linking a
target to nephews, then 3 million NDNAD profiles would amply make the
whole UK population ensnareable.
How many also-rans ,from the database trawl,other than the nephew had
a 10 or more allele 'match' with the SoCo sample(s) ?
If you must rely on DNA profiles ,only, with no other corroboration
then at least repeat using Powerplex or mitochondrial ( if soco
sample is amenable) or some other set of loci tests on both sampes is
what I suggest,but it still leaves the attribution problem (below).
Now concerning the vindicated 3 and other miscarriages of justice, I
have no problem with the use of DNA profiles as evidence of
exclusion. It is evidence of inclusion in a system that is perceived
to carry the concept of uniqueness, when it does not, that the
problems start. Why in a DNA database trawl do they always arrest the
first person who matches - it just so happens that no SoCo sample has
matched one of the unresolved pairs in the NDNAD. Similarly for
identifying dead bodies there is no problem (other than lab cross-
contamination/errors) that the DNA sample came from that dead body
with no possibility of misattribution as is always a possibiliy with
SoCo samples ie planted to implicate (say). Are SoCOs trained to
determine the difference between a glove (say) left at the crime
scene by a burglar and worn by the burglar as distinct to a burglar
dropping a glove previously owned by his enemy or some totally
innocent person - I think not. Without independent corroboration DNA
profiles can be worse than worthless.
I still don't understand what your problem with the Gafoor case is. It is self
evident that there will be more partial hits at fewer loci. The point in the
Gafoor case is that the searching of the database for some of the alleles
allowed numerous people to be eliminated. It really is quite simple. The
similarity of crime scene profiles to that of the boy indicated that his male
relatives were people whose DNA profiles needed checking and elimination. All
of them including Gafoor voluntarily gave their buccal swabs.
I have absolutely no problem with the police work that happened in this case.
The trawling as you put it helped the police to generate a line of enquiry and
eliminate many other lines of enquiry. I have no problem with it at all in this
case. The problem in your world is that without this innovative police work
Gafoor would never have emerged as a suspect. I really do not understand what
your problem is. The boy was never a suspect. There were other partial hits. So
what? I repeat the partial hit to the boywas only used to narrow down the
potential search. It developed a line of enquiry that resulted in the
conviction of a particularly brutal murderer. The evidence in this case is
Gafoor's profile, which he gave voluntarily. He has admitted his guilt to the
ambulance staff; he volunteered it; it was not solicited. He admitted it to
police in formal interviews with police; he admitted it to his lawyers and
instructed them that he wanted to plead guilty and he admitted it in court. In
short from the moment he realised that police had caught up with him he
admitted his crime. He attempted to fabricate an explanation for the finding of
his DNA at the crime scene. He knew his DNA was there. He tried to claim that
he had sex with Lynette a week before her murder and asked if it was semen. The
question wasn't answered, but it alerted police that they had almost certainly
found their man as the DNA obtained from the flat had come from extensive
bloodstaining. His DNA was found on several samples, both ones discovered
during the original enquiry and in the recent re-investigation. His blood lay
beside the victims on several samples. There is no way that this could be
explained innocently. Once Gafoor was confronted with this evidence he admitted
his guilt. Please note that his admission to the ambulance staff came after a
serious suicide attempt. He told them: "Just for the record I did kill Lynette
White. I sincerely hope to die." This was evidence and compelling evidence at
Now to return to your question, what the police did in relation to the National
DNA Database was nothing short of excellent police work. It narrowed down
possible lines of enquiry. You seem to completely the fact that its sole
purpose was to narrow down lines of enquiry; it was not and was never intended
to be evidence in its own right. The evidence was Gafoor's own profile. Please
get your facts straight, Gafoor's profile was not on the National DNA Database
at that time, nor would it have been. The boys was and was considered similar
enough to the crime scene profiles to suggest that male members of the boy's
family needed to be conclusively linked to those profiles or eliminated. Had
the police tested all the relatives of the other partial hits they would have
been eliminated as the only 20 allele match to crime scene profiles was that of
To sum up, I not only have no problem with the police's innovative search of
the National DNA Database for partial alleles, I tip my hat to them for doing
so. They did an excellent job and I hope that other forces follow suit. I
notice you miss the wider picture here. Innocent people have been cleared by
DNA testing in a number of cases. They can instruct their lawyers to
investigate such possibilities in their own cases. After all, what better way
to prove your innocence than to be able to say, Mr or Miss X is the real
SGM+ tests ARE corroborated. Samples are tested and a corroboration test is
also conducted. Results are reported if both results corroborate each other.
Please get your facts straight. Mitachondrial DNA is useful in cases of hair
shafts or bone; it is ludicrous to attempt on blood as it uses more DNA for
less discimination. I take it you are aware that conventional blood grouping
corroborated the results. In the original case there was good blood grouping
evidence and very poor quality DNA results. As DNA testing testing and
amplification techniques improved the quality of the DNA evidence improved to
the point that it became overwhelming. The innovative use of the Database
enabled this case to be solved as it developed THE line of enquiry that
unmasked the murderer. It is a technique that can be used in other cases, both
to help convict the guilty and potentially to clear the innocent. It is NOT
evidence, it is an INVESTIGATIVE tool, and damn good one too. Have you ever
though that the reason the crime scene profiles did not match any profile on
the database was because the donor of the crime scene samples Jeffrey Gafoor
was a person whose DNA profile was not on the Database, so there never would be
a full match on the Database. If you want independent corroboration of DNA,
what do you define as corroboration? And why are you complaining about this
case. There was corroboration of the DNA - compelling corroboration of
Once again I have no problem with what the police did in the second
investigation of this crime. It is not the end justifying the means; it is pure
and simple excellent police work. Please understand that there is a difference
between the investigative process (the search of the Database) and the evidence
presented in court Gafoor's profile and admissions and guilty plea. The DNA
evidence in this case was corroborated. Call the next case!
Sorry I should have added that your point regarding the ensnaring of 3m
based on 10 alleles is fundamentally wrong. The evidence of incliusion that
resulted in conviction was a 20 allele profile. Nobody could be ensnared
without a 20 allele match, so unless you are claoiming that 3million people
coincidentally share Gafoor's profile 3m people in the UK most certainly are
I am not sure of the generic term for what was done, but call it subset or
allele trawling if you like. I prefer to call it innovative investigation of
particular alleles. While there were 12 allele partial hits the relevance of
the boy's profile was the particular ones that the partial hits were
obtained at and the similarity to crime scene samples - similar enough to
arouse interest, but nowhere near similar enough to suggest that the boy had
a case to answer. In fact the boy's alibi is compelling as he hadn't even
been born when Lynette was murdered! But as I have said before the boy's
profile merely alerted police to the possibility that a male relative was
likely to be the killer. They still had much investigative work to do.
Gafoor established himself as the prime suspect as a result of his blatant
attempt to fabricate an explanation for the discovery of his DNA at the
crime scene. That was prior to him giving a voluntary buccal swab. As a
result of that lie he was put under surveillance and set about obtaining
paracetamol. He took a massive overdose, hoping to die. The surveillance
resulted in his life being saved when police knocked his door down once the
DNA profiles obtained from his swabs were shown to match the DNA profiles
obtained from the crime scene. And I repeat NOBODY was ever going to be
arrested and charged over this case without a full 20 allele match - 12
simply would not cut it, so the number of matches at 12 alleles is
irrelevant, except in eliminating those not of interest and narrowing down
those who were of interest. I still say, this was excellent police work. I
say that as someone who was scathing of the failures in the original
investigation. The reinvestigation was a different story - an example of
exemplary police work.
One fly in the ointment for the Gafoor technique/
Sub-set trawling is
number of children not genetically fathered by
the person accredited with the fathering.
See study by Elliot Elias Philipp that a
minimum of 30% could not have been the father.
From a random population being researched for medical/
genetic reasons and as a side matter threw-up this
astonishing statistic. No reason to assume it is not generally
applicable. This study was from the 1950s and we've had the 60s etc since..
Source: Law and Ethics of A.I.D and Embryo Transfer,
Ciba Foundation Symposium 1973 ,pub. Elsevier-Excerpta Medica,N. Holland.
I am not opposed to corroboration of DNA evidence, but remember there was
corroboration here and corroboration is only required in the courts, not
during the investigative process. The DNA evidence in Gafoor's case
established that he had a case to answer. The evidence as a whole proved his
guilt, none of which would have been obtained without the DNA evidence
establishing that Gafoor had some questions to answers.-----
Dear Mr. Lind -
Since I said, "DNA is NOT so widely accepted any longer without
corroboration", and then you said, "It IS accepted without corroboration", it
obvious that it is a futile effort to attempt to share anything with you
you are too closed minded.
Obviously you did not read the Supreme Court decision from the UK which was
included in the post. Sorry but the world consists of more than just the U S
Two problems with your post as I see. First of all there is a
national database in the US that combines almost all of the state
databses (I think one or two states haven't joined yet). It currently
has well over a million profiles in it.
Secondly, I have some concerns about your statements from your
Missisippi source. I highly doubt that the person was working in a
forensic laboratory, but probably in a biotech lab. Even if he/she
did work in a crime lab, I doubt the tinkering would cause any
significant problems, since the thermal cycler would be calibrated
I've not really studied the American situation so I stand corrected
The golfing profiler could
well be in an academic environment I only researched to a possible
Mississippi context and no further as to his identity.
Returning to the main thread this is a secondary reference to the
2000 UK appeal court decision concerning admissibility of DNA
profiles without independent corroboration that someone in the thread
I'm trying to find a more robust primary reference.
Note the following passage from that decision
The other evidence results from more stringent tests that have been
done on the DNA material that was available in this case. That is
partly as a result of a case in which a 6 point match was found to
produce two possible suspects, one of whom had been charged despite
living at the other end of the country and had to be acquitted when
it was appreciated that the DNA matched a second person.
This is another documented example of ' unresolved duplicate pairs'
resolved in this particular case and not use of aliases. It harks
back to the 6 loci database situation when there was perhaps less
than 100,000 profiles. Balancing increase from 6 to 10 is increase of
the database up to 2 million profiles so probably as many unrelated
matches occuring each year now as in the 6 loci days.
By the way I don't like the term 'duplicate pairs' as it might
Firstly I would prefer the 'sub-set trawling' was named after the
person who first promulgated the concept with the name of the first
investigator. Secondly I am amazed at the amount of man hours that
must have been spent to track down Gafoor when no one will put in a
few man hours to do a procedurally very easy 2 database cross-
correlation to resolve these unresolved matches in the NDNAD.
Perhaps if I summarise how I perceive 'sub-set trawling' works just
incase I've got it totally wrong.
It starts with a 10 loci 20 allele scene of crime profile that has no
match on the NDNAD.
This consists of say A1,A2;B1,B2; ........J1,J2.X,Y and then 1024
permutations taking 10 at a time.
Now an interrogation of the NDNAD for 'partial matches' probably
returns and I'm totally guessing here but something like a few 16,17
or 18 allele possibles,say 4 or 5 for 15 ,10 or 12 for 14,50 or 60
for 13 ,a hundred or so for 12 and thousands for 11 and tens of
thousands for 10.
Now family tree investigation (easy but time-consuming ,GRO records
trawl ,for other than own family - I know I've done it myself)
starting with the 15 or more hits going up the tree and across to
Then going to the each family and requesting samples and family tree
info. If their volunteered family history does not match the
externally derived tree then immediate suspicion . If the volunteered
samples diverge from the SoCo profile then go on to the next sub-
set 'hit'. Until eventually the nephew comes under investigation and
sampling up and across the tree becomes convergent on Gafoor in this
case. 10 loci/20 allele match so start criminal proceedings but he is
still only a suspect at that stage.
My problem with such trawling goes back to when I received my own
profile under Data Protection Act - Subject Access. I am no
criminal,no criminal record, but my profile is on the NDNAD just like
crime-scene examiners ,many police etc. Although of a scientific
background the returned form showing words like
locus,Amelogenin,D21S11 etc meant absolutely nothing to me at that
time. But at the top of the table were apparently two columns
labelled X and Y. I misinterpreted this as meaning the numbers under
the X came from the X chromosome and those under the Y from the Y
chromosome. The sort of tables I normally come across that is how
they are structured. Totally wrong,bad layout ,I now know, but at the
time I was gob-smacked in that should a relative of mine be a villain
(known or not known) then I could be in the situation in effect
of 'grassing-him up'. Now I find with 'sub-set trawling' I was not
so far off the mark. The only difference being the computer crunching
and investigation time required to do it .
But in 10 years time with powerful computers and surname/DNA profile
data, by that time demanded,from the genealogical community cross-
correlated to the NDNAD etc - who knows. See whats going on in
Iceland supposedly for genetic/medical reasons.
Or put it another way how does the Gafoor nephew feel about all this
and how does the rest of his family behave to the nephew these days?.
Whatever crime one does then one is punished and that should be the
end of it not this atrocious Damoclean Sword hanging over you for the
rest of your life and death, the lives of your parents,the lives of
your children,children's children ad infinitum . Previously a
criminal record related to yourself and you alone other than maybe
some stigma wrt neighbours or family say ,nothing further could be
inferred from name ,DOB,mugshot,IC1-5,facial features and friction
ridge data - totally different now.
Call it sub-set trawling if you want. It makes no difference. Why are you
amazed that the police did such a good job involving so many hours. They were
determined to solve this case. When you consider what happened before - an
easily preventable glaring miscarriage of justice - it is not at all
surprising. They want to restore public confidence in them. Very few people
believed that the real murderer could be caught. At times I was alone in
believing it could and should be done. They saw that this case was solvable and
they knew that they would be under great scrutiny this time round. Not only did
it have to be done right it had to be seen to be done correctly.
Many victims of miscarriages of justice want the real perpetrators to be caught
and punished. The Gafoor case has already caused many victims of miscarriages
of justice to give the police a chance to follow the example set in the Gafoor
case. This is precisely what the police hoped for. It explains the man hours
invested in this case - resources which were a fraction of those wasted in the
If cross checking the National DNA Database for unresolved matches means so
much to you, I suggest that you take it up with the Forensic Science Service
who are the custodian of the database and raise it with the Bar Council,
Liberty, etc. to take it up with the government. By the way what evidence do
you have that there are unresolved matches on the database and what do you mean
by that term?
I know that there was a case of a match being called on the six loci test which
was later proved wrong, but you really ought to credit the fact that it was the
ten loci test that exposed this error. To the best of my knowledge there is no
similar case involving SGM+.
You seem to want it both ways - the only way to categorically prove that DNA is
unique or disprove it would be to have the DNA profiles of every single human
who ever lived on a Database, which would then be searched for matches. If one
is found DNA would be proven not to be unique. If not it would be. Now to the
difficult part; to achieve that you would require a complete DNA Database, not
just nationally but globally. Would you support that? If not how do you propose
to test your hypothesis that Adam, Wally and others are wrong to claim that
CODIS offers unique identification of individuals? You can't just assert that
there are unresolved matches on the database as a fact without proof.
In the Gafoor case full 20 allele profiles were obtained from the crime scene.
These were obtained from several samples. There was really no doubt that these
contained the DNA of the murderer of Lynette White and that he was her real
murderer. I cannot conceive of an innocent explanation of the position and
amount of his DNA discovered at the crime scene. That was in January 2002.
Police then checkecked that National DNA Database. There were no direct hits.
It was therefore clear that the murderer's DNA profile was not on the database
at that time. At my request, the police had all 140 DNA databases throughout
the world checked through Interpol. That shows how determined they were to
solve this and that they were open to suggestion if they thought it could help.
This was a marked change from the attitude of previous investigations.
Unfortunately there were no direct hits. It was pretty obvious that DNA offered
the only realistic possibility of unmasking the real killer. After all the only
other possibilities were the killer, overcome by remorse confessing, or a
witness belatedly coming forward. Both were unlikely after 14 years. That left
There were only two options over the DNA. 1) Wait for the killer to do
something that allowed his DNA to be stored on the database, or the innovative
approach of analysing components of the result. The second approach is what
South Wales Police did. Considering what we now know of Gafoor's character this
was the right approach to take.
It all started with DC Paul Williams noticing that one particular allele
position was only occurring once in every hundred profiles checked. That alone
narrowed down the search by 99%. He then expanded the search to eight alleles.
This narrowed down the remaining 1% further. After that the search was expanded
to 12 alleles. It is fair to say that were more than one 12 allele partial hits
on the database. Williams then narrowed the search to the South Wales area,
believing correctly as it turned out that the murderer was a South Wales
native. After both were done 12 allele search confined to South Wales area, the
profile of the boy stood out.
It was of course possible that this boy was not related to the murderer. At
this point Williams' work had generated a very interesting line of enquiry that
had to be checked. If after this was done, no 20 allele match was forthcoming
there would have been no arrest and other partial matches would have been
The investigation of the family tree as you put it was necessary. Family
members could have refused to give samples if they wished. Would they have been
suspected as a result - possibly, probably even, but without evidence they
could not have been compelled to provide samples. Gafoor could have prevented
his conviction by refusing to co-operate. Had he done so he would have been
suspected, but there would have been no evidence to arrest him. It was the DNA
evidence that led to the other evidence. The work of Williams was part of the
investigative process. But for Williams and the DNA evidence a particularly
brutal murderer would have been untouchable. You seem to object to the police
doing this. I still can't understand why. There is no breach of civil rights
here. The police did not frame Gafoor. They did not plant evidence. They merely
identified him as the donor of crime scene DNA and nobody but him was the
donor, a fact he concedes.
I'm not sure what you mean by family tree investigation. There was no such
thing. Police approached all male relatives of the boy and asked for buccal
swabs. All such requests were successful. The point of the tests of family
members was to exclude the innocent and to identify the donor, after which the
family history, background, etc became important. The DNA testing was the
essential first step in the evidence-gathering process.
As far as I know Gafoor's family are fine about it. They had been estranged for
The co-operation police obtained from the boy's family suggests they had no
objection, probably could not believe thart one of their own could have been
involved. They like many others in this inquiry co-operated because they had
nothing to hide. Gafoor himself may have co-operated for the reason you
suggest. He alone could not afford the suspicion or arrest then as he planned
his final exit and needed to buy time for it, but his answers aroused suspicion
and he was put under surveillance.
I still don't get your point. May I suggest that you should have more concern
for the feelings of the Cardiff Five, Lynette's family and society than Gafoor.
There was a very simple way for Gafoor to have remained at liberty. All he had
to do was not kill Lynette in the first place.
The use of DNA Databasing was not the sword of Damocles hanging over Gafoor's
head; his DNA which proved his guilt was. In the case of Gafoor and his nephew,
had either of them not committed crimes, none of this would have happened.
Speaking for myself, if my DNA could have helped to solve this crime I would
have no objections to giving it. I believe there is a need for debate over the
precise safeguards needed for DNA Databasing, but the work of DC Williams was
and remains outstanding and has no part in my concerns. I have no problem with
it. Society is better off with Gafoor in prison for a very long time. The
atrocious sword of Damocles in this case was not hanging over Gafoor's head; it
was hanging over society's collective head. For all we knew Gafoor could have
been a prolific serial killer. Thankfully he wasn't and it seems unlikely that
he would have killed again. Prior to his capture nobody could have known that.
DC Williams cut the Gordian Knot. More power to him for doing so.
The unresolved matches are inherent to such a database and
solely contained within such a database . NOT the false matches
between a new SoCo profile and a profile already within
the NDNAD or new arrestee and old SoCo profile.
The term 'unresolved match ' first emerged in the
Journal Forensic Science International 95 (1998) p30.
Concerning data in the UK DNA database as of 04 October 1996
when there were only 6311 samples from the London
area and 573 from the Cardiff area.
Direct quote from that article - pre national database.
"A small number of unresolved duplicate pairs of
profiles were present in the regional data :10 pairs
within the London region and 1 pair in Cardiff.
The most common cause of
duplicate entries is the use of aliases by suspects
who have been arrested on several occasions.
For administrative reasons ,it is not always possible
to resolve such duplicates by exhaustive
I suspect it is more than 'administrative reasons' these matches have
not been resolved. As I say it does not require police investigation,
just the cross-correlation between 2 databases. If they want to
investigate the aliases situation then that is another matter I
have no interest in.
Then more recently broadcast on
04 Dec 2002 was a documentary "DNA in the Dock" that
had been made maybe a year or more previously referring to this
same situation but by that time 300 "matches" in the
UK NDNAD ascribed to mistakes ie retested people
giving aliases but again not resolved. Even a sampling of
1 in 10 say of these matches being resolved would indicate
how prevalent the aliasing versus unrelated individuals situation
Copy of this broadcast at Livermoore Library but I cannot
access it or find a transcript of it anywhere else.
A couple of weeks before parliamentary recess I got my MP
to ask a written question of the FSS the most fundamentally simple
question of how many of these damned matches are in the NDNAD.
So far it has not turned up in the Hansard public access internet
The DNA uniqueness data is already there buried in the NDNAD it does
not need any further profile taking studies. But someone at the top
of the FSS
is not disclosing it for political?/scientific? reasons - your guess
is as good
as mine. Not even researching one in ten (say) of these unresolved
Thanks for more detail on the zeroing in on Gafoor,I was taking
a general case. I did not realise there
was a head start in one low allele frequency ,sub 1% . Every allele
own personal profile are in excess of 9% frequency of occurance.
On the family tree side of things for investigation expediency you
balance up leads from the family very much speads up the GRO research
but doing it remotely ,taking much longer, would be more pure but
for each particular family for more than 2 or 3 false hits would
stymied the investigation - it is very time consuming as compared to
verifying /extending leads direct from the families.
The Damoclean sword hangs over Gafoor yes but also everyone on the
(Peter Hamkin fashion) courtesy of s82 of the 2001 Criminal Justice &
including ,no doubt,(not that they realise it) the families in the
false hits in the process
of zeroing in on Gafoor.
In response to some commentaries -
Yes, we (USA) have a national DNA database, CODIS. All states except
Mississippi and Rhode Island are participating. However, thinking that
samples is a lot, look at it from another perspective.
Right here in Los Angeles County there are over 1.5 million warrants in
County system. CODIS is in its infancy and I believe headed for a lot of
jerky crawling and toddling before the smooth walk.
While DNA databasing is undoubtedly in its infancy and over one million
people being on the CODIS database may seem a small sample size in the
context of the population of the USA, it is bigger than it may seem.
The Gafoor case in Britain has shown that criminals can be identified from
family members' DNA. There is nothing to stop police from investigating
particular allele positions and searching for partial hits at less allele
positions than the complete set recorded on CODIS. It can therefore be
argued that rather than one million people on the database can be seen as
one million families on the database. Of course for prosecutions the DNA of
family members not on the database will be required.
In Britain co-operation (meaning the right to obtain samples from those
arrested) can be compelled. Samples can be taken by force if required, but
this cannot be done in the absence of evidence justifying arrest. Unless
there is other evidence justifying arrest voluntary co-operation would be
In 2001 following a notorious case (Michael Weir) the law was changed to
allow samples and results to be retained and stored for National DNA
Databasing purposes after a person has been arrested and charged even if the
person arrested was subsequently acquitted, or the charges were dropped.
This applies to fingerprinting databasing as well.
Last year the legality of this was challenged by judicial review in the case
of Marper & S v Chief Constable of South Yorkshire Police both in the High
Court and on an appeal. It had been argued inter alia that this practice
breached the terms of the European Convention of Human Rights, and the Human
Rights Act regarding privacy rights. Both courts found no breach. In fact,
there are exceptions to privacy rights within the Convention itself
including the detection and prevention of crime. Consequently, the law as
changed in 2001 does not breach the terms of the Convention. I also think it
unlikely that a challenge will succeed in the European Court of Human Rights
on these grounds due to the exemption. Satish
One might add that DNA typing methods are based on a
pool that is not only enormous in quantity, but also a
very broad cross-section of the population. I know of
no other biological studies based on such an
impressive pool of data. Even the largest medical
studies done (i.e.: the Framingham heart study) can't
hold a candle to the number of samples in the DNA
profile database compiled by the U.S. military.
Yes, we all know that just because the probability of
something happening is greater than zero (no matter
how miniscule) doesn't mean that it's never going to
happen. However, there is also a very small
probability that all the atoms in the world will all
somehow occupy the same place at exactly the same
moment, thus turning us into a black hole and
rendering this whole exchange pointless, but I'm not
going to lose any sleep over it.
Anyone can publish a paper "proving" otherwise, but
unless others can reproduce your results, a published
paper in and of itself means little. (Remember "Cold
Fusion"?) - Laura
I would like to thank all of those that spent time
working on trying
to apply equations for the birthday problem to DNA.
of your work is wasted. There are two critical
flaws in doing this.
1) The birthday problem by its nature involves a
finite set of
possibilities, and therefore a known probability
are only 365 days in the year, each of them assumed
to be as likely as
the next to be someone's birthday. So for any group
of people, you can
calculate the number of people with the same
birthday. With DNA,
there is no set number of combinations that can be
could take the set of alleles in the allelic ladder
and disregard the
fact that dozens of other alleles are possible at
each locus. Limited
to only these, one could come up with 400 billion
combinations (although this would be underestimating
by several orders
2) You have made your calculation easier by
estimating the "average"
population frequency of 1:1 billion. This is a
rather nice figure,
but it doesn't seem to take into account for the
fact that all of the
DNA profiles have different frequencies. Some may
be 1:1 billion, but
others may be 1:1 sextillion. Perhaps you shold
come up with a
weighted average of the billions of billions of
and their respective population frequencies.
If the calculations were correct, there would be
over 1000 pairs in
America's DNA database of over 1.5 million.
Interesting how this is
not the case. Interesting how there are not even a
hundred. Or ten.
Or even one.
Sleight of hand can make even the impossible seem
Let's not let facts get in the way of some really
smokescreening. Please, carry on with your ministry
misinformation, there are plenty of impressionable
people who also are
not intersted in the truth.
"If the calculations were correct, there would be over 1000 pairs in
America's DNA database of over 1.5 million. Interesting how this is
not the case. Interesting how there are not even a hundred. Or ten.
Or even one. "
Of course this is more than blind faith coming out here. You
have the evidence to back your assertion down to the last '1'.
Where / what is your evidence for this assertion. ?
[ NOTE: HE NEVER DID COME BACK WITH ANY EVIDENCE ]
If true you have some remarkable technicians in charge of
the USA DNA profile database.
From the record of UK parliament (the section highlighted in red)
a third down this official source
Mr. Bob Ainsworth: The total number of profiles held on the database
at 25 March 2003 was 2,094,858. The Forensic Science Service
calculate that these profiles relate to an estimated 1,886,000
Number of profiles I'm quite confident to the last '1' as being
2,094,858 on that date. Beyond that here is the
'sleight of hand' they have to 'calculate' to
give an 'estimated' figure of 1,886,000 for 'different
To my way of viewing errors some figure between 1,885,500 and
or even range 1,884,000 and 1,888,000.
Genuine repeats of reprocessed same individual ,clerical errors
all along the way from police station to forensic science date entry,
people using aliases and (to me) the all important pairs (or more)
different individuals with the same DNA profile. All cast aside
with 'calculate' and 'estimated'.
Split into 6 loci set and 10 loci set
How difficult is it to check 3 fields in a database for repeat
ie match on 10 (6) pairs of numbers + amelogenin,DOB,and name.
That should be a figure accurate down to the last '1'.
Then check for repeats across 6 loci and 10 loci set with same
After that how many pairs of matching profiles in each set again
a precise figure.
After that it becomes more murky - resolving what constitutes the
difference in figures,ie clerical errors,aliases and unrelated
Please tell me how they have analysed in the USA down
to the last '1'.
This quote contained in a judgement by justices KAY LJ, SILBER J and
in matter of
R. v. Watters
COURT OF APPEAL (CRIMINAL DIVISION)
October 19, 2000
6 point match was found to produce two possible suspects, one of whom
had been charged despite living at the other end of the country and
had to be acquitted when it was appreciated that the DNA matched a
Source 2/3 way down the full judgement on
This puts on record that unrelated matched pairs of DNA
profiles (not alias cases) do occur within the UK NDNAD.
Less common now with 10 loci profiles than 6 but it does
not change the principle. This event only leaked out because a
scene of crime profile matched two in the NDNAD.
We can blather on about hypotheticals of 1: 1 billion, 1: 1 squillion
and it is all meaningless. The real answer lies within
the real data derrived from real people but buried ,untapped,
within these databases.
The best point I've heard in this discussion yet. The arguments against DNA
are equivelent to Barry Scheck's (sp) pounding of Deniis Fung in the OJ case.
"isn't it possible. Mr. Fung"? Anything is possible but it isn't very likely
at all. In the criminal justice system we have to deal with things we can
actually grasp and understand. DNA is well founded in science, reliable, and
believed by judges, juries, and attorneys.
Minor correction: the US military DNA Repository has samples of most
everybody on active duty since about 1994 and having a sample confirmed as
in the repository is a predeployment requirement (at least in the Army.)
But these samples have NOT been typed. Typing only occurs when a set of
human remains is compared to an individual in the repository. Of course,
for a mass fatality incident with a known population (like an air crash) all
the samples believed to be in the group would be pulled and typed and then
compared with typing from remains as they came in.
I think the ball is in your court on this one. Knowing that there
are a set number of combinations makes it easier to calculate an
*exact* result, but there is no need to rely on that to calculate a
We see quoted accuracy of an unrelated DNA match as being x
million :1, or 1-100 billion: 1 for the UK NDNAD test, say. Knowing
that alone, it must presumably also be true for two individuals
selected at random from a large population. Given that, it becomes
clear that it is not surprising and indeed inevitable that a large
population must contain individuals with non-unique profiles, per
previous calculations, without needing to rely on knowing exactly
how many combinations might exist.
You have to start somewhere. AIUI, that kind of assumption of
independence is also made in the calculation of the x million:1 odds
we see quoted. In some sense there will be an 'average' figure
(perhaps not an *arithmetic* mean, maybe more like a *geometric*
mean), and if one individual has a much lower chance of having a
duplicate due to their particular makeup, then by the same token
there will be others with a much higher chance.
Reversing your logic, one way to discover experimentally the
weighted average you suggest would be to take a large database,
count how many duplicate pairs we find, and feed it back through the
probability calculation the other way.
If there are actually no duplicates, then by calculating that
average probability if there was 1 pair in the population, we can
determine that the 'average' random match chance should be less than
that result, so we still learn something.
Also, if there were actually no duplicates, that would suggest to me
that there can't be more than a certain variation in the random
match chance, as if some people were much more likely to be in an
unrelated pair you might well see them popping up. You could also
investigate that variation, if present, by seeing perhaps whether
any such pairs fell wholly within a subset of the whole population
(e.g. by race, or anything else) where the random match chance
differed from the average.
Question: Has anyone seriously looked for pairs in it?
Or has it been assumed that there can't be (it's DNA, after all!),
so there's no need to look? That seems circular logic to me, to say
that because there can't be any duplicates, any that are found must
In between that 2,094,858 total and the 1,886,000 'estimated'
individuals, there's plenty of room for there to be between 40-4000
non-unique individuals (20-2000 unrelated matching pairs) - the
number you'd expect if the 'average' random match is as quoted 1-100
billion:1 for the test used in that case.
Small enough not to be seen if you're not looking for it, yet easily
large enough for there to be many more such matches in the general
population. Remember, the number of unrelated pairs rises in
proportion to the square of the population (until it's a large
fraction of the population, anyway), so double the database and
you'll quadruple the number of unrelated pairs.
Whatever the actual match probability, I think it's for you to
explain why the pairs would not rise in proportion to the square of
the population, as the probability calculation shows they should, at
least until they are a large fraction of the database.
The results may seem counter-intuitive, just as it does with the
birthday problem, but are no less valid because they are not what
you expected. Even with a very low random match chance, you will
start seeing duplicates with a surprising small population.
Well, I'm interested in truth, and there seems to be a remarkable
willingness to make pre-judgement with this and assume the result,
essentially asserting that unrelated pairs are impossible because
they're impossible. IIRC, there was a good deal of pain before it
was admitted that the 6-loci match wasn't good enough and the
accuracy of the test was improved to construct the UK NDNAD.
I make no prejudgement, whatever the outcome. Either way, to perform
an investigation or experiment that could conceivably falsify a
currently accepted tenet seems like good science to me. It may be
that the results are consistent with the theoretically calculated
random match chance; or it may be that they indicate that it's too
low or too high.
Whatever the case may be, the numbers for the 'flat' probability
calculation raise enough issue in my mind to want to know how many
duplicates there actually are in a large database. Are there
actually zero? Tens? Hundreds? Thousands? How do you know unless you
accept there might be some number of unrelated pairs and go and look
for them? - jbaron
Still the size of the typed pool around the world must be large compared to
most medical studies. I wonder if anyone is pulling this worldwide data
together to evaluate say RFLP or STR profiling?
At the risk of being attacked personally again, I think that the PI lady and I
have been talking past each other. She has been saying, as I understood her,
that DNA profiles are not statistically unique. What I'm saying is that the
DNA of a particular non-twin person is unique, due to the fact that there are
about six billion people on earth and trillions of possible DNA combinations.
120,000 years ago there couldn't have been more than a few thousand modern
humans. Even given the rise in human population to over 6 billion, there
haven't been enough human genomes put together to make a duplication more that
a very remote possibility. My contention is that DNA is reliable, as it is
presently done. But if there is a statistical artifact that casts any doubt on
it, the number of probes or loci studied should be increased to eliminate that
I also think that the people on the defense side of the criminal justice system
just hate the thought that science has come up with identification procedures
they can't talk their way out of. I think that that is why they so vigorously
attack DNA and fingerprint evidence. - Wally
It may surprise you, but I agree with much of this. I do not know what the
reported statistics are for the possibility of random matches are in the
USA, but in the UK the SGM+ system (10 loci plus amelogenin) is 1:1 billion.
I do not know whether that is an accurate representation of the statistical
reliability or not, but it is the figure routinely reported in the UK. Given
the figure reported by Wally Lind I am interested in how there could be such
a discrepancy. Do the extra genes used in CODIS offer that significant a
degree of discrimination? Alternatively either the figures reported in the
UK must be wrong, or those in the USA are. Could anybody explain how these
systems could produce such diverging random match statistics?
Regarding the use of extra loci, I wholeheartedly agree. Given that there
are suitable loci already identified I don't see why one or two extra loci
are not added to the SGM+ system as by doing so the statistical grey area
that exists in the UK would be removed.
I do however, disagree that defence lawyers hate DNA. Where would the
Innocence Project be without it? Over 100 innocent people have good cause to
thank their lucky stars for DNA in the USA at least. So do their lawyers.
DNA not only helps to convict the guilty, it helps to clear the innocent.
The Cardiff Three have good reason to be very grateful for DNA, while only
Gafoor had cause to fear it. DNA is an invaluable tool for law enforcement
AND for the defence of the innocent.
Here's an interesting way of looking at this debate--
Remember PGM (phosphoglucomutase)? Remember how this
test was used in combination with ABO? Yes, folks,
they were used to eliminate suspects. If the results
of these tests "matched" your subject's, you have
failed to eliminate the subject. Depending on which
ABO type and which PGM type you get, you could say
that you could exclude X% of the population but could
not exclude the subject. This test was (and maybe
still is) being used as one piece of evidence.
DNA typing can be seen the same way. Instead of 80%
or 90% of the population being eliminated, we can
eliminate 99.9999999999999% of the population as the
source. Once again--one piece of evidence (a really
good piece of evidence, however).
The prosecution still has to demonstrate the
significance of this one piece of evidence by
presenting other evidence and putting it together to
convince a jury beyond a reasonable doubt that the
accused is guilty.
I think there is a lot of confusion here over the
difference between "reasonable" and "shadow of a"
doubt. We are not trying to do a mathematical proof
here, rather we are determining the odds of a
proposition being true.
A person could have more than one profile placed in the CODIS databases. For
instance, if a serial rapist moved from state to state (each participating
state is the repository for its own database) each state would compete a
profile if DNA was recovered, so they could have more than one profile in
CODIS. We had a robber/rapist who had left semen at one other crime scene in a
different Minnesota community. When he was caught and convicted for a third
rape (no DNA in that case), his known profile was done and placed in our
database per state law. The computer "hit" on the two unknown cases. This meant
that he had three copies of his profile in the Minnesota database. I think.
There is no actual national database in the US. There are connections between
state and federal databases through the FBI. As I understand it, when a profile
is run in CODIS it is actually run in each participating state, since only FBI
case profiles are kept at the FBI. The question would be; when a profile is
placed in one states' database, is it automatically checked in all the CODIS
I would like to correct myself on two issues after speaking to a
mathematician on the issue of the "birthday problem".
He indicated that although there are differences, the expansion of the
birthday problem to the issue of DNA would not be a big problem. He
indicated that the big problem would be calculating the average
profile frequency. He suggested a reasonable estimate.
Some have suggested a "reasonable" estimate of 1:1 billion. I found
this significantly low. As I don't use SGM+, I didn't have a strong
basis to make this claim. I have now found that this figure is
apparently accurate. In America, we use 13 loci (but many overlap).
Since we use more loci, we commonly report statistics in the
trillions, quadrillions and sextillions.
But whatever the number (1 in a million, billion, trillion or
gazillion), it doesn't change the fact that duplicates are going to
happen, but they are not a big deal. The fact that two people in a
country have the same DNA profile doesn't change the facts of a case.
Certainly if the only evidence that exists is the DNA match, there is
no just cause to convict (arrest is another issue). If there is
corroborating evidence, that supports the claim and it is found to be
sufficient, there may be cause for conviction.
The presence of duplicates doesn't invalidate the use of a database,
it is just a factor of the population size.
We should remember that just because something happens, it doesn't
make it common (and the reverse is also true). Just because something
is rare, it doesn't mean it can't happen. - Dutra
Concerning duplicates in the database.
While there are no duplicates in a database of 2 million ,just a
single hit then you can say that maybe there will be no further hits
if extended up to the whole population of say 60m (UK).
Once you have duplicates within 2m then you could say there may well
be another 1 or 2 in the next 2 million added to the database or
maybe 50 or more duplicates in the whole population.
It is to me shear arrogance of assuming that the first hit within
such a database and you 'have got your man', just because there is
only one in a 2m database which is itself just a small sample of a 60
Even if two people have the same DNA profile, they don't have the same genome,
distinguishing between the two and the crime scene evidence would only require
further testing. How many actual parings have been detected in the American
my two cents then back into lurk mode again.....
Don't forget about the NMDP...(National marrow donor program) They
find matches all the time...They use HLA & RFLP, AFLP, and are
getting more and more precise with the Typing data all the time, (the
closer the match....the better)
I once saw a person who had had a sex change operation. This person
later developed leukemia. had to have an unrelated marrow transplant
and the donor turned out to be the sex of the patient after their
operation....to simplify.... The patient was born female, had a sex
change to male and received donor marrow from a male...this person's
karyotype were from then on 46XY. Really causes a bit of a mess for
forensics then doesn't it...
Broadly speaking concur with your revised conclusions.
My congratulations for your willingness to investigate and to report
back on this honestly, as you have done.
That is essentially my understanding.
One difference is that the DNA test is 1: some large number, and
that the group size you're assessing may be much larger than 20
people - but these are differences of scale, not of principle.
The main effect that I see resulting from that is that it's more
meaningful with DNA tests/databases to talk about the expected
number of duplicates in large population than to talk about the
population size where there is a 50% chance for one or more
duplicates - but the math basically works either way.
I chose 1 billion:1 as a number to investigate because that's a
figure that is not uncommonly quoted as an estimate for the accuracy
of the current UK 10-loci test. AIUI, this estimate is calculated by
a product rule multiplication of a series of probabilities, although
I'm unsure of the precise details of the derivation. I've also seen
100 billion:1 quoted, which is why I also examined that figure.
It is itself a fact of the case, however, and I'd say it's something
that a jury needs to know - that while a DNA match is compelling
evidence, two people _may_ have the same profile and some form of
corroborative evidence is advisable.
That is why when we see cases reported, I ask what the corroborating
evidence was, and whether the DNA match was presented as being
effectively certain in its own right with or without any corrobative
If we accept that DNA evidence is not enough to gain a conviction by
itself, then it surely follows that *if* that's all that's available
charges should not be laid but investigation should proceed
until/unless there is?
Beyond that, the concern I have is particularly with UK policy to
retain profiles on the national DNA database for persons not
charged, or cleared of any crime.
(i) because of the civil liberty issue, retaining personal data from
people who are cleared of any crime, and
(ii) because of implications if there is some proportion of
unrelated matches in the general population (see below).
I agree. What concerns me is if the presence of duplicates is
denied, and the possibility not considered. See the FSS response in
If the existence of duplicates is not allowed for, a DNA match is
likely to be treated as absolute by investigators, introducing the
danger that they will see only what they are expecting to find.
If the potential existence of duplicates is admitted, then it may be
fine to use our database as we are doing; but to do a more precise
DNA test on any actual hit we find as a matter of course, or to
expect that some form of corroborative evidence must be necessary.
The particular issue I would raise with the retention of data from
persons cleared of any crime is that if you start doing that, and
you do not acknowledge that there may be a proportion of non-unique
profiles in the general population, then there is a risk that you
may be essentially randomly sampling the whole population, and the
chance of hitting an unrelated match may be determined by that
population size rather than the database size. If the random match
chance for the UK test is truly 1 billion:1, as quoted, then we
could be looking at a chance of error as high as few percent.
To see what I mean, imagine that we have a scene-of-crime profile,
which is not matched to a profile already on the arrestee database.
If there happen to be *two* people in the whole population with that
profile, then the closer the addition of profiles to the database
gets to a random sample, the more it's going to be 50/50 which
person gets added first.
Well, you may say, it's more likely that the offender will get
arrested first. Maybe - but also consider the converse situation,
where someone is arrested and then released without charge, but
their profile is retained on the database. If it happens that there
is someone else with a matching profile in the general population
not yet on the database, but who then commits a crime and leaves
DNA, the first man is likely to be accused of the crime.
In either case, they may have difficulty proving their innocence,
unless they know to demand a more precise test.
The investigators won't know it's a duplicate, of course,
unless/until they happen to arrest the _other_ one - and then not if
it's rejected automatically as an "invalid" duplicate.
Indeed. And if you accept that duplicates are possible, then I'd say
that it's reasonable to want to count how many are actually
occurring in a large population, such as the UK NDNAD. Whatever the
result, we can then plug that number back into the math the other
way, and see if it comes out similar to or different from the
theoretically calculated estimate of the random match probability.
If zero such duplicates, then we can do the math as if there was 1
such pair, and infer that the true random match probability is
likely to be even higher odds against.
If greater than zero, especially if quite a respectable number, then
we can make what should be an excellent experimental measure of the
actual "average" random match probability. We can also do the same
thing with subsets of the whole population, say by ethnic group,
provided that the subset is large enough to still include some
duplicates, and see whether the "average" random match probability
is different for that group.
One thought that has occured to me since the previous post is that
there should be enough data on the UK database to determine how it
stacks up with a simpler case, the 6-loci test, without needing to
concern ourself about checking fingerprints or indeed anything other
than the DNA data.
When that test was in standard use, the typical calculated estimate
quoted as the random match probability was 1:37 million, for example
as described in the Easton case:
With a national database of 1,886,000 persons, and given a random
match probability of 1:37 million, then we'd expect to find 100,000
or so 6-loci not unique, or perhaps around 50,000 "pairs". You'd
also expect perhaps 500-1,000 "triples", and maybe 10 or
Then what you do is assume that the 10-loci test is genuinely
unique, and count how many 6-loci duplicates etc you have. If there
are a few tens or even a few thousand 10-loci duplicates, we'll fail
to count those because we incorrectly think it's only one person,
but we're looking for 50,000 pairs, so we'll still get a fairly
Depending on the results, I'd say:
If you found zero 6-loci matches, then I'd wonder what was going on,
because we know such matches can occur in practice, and it would be
wildly inconsistent with the theoretically calculated estimate.
If you found many fewer (say 10,000), then I'd say it was good
evidence the original estimate of 37 million:1 was conservative, and
that other estimates calculated in a similar way (e.g. the 1
billion:1 for the 10-loci test) may also be conservative.
If you found a similar number to that expected, I'd say that it was
good evidence to validate the theoretical 'product-rule'
calculation, for this and potentially for other tests.
If you found many more than the expected number, say 100,000 or
more, then I'd start to wonder if the estimated probability for
supposedly more accurate tests was doubtful.
Take a little bit of effort, but all of the data should be there in
the database, and susceptible to suitably defined query operations.
I could certainly write SQL which would produce the required
results, given the database schema, and while 2 million records is a
substantial dataset, it's perfectly within the realms of current
technology to manipulate. - jbaron
Would you know whether the Amelogenin adjunct to
DNA profiles will pick up the likes of Turner
Syndrome 45,X or Amazonianism/Triple X 47,XXX ?
It is possible to tell if such a genetic anomaly existed. In the case
of someone with turner syndrome (X,0), they should have a single peak
at amelogenin that would be about half as tall as would be expected.
With triple X (X,X,X), the single peak would be about 50% larger than
expected. This might not be detected, since the peaks at amelogenin
are generally taller than at other loci. If a person was (X, X, Y) or
some other situation, it could be more evident if one peak was
significantly different than expected.
I happen to know someone with XXX ,she only
got diagnosed by a test associated with pregnancy.
I was wondering whether someone could get a
'diagnosis' by making a Data Protection application
for copy of their DNA database record.
Returning to the main thread. If the FBI database has
no matches within it then it is because nothing has changed since
1992. The following is a published letter following a published
article that was rendered flawed because of FBI weeding-out
matches within databases before releasing to academic researhers.
Only now ,it is beyond being just an academic matter as people are
being falsely arrested and for all I know falsely prosecuted and
Fom Science,Vol 256,26 June 1992 p1743
Patrick J. Sullivan
Title : DNA Fingerprint Matches
I am writting to comment on two aspects of the report " On the
probability of matching DNA fingerprints " by Neil J. Risch and B.
Devlin (7 Feb,p717) . Risch and Devlin searched several large
databases to determine whether there were any samples with matching
patterns across a nummber of gene loci. They found " the probability
of a matching DNA profile between unrelated individuals to be
Last summer I was trying a Federal Bureau of Investigation (FBI)
case, Minnesota v. Johnson (1),and examined three FBI databases,C-3
(Caucasian),B-4 (black), and H-3 (Hispanic). During my examination,I
discovered 25 apparent matches. Before my examination ,the existence
of these matches had been known by only a few individuals connected
with the FBI. Bruce Budowle of the FBI subsequently testified in
Minnesota v. Johnson that he was aware of these matches and that they
had been discovered when the FBI examined its database with its
computer matching program. The FBI was able to verify that most of
these matches occured because the Texas College of Osteopathic
Medicine submitted more than one blood sample from the same
individual. One false match was the result of sample handling error.
The FBI also discovered three sets of matching samples from Florida.
These samples were from the black and Hispanic databases. The FBI was
not able to identify that the Florida matches were the result of
duplicate submissions from the same individual or of submissions from
identical twins. Budowle then asked Cellmark Diagnostics (German-
town,Maryland ) to examine the matching samples. Its probes also
yielded unclear results. The Florida matches were then deleted from
the databases,even though there was no explanation for their
The FBI again revised its databases in January 1992. The new
databases are designated C-4,B-5, and H-4. Budowle testified (2) that
all the matches have been edited out of these databases and that this
removal is justified because it is not possible for two individuals
to yield identical profiles when as many as seven probes are used. My
first point is this: Of what scientific value is a paper that seeks
to draw any conclusion from the fact there are no matches in a
database when the matches have been removed from the database before
the analysis is done? The FBI's removal of matches from its databases
before giving them to outside scientists guarantees that those
scientists' conclusions will support the FBI's "self-fulfilling
This is not an isolated practice. Budowle testified in United States
v. Yee (3) that the FBI ran its match program over its South Carolina
black database and found a large number of matches. The FBI's record-
keeping was such that it could only speculate as to the cause of
these matches. Again,the FBI removed them from its database.
The existence of individuals who match across a number of loci is not
unprecedented. Kenneth Kidd's Amerindian (Karitiana) data (4) show a
seven-probe match between two individuals ,a four probe match between
another two individuals ,and a number of three-probe matches . These
matches occured in a database of 54 donors from one Indian village.
Despite this fact ,which is well known to the FBI ,the FBI chose
simply to remove apparent matches from its databases. The apparent
justification of this practise is that it eliminates the neccessity
of keeping records about the source of the data. It is troubling to
think that this approach has acceptance among scientists.
My second point .... (binning problem ).....
I have to concur with all he said and that was 11 years ago - I was
not aware of the above article until last week - the situation is now
of course far,far worse. The blind faith stretches ever further.
NB The Amazonian Karitiana tribe is not relly relevant here as they
are highly incestuous so not applicable to unrelated matches - the
relevance comes in when trawling across relatively closed communities
I will respond to both parts of your post separately:
As far as your friend's genetic condition is concerned, I doubt that
the reported genotype you could obtain would detail the XXX. Many
labs (it may be different in the UK). Will report only the OBSERVED
genotypes. So in the case of XO, XX, XXX, or XXXXXXXXXXXX, they would
probably report merely as "X". This has something to do with the fact
that mutations can occur that prevent one (or both) allele(s) from
amplifying. In RFLP days, it was possible that a band would be so
small, it would "run off" the gel and not be observed. This means
that there may be more alleles that are not detected (although this is
I want to point out that the paper you cite deals with RFLP databases.
These are no longer in use in the United States. Similar to the
matches reported when the UK used only SGM with six markers, it is
possible to obtain matches with only three or four RFLP loci that can
be excluded when additional loci are used.
On another note, "trawling" for DNA types doesn't change the
statistics of the match. No matter whether you obtain evidence
against a person and the DNA solidifies the case or you have DNA
evidence against a person and the resulting investigation solidifies
the case, the two are equal. The fact that a person matches because
he was in a database doesn't change the likelihood that a random
person will match.
STR technology has been introduced since 1992, as Adam said, and most, if not
all, of the profiles in the CODIS ( I think it has a new name which I can't
remember) databases have been converted. If people are being falsely arrested
and convicted, I still believe it is because of faulty practice of the
profiling technology, not because there is anything wrong with the technology.
Two cases from the UK this week on what can be broadly called
misattribution of DNA 'evidence'. First from Scotland ,second from
Police outrage over demand for their DNA
by JASON ALLARDYCE
PLANS to force police to give DNA samples have sparked a rebellion
among rank-and-file officers.
It is understood all eight of Scotland's police forces are about
demand that in future new recruits
hand over samples to be included in a national genetic database.
This would allow any body matter, such as hair or saliva, found at a
crime scene, to be compared with
the DNA records of officers, so investigations are not thrown off
course through accidental contamination by officers working there.
But rank-and-file police fear that calculating criminals with a
grudge against members of the force
could manipulate the system to damage the careers of innocent
Members of the Scottish Police Federation believe criminals could
deliberately contaminate the scene
with officers' DNA, either to implicate them in serious crimes or
give the impression that they had planted evidence.
A federation spokesman said: "A point made by many of our members is
that it is
relatively easy for anyone so minded to obtain DNA traces of a police
officer - for example from a
discarded cigarette butt - and to deliberately contaminate a locus
"Apart from the suspicion which may or may not fall on the officer,
it has the potential to diminish the
evidential value of any DNA traces of the real perpetrator of the
In the full Scotland on Sunday article the policewoman
McKie case and the disputed dermal finfgerprints are on
as high resolution images - interesting viewing
Then from the criminal fraternity someone being implicated by person
or persons unknown, presumably an enemy of his.
A man accused of burgling a city home after bloody tissues found at
the scene matched his DNA profile has been cleared by a court.
Jonathan Bowskill said he had nothing to do with the burglary
at Alpha Street, Heavitree, in the early hours of November 29. A
jury at Exeter Crown Court yesterday found him not guilty. During
the trial, prosecutor David Evans said Peter Holmes went to bed and
left a window open and his wallet in his leather jacket. He got up
at 5am and went to work. He later found the tissues on the floor and
his wallet missing.
Bowskill told police although he was a heroin addict, he "didn't do
burglaries", and did not know how the tissues came to be there.
There are about 2 million DNA profiles in the UK
Forensic Science Service (FSS) NDNAD. No one would
report how many false matches are actually recorded
within that database. That is pairs of separate people who
just happen to have the same DNA profile although not
in any way related (not twins or even brothers etc).
In consequence I have simulated a large DNA profile
database and the likely figure is one such match in
2 million. Full details,including computer macros, to
repeat the experiment yourself on
This is a mathematical simulation ,randomly modelling the DNA
based on published data but all 'profiiles' totally independent
of one another.
The UK NDNAD contains 2 million profiles with
a minimum of one such match
plus many more due to the inescapable fact that most people
in the UK have ancestors in common , so more chance of
shared alleles and consequential match.
(Future research - to determine what this co-ancestry factor is)
I welcome anyone to copy the macros off the dnas.htm
file and spend a few hours repeating the process to
verify. No fancy computer required ,just an ordinary pc.
I thought we beat this to death a couple of months ago. At the risk of being
personally attacked, again, I don't think these mathematical games mean much in
the real world of crime scene, crime lab, and court. There are no non-twins
that have the same genetic make up. If there can be matches with current
technology, its because the labs are not using enough data to compare DNA.
Anyone wrongly convicted on the basis of RFLP or STR technology can have the
lab results verified by checking more sites. Surely a 2nd confimatory round of
tests can be devised to double check questionable results. wally
If there was a general perception of a problem with
DNA profiles then what you say is true. I have yet
to see reports in the press concerning extended loci tests.
Perhaps the most disturbing thing i discovered in this
simulation were the results for 6 loci -
27,168 pair matches
1231 triples matches
if the UK FSS had allowed the 6 loci situation to
survive up to a database of 2 million.
For most of the 1990s the FSS,police,judiciary, forensic
statisticians and all their computer power considered
6 loci to be sufficient to arrest the likes of Raymond
Easton because they believed there would not be a
false match in 60 million of the UK population.
It was only Mr Easton's case in 1999 that led
to the change from 6 to 10.
I appreciate USA / CODIS uses 13 loci but on
the other hand the population of the USA is 10
times that of the UK - the increase from 10 to 13
swallowed up by the 10 fold population size
Just an observation from a "blinkered" forensic
scientist... In your simulation, you "make up the
rule" that the 10 loci tested are limited to having
only 10 alleles (even though you state that the
average number of alleles for SGM loci is 14). That
is, rare alleles get lumped into one category (0) for
your simulation. Why don't you run the simulation
with 14 alleles? I mean, I know you believe that
everyone in the world is +/- 2 alleles from the most
common alleles... but is this realistic? Perhaps the
whole world really is an anti-Lake Wobegon (e.g.
everyone is just average). It just seems to me that
with all the constraints you've put in your
analysis... it's no wonder what you get is what you
want to see.
On a personal note to the Forensic Science group: Did
you know that this is a "Peer-Reviewed" forum? I'm
glad because now, with each posting, I can add a few
more publications to my resume. Except, I forgot to
sign my name to the posting I gave Mr. Nutteing a few
months ago (it was about using fingerprints to confirm
You see, Mr. Nutteing has posted some of the
discussions we've had with him in the past. Fair
enough, the web is free (well, except for the porno).
But, this is what Mr. Nutteing thinks of his "peers",
the DNA forensic scientists: "It shows the very
dangerous blinkered mindset of forensic scientists. I
have to assume the same attitude is prevalent within
the police and the judiciary."
Here is some background info about Paul from his
cousin, if you are interested in potential motivation
for wanting to discredit DNA analysis (well, we are
all peers, right?).
As for me, I've washed my hands of this dude. And I
would suggest to everyone else to be aware of what you
wish to say to him. I'm going to follow the advice my
late grandfather once told me, "Never argue with a
blinkered person... people watching may not be able to
tell the difference."
You can't be too blinkered as you picked up the 10
instead of 14 and did further research.
The reason has probably got a bit lost in the background.
It did not emerge overnight - a number of people
contributed along the way. One thing that emerged
early on was the matter of the rare alleles. For 'Gaffoor
trawling ' rare alleles are very useful but in this area of false
matches they become a bit of a nuisance.
I tried finding the name the statisticians may call the following
effect - but not found (maybe original research finding but I doiubt
In more general terms ,what we are dealing with here
is matching of multi-modal sets. With the exception of ,D2S1338
in the UK, in the chosen 10 loci they tend to approximate to
normal/gaussian frequency distributions . That is some sort of peak
with tailing-off to the rarer alleles either side.
Before simulating a 10 loci database containing millions
we gradually built up by testing 6 loci , 7 loci etc. These of
course showed more matches and it is possible to analyse
the structure of such matches. First thing to note ,contrary to my
intuition, these matches did not exclusively involve the most
common alleles. They can involve medium rare alleles but never in my
preliminary simulations involving the rarest ones.
A second effect is probably in the form of a law within statistics
but ,as i say,i could not find it.
All these loci/alleles in the simulation are costrained
to agree with the published allele frequency tables. With a large
number of 6,7 loci matches it is possible to analyse the distribution
of the alleles found in the matches. What emerges is what starts
as multi-modal distributions, similar to normal distributions ,when
are analysed they have a more broadened distribution ,like an
upturned U , and steep run-off to the tails. Increase in the
concerning common alleles and a very noticeable reduction in the
proportion of rarer alleles .
All in all I decided to simplify, to lump these rare
alles ,into one where apropriate.
vWA - not required as less than 10
THO1 - not required
D8 - not required
FGA - 1.8% rare ones lumped into one
D21 - 0.5%
D18 - 2.5%
D2 - 2%
D16 - not required
D19 - 0.3%
D3 - not required
The percentages are percentage of alleles for that locus ,
not percentage of all loci/alleles.
I agree a more rigorous simulation would indeed use
a larger array space - requiring 16 for the biggest,FGA - UK
distribution ,or 25 for full ethnicity coverage and a lot of zeros in
You have to make a cut-off somewhere , the tables I have only go
down to 0.1% anyway, strictly one should include 0.01% alleles etc
A crime-scene profile has a full set of 13 loci
so 26 data points. On 25 there is a perfect match
with a profile on a database. The 26th allele from
the crime is 18 but the database is 18.2 - is
this declared a match ?
Second question concerning another crime-scene
profile of a totally unknown/unwitnessed offender.
Frequency analysis of the alleles indicates he is
50 times more likely to have ethnicity A than B.
But one pair of alleles are extremely rare in A population
say 0.002,0.002 but in population B it is say 0.015,0.15 .
Do you declare no result or does one determination
override the other ?. As equally rare for both alleles
then it is difficult to argue mixed parentage
from A and B populations.
On the first part, I don't think anyone would declare it a match
without further examination. Even after further examination I still
think they wouldn't use a term like that.
For your second question, I'm not aware of anyone in the US that uses
the population databases to determine possible racial information.
On the first part a match should not be called, but the similarity between the
crime scene profile and that of the person on the database would probably result
in further tests, certainly at the locus in question. If these confirm the
discrepancy then that person should be eliminated, but I would expect
investigators to consider close relatives, especially twins of the same sex as
the person on the database.