Problems with DNA profiles

If you found this file in an archive then use keyword "nutteingy" in a search engine to find an updated version or related pages.
Updated file March 2006

Despite official government sites linking to these files there are still corrupt persons knocking out my sites, so for the purposes of searchengines cross-linking them, files no longer available on the original web hosting sites were on http://www.nutteing.50megs.com/dnay.htm , http://www.nutteing.freeisp.co.uk/dnay.htm, http://nutteing.no-frills.net/dnay.htm and http://nutteing3.no-frills.net/dnay.htm (last 2 due now to host failure)
http://www.nutteing.batcave.net/dnay.htm , http://home.graffiti.net/nutteing/dnay.htm Articles posted to moderated Yahoo community group for forensic science personel starting 16 July 2003,title of thread "Problems with DNA Profiles" under my nom de guerre of Nona Revers or as it turns up there nonarevers.
Text in brown is my contribution - due to the nature of such dialogues the thread can get rather broken as it splits into different sub-themes at different times and involving different people.
It shows the very dangerous blinkered mindset of forensic scientists. I have to assume the same attitude is prevalent within the police and the judiciary.

Yahoo forensic science group
A more searchable archive of that group
Subject: Problems with DNA profiles

Problems that even Prof Sir Alec Jeffreys will not address.
DNA profiles for unrelated people are very far from being unique.

Cases from real life
1/ http://tinyurl.com/dtfe Raymond Easton of Swindon ,Newspaper report , 15 August 2000 or if not showing there then http://cjpa.freeservers.com/easton.htm Ditto Seriously disabled but arrested for a cat burglary 200 miles away.

2/ http://tinyurl.com/9dzd Peter Hamkin of Liverpool, newspaper report ,10 March 2003 Arrested for murder 1000 miles away

3/ http://213.159.10.102/germany.asp?pad=190,205,&item_id=31550 27 May 2003 - the Goettingen prisoner ,newspaper report also published in Die Welt ,24 May 2003. The supposed murderer but he had the perfect alibi - he was in prison at the time of the murder.

Add in the http://news.bbc.co.uk/1/hi/england/3007854.stm Milly Dowler case probably false match All cases of two totally separate unrelated people having the same DNA profile and that is just the ones that get to be published - there are many unreported so called 'unresolved duplicate pairs' of such matches in the UK NDNAD database.

Although DNA profiles are not necessarily unique, your statement is misleading. The weight of a "DNA profile" is dependant on the loci used. If only one locus is used, it is possible that almost everyone in the population will match the DNA profile. If a sufficient number of markers are used, the profile can approach uniqueness.

Just because one profile with a given typing system may be nearly unique, this does not mean that every profile is unique. This is particularly true with mixed samples.

I find it slightly amusing when people that don't understand the facts behind what they are saying give not explanation, but simply cite publications that are written by people that don't really understand the facts behind what THEY are saying.

Adam D

Unless you know otherwise these are all full sets of loci producing full matches so that the innocent were arrested except the Gottingen prisoner of course.

No other corrobarative evidence was used - just a trawl of DNA databases. Remember it is happening in the UK first because they have 'joined-up' databases now consisting of over 1.8 million such profiles. My own profile is too close to the UK caucasian norm (nowhere more than 2 alleles away from the median on each locus) to be comfortable

You proved my point. You don't have any knowledge as to whether these are full profiles, mixtures or whatever. You make serious assumptions in your arguement.

Even if you have the most common alleles at each locus, it doesn't indicate that your profile is common. I cannot give any additional information to prove this, since I'm from america and we neither use all the same loci as in SGM+ nor do I have the population study information for the UK handy.

Although it is true that as a database size increases, it is possible to have a match purely by chance, but these can be excluded by using additional loci. This has been done in at least one of the cases you cite.

Whether derrived from mixed samples or not it was considered sufficient grounds to arrest totally innocent persons without the slightest corroboratory evidence.

My own profile based on the 10 loci
VWA,THO1,D6,FGA,D21,D18,D2,D16,D19,D3
used in the UK FSS NDNAD structure is ( slightly altered for obvious reasons )
(17,19)(8,9.3)(13,13)(20,22)(29,29)(13,15)(18,19)(12,12)(12,14)(16,18)
the UK caucasian median profile ie the 'average Joe' is
(17,17)(9.3,9.3)(13,14)(21,21)(29,30)(14,14)(A,B)(11,12)(14,14)(15,16
(assuming D2 Allele 20 UK subgroup for A,B as the frequency plot is triple valued ) so normalised relative to UK caucasians ie taking the difference between each element is
(0,2)(-2,0)(0,-1)(-1,1)(0,-1)(-1,1)(-2,-1)(1,0)(-2,0)(1,2)
the UK caucasian 'average Joe 'would be all (0,0) numbers. So you see the proponderance of 0s,1s and 2s makes me uncomfortable as far as being wrongly implicated in some future crime whether in the UK or interpol-wise somewhere else in Europe.

It was considered sufficient evidence to arrest him because it IS sufficient evidence to arrest him. Arresting a person does not mean the person must have committed the crime. It means that evidence exists that shows the person is possibly or probably implicated in the crime. If your gun is stolen and used to kill a person, you should not be totally dumbstruck when the police come calling. If your blood is at the scene of a crime because you accidentaly cut yourself there prior to the crime happening, it doesn't mean you committed the crime, but the police have an obligation to investigate and possibly arrest you. If the DNA types are a million times more likely if you left the DNA than if a random person did, you should not be shocked if you are arrested, because there is sufficient evidence to arrest you. If you are CONVICTED, then there is a miscarriage of justice. Your information about your DNA types and how they copare the "average Joe" shows a complete misunderstanding of DNA evidence. First of all there is no "average" DNA type. I am assuming that your "average Joe" type is a combination of the most common alleles (I have no idea where you came up with this information). In any case, this would probably be present in only a few individuals in Europe. DNA typing has nothing to do with how many STR repeat units your type is away from the most common allele. Your list of 0s, 1s and 2s is complete nonsense. Before you question evidence, actually study it. You are guilty of making the same error you are concerned about. You rush to judgement before all the facts are gathered.

In your scenario I assume all guns in the USA have unique serial numbers. Then indeed good evidence but that is precisely my point DNA profiles are not unique. Take a hypothetical for instance a bit in the future when the USA (following the lead of the UK unusually) has structured a nationwide DNA profile database covering all or at least many states and counties. If you in San Diego have a murder scene-of-crime DNA profile and you interrogate the new states-wide database and find there is an exact full loci match to an individual in New York. Despite there being absolutely no independent corroboratory evidence would you get NYPD to arrest this person the other side of the country ?. I would posit that in this scenario there just isn't sufficient reason to arrest someone. If after further investigation eg mobile phone cell records/credit card transactions or whatever place this New York person in your area at the relevent time - then there is sufficient evidence.

My data on median groups within the UK is from allele frequency tables in the forensic science journal International Journal of Legal Medicine issues: (2001) 114:147-155 By L.A. Foreman and I. W. Evett and (1997) 110: 5-9 By I. W. Evett et al. In forensic science terms the ideal frequency distribution of alleles would be flat ie equal likelihood of occurance right through the range but alas profiles are an abstraction from biology so a Gaussian distribution, generally speaking ,with very specific locus/allele peaks for a given sub-population.

Table of the percentages of the caucasian population within plus or minus 2 alleles of the median for each locus in the 10 used in the UK FSS NDNAD Locus / Percentage VWA 88 THO1 56 D8 84 FGA 63 D21 74 D18 71 D2 39 D16 82 D19 91 D3 84 eg the median allele for VWA is 17 with 27 percent ,those with 16,17 or 18 then 71 percent and for 15,16,17,18 or 19 (ie median +-2 )then 88 per cent of people.

So again in forensic science terms D2S1338 is best as 61 percent of people are outside the median+-2 and the worst is D19 with only 9 percent outside the median +-2. Considering people within this combined median (average Joe + or -2 ) area then probabilities of false matches between unrelated separate people with matching DNA profiles is much less than the 100 trillion sort of figures normally bandied about.

This principle equally applies to the USA but of course with a different set of chosen loci to the UK SGM+ set.

Co-incidentally I came across this CBS report this weekend concerning friction ridge ,dermal fingerprints problems http://www.cbsnews.com/stories/2003/07/16/60minutes/main563607.shtml

This has gone on long enough. There will be a great deal of coincidence at an individual locus. There will be far less at a combination of loci. Your example of the average Joe's profile and your own misses a crucial point. The difference between the two means that no self respecting scientist would call a match between a crime committed by somebody of that profile and yourself, even taking margins of error into account.

There are legitimate concerns regarding whether DNA profiling is unique or simply very rare. A full SGM+ profile is usually referred to as having a random match statistic of 1:1bn. Given a population of the world of 6bn, this suggests that there would be five other people in the world with similar DNA to your own. These five people could be anywhere in the world. It would be an incredible coincidence for one of them to be in the vicinity of the crime and yourself, but not impossible. This is why DNA evidence needs to be corroborated, but that corroboration must be tested in court. It should in my opinion be perfectly acceptable for police to seek an explanation from a suspect/ person to be eliminated of DNA evidence suggesting the possibility that the person concerned may be responsible for that crime. At the stage of arrest no determination of guilt or innocence has been made, only that there is evidence that requires an explanation and that there may be a case to answer.

Given your references to the UK I take it that you live here. Are you aware of the case of Jeffrey Gafoor? He recently pleaded guilty to the murder of Lynette White. The 1988 murder was then the most horrific murder in Welsh history. He was caught by DNA evidence. Innovative police work regarding particular alleles resulted in his capture after a partial match on the National DNA Database indicated that a male relative was likely to be the killer.

Gafoor gave buccal swabs which were matched to the crime scene samples. Following that he attempted suicide. He was arrested solely on the basis of the DNA results. He was subsequently interviewed and admitted his guilt. Are you seriously saying that after a match was declared to Gafoor on the DNA police should not have arrested him? If you are, then I suggest you join the real world asap, as Gafoor's admissions would not have occurred without his arrest and an extremely vicious killer would not only have remained at liberty, but he could not ever have been brought to justice.

I would certainly agree that in court, if it gets that far, there needs to be corroborating evidence. In your example, the New York alibi would mean that barring incompetence, etc. the person will not be charged and convicted and I doubt that anybody on this list or in the scientific community would think they should be, so what is your point? I would suggest that after a match is called police in San Diego would be duty-bound to investigate that person. From having read Adam'sposts for many months I am sure that once he established there was no corroborating evidence, he would pursue other lines of enquiry, but how do you suggest that police check whether the man was in New York or not and what is your definition of corroboratory evidence? The simplest method would be to ask him and his associates. After all, just because he lives in New York, doesn't mean he can't have been in San Diego at the relevant time does it? You suggest that there is no corroboratory evidence in your example, but you fail to indicate how corroboration would be sought. I think you have it about face. The DNA hit indicates that the person is worthy of further investigation and that an explanation of it may be required. How do you propose to obtain corroboration and/or eliminate that person? In Gafoor's case that required arrest. And for the record, Gafoor's arrest did not prove his guilt, it meant that he was a person who had to be spoken to under arrest in order to try to explain the DNA evidence. You surely would accept that police were entitled to seek an explanation of the DNA evidence from him wouldn't you? If not, history would not and could not have been made in Cardiff Crown Court on July 4th. By the way, that case had seen a previous miscarriage of justice. Without the DNA evidence all but proving Gafoor's guilt, the whispering campaign against the original defendants would still be carrying on as we speak! There may be some problems with DNA evidence, but please let us not throw the baby out with the bathwater.

Best Wishes Satish

I agree that it would be very logical to increase the number of loci looked at, as the statistical probability of there being a match between several people would greatly decrease. However, how many more should be added? Those who distrust DNA evidence or simply do not understand it (and its statistical importance) will probably never be satisfied with the amount of loci used. There will always be that "grey area" unless every single person ever to live on the earth has their DNA tested at every single loci, and the entire universe is explored just to make sure that there are no humans or other creatures with similar DNA that happen to be living anywhere else.

Shannon

Your probably right Shannon. Many will never be satisfied and that's OK considering the awful consequences of being wrong in a death penalty case. But as I remember we went from 6 to 13 loci with the last upgrade (RFLP to STR) in DNA technology, maybe an increase will accompany the next generation (LCN?). Anything that makes us more certain and less nervous about capital or long sentence criminal convictions. Can you imagine the nightmare of being imprisoned for someone else's crime? Wally L

Your posts continue to show poor understanding of population gentics and the DNA testing performed. I cannot continue lengthy responses to your post, so I will try to keep it simple.

1) The USA has a DNA database comparable to the database in the UK.

2) DNA profiles are unique if sufficient loci are used (excepting identical twins). 3) The burden of proof for arresting a person is the lowest burden in the justice system. It is simply "probable cause". Being arrested does not mean you are guilty, it simply means that the law enforcement agency has a reason to believe that you are probably involved.

4) If a search of the national database in the USA leads to a match, it will probably be sufficient for an arrest. Usually additional work is done, but it probably doesn't need to.

5) Although I don't have access to the journal that you cite, your use of the information is seriously flawed. First of all, the statistics don't have anything to do with how close to the most common type you are. In addition, you have to account for dizygosity. You are also neglecting to take into account the fact that many of the people that have the most common allele at one locus have an uncommon allele at another locus. You have some idea that half of the people in England or Europe or wherever have the same DNA type. If you actually tried to take the time to understand what you are looking at, you will realize that you are totally unjustified in your position. You are trying to come up with some off-base arguement to justify your position, but you don't really understand what you are saying. Very few people have most of the most common alleles.

Adam

One quick note on the statistical differences between RFLP and STR technologies. With RFLP there were many more possible combinations at each locus than with STRs. This is part of the reason why the FBI required 13 loci for STRs vs. the six for RFLP. That stats for six RFLPs are similar to 13 STRs. But STRs are preferred for many other reasons (particularly since they require significantly less sample).

Adam

The profiles used in DNA analysis now are limited to 13 probes/loci (?) as I understand it. There are millions or billions of genetics pairings. Even given that genes are made up of more than one pairing, there must be millions of billions of possible combinations. Even with just the 13, I've heard figures of up to 9 1/2 trillion to one. Isn't the statistical uniqueness of a profile a function of how many genetic characteristics you compare? So if there were to be a question in a given case, just increase the number of comparison points, Studies could be done to validate those contingency characteristics. Tempest in a teapot.

I am not one of those who will never be satisfied. I happen to believe that it is one of the most important tools in the fight not just against crime, but also against miscarriages of justice as well. In Britain the system of SGM+ played a pivotal role in the finally resolving the Lynette White case. I had been involved in that for twelve years. Every system that was used in Britain from 1988 onwards was used in that case. As such it is an excellent case to review the effectiveness of DNA testing systems over time.

To reply to your questions, Shannon, in Britain SGM+ tests at ten loci (it also tests at amelogenin which is very sensitive for the Y-chromosome). The random match statistic quoted in Britain is 1:1 billion. Whether this statistic accurately reflects the level of discrimination of that system I know not, but this is what juries hear.

In 1999 Britain went from SGM (six plus amelogenin) to SGM+. The previous system was quoted as offering a random match statistic of 1:50 million - approximately the population of the UK.

I know that there are compatible genes that could easily be added to SGM+. I would think that only one locus would need to be added in order for the random match statistic offered in evidence in Britain to be greater than the population of the world and that would satisfy me as it would be a case of the statistic offered in evidence being such that it would offer compelling evidence of inclusion until and unless an example is found of a random match occurring using this system. Were that to occur then the statistic offered would have to be wrong.

To date it has not occurred in SGM+, but were it to occur it would not prove the statistic offered to be wrong for reasons I referred to in a previous post. Within the statistic offered by SGM+ at present there is a grey area in which unscrupulous people can find refuge or a miscarriage of justice can occur. I just don't see why we have to live in this grey area when it is so easy to offer certainty, at least until proven otherwise.

In an ideal world we would have the DNA of everyone to compare with each other and test at every locus. That is not going to happen, nor should it. It would be time-consuming and outrageously expensively without improving matters significantly. It may satisfy some doubters, but it would bring everything to a grinding halt with no prospect of ever catching up with the backlog that it would create. In my view it is simply not a viable option, however desirable it might be. The random match statistic only needs to offer a level of discrimination that is greater than the population of the world as that will in effect offer exclusivity until and unless an example can be found to disprove it.

I would add that to function as effectively as it is capable of DNA Databases need to contain the DNA of all citizens of that jurisdiction and that this should occur throughout the world. I believe this to be in everyone's interests as it would (possibly) generate new lines of enquiry, especially in cold cases and it would enable wrongful convictions to be corrected and prevent visits to 'the usual suspects' where their DNA does not match that obtained from the crime scene. The Gafoor case is a good illustration of the need for this. But for excellent police work involving a profile stored on the National DNA Database the case would never have been solved, but what if Gafoor's nephew had never been arrested at all? If that were the case his DNA would never have been on the UK Database and nor would his uncle's. This would have meant that the real killer not only would have remained at liberty, but that he never could have been caught. In my opinion that is too high a price to pay, especially when it could be resolved so easily.

Hope that clarifies my position.

Best Wishes Satish

Firstly a general comment then some individual replies.

I am amazed at the complascency here. I can safely assume most people in this group,for ellimination purposes, have their DNA profile on a database somewhere. Just because you work in the criminal justice system does not make you immune from becoming a "Peter Hamkin" (a previous post) perhaps 5 years down the line and (falsely) arrested for some serious crime in another part of the country or abroad even. You ,though, have enough savvy to demand an extended-loci DNA profile test to get yourself excluded but that is no comfort to the Peter Hamkins of this world. Ironically the only persons not in this Damoclean / Kafka-esque world are the likes of the 'Goettingen Prisoner' (a previous post) safely banged-up inside at the time he would have been accused of murder.

There are hundreds of "unresolved duplicate pairs of profiles" in the UK NDNAD first reported in the journal Forensic Science International 95 (1998) p30. Many of these would be one individual arrested and processed more than once but using aliases. Is no one concerned about these situations ?. No one other than myself is concerned about all the other 'matches'. I don't have clearance to interrogate the relevant databases . From the NDNAD find all such pairs of matches and take both names and DOBs and interrogate the dermal fingerprint database. A match in both databases - then one individual using an alias. Mismatch in the dermal fingerprint database - then two individuals sharing the same DNA profile. Some would be clerical errors of course but in that area you have false inclusions and false exclusions. As a scientist I find the (deliberate?) non-investigations of such anomalies abhorrent and alien to the scientific ethic.

Individual replies to
Satish
From the promega site concerning SGM+ in the UK they quote 100 billion : 1 for random match. A DNA 'match' without any other evidence just shows a set of numbers derrived from a crime-scene match a set of numbers on a database somewhere, it is just a coincidence until further evidence. A DNA profile is just a snapshot of part of someone's DNA it is not a unique identifier. I'm surprised you raise the Gafoor case - I don't know the generic term for this new technique i will call it sub-set trawling,10 point matches instead of 20 (UK). I'm surprised the civil liberties lot have not been screaming their heads off. Collaring of suspects by using the DNA of their blood relatives who happen to be recorded on a DNA database plus disturbance to the dozens or hundreds snagged along the way. How many people within a family share half the alleles ? brothers,sisters,parents, sons and daughters. No wonder the arrestee side of the NDNAD will be stopped at 3million profiles - using 10 point trawls along with serious number crunching then in effect perhaps 20 million (one third) of the UK population are snareable. Gafoor and the victim as far as I am aware were related so evidence of inclusion anyway. You could have mentioned the similar case of Joseph Kappen implicated by DNA profile alone although deceased. The relatives of Kappen now have the stigma of a murderer in the family but because he is dead they have no chance of vindication in court. Over 100 other men presumably grilled or were volunteered to surrender a DNA sample now added and irretrevable from this Juggernaut database. The whole sub-set trawling exercise is supposed to be initiated again in the Scottish, Helen Scott/ Christine Eadie murders re-investigation.

To Wallyl & Shannon
I am reminded of a mathematical treatment for infinity. You give me the largest number you can think of and I can always add 1 to make it bigger. In the UK 6 loci were considered by the forensic statisticians to be perfectly adequate until the Raymond Easton (a previous post) case forced a rejig to 10 loci. However many loci are used there is still a finite chance of a false match.

To dutraa
I wish I had a rare locus/allele combination somewhere in my profile but my rarest ( D2S1338 /18 ) is shared by 9 percent of UK caucasians. I would not feel vulnarable to a future false match determination if just one of them was ,say, sub 1 percent frequency of occurance.

You need to be pretty careful with the statistics for this, as it's not as simple and obvious as you suggest. The chance of duplicates occurring in substantial sized population is actually much greater than it might appear at first sight. It is rather similar to the 'Birthday problem', the chance of finding any two people with the same birthday in a group as small as 20 people.

I have done the calculations, and I'd tell you that in a population the size of the UK (circa 6E7), a random match probability of 1E9:1 for the DNA test means you can expect to find on the order of 3.6E6 people in the UK population whose profile is not unique, or 6% of the population.

Depending on the circumstances of how the match was located, that means the chance of latching on to an innocent could be as high as 3%, if you randomly select the innocent person (half-profile) from the 6% of those with the same profile. That is approximately a 1:33 chance within the UK population only, which is much, much greater than your '5 random matches expected in the world' (correct as far as it goes, but not the whole story) leads you to expect. Not such an 'incredible coincidence' after all! If arrests and investigations are frequently being launched with chances of error as large as a few percent, then it's likely that innocents will be accused on a regular basis, and a chance of error as large as that is certainly enough to constitute reasonable doubt by itself, requiring additional and convincing corroborating evidence to support a case.

You had also better be sure that the test really is as accurate as you suggest - if it turned out to be even somewhat less accurate, then the chance of false matches rapidly increases. Hope that helps...

BTW, I do of course mean 3,600,000 people out of the 60 million with a non-unique profile. That is to say, 1,800,000 different profile 'pairs' with one profile shared by two people. Within that 3.6E6 people, there would also be a much lesser number of triples and higher profile n-tuples, but those are second-order, third-order etc effects - the likely number of 'pairs' being the most significant and potentially surprising/disturbing result of the calculation.

Sorry if that wasn't clear - it can be difficult to express precisely without ambiguity...
jbaron

That was very,very interesting. I know it's easier said than done - any chance of posting the background maths ? Despite just being ordinary text - algebraic formulae or whatever rather than numeric so I can plug in my own figures . Today I revisited the 'Birthday Problem' to brush up on this sort of statistics. This is someone's site about it http://images.beggerlybend.com/puzzles/birthdays.html My maths is exceedingly rusty but I did managed to pick up a typographical error in his formula - for 36530 it should read 365^30. The first 6 elements of his table are correct anyway,I could check them on my basic calculator. It still seems intuitively wrong that 23 people gives a better than evens chance that 2 people share a birthday. I was also somewhat surprised that with 57 people you are 99 per cent certain to have 2 people sharing a birthday. Again intuitively I would have said for 99% then something over 150 people would be required. But that is precisely the problem intuition does not square with billion : 1 sorts of probabilities.

Anyway with a bit of study I hope to be able to get my head round your maths should you be kind enough to post it here.

As a first-order approximation for the number of non-unique individuals, use the square of size of the database divided by the odds against a single random match.

That is to say, if the size of the database (let it be N) is 1.8 million, and the odds against two profiles chosen at random matching are (let it be S) 1 billion:1, then expect the number of profiles which are not unique to be N^2 / S, or 3240 for these numbers. The number of 'pairs' is that divided by 2, or 1620 pairs.

The approximation is valid until the number of non-unique profiles starts becoming a significant proportion of the population. As the number grows, it will start to be an over-estimate (in fact, once there starts to be a significant chance of getting more than two people sharing the same profile, i.e. triples and higher n-tuples). In the context of DNA databases such as we are talking about, it's a suitable approximation because by the time we start seeing triples etc there's already a big problem with the number of duplicates! To give you an idea, using an example I quoted earlier, if we take N as 6E7 for the UK population, and leave S as 1 billion:1, then N^2 / S is 3,600,000. That's actually a slight over-estimate, as with those numbers there are likely to be a little over 100,000 triples of three people sharing the same profile, and if you make an adjustment for that you'll find that the number of pairs is actually more like 3.4 million.

But I'm sure you'll agree that the number not unique is large enough to be problematic long before we need to consider such second-order effects! jbaron

I was expecting loads of factorials and complex sigma functions not a plain and simple N*N/S. Very impressive nevertheless. I remember a concept in maths 'reductio ad absurdum' . If there is a complex conjecture or formula that is difficult to prove then check it by entering a simplified case. If it throws up an inconsistency then the complex form is suspect. I don't consider the discrepancy between 23 and 27 in the birthday problem simplified case to be disproof.

The US/CODIS database is subject to the same statistics assuming each state uses the same loci set. The use of 13 loci reduces the number of duplicate pairs but of course does not eliminate them.

It is absolutely amazing the 'powers-that-be' can keep such a simple piece of maths and data out of the public arena (and technical arena for that matter) . The quote I have for 300 unresolved DNA profile matches in the NDNAD may relate back years before 2002. Unless there are two tv documentaries titled "DNA in the Dock" although shown December 2002 it was recorded and first shown maybe 5 years previously. So the NDNAD having many thousands of profile matches plus the aliases matches would be consistent. There must be a very firm lid to keep such info from the public domain.

The following quote from the Regina v Watters ,2000,appeal court judgement is now confirmed on primary source Butterworths Lexis Nexis [Typing error in the last sentence 'acquitted who when' in the Lexis Nexis version, later amended to 'acquitted when'.]
Quote
The other evidence results from more stringent tests that have been done on the DNA material that was available in this case. That is partly as a result of a case in which a 6 point match was found to produce two possible suspects, one of whom had been charged despite living at the other end of the country and had to be acquitted who when it was appreciated that the DNA matched a second person.
End Quote
So part of the judiciary at least is aware of duplicate pairs of profiles in the NDNAD. In normal circumstances you would expect a complete monograph at least on an otherwise throw-away line.

From Hansard [ Record of proceedings of the Houses of Parliament ] 08 April 2003
Quote
Mr. Bob Ainsworth: The total number of profiles held on the [ NDNAD ] database at 25 March 2003 was 2,094,858. The Forensic Science Service calculate that these profiles relate to an estimated 1,886,000 different individuals.
End Quote
So only 2 weeks to access,process and relay this basic data. It would be nice to see how many of the 2,094,858 are repeat DNA profiles broken down into: a) repeated acquiring of profile from same person at different times within the bounds of clerical error for personal details. b) same profile ,markedly different personal details but same dermal fingerprint data within error bounds. c) same profile,markedly different personal details,different dermal fingerprints But that would destroy public confidence - so it will never be done.

Dear Wally L and others - I have only posted, to the best of my recollection, once previously to this group. I am definitely out of my league and not a qualified expert. However, Mr. Lind, I find your attitude towards other posters and your conduct outrageous, insufferable and inexcusable. DNA is NOT so widely accepted any longer without corroboration.

My previous post was about the subjective versus the objective approached to both DNA and fingerprints. You agreed on the fingerprint aspect but as I recall, you disagreed on the DNA aspect. You made reference to the FBI previously comparing a match in the trillions whereby now they just say "it's a match".

It is my understanding and belief founded by the journals I have read that there is a great difference between a DNA match and a DNA profile. In the case of most major law enforcement agencies, a comparison is made to the profile, not the match. There have been hundreds of incorrectly arrested individuals based on DNA who have had to be released because of DNA.

In a time in our life when science has almost perfected forensic evidence, I find that the technicians are less concerned, in too much of a hurry and maintain a cavalier "who cares" attitude about much of the processing itself. Whether it is inadequate testing of DNA or an improper description of the year, make & model of a car, I find that liberty and freedom in this great nation of ours is far too precious to allow those in "position" to have a cavalier attitude. Perhaps it is time that the threshold be elevated for what constitutes "probable cause" for an arrest.

Although it is my understanding and belief that the conduct is more prevalent in America, other countries such as the UK are much more willing to share the truth and expose the inadequacies. Thus the release of articles such as the ones I've included below. In the USofA I have found that although we are very willing to admit that wrongly convicted innocent people have been released because of the results of DNA testing which was not available at the time of their conviction and we make certain that the news hits front page, we are not admitting publicly too often about people who have been wrongly arrested based on DNA evidence.

The Court of Appeal in the UK said that DNA statistics were found to be insufficient to convict without corroboration in a decision they returned in October, 2000.

In most criminal prosecutions where DNA evidence is utilized, the evidence serves to corroborate, in a powerful manner, other circumstances pointing to the guilt of the accused. But should DNA evidence alone be sufficient to convict when there is no corroborative evidence, except of the most generalized and non specific nature? Some high courts have said "NO".

DNA mystery in murder probe
27 May 2003 GOETTINGEN - German justice officials investigating a murder six years ago are faced with a baffling problem after a DNA sample appeared to confirm the killer. The sample prosecutors found in connection with the murder fitted the DNA profile of a 40-year-old man. But their sole suspect had the perfect alibi - he was in jail at the time.

"This is a very mysterious affair," admitted Hanover public prosecutor Thomas Klinge. The September 1997 murder of a 61-year-old woman, whose body was left on a playground in Hanover after she had been beaten about the head with a stone, had baffled police for several years. But last year specialists achieved a breakthrough when they discovered small traces of DNA material on the victim's bicycle. A check of the BKA federal police department's DNA databank confirmed it matched the profile of a suspect with a previous record of violence and sexual offences. However, the man has been held at a high-security unit at Goettingen's closed mental health hospital since the middle of 1997. Officials at the unit have confirmed that it is absolutely secure. Director Gunter Heinz said he was "100 per cent certain" the suspect could not have left and returned. Klinge said there could be no doubt about the accuracy of the DNA sample which had been tested by several institutes. Neither was there any reason to believe the evidence had somehow appeared on the bicycle after the crime. But he added: "The alibi appears to be absolutely reliable, and we have no knowledge the man has an identical twin brother."

Cleared murder accused victim of DNA blunder
Mar 10 2003
By Chris Johnson Daily Post Correspondent
A MAN arrested on DNA evidence has been cleared of murdering an Italian woman after police admitted he was the victim of an Interpol blunder. Bartender Peter Hamkin, 23, was arrested in Merseyside on an extradition warrant and hauled up before Bow Street Magistrates' Court in London. His DNA fingerprint was said to be a perfect match for the man who shot Annalisa Vincenti in Tuscany in August 2002. The killer had escaped in a Rover car and Mr Hamkin, from Litherland, was said to match the e-fit produced with the help of witnesses by the Italian Carabinieri investigators. Police arrested him while he was pulling pints behind the bar at Buckley's Pub, at Litherland, when it was crowded with regulars. Mr Hamkin protested his innocence from the outset and swore that he had never even been to Italy, let alone murdered anyone. Following a 20-day ordeal he has been informed that the Metropolitan Police have now ruled him out following a second DNA test. Mr Hamkin said: "This has been the worst three weeks of my life. I've not been able to sleep or think straight with this nightmare hanging over me. "I've been a prisoner in my own home, constantly on edge thinking the Italian police were going to arrive to take me away. "You hear about miscarriages of justice and innocent men have been hanged for murders they did not commit. "They picked me out of millions of men in Europe. "I had dozens of alibi witnesses but as far as they were concerned I was guilty because the DNA said so. "I told the police who arrested me that I have never been to Italy and I could prove it but they just did not listen. "They marched me off and threw me in a cell overnight and then stuck me in a van for eight hours to take me to London. "I felt like I was trapped in some kind of crazy film and told myself this could not be happening to me."

Mr Hamkin's DNA profile was kept on record by police after he was convicted of drink driving in 2001. He added: "It was a stupid thing to do but I never thought it would land me in court accused of murder." At Bow Street, Mr Hamkin had to find a гг10,000 surety before magistrates agreed to grant him bail when he faced an extradition warrant on February 16. Mr Hamkin, who has a twoyear-old daughter Alicia, is consulting his lawyer with a view to suing police for wrongful arrest and false imprisonment. His solicitor, Rex Makin, said: "The police say that a 'more refined result' from a second DNA sample that shows that he is not a match for the Italian killer. "I suspect that the original DNA sample was bungled and that has resulted in this terrible experience for Mr Hamkin. "It begs a lot of questions about the procedures surrounding the routine sampling of DNA and the conduct of police in accepting DNA file evidence." Mr Hamkin must attend Bow Street on March 25 when the extradition proceedings will be formally dropped.

Disabled man turns down payout offer
Invalid Raymond Easton has been offered г2,000 compensation by police after he was arrested for a burglary he did not commit. Parkinson's disease sufferer Mr Easton was arrested on the basis of DNA evidence which later proved false. He says the amount is an insult and has instructed his solicitor to ask for at least г500 more. He said: It is not the amount, it is the principle of the thing and what they are offering is not enough for what I have gone through. Mr Easton, 49, of Pound Lane, Pinehurst, was arrested last year by Swindon Police for a burglary which he was supposed to have committed 200 miles away in Bolton. Despite not being able to dress or bath himself and being unable to walk more than 10 yards unassisted, Mr Easton was arrested in April and charged with a burglary during which electrical equipment worth г440 was taken. Police matched the DNA found at the crime scene to Mr Easton's, which was on file from a domestic incident four years ago that resulted in a caution. He was told the chances of it being wrong were one in 37 million. Despite this evidence against him, Mr Easton was adamant that on the day the burglary was committed he was at home, looking after his unwell 16-year-old daughter Xaena. After being arrested by Swindon police on behalf of Greater Manchester police he was kept in a cell from 9am to 4pm. Mr Easton's solicitor demanded another DNA test, which was more accurate than the first and led to the case being dropped. A spokesman for Greater Manchester Police would only confirm the force does have a compensation claim being made against it and is waiting to hear from Mr Easton.

Sue

I was amazed when I read the following piece that an American had written as an aside on a Usenet golfing usergroup.
Quote
"Making your own (golf) clubs is a lot of fun and can add a lot to your enjoyment of the game. It does not require any sort of high tech qualifications or excess of expertise. My colleagues think it is a big deal that I can fix my thermal cycler (PCR machine) and automated DNA sequencer.....it's not. They are simple electrical/mechanical devices and pulling a board or replacing a pump is no different than doing the same for a cheap radio or a dishwasher."
End Quote
You don't tinker with these sort of expensive and sensitive machines however well intentioned. Just moving a wiring loom could upset the callibration. I won't identify that individual more than he probably works in a Mississippi lab.

There is the ongoing fiasco of the Houston crime lab leaking roof and numerous other problems with other USA crime labs (not from Mississippi yet reported though). As far as I know the profile databases in the USA are fragmented to state or county level and no 'federal' combined database. False matches only start turning up with large databases so it will occur in country-wide databases first. To balance your observation concerning Europe being more open about reporting false arrests due to DNA 'evidence'. I find it a bit suspicious that I've not read any similar reports concerning systemic/procedural errors in UK forensic labs. Considering I have written evidence of a dyslexic employed within the main Birmingham Forensic Science Service site.

I apologize if my views offend you. That said, a "profile" is the result of a DNA analysis procedure on a known or unknown sample. A "match" is the comparison of a known and a crime scene or other profile. There must be thousands of unresolved profiles in databases that report to CODIS, because no suspect profile has been submitted to compare it to. Is that what you meant? Your are flat wrong about the acceptance of DNA. It always did have to fit the circumstances of the case. If the suspect's DNA is identified in the vagina of a rape victim, you still have to prove that the incident was rape. DNA is not only accepted, it is expected in todays courtroom. Due to the propularizing if it on TV, nearly everyone know about it. And, like fingerprints, if you don't have DNA in your case presentation, you almost feel like explaining to the judge or jury why not. Is that what you meant? As for freedom, many innocent people have been freed from prison through DNA analysis. The people who have been wrongly convicted with the aid of DNA evidence, as in the Houston PD case, have suffered this fate because of incompetence or criminal activity, not because there is anything wrong with DNA analysis. Any technology can be screwed up. Lab accreditation and proper supervision are what is needed. Wally L

What is this guy talking about? What unsolved DNA "profiles"? Outside of twins there aren't 5 people who have the same DNA in the 120,000 years modern man has been around. It is theoretically possible in the same way that its theoretically possible that this guy might catch a ride to alpha centauri on the next UFO. Even taking the 13 loci STR profile, which the FBI says has 9 1/2 trillion to 1 stats, a profile is unique among all the hominids that have ever lived. DNA is widely accepted, because its common sense to an educated public. Wally L

Reply
I recourse to evidence not abuse.

The FSI article referred back to data as of 24 October 1966 and concerned just 6884 DNA profiles. Within those profiles were 11 matches. Of those how may were cases of aliases and how many were 2 unique individuals was not resolved then and is still not resolved although now concerning over 1.8 million such profiles. The latest figure I have for these unresolved matches was 300 but I don't know when that inerrogation of the NDNAD was made. Don't blame me if the controllers of these databases deliberately don't resolve these matches.

No,the unresolved matches refer purely and solely to the section of the NDNAD containing the profiles of arrestees (plus some others such as crime scene examiners) not the database of profiles derived from crime scenes - any matches in that database are usually just the same criminal detected on different occasions at different crimes. Because of the large numbers involved in such a database (1.8m) and the statistics relating to large numbers there are hundreds or maybe now thousands of pairs of profiles on this part of the NDNAD that are from TWO or more SEPARATE unrelated individuals. It is confused because there must also be matches recorded from the same individuals using an alias at different times of arrest. Statistics cannot help in quantifying what proportion alias cases constitutes to the "unresolved matched pairs'.

So, let me get this straight...

If two people match exactly in the DNA database, you would interrogate the dermal fingerprint database to see if two aliases match. End of story. The identical profile is simply a duplicate.

If, however, alias #1 doesn't match dermal fingerprints to alias #2... A HA!! DNA goes by the way of the dinosaur.

What if alias #2 never had a dermal print taken (only DNA)? Only you and OJ Simpson would still be searching for the real killer (on a golf course, as the joke goes). Furthermore, are you not now, in order to discredit the current method of DNA profiling, requiring an enormous dermal fingerprint database to show beyond a reasonable doubt that 2 individuals share the same DNA profile? The requirement of a global dermal fingerprint database to disprove DNA is abhorrent and alien to me.

Besides, don't we all know that the "average Joe" is only +/- two loops and whorls from everyone else in the world??

Replace with interrogation of the mugshot PNC database instead of dermal (i just considered it more rigorous) if you prefer for the cross-correlation. But my point is no one is bothered about these unresolved matches ,utterly astonishing.

I really don't want to get dragged into a long debate here, but I think you completely misunderstand the Gafoor case. There really is no doubt that Gafoor is the real murderer. The DNA evidence is absolutely compelling, so compelling Gafoor tried to kill himself and admitted his guilt on his way to hospital, not just to police, but to ambulance staff. He has consistently maintained his guilt ever since his arrest. He even apologised to Lynette's family and the original defendants through his QC. He accepts responsibility for Lynette's murder. I can't see why you seem unable to accept the evidence and his guilty plea. If you think that the evidence is a ten allele profile then I'm afraid you are showing your ignorance of the facts; it is a ten loci profile with 20 alleles. The evidence in this case is not the 12 allele partial profile of the boy, it is the full profile of Jeffrey Gafoor. The 12 allele partial hit of the boy did nothing more than narrow the search to relatives of the boy. Your number crunching is absurd.

Do you seriously believe that a twelve allele hit would ensnare 20 million people? If so please explain how instead of narrowing down the search to a third of the population it narrowed it down to one family within which there was one person and only one person whose profile was a perefect 20 allele match to crime scene samples.

The use of his nephew's profile was to narrow down possible lines of enquiry. What on earth is wrong with that? The match was between Gafoor's own profile and several crime scene samples, both from samples discovered at the time of the murder and ones discovered during the re-investigation. There is a crucial difference between Kappen and Gafoor. Gafoor has faced trial and been found guilty beyond reasonable doubt. Kappen is deceased and will never face trial and as such is entitled to be presumed innocent. Being entitled to be presumed innocent does not mean he is innocent; it means he has not been proved guilty and never will be. What do you suggest police should do in the Kappen case - ignore the DNA evidence altogether? And while we are at it should they ignore the DNA evidence in the Gafoor case as well. And before pointing to the other evidence remember that it was the DNA evidence that resulted in the other evidence being obtained. Without it this case could not have been solved.

I don't know where you get your statistic from, but I was in court for Gafoor's guilty plea. The statistic quoted was 1:1 billion. I know that for a fact. The nephew's profile was a 12 allele out of 20 match to the crime scene profiles. Gafoor's was a full 20 allele match. You seem most concerned with protecting civil rights, but are strangely silent on the fact that the evidence has already proved Gafoor's guilt. It is overwhelming and in the absence of a full database such innovative work was the only means of solving it. Gafoor had the opportunity to challenge this evidence. He pleaded guilty instead. I don't understand what your problem is regarding the police's work in this re-investigation. For the record the relatives of the boy were asked to give voluntary swabs, which they did including Gafoor. The only person arrested over this was Gafoor AFTER the results indicated a perfect match to the crime scene profiles, because at that stage it WAS reasonable to seek an explanation from him. Had police waited Gafoor's suicide attempt would have succeeded and then no doubt you would have gleefully raised the similarity with the Kappen case. Then it would have been similar. Now there are none at all. They intended to keep him under surveillance before arrest. His suicide attempt prevented that. What do you suggest they should have done?

I notice that in your concern to avoid a miscarriage of justice you completely ignore the miscarriage of justice that did occur in this case. It was not suffered by Mr Gafoor; it was suffered by the original defendants, who were cleared by the DNA testing you condemn. The conviction of Gafoor has silenced the whispering campaign against the original defendants once and for all. Save your concern for the five men who were wrongly arrested and three of them were wrongfully convicted and for Lynette's family who endured a living hell for 15 years as a result of that. They deserve your sympathy and concern. Gafoor does not.

Having worked on this case for 12 years, both to prove the innocence of the Cardiff Three and to get justice for Lynette by finding her real murderer, I can tell you that your concern over Gafoor is misplaced. Police did nothing wrong in that investigation. I wish I could say the same about the original investigation. Sadly I can't. I suggest that you take a good look at what happened in the original case and what happened in the latest one before suggesting that there may have been a miscarriage of justice. As one who knows more than most about this case, let me assure you the evidence, not just the DNA, against Gafoor is absolutely compelling. He is not the victim of a miscarriage of justice and nobody's civil rights were violated. The boy was never arrested in connection with Lynette's murder, nor was anyone who gave buccal swabs other than Jeffrey Gafoor, because Gafoor's was the only profile that matched those obtained from the crime scene.

By the way what do you mean by Gafoor and the victim were related, so evidence of inclusion anyway? This does not make sense. They were not relatives and prior to the DNA evidence Gafoor had not occurred as a suspect. But for the innovative narrowing down of profiles of interest he never would have either. But before you claim victory remember he admits that he killed Lynette and has done so at every opportunity since confronted with the DNA evidence. Without the innovative police work he never would have and this murder would never have been solved and five entirely innocent men would continue to endure a thoroughly unjustified whispering campaign. That is is the the injustice of this case, not what happened to Gafoor. Please remember that but for the excellent work by the police in this case Jeffrey Gafoor would never have admitted his guilt. He had 15 years to come forward voluntarily. He failed to do so. He had no intention of ever doing so. He even watched men he knew to be innocent go to prison for his crime. Their lives and those of their families and those of Lynette's family have been wrecked by Gafoor. Society is entitled to a very long rest from him.
Satish

Commiserations for your email server problem [ His server was firing-off multiples of the same email]. A year ago as part of a 80 person Cc group I was on the receiving end ,like all 80, of the same repeated original email, twice an hour ,continuously for a week. Simple filter blocked the immediate problem at my end but it seems the problem was a server in a small analytical laboratory that had closed for annual holiday. The intended recipient there had a full mailbox and an autoredirect to someone else on the same server also with a redirect back to the other person .

Anyway back to Gafoor. So the ends justify the means does it. ? I'm quite confident Gafoor is the guilty party, that is not the problem. I was relying on memory that he was somehow related to the victim so I was wrong there. Up until his confession (better to ambulance staff than the notorious jailbird confessions) he was just the victim of coincidence of SoCo sample profile matching his own set of numbers. I would be grateful if you could tell me what the accepted generic name for what I call sub-set trawling is. Including subsets linking a target to nephews, then 3 million NDNAD profiles would amply make the whole UK population ensnareable. How many also-rans ,from the database trawl,other than the nephew had a 10 or more allele 'match' with the SoCo sample(s) ?

If you must rely on DNA profiles ,only, with no other corroboration then at least repeat using Powerplex or mitochondrial ( if soco sample is amenable) or some other set of loci tests on both sampes is what I suggest,but it still leaves the attribution problem (below).

Now concerning the vindicated 3 and other miscarriages of justice, I have no problem with the use of DNA profiles as evidence of exclusion. It is evidence of inclusion in a system that is perceived to carry the concept of uniqueness, when it does not, that the problems start. Why in a DNA database trawl do they always arrest the first person who matches - it just so happens that no SoCo sample has matched one of the unresolved pairs in the NDNAD. Similarly for identifying dead bodies there is no problem (other than lab cross- contamination/errors) that the DNA sample came from that dead body with no possibility of misattribution as is always a possibiliy with SoCo samples ie planted to implicate (say). Are SoCOs trained to determine the difference between a glove (say) left at the crime scene by a burglar and worn by the burglar as distinct to a burglar dropping a glove previously owned by his enemy or some totally innocent person - I think not. Without independent corroboration DNA profiles can be worse than worthless.

I still don't understand what your problem with the Gafoor case is. It is self evident that there will be more partial hits at fewer loci. The point in the Gafoor case is that the searching of the database for some of the alleles allowed numerous people to be eliminated. It really is quite simple. The similarity of crime scene profiles to that of the boy indicated that his male relatives were people whose DNA profiles needed checking and elimination. All of them including Gafoor voluntarily gave their buccal swabs.

I have absolutely no problem with the police work that happened in this case. The trawling as you put it helped the police to generate a line of enquiry and eliminate many other lines of enquiry. I have no problem with it at all in this case. The problem in your world is that without this innovative police work Gafoor would never have emerged as a suspect. I really do not understand what your problem is. The boy was never a suspect. There were other partial hits. So what? I repeat the partial hit to the boywas only used to narrow down the potential search. It developed a line of enquiry that resulted in the conviction of a particularly brutal murderer. The evidence in this case is Gafoor's profile, which he gave voluntarily. He has admitted his guilt to the ambulance staff; he volunteered it; it was not solicited. He admitted it to police in formal interviews with police; he admitted it to his lawyers and instructed them that he wanted to plead guilty and he admitted it in court. In short from the moment he realised that police had caught up with him he admitted his crime. He attempted to fabricate an explanation for the finding of his DNA at the crime scene. He knew his DNA was there. He tried to claim that he had sex with Lynette a week before her murder and asked if it was semen. The question wasn't answered, but it alerted police that they had almost certainly found their man as the DNA obtained from the flat had come from extensive bloodstaining. His DNA was found on several samples, both ones discovered during the original enquiry and in the recent re-investigation. His blood lay beside the victims on several samples. There is no way that this could be explained innocently. Once Gafoor was confronted with this evidence he admitted his guilt. Please note that his admission to the ambulance staff came after a serious suicide attempt. He told them: "Just for the record I did kill Lynette White. I sincerely hope to die." This was evidence and compelling evidence at that.

Now to return to your question, what the police did in relation to the National DNA Database was nothing short of excellent police work. It narrowed down possible lines of enquiry. You seem to completely the fact that its sole purpose was to narrow down lines of enquiry; it was not and was never intended to be evidence in its own right. The evidence was Gafoor's own profile. Please get your facts straight, Gafoor's profile was not on the National DNA Database at that time, nor would it have been. The boys was and was considered similar enough to the crime scene profiles to suggest that male members of the boy's family needed to be conclusively linked to those profiles or eliminated. Had the police tested all the relatives of the other partial hits they would have been eliminated as the only 20 allele match to crime scene profiles was that of Gafoor himself.

To sum up, I not only have no problem with the police's innovative search of the National DNA Database for partial alleles, I tip my hat to them for doing so. They did an excellent job and I hope that other forces follow suit. I notice you miss the wider picture here. Innocent people have been cleared by DNA testing in a number of cases. They can instruct their lawyers to investigate such possibilities in their own cases. After all, what better way to prove your innocence than to be able to say, Mr or Miss X is the real murderer.

SGM+ tests ARE corroborated. Samples are tested and a corroboration test is also conducted. Results are reported if both results corroborate each other. Please get your facts straight. Mitachondrial DNA is useful in cases of hair shafts or bone; it is ludicrous to attempt on blood as it uses more DNA for less discimination. I take it you are aware that conventional blood grouping corroborated the results. In the original case there was good blood grouping evidence and very poor quality DNA results. As DNA testing testing and amplification techniques improved the quality of the DNA evidence improved to the point that it became overwhelming. The innovative use of the Database enabled this case to be solved as it developed THE line of enquiry that unmasked the murderer. It is a technique that can be used in other cases, both to help convict the guilty and potentially to clear the innocent. It is NOT evidence, it is an INVESTIGATIVE tool, and damn good one too. Have you ever though that the reason the crime scene profiles did not match any profile on the database was because the donor of the crime scene samples Jeffrey Gafoor was a person whose DNA profile was not on the Database, so there never would be a full match on the Database. If you want independent corroboration of DNA, what do you define as corroboration? And why are you complaining about this case. There was corroboration of the DNA - compelling corroboration of compelling DNA.

Once again I have no problem with what the police did in the second investigation of this crime. It is not the end justifying the means; it is pure and simple excellent police work. Please understand that there is a difference between the investigative process (the search of the Database) and the evidence presented in court Gafoor's profile and admissions and guilty plea. The DNA evidence in this case was corroborated. Call the next case!

Sorry I should have added that your point regarding the ensnaring of 3m based on 10 alleles is fundamentally wrong. The evidence of incliusion that resulted in conviction was a 20 allele profile. Nobody could be ensnared without a 20 allele match, so unless you are claoiming that 3million people coincidentally share Gafoor's profile 3m people in the UK most certainly are not ensnarable.

I am not sure of the generic term for what was done, but call it subset or allele trawling if you like. I prefer to call it innovative investigation of particular alleles. While there were 12 allele partial hits the relevance of the boy's profile was the particular ones that the partial hits were obtained at and the similarity to crime scene samples - similar enough to arouse interest, but nowhere near similar enough to suggest that the boy had a case to answer. In fact the boy's alibi is compelling as he hadn't even been born when Lynette was murdered! But as I have said before the boy's profile merely alerted police to the possibility that a male relative was likely to be the killer. They still had much investigative work to do. Gafoor established himself as the prime suspect as a result of his blatant attempt to fabricate an explanation for the discovery of his DNA at the crime scene. That was prior to him giving a voluntary buccal swab. As a result of that lie he was put under surveillance and set about obtaining paracetamol. He took a massive overdose, hoping to die. The surveillance resulted in his life being saved when police knocked his door down once the DNA profiles obtained from his swabs were shown to match the DNA profiles obtained from the crime scene. And I repeat NOBODY was ever going to be arrested and charged over this case without a full 20 allele match - 12 simply would not cut it, so the number of matches at 12 alleles is irrelevant, except in eliminating those not of interest and narrowing down those who were of interest. I still say, this was excellent police work. I say that as someone who was scathing of the failures in the original investigation. The reinvestigation was a different story - an example of exemplary police work.

Best Wishes
Satish

One fly in the ointment for the Gafoor technique/ Sub-set trawling is number of children not genetically fathered by the person accredited with the fathering. See study by Elliot Elias Philipp that a minimum of 30% could not have been the father. From a random population being researched for medical/ genetic reasons and as a side matter threw-up this astonishing statistic. No reason to assume it is not generally applicable. This study was from the 1950s and we've had the 60s etc since.. Source: Law and Ethics of A.I.D and Embryo Transfer, Ciba Foundation Symposium 1973 ,pub. Elsevier-Excerpta Medica,N. Holland.

I am not opposed to corroboration of DNA evidence, but remember there was corroboration here and corroboration is only required in the courts, not during the investigative process. The DNA evidence in Gafoor's case established that he had a case to answer. The evidence as a whole proved his guilt, none of which would have been obtained without the DNA evidence establishing that Gafoor had some questions to answers.-----

Dear Mr. Lind - Since I said, "DNA is NOT so widely accepted any longer without corroboration", and then you said, "It IS accepted without corroboration", it is very obvious that it is a futile effort to attempt to share anything with you because you are too closed minded.

Obviously you did not read the Supreme Court decision from the UK which was included in the post. Sorry but the world consists of more than just the U S of A.
Sue S

Two problems with your post as I see. First of all there is a national database in the US that combines almost all of the state databses (I think one or two states haven't joined yet). It currently has well over a million profiles in it.

Secondly, I have some concerns about your statements from your Missisippi source. I highly doubt that the person was working in a forensic laboratory, but probably in a biotech lab. Even if he/she did work in a crime lab, I doubt the tinkering would cause any significant problems, since the thermal cycler would be calibrated before use.

I've not really studied the American situation so I stand corrected on that. The golfing profiler could well be in an academic environment I only researched to a possible Mississippi context and no further as to his identity.

Returning to the main thread this is a secondary reference to the 2000 UK appeal court decision concerning admissibility of DNA profiles without independent corroboration that someone in the thread alluded to. http://www.forensic-evidence.com/site/EVID/DNA_Watters.html I'm trying to find a more robust primary reference.

Note the following passage from that decision
Quote
The other evidence results from more stringent tests that have been done on the DNA material that was available in this case. That is partly as a result of a case in which a 6 point match was found to produce two possible suspects, one of whom had been charged despite living at the other end of the country and had to be acquitted when it was appreciated that the DNA matched a second person.
End Quote
This is another documented example of ' unresolved duplicate pairs' resolved in this particular case and not use of aliases. It harks back to the 6 loci database situation when there was perhaps less than 100,000 profiles. Balancing increase from 6 to 10 is increase of the database up to 2 million profiles so probably as many unrelated matches occuring each year now as in the 6 loci days. By the way I don't like the term 'duplicate pairs' as it might suggest 4.

Firstly I would prefer the 'sub-set trawling' was named after the person who first promulgated the concept with the name of the first investigator. Secondly I am amazed at the amount of man hours that must have been spent to track down Gafoor when no one will put in a few man hours to do a procedurally very easy 2 database cross- correlation to resolve these unresolved matches in the NDNAD.

Perhaps if I summarise how I perceive 'sub-set trawling' works just incase I've got it totally wrong. It starts with a 10 loci 20 allele scene of crime profile that has no match on the NDNAD. This consists of say A1,A2;B1,B2; ........J1,J2.X,Y and then 1024 permutations taking 10 at a time. Now an interrogation of the NDNAD for 'partial matches' probably returns and I'm totally guessing here but something like a few 16,17 or 18 allele possibles,say 4 or 5 for 15 ,10 or 12 for 14,50 or 60 for 13 ,a hundred or so for 12 and thousands for 11 and tens of thousands for 10. Now family tree investigation (easy but time-consuming ,GRO records trawl ,for other than own family - I know I've done it myself) starting with the 15 or more hits going up the tree and across to siblings. Then going to the each family and requesting samples and family tree info. If their volunteered family history does not match the externally derived tree then immediate suspicion . If the volunteered samples diverge from the SoCo profile then go on to the next sub- set 'hit'. Until eventually the nephew comes under investigation and sampling up and across the tree becomes convergent on Gafoor in this case. 10 loci/20 allele match so start criminal proceedings but he is still only a suspect at that stage.

My problem with such trawling goes back to when I received my own profile under Data Protection Act - Subject Access. I am no criminal,no criminal record, but my profile is on the NDNAD just like crime-scene examiners ,many police etc. Although of a scientific background the returned form showing words like locus,Amelogenin,D21S11 etc meant absolutely nothing to me at that time. But at the top of the table were apparently two columns labelled X and Y. I misinterpreted this as meaning the numbers under the X came from the X chromosome and those under the Y from the Y chromosome. The sort of tables I normally come across that is how they are structured. Totally wrong,bad layout ,I now know, but at the time I was gob-smacked in that should a relative of mine be a villain (known or not known) then I could be in the situation in effect of 'grassing-him up'. Now I find with 'sub-set trawling' I was not so far off the mark. The only difference being the computer crunching and investigation time required to do it . But in 10 years time with powerful computers and surname/DNA profile data, by that time demanded,from the genealogical community cross- correlated to the NDNAD etc - who knows. See whats going on in Iceland supposedly for genetic/medical reasons. Or put it another way how does the Gafoor nephew feel about all this and how does the rest of his family behave to the nephew these days?. Whatever crime one does then one is punished and that should be the end of it not this atrocious Damoclean Sword hanging over you for the rest of your life and death, the lives of your parents,the lives of your children,children's children ad infinitum . Previously a criminal record related to yourself and you alone other than maybe some stigma wrt neighbours or family say ,nothing further could be inferred from name ,DOB,mugshot,IC1-5,facial features and friction ridge data - totally different now.

Call it sub-set trawling if you want. It makes no difference. Why are you amazed that the police did such a good job involving so many hours. They were determined to solve this case. When you consider what happened before - an easily preventable glaring miscarriage of justice - it is not at all surprising. They want to restore public confidence in them. Very few people believed that the real murderer could be caught. At times I was alone in believing it could and should be done. They saw that this case was solvable and they knew that they would be under great scrutiny this time round. Not only did it have to be done right it had to be seen to be done correctly.

Many victims of miscarriages of justice want the real perpetrators to be caught and punished. The Gafoor case has already caused many victims of miscarriages of justice to give the police a chance to follow the example set in the Gafoor case. This is precisely what the police hoped for. It explains the man hours invested in this case - resources which were a fraction of those wasted in the original investigation.

If cross checking the National DNA Database for unresolved matches means so much to you, I suggest that you take it up with the Forensic Science Service who are the custodian of the database and raise it with the Bar Council, Liberty, etc. to take it up with the government. By the way what evidence do you have that there are unresolved matches on the database and what do you mean by that term? I know that there was a case of a match being called on the six loci test which was later proved wrong, but you really ought to credit the fact that it was the ten loci test that exposed this error. To the best of my knowledge there is no similar case involving SGM+. You seem to want it both ways - the only way to categorically prove that DNA is unique or disprove it would be to have the DNA profiles of every single human who ever lived on a Database, which would then be searched for matches. If one is found DNA would be proven not to be unique. If not it would be. Now to the difficult part; to achieve that you would require a complete DNA Database, not just nationally but globally. Would you support that? If not how do you propose to test your hypothesis that Adam, Wally and others are wrong to claim that CODIS offers unique identification of individuals? You can't just assert that there are unresolved matches on the database as a fact without proof. In the Gafoor case full 20 allele profiles were obtained from the crime scene. These were obtained from several samples. There was really no doubt that these contained the DNA of the murderer of Lynette White and that he was her real murderer. I cannot conceive of an innocent explanation of the position and amount of his DNA discovered at the crime scene. That was in January 2002.

Police then checkecked that National DNA Database. There were no direct hits. It was therefore clear that the murderer's DNA profile was not on the database at that time. At my request, the police had all 140 DNA databases throughout the world checked through Interpol. That shows how determined they were to solve this and that they were open to suggestion if they thought it could help. This was a marked change from the attitude of previous investigations.

Unfortunately there were no direct hits. It was pretty obvious that DNA offered the only realistic possibility of unmasking the real killer. After all the only other possibilities were the killer, overcome by remorse confessing, or a witness belatedly coming forward. Both were unlikely after 14 years. That left DNA.

There were only two options over the DNA. 1) Wait for the killer to do something that allowed his DNA to be stored on the database, or the innovative approach of analysing components of the result. The second approach is what South Wales Police did. Considering what we now know of Gafoor's character this was the right approach to take.

It all started with DC Paul Williams noticing that one particular allele position was only occurring once in every hundred profiles checked. That alone narrowed down the search by 99%. He then expanded the search to eight alleles. This narrowed down the remaining 1% further. After that the search was expanded to 12 alleles. It is fair to say that were more than one 12 allele partial hits on the database. Williams then narrowed the search to the South Wales area, believing correctly as it turned out that the murderer was a South Wales native. After both were done 12 allele search confined to South Wales area, the profile of the boy stood out.

It was of course possible that this boy was not related to the murderer. At this point Williams' work had generated a very interesting line of enquiry that had to be checked. If after this was done, no 20 allele match was forthcoming there would have been no arrest and other partial matches would have been investigated. The investigation of the family tree as you put it was necessary. Family members could have refused to give samples if they wished. Would they have been suspected as a result - possibly, probably even, but without evidence they could not have been compelled to provide samples. Gafoor could have prevented his conviction by refusing to co-operate. Had he done so he would have been suspected, but there would have been no evidence to arrest him. It was the DNA evidence that led to the other evidence. The work of Williams was part of the investigative process. But for Williams and the DNA evidence a particularly brutal murderer would have been untouchable. You seem to object to the police doing this. I still can't understand why. There is no breach of civil rights here. The police did not frame Gafoor. They did not plant evidence. They merely identified him as the donor of crime scene DNA and nobody but him was the donor, a fact he concedes. I'm not sure what you mean by family tree investigation. There was no such thing. Police approached all male relatives of the boy and asked for buccal swabs. All such requests were successful. The point of the tests of family members was to exclude the innocent and to identify the donor, after which the family history, background, etc became important. The DNA testing was the essential first step in the evidence-gathering process.

As far as I know Gafoor's family are fine about it. They had been estranged for years. The co-operation police obtained from the boy's family suggests they had no objection, probably could not believe thart one of their own could have been involved. They like many others in this inquiry co-operated because they had nothing to hide. Gafoor himself may have co-operated for the reason you suggest. He alone could not afford the suspicion or arrest then as he planned his final exit and needed to buy time for it, but his answers aroused suspicion and he was put under surveillance.

I still don't get your point. May I suggest that you should have more concern for the feelings of the Cardiff Five, Lynette's family and society than Gafoor. There was a very simple way for Gafoor to have remained at liberty. All he had to do was not kill Lynette in the first place. The use of DNA Databasing was not the sword of Damocles hanging over Gafoor's head; his DNA which proved his guilt was. In the case of Gafoor and his nephew, had either of them not committed crimes, none of this would have happened. Speaking for myself, if my DNA could have helped to solve this crime I would have no objections to giving it. I believe there is a need for debate over the precise safeguards needed for DNA Databasing, but the work of DC Williams was and remains outstanding and has no part in my concerns. I have no problem with it. Society is better off with Gafoor in prison for a very long time. The atrocious sword of Damocles in this case was not hanging over Gafoor's head; it was hanging over society's collective head. For all we knew Gafoor could have been a prolific serial killer. Thankfully he wasn't and it seems unlikely that he would have killed again. Prior to his capture nobody could have known that. DC Williams cut the Gordian Knot. More power to him for doing so. Best Wishes Satish

The unresolved matches are inherent to such a database and solely contained within such a database . NOT the false matches between a new SoCo profile and a profile already within the NDNAD or new arrestee and old SoCo profile.

The term 'unresolved match ' first emerged in the Journal Forensic Science International 95 (1998) p30. Concerning data in the UK DNA database as of 04 October 1996 when there were only 6311 samples from the London area and 573 from the Cardiff area. Direct quote from that article - pre national database.
"A small number of unresolved duplicate pairs of profiles were present in the regional data :10 pairs within the London region and 1 pair in Cardiff. The most common cause of duplicate entries is the use of aliases by suspects who have been arrested on several occasions. For administrative reasons ,it is not always possible to resolve such duplicates by exhaustive police investigation."
End Quote
I suspect it is more than 'administrative reasons' these matches have not been resolved. As I say it does not require police investigation, just the cross-correlation between 2 databases. If they want to investigate the aliases situation then that is another matter I have no interest in. Then more recently broadcast on 04 Dec 2002 was a documentary "DNA in the Dock" that had been made maybe a year or more previously referring to this same situation but by that time 300 "matches" in the UK NDNAD ascribed to mistakes ie retested people giving aliases but again not resolved. Even a sampling of 1 in 10 say of these matches being resolved would indicate how prevalent the aliasing versus unrelated individuals situation was. Copy of this broadcast at Livermoore Library but I cannot access it or find a transcript of it anywhere else.

A couple of weeks before parliamentary recess I got my MP to ask a written question of the FSS the most fundamentally simple initial question of how many of these damned matches are in the NDNAD. So far it has not turned up in the Hansard public access internet search facility.

The DNA uniqueness data is already there buried in the NDNAD it does not need any further profile taking studies. But someone at the top of the FSS is not disclosing it for political?/scientific? reasons - your guess is as good as mine. Not even researching one in ten (say) of these unresolved matches.

Thanks for more detail on the zeroing in on Gafoor,I was taking a general case. I did not realise there was a head start in one low allele frequency ,sub 1% . Every allele in my own personal profile are in excess of 9% frequency of occurance. On the family tree side of things for investigation expediency you have to balance up leads from the family very much speads up the GRO research but doing it remotely ,taking much longer, would be more pure but doing it for each particular family for more than 2 or 3 false hits would probably have stymied the investigation - it is very time consuming as compared to verifying /extending leads direct from the families.

The Damoclean sword hangs over Gafoor yes but also everyone on the NDNAD (Peter Hamkin fashion) courtesy of s82 of the 2001 Criminal Justice & Police Act including ,no doubt,(not that they realise it) the families in the false hits in the process of zeroing in on Gafoor.

In response to some commentaries - Yes, we (USA) have a national DNA database, CODIS. All states except Mississippi and Rhode Island are participating. However, thinking that one million samples is a lot, look at it from another perspective.

Right here in Los Angeles County there are over 1.5 million warrants in the County system. CODIS is in its infancy and I believe headed for a lot of jerky crawling and toddling before the smooth walk.

Sincerely yours, Sue

Hi Sue, While DNA databasing is undoubtedly in its infancy and over one million people being on the CODIS database may seem a small sample size in the context of the population of the USA, it is bigger than it may seem.

The Gafoor case in Britain has shown that criminals can be identified from family members' DNA. There is nothing to stop police from investigating particular allele positions and searching for partial hits at less allele positions than the complete set recorded on CODIS. It can therefore be argued that rather than one million people on the database can be seen as one million families on the database. Of course for prosecutions the DNA of family members not on the database will be required.

In Britain co-operation (meaning the right to obtain samples from those arrested) can be compelled. Samples can be taken by force if required, but this cannot be done in the absence of evidence justifying arrest. Unless there is other evidence justifying arrest voluntary co-operation would be required.

In 2001 following a notorious case (Michael Weir) the law was changed to allow samples and results to be retained and stored for National DNA Databasing purposes after a person has been arrested and charged even if the person arrested was subsequently acquitted, or the charges were dropped. This applies to fingerprinting databasing as well.

Last year the legality of this was challenged by judicial review in the case of Marper & S v Chief Constable of South Yorkshire Police both in the High Court and on an appeal. It had been argued inter alia that this practice breached the terms of the European Convention of Human Rights, and the Human Rights Act regarding privacy rights. Both courts found no breach. In fact, there are exceptions to privacy rights within the Convention itself including the detection and prevention of crime. Consequently, the law as changed in 2001 does not breach the terms of the Convention. I also think it unlikely that a challenge will succeed in the European Court of Human Rights on these grounds due to the exemption. Satish

One might add that DNA typing methods are based on a pool that is not only enormous in quantity, but also a very broad cross-section of the population. I know of no other biological studies based on such an impressive pool of data. Even the largest medical studies done (i.e.: the Framingham heart study) can't hold a candle to the number of samples in the DNA profile database compiled by the U.S. military. Yes, we all know that just because the probability of something happening is greater than zero (no matter how miniscule) doesn't mean that it's never going to happen. However, there is also a very small probability that all the atoms in the world will all somehow occupy the same place at exactly the same moment, thus turning us into a black hole and rendering this whole exchange pointless, but I'm not going to lose any sleep over it. Anyone can publish a paper "proving" otherwise, but unless others can reproduce your results, a published paper in and of itself means little. (Remember "Cold Fusion"?) - Laura

--- dutraa I would like to thank all of those that spent time working on trying to apply equations for the birthday problem to DNA. Unfortunatly all of your work is wasted. There are two critical flaws in doing this. 1) The birthday problem by its nature involves a finite set of possibilities, and therefore a known probability distribution. There are only 365 days in the year, each of them assumed to be as likely as the next to be someone's birthday. So for any group of people, you can calculate the number of people with the same birthday. With DNA, there is no set number of combinations that can be determined. One could take the set of alleles in the allelic ladder and disregard the fact that dozens of other alleles are possible at each locus. Limited to only these, one could come up with 400 billion possible combinations (although this would be underestimating by several orders of magnitude). 2) You have made your calculation easier by estimating the "average" population frequency of 1:1 billion. This is a rather nice figure, but it doesn't seem to take into account for the fact that all of the DNA profiles have different frequencies. Some may be 1:1 billion, but others may be 1:1 sextillion. Perhaps you shold come up with a weighted average of the billions of billions of possible combinations and their respective population frequencies. If the calculations were correct, there would be over 1000 pairs in America's DNA database of over 1.5 million. Interesting how this is not the case. Interesting how there are not even a hundred. Or ten. Or even one. Sleight of hand can make even the impossible seem reasonable. Let's not let facts get in the way of some really great smokescreening. Please, carry on with your ministry of misinformation, there are plenty of impressionable people who also are not intersted in the truth.

Your statement "If the calculations were correct, there would be over 1000 pairs in America's DNA database of over 1.5 million. Interesting how this is not the case. Interesting how there are not even a hundred. Or ten. Or even one. "

Of course this is more than blind faith coming out here. You have the evidence to back your assertion down to the last '1'. Where / what is your evidence for this assertion. ?

[ NOTE: HE NEVER DID COME BACK WITH ANY EVIDENCE ]

If true you have some remarkable technicians in charge of the USA DNA profile database. From the record of UK parliament (the section highlighted in red) about a third down this official source http://www.publications.parliament.uk/cgi-bin/ukparl_hl? DB=ukparl&STEMMER=en&WORDS=bob+ainsworth+profil+calcul+&COLOUR=Red&STY LE=s&URL=/pa/cm200203/cmhansrd/cm030408/text/30408w25.htm#30408w25.htm l_spnew2

Quote
Mr. Bob Ainsworth: The total number of profiles held on the database at 25 March 2003 was 2,094,858. The Forensic Science Service calculate that these profiles relate to an estimated 1,886,000 different individuals.
End Quote

Number of profiles I'm quite confident to the last '1' as being 2,094,858 on that date. Beyond that here is the 'sleight of hand' they have to 'calculate' to give an 'estimated' figure of 1,886,000 for 'different individuals' ,whatever that means. To my way of viewing errors some figure between 1,885,500 and 1886,500 or even range 1,884,000 and 1,888,000. Genuine repeats of reprocessed same individual ,clerical errors all along the way from police station to forensic science date entry, people using aliases and (to me) the all important pairs (or more) different individuals with the same DNA profile. All cast aside with 'calculate' and 'estimated'.

Resolution Split into 6 loci set and 10 loci set How difficult is it to check 3 fields in a database for repeat profiles. ie match on 10 (6) pairs of numbers + amelogenin,DOB,and name. That should be a figure accurate down to the last '1'. Then check for repeats across 6 loci and 10 loci set with same DOB,name. After that how many pairs of matching profiles in each set again should be a precise figure. After that it becomes more murky - resolving what constitutes the difference in figures,ie clerical errors,aliases and unrelated individuals.

Please tell me how they have analysed in the USA down to the last '1'.

This quote contained in a judgement by justices KAY LJ, SILBER J and JUDGE MELLOR in matter of
R. v. Watters
COURT OF APPEAL (CRIMINAL DIVISION)
October 19, 2000
London

Quote
6 point match was found to produce two possible suspects, one of whom had been charged despite living at the other end of the country and had to be acquitted when it was appreciated that the DNA matched a second person
End Quote
Source 2/3 way down the full judgement on http://www.forensic-evidence.com/site/EVID/DNA_Watters.html

This puts on record that unrelated matched pairs of DNA profiles (not alias cases) do occur within the UK NDNAD. Less common now with 10 loci profiles than 6 but it does not change the principle. This event only leaked out because a scene of crime profile matched two in the NDNAD.

We can blather on about hypotheticals of 1: 1 billion, 1: 1 squillion or whatever and it is all meaningless. The real answer lies within the real data derrived from real people but buried ,untapped, within these databases.

The best point I've heard in this discussion yet. The arguments against DNA are equivelent to Barry Scheck's (sp) pounding of Deniis Fung in the OJ case. "isn't it possible. Mr. Fung"? Anything is possible but it isn't very likely at all. In the criminal justice system we have to deal with things we can actually grasp and understand. DNA is well founded in science, reliable, and believed by judges, juries, and attorneys.

Minor correction: the US military DNA Repository has samples of most everybody on active duty since about 1994 and having a sample confirmed as in the repository is a predeployment requirement (at least in the Army.) But these samples have NOT been typed. Typing only occurs when a set of human remains is compared to an individual in the repository. Of course, for a mass fatality incident with a known population (like an air crash) all the samples believed to be in the group would be pulled and typed and then compared with typing from remains as they came in.

I think the ball is in your court on this one. Knowing that there are a set number of combinations makes it easier to calculate an *exact* result, but there is no need to rely on that to calculate a good approximation.

We see quoted accuracy of an unrelated DNA match as being x million :1, or 1-100 billion: 1 for the UK NDNAD test, say. Knowing that alone, it must presumably also be true for two individuals selected at random from a large population. Given that, it becomes clear that it is not surprising and indeed inevitable that a large population must contain individuals with non-unique profiles, per previous calculations, without needing to rely on knowing exactly how many combinations might exist.

You have to start somewhere. AIUI, that kind of assumption of independence is also made in the calculation of the x million:1 odds we see quoted. In some sense there will be an 'average' figure (perhaps not an *arithmetic* mean, maybe more like a *geometric* mean), and if one individual has a much lower chance of having a duplicate due to their particular makeup, then by the same token there will be others with a much higher chance.

Reversing your logic, one way to discover experimentally the weighted average you suggest would be to take a large database, count how many duplicate pairs we find, and feed it back through the probability calculation the other way.

If there are actually no duplicates, then by calculating that average probability if there was 1 pair in the population, we can determine that the 'average' random match chance should be less than that result, so we still learn something.

Also, if there were actually no duplicates, that would suggest to me that there can't be more than a certain variation in the random match chance, as if some people were much more likely to be in an unrelated pair you might well see them popping up. You could also investigate that variation, if present, by seeing perhaps whether any such pairs fell wholly within a subset of the whole population (e.g. by race, or anything else) where the random match chance differed from the average.

Question: Has anyone seriously looked for pairs in it? Or has it been assumed that there can't be (it's DNA, after all!), so there's no need to look? That seems circular logic to me, to say that because there can't be any duplicates, any that are found must be error...?!

In between that 2,094,858 total and the 1,886,000 'estimated' individuals, there's plenty of room for there to be between 40-4000 non-unique individuals (20-2000 unrelated matching pairs) - the number you'd expect if the 'average' random match is as quoted 1-100 billion:1 for the test used in that case.

Small enough not to be seen if you're not looking for it, yet easily large enough for there to be many more such matches in the general population. Remember, the number of unrelated pairs rises in proportion to the square of the population (until it's a large fraction of the population, anyway), so double the database and you'll quadruple the number of unrelated pairs.

Whatever the actual match probability, I think it's for you to explain why the pairs would not rise in proportion to the square of the population, as the probability calculation shows they should, at least until they are a large fraction of the database.

The results may seem counter-intuitive, just as it does with the birthday problem, but are no less valid because they are not what you expected. Even with a very low random match chance, you will start seeing duplicates with a surprising small population.

Well, I'm interested in truth, and there seems to be a remarkable willingness to make pre-judgement with this and assume the result, essentially asserting that unrelated pairs are impossible because they're impossible. IIRC, there was a good deal of pain before it was admitted that the 6-loci match wasn't good enough and the accuracy of the test was improved to construct the UK NDNAD.

I make no prejudgement, whatever the outcome. Either way, to perform an investigation or experiment that could conceivably falsify a currently accepted tenet seems like good science to me. It may be that the results are consistent with the theoretically calculated random match chance; or it may be that they indicate that it's too low or too high.

Whatever the case may be, the numbers for the 'flat' probability calculation raise enough issue in my mind to want to know how many duplicates there actually are in a large database. Are there actually zero? Tens? Hundreds? Thousands? How do you know unless you accept there might be some number of unrelated pairs and go and look for them? - jbaron

Still the size of the typed pool around the world must be large compared to most medical studies. I wonder if anyone is pulling this worldwide data together to evaluate say RFLP or STR profiling? At the risk of being attacked personally again, I think that the PI lady and I have been talking past each other. She has been saying, as I understood her, that DNA profiles are not statistically unique. What I'm saying is that the DNA of a particular non-twin person is unique, due to the fact that there are about six billion people on earth and trillions of possible DNA combinations. 120,000 years ago there couldn't have been more than a few thousand modern humans. Even given the rise in human population to over 6 billion, there haven't been enough human genomes put together to make a duplication more that a very remote possibility. My contention is that DNA is reliable, as it is presently done. But if there is a statistical artifact that casts any doubt on it, the number of probes or loci studied should be increased to eliminate that doubt. I also think that the people on the defense side of the criminal justice system just hate the thought that science has come up with identification procedures they can't talk their way out of. I think that that is why they so vigorously attack DNA and fingerprint evidence. - Wally

It may surprise you, but I agree with much of this. I do not know what the reported statistics are for the possibility of random matches are in the USA, but in the UK the SGM+ system (10 loci plus amelogenin) is 1:1 billion.

I do not know whether that is an accurate representation of the statistical reliability or not, but it is the figure routinely reported in the UK. Given the figure reported by Wally Lind I am interested in how there could be such a discrepancy. Do the extra genes used in CODIS offer that significant a degree of discrimination? Alternatively either the figures reported in the UK must be wrong, or those in the USA are. Could anybody explain how these systems could produce such diverging random match statistics?

Regarding the use of extra loci, I wholeheartedly agree. Given that there are suitable loci already identified I don't see why one or two extra loci are not added to the SGM+ system as by doing so the statistical grey area that exists in the UK would be removed.

I do however, disagree that defence lawyers hate DNA. Where would the Innocence Project be without it? Over 100 innocent people have good cause to thank their lucky stars for DNA in the USA at least. So do their lawyers. DNA not only helps to convict the guilty, it helps to clear the innocent. The Cardiff Three have good reason to be very grateful for DNA, while only Gafoor had cause to fear it. DNA is an invaluable tool for law enforcement AND for the defence of the innocent. Best Wishes Satish

Here's an interesting way of looking at this debate-- Remember PGM (phosphoglucomutase)? Remember how this test was used in combination with ABO? Yes, folks, they were used to eliminate suspects. If the results of these tests "matched" your subject's, you have failed to eliminate the subject. Depending on which ABO type and which PGM type you get, you could say that you could exclude X% of the population but could not exclude the subject. This test was (and maybe still is) being used as one piece of evidence.

DNA typing can be seen the same way. Instead of 80% or 90% of the population being eliminated, we can eliminate 99.9999999999999% of the population as the source. Once again--one piece of evidence (a really good piece of evidence, however).

The prosecution still has to demonstrate the significance of this one piece of evidence by presenting other evidence and putting it together to convince a jury beyond a reasonable doubt that the accused is guilty.

I think there is a lot of confusion here over the difference between "reasonable" and "shadow of a" doubt. We are not trying to do a mathematical proof here, rather we are determining the odds of a proposition being true. - Laura

A person could have more than one profile placed in the CODIS databases. For instance, if a serial rapist moved from state to state (each participating state is the repository for its own database) each state would compete a profile if DNA was recovered, so they could have more than one profile in CODIS. We had a robber/rapist who had left semen at one other crime scene in a different Minnesota community. When he was caught and convicted for a third rape (no DNA in that case), his known profile was done and placed in our database per state law. The computer "hit" on the two unknown cases. This meant that he had three copies of his profile in the Minnesota database. I think.

There is no actual national database in the US. There are connections between state and federal databases through the FBI. As I understand it, when a profile is run in CODIS it is actually run in each participating state, since only FBI case profiles are kept at the FBI. The question would be; when a profile is placed in one states' database, is it automatically checked in all the CODIS databases. - Wally

I would like to correct myself on two issues after speaking to a mathematician on the issue of the "birthday problem". He indicated that although there are differences, the expansion of the birthday problem to the issue of DNA would not be a big problem. He indicated that the big problem would be calculating the average profile frequency. He suggested a reasonable estimate. Some have suggested a "reasonable" estimate of 1:1 billion. I found this significantly low. As I don't use SGM+, I didn't have a strong basis to make this claim. I have now found that this figure is apparently accurate. In America, we use 13 loci (but many overlap). Since we use more loci, we commonly report statistics in the trillions, quadrillions and sextillions.

But whatever the number (1 in a million, billion, trillion or gazillion), it doesn't change the fact that duplicates are going to happen, but they are not a big deal. The fact that two people in a country have the same DNA profile doesn't change the facts of a case. Certainly if the only evidence that exists is the DNA match, there is no just cause to convict (arrest is another issue). If there is corroborating evidence, that supports the claim and it is found to be sufficient, there may be cause for conviction.

The presence of duplicates doesn't invalidate the use of a database, it is just a factor of the population size.

We should remember that just because something happens, it doesn't make it common (and the reverse is also true). Just because something is rare, it doesn't mean it can't happen. - Dutra

Concerning duplicates in the database.

While there are no duplicates in a database of 2 million ,just a single hit then you can say that maybe there will be no further hits if extended up to the whole population of say 60m (UK). Once you have duplicates within 2m then you could say there may well be another 1 or 2 in the next 2 million added to the database or maybe 50 or more duplicates in the whole population. It is to me shear arrogance of assuming that the first hit within such a database and you 'have got your man', just because there is only one in a 2m database which is itself just a small sample of a 60 million population.

Even if two people have the same DNA profile, they don't have the same genome, distinguishing between the two and the crime scene evidence would only require further testing. How many actual parings have been detected in the American databases? Wally

my two cents then back into lurk mode again..... Don't forget about the NMDP...(National marrow donor program) They find matches all the time...They use HLA & RFLP, AFLP, and are getting more and more precise with the Typing data all the time, (the closer the match....the better)
I once saw a person who had had a sex change operation. This person later developed leukemia. had to have an unrelated marrow transplant and the donor turned out to be the sex of the patient after their operation....to simplify.... The patient was born female, had a sex change to male and received donor marrow from a male...this person's karyotype were from then on 46XY. Really causes a bit of a mess for forensics then doesn't it... Best Regards, Daphne F.

Broadly speaking concur with your revised conclusions. My congratulations for your willingness to investigate and to report back on this honestly, as you have done.

That is essentially my understanding. One difference is that the DNA test is 1: some large number, and that the group size you're assessing may be much larger than 20 people - but these are differences of scale, not of principle.

The main effect that I see resulting from that is that it's more meaningful with DNA tests/databases to talk about the expected number of duplicates in large population than to talk about the population size where there is a 50% chance for one or more duplicates - but the math basically works either way.

I chose 1 billion:1 as a number to investigate because that's a figure that is not uncommonly quoted as an estimate for the accuracy of the current UK 10-loci test. AIUI, this estimate is calculated by a product rule multiplication of a series of probabilities, although I'm unsure of the precise details of the derivation. I've also seen 100 billion:1 quoted, which is why I also examined that figure.

It is itself a fact of the case, however, and I'd say it's something that a jury needs to know - that while a DNA match is compelling evidence, two people _may_ have the same profile and some form of corroborative evidence is advisable.

That is why when we see cases reported, I ask what the corroborating evidence was, and whether the DNA match was presented as being effectively certain in its own right with or without any corrobative evidence.

If we accept that DNA evidence is not enough to gain a conviction by itself, then it surely follows that *if* that's all that's available charges should not be laid but investigation should proceed until/unless there is?

Beyond that, the concern I have is particularly with UK policy to retain profiles on the national DNA database for persons not charged, or cleared of any crime.

(i) because of the civil liberty issue, retaining personal data from people who are cleared of any crime, and

(ii) because of implications if there is some proportion of unrelated matches in the general population (see below).

I agree. What concerns me is if the presence of duplicates is denied, and the possibility not considered. See the FSS response in this article: http://news.bbc.co.uk/1/hi/in_depth/sci_tech/2002/leicester_2002/2252 782.stm

If the existence of duplicates is not allowed for, a DNA match is likely to be treated as absolute by investigators, introducing the danger that they will see only what they are expecting to find.

If the potential existence of duplicates is admitted, then it may be fine to use our database as we are doing; but to do a more precise DNA test on any actual hit we find as a matter of course, or to expect that some form of corroborative evidence must be necessary.

The particular issue I would raise with the retention of data from persons cleared of any crime is that if you start doing that, and you do not acknowledge that there may be a proportion of non-unique profiles in the general population, then there is a risk that you may be essentially randomly sampling the whole population, and the chance of hitting an unrelated match may be determined by that population size rather than the database size. If the random match chance for the UK test is truly 1 billion:1, as quoted, then we could be looking at a chance of error as high as few percent.

To see what I mean, imagine that we have a scene-of-crime profile, which is not matched to a profile already on the arrestee database.

If there happen to be *two* people in the whole population with that profile, then the closer the addition of profiles to the database gets to a random sample, the more it's going to be 50/50 which person gets added first.

Well, you may say, it's more likely that the offender will get arrested first. Maybe - but also consider the converse situation, where someone is arrested and then released without charge, but their profile is retained on the database. If it happens that there is someone else with a matching profile in the general population not yet on the database, but who then commits a crime and leaves DNA, the first man is likely to be accused of the crime.

In either case, they may have difficulty proving their innocence, unless they know to demand a more precise test.

The investigators won't know it's a duplicate, of course, unless/until they happen to arrest the _other_ one - and then not if it's rejected automatically as an "invalid" duplicate.

Indeed. And if you accept that duplicates are possible, then I'd say that it's reasonable to want to count how many are actually occurring in a large population, such as the UK NDNAD. Whatever the result, we can then plug that number back into the math the other way, and see if it comes out similar to or different from the theoretically calculated estimate of the random match probability.

If zero such duplicates, then we can do the math as if there was 1 such pair, and infer that the true random match probability is likely to be even higher odds against.

If greater than zero, especially if quite a respectable number, then we can make what should be an excellent experimental measure of the actual "average" random match probability. We can also do the same thing with subsets of the whole population, say by ethnic group, provided that the subset is large enough to still include some duplicates, and see whether the "average" random match probability is different for that group.

One thought that has occured to me since the previous post is that there should be enough data on the UK database to determine how it stacks up with a simpler case, the 6-loci test, without needing to concern ourself about checking fingerprints or indeed anything other than the DNA data.

When that test was in standard use, the typical calculated estimate quoted as the random match probability was 1:37 million, for example as described in the Easton case: http://www.thisiswiltshire.co.uk/wiltshire/archive/2000/08/15/swindon _news10ZM.html

With a national database of 1,886,000 persons, and given a random match probability of 1:37 million, then we'd expect to find 100,000 or so 6-loci not unique, or perhaps around 50,000 "pairs". You'd also expect perhaps 500-1,000 "triples", and maybe 10 or so "quadruples.

Then what you do is assume that the 10-loci test is genuinely unique, and count how many 6-loci duplicates etc you have. If there are a few tens or even a few thousand 10-loci duplicates, we'll fail to count those because we incorrectly think it's only one person, but we're looking for 50,000 pairs, so we'll still get a fairly close result.

Depending on the results, I'd say:

If you found zero 6-loci matches, then I'd wonder what was going on, because we know such matches can occur in practice, and it would be wildly inconsistent with the theoretically calculated estimate.

If you found many fewer (say 10,000), then I'd say it was good evidence the original estimate of 37 million:1 was conservative, and that other estimates calculated in a similar way (e.g. the 1 billion:1 for the 10-loci test) may also be conservative.

If you found a similar number to that expected, I'd say that it was good evidence to validate the theoretical 'product-rule' calculation, for this and potentially for other tests.

If you found many more than the expected number, say 100,000 or more, then I'd start to wonder if the estimated probability for supposedly more accurate tests was doubtful.

Take a little bit of effort, but all of the data should be there in the database, and susceptible to suitably defined query operations. I could certainly write SQL which would produce the required results, given the database schema, and while 2 million records is a substantial dataset, it's perfectly within the realms of current technology to manipulate. - jbaron

Would you know whether the Amelogenin adjunct to DNA profiles will pick up the likes of Turner Syndrome 45,X or Amazonianism/Triple X 47,XXX ?

It is possible to tell if such a genetic anomaly existed. In the case of someone with turner syndrome (X,0), they should have a single peak at amelogenin that would be about half as tall as would be expected.

With triple X (X,X,X), the single peak would be about 50% larger than expected. This might not be detected, since the peaks at amelogenin are generally taller than at other loci. If a person was (X, X, Y) or some other situation, it could be more evident if one peak was significantly different than expected. Adam

I happen to know someone with XXX ,she only got diagnosed by a test associated with pregnancy. I was wondering whether someone could get a 'diagnosis' by making a Data Protection application for copy of their DNA database record.

Returning to the main thread. If the FBI database has no matches within it then it is because nothing has changed since 1992. The following is a published letter following a published article that was rendered flawed because of FBI weeding-out matches within databases before releasing to academic researhers. Only now ,it is beyond being just an academic matter as people are now being falsely arrested and for all I know falsely prosecuted and imprisoned.

Fom Science,Vol 256,26 June 1992 p1743
Author
Patrick J. Sullivan
Senior Attorney
Minneapolis

Title : DNA Fingerprint Matches
Quote

I am writting to comment on two aspects of the report " On the probability of matching DNA fingerprints " by Neil J. Risch and B. Devlin (7 Feb,p717) . Risch and Devlin searched several large databases to determine whether there were any samples with matching patterns across a nummber of gene loci. They found " the probability of a matching DNA profile between unrelated individuals to be vanishingly small....."

Last summer I was trying a Federal Bureau of Investigation (FBI) case, Minnesota v. Johnson (1),and examined three FBI databases,C-3 (Caucasian),B-4 (black), and H-3 (Hispanic). During my examination,I discovered 25 apparent matches. Before my examination ,the existence of these matches had been known by only a few individuals connected with the FBI. Bruce Budowle of the FBI subsequently testified in Minnesota v. Johnson that he was aware of these matches and that they had been discovered when the FBI examined its database with its computer matching program. The FBI was able to verify that most of these matches occured because the Texas College of Osteopathic Medicine submitted more than one blood sample from the same individual. One false match was the result of sample handling error. The FBI also discovered three sets of matching samples from Florida. These samples were from the black and Hispanic databases. The FBI was not able to identify that the Florida matches were the result of duplicate submissions from the same individual or of submissions from identical twins. Budowle then asked Cellmark Diagnostics (German- town,Maryland ) to examine the matching samples. Its probes also yielded unclear results. The Florida matches were then deleted from the databases,even though there was no explanation for their occurance.

The FBI again revised its databases in January 1992. The new databases are designated C-4,B-5, and H-4. Budowle testified (2) that all the matches have been edited out of these databases and that this removal is justified because it is not possible for two individuals to yield identical profiles when as many as seven probes are used. My first point is this: Of what scientific value is a paper that seeks to draw any conclusion from the fact there are no matches in a database when the matches have been removed from the database before the analysis is done? The FBI's removal of matches from its databases before giving them to outside scientists guarantees that those scientists' conclusions will support the FBI's "self-fulfilling prophecy."

This is not an isolated practice. Budowle testified in United States v. Yee (3) that the FBI ran its match program over its South Carolina black database and found a large number of matches. The FBI's record- keeping was such that it could only speculate as to the cause of these matches. Again,the FBI removed them from its database. The existence of individuals who match across a number of loci is not unprecedented. Kenneth Kidd's Amerindian (Karitiana) data (4) show a seven-probe match between two individuals ,a four probe match between another two individuals ,and a number of three-probe matches . These matches occured in a database of 54 donors from one Indian village. Despite this fact ,which is well known to the FBI ,the FBI chose simply to remove apparent matches from its databases. The apparent justification of this practise is that it eliminates the neccessity of keeping records about the source of the data. It is troubling to think that this approach has acceptance among scientists.
My second point .... (binning problem ).....
End Quote

I have to concur with all he said and that was 11 years ago - I was not aware of the above article until last week - the situation is now of course far,far worse. The blind faith stretches ever further. NB The Amazonian Karitiana tribe is not relly relevant here as they are highly incestuous so not applicable to unrelated matches - the relevance comes in when trawling across relatively closed communities for suspects.

I will respond to both parts of your post separately:

As far as your friend's genetic condition is concerned, I doubt that the reported genotype you could obtain would detail the XXX. Many labs (it may be different in the UK). Will report only the OBSERVED genotypes. So in the case of XO, XX, XXX, or XXXXXXXXXXXX, they would probably report merely as "X". This has something to do with the fact that mutations can occur that prevent one (or both) allele(s) from amplifying. In RFLP days, it was possible that a band would be so small, it would "run off" the gel and not be observed. This means that there may be more alleles that are not detected (although this is unlikely).

Adam

I want to point out that the paper you cite deals with RFLP databases. These are no longer in use in the United States. Similar to the matches reported when the UK used only SGM with six markers, it is possible to obtain matches with only three or four RFLP loci that can be excluded when additional loci are used. On another note, "trawling" for DNA types doesn't change the statistics of the match. No matter whether you obtain evidence against a person and the DNA solidifies the case or you have DNA evidence against a person and the resulting investigation solidifies the case, the two are equal. The fact that a person matches because he was in a database doesn't change the likelihood that a random person will match.

Adam

STR technology has been introduced since 1992, as Adam said, and most, if not all, of the profiles in the CODIS ( I think it has a new name which I can't remember) databases have been converted. If people are being falsely arrested and convicted, I still believe it is because of faulty practice of the profiling technology, not because there is anything wrong with the technology.

dutraa

Two cases from the UK this week on what can be broadly called misattribution of DNA 'evidence'. First from Scotland ,second from Devon.
From
http://www.scotlandonsunday.com/scotland.cfm?id=902562003
Partial Quote
Police outrage over demand for their DNA

by JASON ALLARDYCE

PLANS to force police to give DNA samples have sparked a rebellion among rank-and-file officers. It is understood all eight of Scotland's police forces are about to demand that in future new recruits hand over samples to be included in a national genetic database. This would allow any body matter, such as hair or saliva, found at a crime scene, to be compared with the DNA records of officers, so investigations are not thrown off course through accidental contamination by officers working there. But rank-and-file police fear that calculating criminals with a grudge against members of the force could manipulate the system to damage the careers of innocent officers. Members of the Scottish Police Federation believe criminals could deliberately contaminate the scene with officers' DNA, either to implicate them in serious crimes or to give the impression that they had planted evidence. A federation spokesman said: "A point made by many of our members is that it is relatively easy for anyone so minded to obtain DNA traces of a police officer - for example from a discarded cigarette butt - and to deliberately contaminate a locus with it. "Apart from the suspicion which may or may not fall on the officer, it has the potential to diminish the evidential value of any DNA traces of the real perpetrator of the crime."
End Quote

In the full Scotland on Sunday article the policewoman McKie case and the disputed dermal finfgerprints are on http://onin.com/fp/problemidents.html#second_case as high resolution images - interesting viewing

Then from the criminal fraternity someone being implicated by person or persons unknown, presumably an enemy of his.
http://www.thisisexeter.co.uk/displayNode.jsp?nodeId=101955&command=displayContent&sourceNode=99871&contentPK=6705317
Quote
A man accused of burgling a city home after bloody tissues found at the scene matched his DNA profile has been cleared by a court. Jonathan Bowskill said he had nothing to do with the burglary at Alpha Street, Heavitree, in the early hours of November 29. A jury at Exeter Crown Court yesterday found him not guilty. During the trial, prosecutor David Evans said Peter Holmes went to bed and left a window open and his wallet in his leather jacket. He got up at 5am and went to work. He later found the tissues on the floor and his wallet missing. Bowskill told police although he was a heroin addict, he "didn't do burglaries", and did not know how the tissues came to be there.
End Quote

There are about 2 million DNA profiles in the UK Forensic Science Service (FSS) NDNAD. No one would report how many false matches are actually recorded within that database. That is pairs of separate people who just happen to have the same DNA profile although not in any way related (not twins or even brothers etc). In consequence I have simulated a large DNA profile database and the likely figure is one such match in 2 million. Full details,including computer macros, to repeat the experiment yourself on http://www.nutteing2.freeservers.com/dnas.htm This is a mathematical simulation ,randomly modelling the DNA profiles , based on published data but all 'profiiles' totally independent of one another. The UK NDNAD contains 2 million profiles with a minimum of one such match plus many more due to the inescapable fact that most people in the UK have ancestors in common , so more chance of shared alleles and consequential match. (Future research - to determine what this co-ancestry factor is) I welcome anyone to copy the macros off the dnas.htm file and spend a few hours repeating the process to verify. No fancy computer required ,just an ordinary pc.

I thought we beat this to death a couple of months ago. At the risk of being personally attacked, again, I don't think these mathematical games mean much in the real world of crime scene, crime lab, and court. There are no non-twins that have the same genetic make up. If there can be matches with current technology, its because the labs are not using enough data to compare DNA. Anyone wrongly convicted on the basis of RFLP or STR technology can have the lab results verified by checking more sites. Surely a 2nd confimatory round of tests can be devised to double check questionable results.
wally

If there was a general perception of a problem with DNA profiles then what you say is true. I have yet to see reports in the press concerning extended loci tests. Perhaps the most disturbing thing i discovered in this simulation were the results for 6 loci -
27,168 pair matches
1231 triples matches
110 quadruples
14 quintuples,
if the UK FSS had allowed the 6 loci situation to survive up to a database of 2 million.

For most of the 1990s the FSS,police,judiciary, forensic statisticians and all their computer power considered 6 loci to be sufficient to arrest the likes of Raymond Easton because they believed there would not be a false match in 60 million of the UK population. It was only Mr Easton's case in 1999 that led to the change from 6 to 10. I appreciate USA / CODIS uses 13 loci but on the other hand the population of the USA is 10 times that of the UK - the increase from 10 to 13 swallowed up by the 10 fold population size

Just an observation from a "blinkered" forensic scientist... In your simulation, you "make up the rule" that the 10 loci tested are limited to having only 10 alleles (even though you state that the average number of alleles for SGM loci is 14). That is, rare alleles get lumped into one category (0) for your simulation. Why don't you run the simulation with 14 alleles? I mean, I know you believe that everyone in the world is +/- 2 alleles from the most common alleles... but is this realistic? Perhaps the whole world really is an anti-Lake Wobegon (e.g. everyone is just average). It just seems to me that with all the constraints you've put in your analysis... it's no wonder what you get is what you want to see.

On a personal note to the Forensic Science group: Did you know that this is a "Peer-Reviewed" forum? I'm glad because now, with each posting, I can add a few more publications to my resume. Except, I forgot to sign my name to the posting I gave Mr. Nutteing a few months ago (it was about using fingerprints to confirm DNA matches).

You see, Mr. Nutteing has posted some of the discussions we've had with him in the past. Fair enough, the web is free (well, except for the porno). But, this is what Mr. Nutteing thinks of his "peers", the DNA forensic scientists: "It shows the very dangerous blinkered mindset of forensic scientists. I have to assume the same attitude is prevalent within the police and the judiciary."
http://www.nutteing2.freeservers.com/dnay.htm
Here is some background info about Paul from his cousin, if you are interested in potential motivation for wanting to discredit DNA analysis (well, we are all peers, right?).
http://homepages.tcp.co.uk/~diverse/intro.htm
As for me, I've washed my hands of this dude. And I would suggest to everyone else to be aware of what you wish to say to him. I'm going to follow the advice my late grandfather once told me, "Never argue with a blinkered person... people watching may not be able to tell the difference."
Mike

You can't be too blinkered as you picked up the 10 instead of 14 and did further research. The reason has probably got a bit lost in the background. It did not emerge overnight - a number of people contributed along the way. One thing that emerged early on was the matter of the rare alleles. For 'Gaffoor trawling ' rare alleles are very useful but in this area of false matches they become a bit of a nuisance. I tried finding the name the statisticians may call the following effect - but not found (maybe original research finding but I doiubt it). In more general terms ,what we are dealing with here is matching of multi-modal sets. With the exception of ,D2S1338 in the UK, in the chosen 10 loci they tend to approximate to normal/gaussian frequency distributions . That is some sort of peak with tailing-off to the rarer alleles either side. Before simulating a 10 loci database containing millions we gradually built up by testing 6 loci , 7 loci etc. These of course showed more matches and it is possible to analyse the structure of such matches. First thing to note ,contrary to my intuition, these matches did not exclusively involve the most common alleles. They can involve medium rare alleles but never in my preliminary simulations involving the rarest ones. A second effect is probably in the form of a law within statistics but ,as i say,i could not find it. All these loci/alleles in the simulation are costrained to agree with the published allele frequency tables. With a large number of 6,7 loci matches it is possible to analyse the distribution of the alleles found in the matches. What emerges is what starts as multi-modal distributions, similar to normal distributions ,when matches are analysed they have a more broadened distribution ,like an upturned U , and steep run-off to the tails. Increase in the proportion concerning common alleles and a very noticeable reduction in the proportion of rarer alleles . All in all I decided to simplify, to lump these rare alles ,into one where apropriate.
vWA - not required as less than 10
THO1 - not required
D8 - not required
FGA - 1.8% rare ones lumped into one
D21 - 0.5%
D18 - 2.5%
D2 - 2%
D16 - not required
D19 - 0.3%
D3 - not required
The percentages are percentage of alleles for that locus , not percentage of all loci/alleles.

I agree a more rigorous simulation would indeed use a larger array space - requiring 16 for the biggest,FGA - UK caucasian distribution ,or 25 for full ethnicity coverage and a lot of zeros in the remainder. You have to make a cut-off somewhere , the tables I have only go down to 0.1% anyway, strictly one should include 0.01% alleles etc etc.

A crime-scene profile has a full set of 13 loci so 26 data points. On 25 there is a perfect match with a profile on a database. The 26th allele from the crime is 18 but the database is 18.2 - is this declared a match ? Second question concerning another crime-scene profile of a totally unknown/unwitnessed offender. Frequency analysis of the alleles indicates he is 50 times more likely to have ethnicity A than B. But one pair of alleles are extremely rare in A population say 0.002,0.002 but in population B it is say 0.015,0.15 . Do you declare no result or does one determination override the other ?. As equally rare for both alleles then it is difficult to argue mixed parentage from A and B populations.

On the first part, I don't think anyone would declare it a match without further examination. Even after further examination I still think they wouldn't use a term like that.

For your second question, I'm not aware of anyone in the US that uses the population databases to determine possible racial information.
Adam

On the first part a match should not be called, but the similarity between the crime scene profile and that of the person on the database would probably result in further tests, certainly at the locus in question. If these confirm the discrepancy then that person should be eliminated, but I would expect investigators to consider close relatives, especially twins of the same sex as the person on the database.
Best Wishes
Satish

Email Paul Nutteing by removing 4 of the 5 dots
or email Paul Nutteing ,remove all but one dot
Or a message on usenet group uk.legal has got to me recently a couple of times.
A simulation of a large DNA profile database
A simulation of DNA profile 'families'
A simulation of DNA profile families with consanguinity
A simulation of DNA profile 'families' for 6 generations
dnas.htm revisited with all alleles represented
dnas.htm revisited for >8 percent allele frequency subset (similar ancestry )
Simulation of Taiwanese Tao and Rukai populations to explore the effect of within and without ancestral clusters
Basques autochthonous DNA profiles simulation, 9 loci
Australian Capital Caucasian 9 loci simulation
Australian Capital Caucasian 9 loci simulation, >= 5% allele frequency
CODIS, 13 Loci Caucasian Simulation
Automating the macros
Otner match scenarios
'Peer review' of some of the dnapr.htm material
Continuation of sparring on http://groups.yahoo.com/group/forensic-science/messages with forensic 'scientists' using my pseudonym of Nona Revers or nonarevers
'Peer review' of some of the dnapr.htm material , part3

Background