DNA profiling (also called DNA fingerprinting) is the process of determining an individual's DNA characteristics. DNA analysis intended to identify a species, rather than an individual, is called DNA barcoding.
DNA profiling is a forensic technique in criminal investigations, comparing criminal suspects' profiles to DNA evidence so as to assess the likelihood of their involvement in the crime. It is also used in parentage testing, to establish immigration eligibility, and in genealogical and medical research. DNA profiling has also been used in the study of animal and plant populations in the fields of zoology, botany, and agriculture.
Starting in the 1980s scientific advances allowed the use of DNA as a material for the identification of an individual. The first patent covering the direct use of DNA variation for forensics was filed by Jeffrey Glassberg in 1983, based upon work he had done while at Rockefeller University in 1981. In the United Kingdom, Geneticist Sir Alec Jeffreys independently developed a DNA profiling process beginning in late 1984 while working in the Department of Genetics at the University of Leicester.
The process, developed by Jeffreys in conjunction with Peter Gill and Dave Werrett of the Forensic Science Service (FSS), was first used forensically in the solving of the murder of two teenagers who had been raped and murdered in Narborough, Leicestershire in 1983 and 1986. In the murder inquiry, led by Detective David Baker, the DNA contained within blood samples obtained voluntarily from around 5,000 local men who willingly assisted Leicestershire Constabulary with the investigation, resulted in the exoneration of a man who had confessed to one of the crimes, and the subsequent conviction of Colin Pitchfork. Pitchfork, a local bakery employee, had coerced his coworker Ian Kelly to stand in for him when providing a blood sample; Kelly then used a forged passport to impersonate Pitchfork. Another coworker reported the deception to the police. Pitchfork was arrested, and his blood was sent to Jeffrey's lab for processing and profile development. Pitchfork's profile matched that of DNA left by the murderer which confirmed Pitchfork's presence at both crime scenes; he pleaded guilty to both murders.
Although 99.9% of human DNA sequences are the same in every person, enough of the DNA is different that it is possible to distinguish one individual from another, unless they are monozygotic (identical) twins. DNA profiling uses repetitive sequences that are highly variable, called variable number tandem repeats (VNTRs), in particular short tandem repeats (STRs), also known as microsatellites, and minisatellites. VNTR loci are similar between closely related individuals, but are so variable that unrelated individuals are unlikely to have the same VNTRs.
The process, developed by Glassberg and independently by Jeffreys, begins with a sample of an individual's DNA (typically called a "reference sample"). Reference samples are usually collected through a buccal swab. When this is unavailable (for example, when a court order is needed but unobtainable) other methods may be needed to collect a sample of blood, saliva, semen, vaginal lubrication, or other fluid or tissue from personal use items (for example, a toothbrush, razor) or from stored samples (for example, banked sperm or biopsy tissue). Samples obtained from blood relatives can indicate an individual's profile, as could previous profiled human remains. A reference sample is then analyzed to create the individual's DNA profile using one of the techniques discussed below. The DNA profile is then compared against another sample to determine whether there is a genetic match.
When a sample such as blood or saliva is obtained, the DNA is only a small part of what is present in the sample. Before the DNA can be analyzed, it must be extracted from the cells and purified. There are many ways this can be accomplished, but all methods follow the same basic procedure. The cell and nuclear membranes need to be broken up to allow the DNA to be free in solution. Once the DNA is free, it can be separated from all other cellular components. After the DNA has been separated in solution, the remaining cellular debris can then be removed from the solution and discarded, leaving only DNA. The most common methods of DNA extraction include organic extraction (also called phenol chloroform extraction), Chelex extraction, and solid phase extraction. Differential extraction is a modified version of extraction in which DNA from two different types of cells can be separated from each other before being purified from the solution. Each method of extraction works well in the laboratory, but analysts typically selects their preferred method based on factors such as the cost, the time involved, the quantity of DNA yielded, and the quality of DNA yielded. After the DNA is extracted from the sample, it can be analyzed, whether it is by RFLP analysis or quantification and PCR analysis.
The first methods for finding out genetics used for DNA profiling involved RFLP analysis. DNA is collected from cells and cut into small pieces using a restriction enzyme (a restriction digest). This generates DNA fragments of differing sizes as a consequence of variations between DNA sequences of different individuals. The fragments are then separated on the basis of size using gel electrophoresis.
The separated fragments are then transferred on to a nitrocellulose or nylon filter; this procedure is called a Southern blot. The DNA fragments within the blot are permanently fixed to the filter, and the DNA strands are denatured. Radiolabeled probe molecules are then added that are complementary to sequences in the genome that contain repeat sequences. These repeat sequences tend to vary in length among different individuals and are called variable number tandem repeat sequences or VNTRs. The probe molecules hybridize to DNA fragments containing the repeat sequences and excess probe molecules are washed away. The blot is then exposed to an X-ray film. Fragments of DNA that have bound to the probe molecules appear as fluorescent bands on the film.
The Southern blot technique requires large amounts of non-degraded sample DNA. Also, Alec Jeffrey's original multilocus RFLP technique looked at many minisatellite loci at the same time, increasing the observed variability, but making it hard to discern individual alleles (and thereby precluding paternity testing). These early techniques have been supplanted by PCR-based assays.
Developed by Kary Mullis in 1983, a process was reported by which specific portions of the sample DNA can be amplified almost indefinitely (Saiki et al. 1985, 1985). The process, polymerase chain reaction (PCR), mimics the biological process of DNA replication, but confines it to specific DNA sequences of interest. With the invention of the PCR technique, DNA profiling took huge strides forward in both discriminating power and the ability to recover information from very small (or degraded) starting samples.
PCR greatly amplifies the amounts of a specific region of DNA. In the PCR process, the DNA sample is denatured into the separate individual polynucleotide strands through heating. Two oligonucleotide DNA primers are used to hybridize to two corresponding nearby sites on opposite DNA strands in such a fashion that the normal enzymatic extension of the active terminal of each primer (that is, the 3' end) leads toward the other primer. PCR uses replication enzymes that are tolerant of high temperatures, such as the thermostable Taq polymerase. In this fashion, two new copies of the sequence of interest are generated. Repeated denaturation, hybridization, and extension in this fashion produce an exponentially growing number of copies of the DNA of interest. Instruments that perform thermal cycling are readily available from commercial sources. This process can produce a million-fold or greater amplification of the desired region in 2 hours or less.
Early assays such as the HLA-DQ alpha reverse dot blot strips grew to be very popular owing to their ease of use, and the speed with which a result could be obtained. However, they were not as discriminating as RFLP analysis. It was also difficult to determine a DNA profile for mixed samples, such as a vaginal swab from a sexual assault victim.
However, the PCR method was readily adaptable for analyzing VNTR, in particular STR loci. In recent years, research in human DNA quantitation has focused on new "real-time" quantitative PCR (qPCR) techniques. Quantitative PCR methods enable automated, precise, and high-throughput measurements. Inter-laboratory studies have demonstrated the importance of human DNA quantitation on achieving reliable interpretation of STR typing and obtaining consistent results across laboratories.
The system of DNA profiling used today is based on polymerase chain reaction (PCR) and uses simple sequences or short tandem repeats (STR). This method uses highly polymorphic regions that have short repeated sequences of DNA (the most common is 4 bases repeated, but there are other lengths in use, including 3 and 5 bases). Because unrelated people almost certainly have different numbers of repeat units, STRs can be used to discriminate between unrelated individuals. These STR loci (locations on a chromosome) are targeted with sequence-specific primers and amplified using PCR. The DNA fragments that result are then separated and detected using electrophoresis. There are two common methods of separation and detection, capillary electrophoresis (CE) and gel electrophoresis.
Each STR is polymorphic, but the number of alleles is very small. Typically each STR allele will be shared by around 5-20% of individuals. The power of STR analysis derives from inspecting multiple STR loci simultaneously. The pattern of alleles can identify an individual quite accurately. Thus STR analysis provides an excellent identification tool. The more STR regions that are tested in an individual the more discriminating the test becomes.
From country to country, different STR-based DNA-profiling systems are in use. In North America, systems that amplify the CODIS 20 core loci are almost universal, whereas in the United Kingdom the DNA-17 17 loci system (which is compatible with The National DNA Database) is in use, and Australia uses 18 core markers. Whichever system is used, many of the STR regions used are the same. These DNA-profiling systems are based on multiplex reactions, whereby many STR regions will be tested at the same time.
The true power of STR analysis is in its statistical power of discrimination. Because the 20 loci that are currently used for discrimination in CODIS are independently assorted (having a certain number of repeats at one locus does not change the likelihood of having any number of repeats at any other locus), the product rule for probabilities can be applied. This means that, if someone has the DNA type of ABC, where the three loci were independent, then the probability of that individual having that DNA type is the probability of having type A times the probability of having type B times the probability of having type C. This has resulted in the ability to generate match probabilities of 1 in a quintillion (1x1018) or more. However, DNA database searches showed much more frequent than expected false DNA profile matches. Moreover, since there are about 12 million monozygotic twins on Earth, the theoretical probability is not accurate.
In practice, the risk of contaminated-matching is much greater than matching a distant relative, such as contamination of a sample from nearby objects, or from left-over cells transferred from a prior test. The risk is greater for matching the most common person in the samples: Everything collected from, or in contact with, a victim is a major source of contamination for any other samples brought into a lab. For that reason, multiple control-samples are typically tested in order to ensure that they stayed clean, when prepared during the same period as the actual test samples. Unexpected matches (or variations) in several control-samples indicates a high probability of contamination for the actual test samples. In a relationship test, the full DNA profiles should differ (except for twins), to prove that a person was not actually matched as being related to their own DNA in another sample.
Another technique, AFLP, or amplified fragment length polymorphism was also put into practice during the early 1990s. This technique was also faster than RFLP analysis and used PCR to amplify DNA samples. It relied on variable number tandem repeat (VNTR) polymorphisms to distinguish various alleles, which were separated on a polyacrylamide gel using an allelic ladder (as opposed to a molecular weight ladder). Bands could be visualized by silver staining the gel. One popular focus for fingerprinting was the D1S80 locus. As with all PCR based methods, highly degraded DNA or very small amounts of DNA may cause allelic dropout (causing a mistake in thinking a heterozygote is a homozygote) or other stochastic effects. In addition, because the analysis is done on a gel, very high number repeats may bunch together at the top of the gel, making it difficult to resolve. AmpFLP analysis can be highly automated, and allows for easy creation of phylogenetic trees based on comparing individual samples of DNA. Due to its relatively low cost and ease of set-up and operation, AmpFLP remains popular in lower income countries.
During conception, the father's sperm cell and the mother's egg cell, each containing half the amount of DNA found in other body cells, meet and fuse to form a fertilized egg, called a zygote. The zygote contains a complete set of DNA molecules, a unique combination of DNA from both parents. This zygote divides and multiplies into an embryo and later, a full human being.
At each stage of development, all the cells forming the body contain the same DNA--half from the father and half from the mother. This fact allows the relationship testing to use all types of all samples including loose cells from the cheeks collected using buccal swabs, blood or other types of samples.
There are predictable inheritance patterns at certain locations (called loci) in the human genome, which have been found to be useful in determining identity and biological relationships. These loci contain specific DNA markers that scientists use to identify individuals. In a routine DNA paternity test, the markers used are short tandem repeats (STRs), short pieces of DNA that occur in highly differential repeat patterns among individuals.
Each person's DNA contains two copies of these markers--one copy inherited from the father and one from the mother. Within a population, the markers at each person's DNA location could differ in length and sometimes sequence, depending on the markers inherited from the parents.
The combination of marker sizes found in each person makes up their unique genetic profile. When determining the relationship between two individuals, their genetic profiles are compared to see if they share the same inheritance patterns at a statistically conclusive rate.
For example, the following sample report from this commercial DNA paternity testing laboratory Universal Genetics signifies how relatedness between parents and child is identified on those special markers:
|DNA marker||Mother||Child||Alleged father|
|D21S11||28, 30||28, 31.2||29, 31.2|
|D7S820||9, 10||10, 11||11, 12|
|TH01||6, 9.3||9, 9.3||8, 9|
|D13S317||10, 12||12, 13||11, 13|
|D19S433||14, 16.2||14, 15||14.2, 15|
The partial results indicate that the child and the alleged father's DNA match among these five markers. The complete test results show this correlation on 16 markers between the child and the tested man to enable a conclusion to be drawn as to whether or not the man is the biological father.
Each marker is assigned with a Paternity Index (PI), which is a statistical measure of how powerfully a match at a particular marker indicates paternity. The PI of each marker is multiplied with each other to generate the Combined Paternity Index (CPI), which indicates the overall probability of an individual being the biological father of the tested child relative to a randomly selected man from the entire population of the same race. The CPI is then converted into a Probability of Paternity showing the degree of relatedness between the alleged father and child.
The DNA test report in other family relationship tests, such as grandparentage and siblingship tests, is similar to a paternity test report. Instead of the Combined Paternity Index, a different value, such as a Siblingship Index, is reported.
The report shows the genetic profiles of each tested person. If there are markers shared among the tested individuals, the probability of biological relationship is calculated to determine how likely the tested individuals share the same markers due to a blood relationship.
Recent innovations have included the creation of primers targeting polymorphic regions on the Y-chromosome (Y-STR), which allows resolution of a mixed DNA sample from a male and female or cases in which a differential extraction is not possible. Y-chromosomes are paternally inherited, so Y-STR analysis can help in the identification of paternally related males. Y-STR analysis was performed in the Jefferson-Hemings controversy to determine if Thomas Jefferson had sired a son with one of his slaves.
The analysis of the Y-chromosome yields weaker results than autosomal chromosome analysis with regard to individual identification. The Y male sex-determining chromosome, as it is inherited only by males from their fathers, is almost identical along the paternal line. On the other hand, the Y-STR haplotype provides powerful genealogical information as a patrilinear relationship can be traced back over many generations.
Furthermore, due to the paternal inheritance, Y-haplotypes provide information about the genetic ancestry of the male population. To investigate this population history, and to provide estimates for haplotype frequencies in criminal casework, the "Y haplotype reference database (YHRD)" has been created in 2000 as an online resource. It currently comprises more than 300,000 minimal (8 locus) haplotypes from world-wide populations.
For highly degraded samples, it is sometimes impossible to get a complete profile of the 13 CODIS STRs. In these situations, mitochondrial DNA (mtDNA) is sometimes typed due to there being many copies of mtDNA in a cell, while there may only be 1-2 copies of the nuclear DNA. Forensic scientists amplify the HV1 and HV2 regions of the mtDNA, and then sequence each region and compare single-nucleotide differences to a reference. Because mtDNA is maternally inherited, directly linked maternal relatives can be used as match references, such as one's maternal grandmother's daughter's son. In general, a difference of two or more nucleotides is considered to be an exclusion. Heteroplasmy and poly-C differences may throw off straight sequence comparisons, so some expertise on the part of the analyst is required. mtDNA is useful in determining clear identities, such as those of missing people when a maternally linked relative can be found. mtDNA testing was used in determining that Anna Anderson was not the Russian princess she had claimed to be, Anastasia Romanov.
When people think of DNA analysis they often think about shows like NCIS or CSI, which portray DNA samples coming into a lab and then instantly analyzed, followed by pulling up a picture of the suspect within minutes. The true reality, however, is quite different and perfect DNA samples are often not collected from the scene of a crime. Homicide victims are frequently left exposed to harsh conditions before they are found and objects used to commit crimes have often been handled by more than one person. The two most prevalent issues that forensic scientists encounter when analyzing DNA samples are degraded samples and DNA mixtures.
In the real world DNA labs often have to deal with DNA samples that are less than ideal. DNA samples taken from crime scenes are often degraded, which means that the DNA has started to break down into smaller fragments. Victims of homicides might not be discovered right away, and in the case of a mass casualty event it could be hard to get DNA samples before the DNA has been exposed to degradation elements.
Degradation or fragmentation of DNA at crime scenes can occur because of a number of reasons, with environmental exposure often being the most common cause. Biological samples that have been exposed to the environment can get degraded by water and enzymes called nucleases. Nucleases essentially 'chew' up the DNA into fragments over time and are found everywhere in nature.
Before modern PCR methods existed it was almost impossible to analyze degraded DNA samples. Methods like restriction fragment length polymorphism or RFLP Restriction fragment length polymorphism, which was the first technique used for DNA analysis in forensic science, required high molecular weight DNA in the sample in order to get reliable data. High molecular weight DNA however is something that is lacking in degraded samples, as the DNA is too fragmented to accurately carry out RFLP. It wasn't until modern day PCR techniques were invented that analysis of degraded DNA samples were able to be carried out Polymerase chain reaction. Multiplex PCR in particular made it possible to isolate and amplify the small fragments of DNA still left in degraded samples. When multiplex PCR methods are compared to the older methods like RFLP a vast difference can be seen. Multiplex PCR can theoretically amplify less than 1 ng of DNA, while RFLP had to have a least 100 ng of DNA in order to carry out an analysis.
In terms of a forensic approach to a degraded DNA sample, STR loci STR analysis are often amplified using PCR-based methods. Though STR loci are amplified with greater probability of success with degraded DNA, there is still the possibility that larger STR loci will fail to amplify, and therefore, would likely yield a partial profile, which results in reduced statistical weight of association in the event of a match.
In instances where DNA samples are degraded, like in the case of intense fires or if all that remains are bone fragments, standard STR testing on these samples can be inadequate. When standard STR testing is done on highly degraded samples the larger STR loci often drop out, and only partial DNA profiles are obtained. While partial DNA profiles can be a powerful tool, the random match probabilities will be larger than if a full profile was obtained. One method that has been developed in order to analyse degraded DNA samples is to use miniSTR technology. In this new approach, primers are specially designed to bind closer to the STR region. In normal STR testing the primers will bind to longer sequences that contain the STR region within the segment. MiniSTR analysis however will just target the STR location, and this results in a DNA product that is much smaller.
By placing the primers closer to the actual STR regions, there is a higher chance that successful amplification of this region will occur. Successful amplification of these STR regions can now occur and more complete DNA profiles can be obtained. The success that smaller PCR products produce a higher success rate with highly degraded samples was first reported in 1995, when miniSTR technology was used to identify victims of the Waco fire. In this case the fire at destroyed the DNA samples so badly that normal STR testing did not result in a positive ID on some of the victims.
Mixtures are another common issue that forensic scientists face when they are analyzing unknown or questionable DNA samples. A mixture is defined as a DNA sample that contains two or more individual contributors. This can often occur when a DNA sample is swabbed from an item that is handled by more than one person or when a sample contains both the victim and assailants' DNA. The presence of more than one individual in a DNA sample can make it challenging to detect individual profiles, and interpretation of mixtures should only be done by highly trained individuals. Mixtures that contain two or three individuals can be interpreted, though it will be difficult. Mixtures that contain four or more individuals are much too convoluted to get individual profiles. One common scenario in which a mixture is often obtained is in the case of sexual assault. A sample may be collected that contains material from the victim, the victim's consensual sexual partners, and the perpetrator(s).
As detection methods in DNA profiling advance, forensic scientists are seeing more DNA samples that contain mixtures, as even the smallest contributor is now able to be detected by modern tests. The ease in which forensic scientists have in interpenetrating DNA mixtures largely depends on the ratio of DNA present from each individual, the genotype combinations, and total amount of DNA amplified. The DNA ratio is often the most important aspect to look at in determining whether a mixture can be interpreted. For example, in the case where a DNA sample had two contributors, it would be easy to interpret individual profiles if the ratio of DNA contributed by one person was much higher than the second person. When a sample has three or more contributors, it becomes extremely difficult to determine individual profiles. Fortunately, advancements in probabilistic genotyping could make this sort of determination possible in the future. Probabilistic genotyping uses complex computer software to run through thousands of mathematical computations in order to produce statistical likelihoods of individual genotypes found in a mixture. Probabilistic genotyping software that are often used in labs today include STRmix and TrueAllele.
An early application of a DNA database was the compilation of a Mitochondrial DNA Concordance, prepared by Kevin W. P. Miller and John L. Dawson at the University of Cambridge from 1996 to 1999 from data collected as part of Miller's PhD thesis. There are now several DNA databases in existence around the world. Some are private, but most of the largest databases are government-controlled. The United States maintains the largest DNA database, with the Combined DNA Index System (CODIS) holding over 13 million records as of May 2018. The United Kingdom maintains the National DNA Database (NDNAD), which is of similar size, despite the UK's smaller population. The size of this database, and its rate of growth, are giving concern to civil liberties groups in the UK, where police have wide-ranging powers to take samples and retain them even in the event of acquittal. The Conservative-Liberal Democrat coalition partially addressed these concerns with part 1 of the Protection of Freedoms Act 2012, under which DNA samples must be deleted if suspects are acquitted or not charged, except in relation to certain (mostly serious and/or sexual) offenses. Public discourse around the introduction of advanced forensic techniques (such as genetic genealogy using public genealogy databases and DNA phenotyping approaches) has been limited, disjointed, unfocused, and raises issues of privacy and consent that may warrant the establishment of additional legal protections.
The U.S. Patriot Act of the United States provides a means for the U.S. government to get DNA samples from suspected terrorists. DNA information from crimes is collected and deposited into the CODIS database, which is maintained by the FBI. CODIS enables law enforcement officials to test DNA samples from crimes for matches within the database, providing a means of finding specific biological profiles associated with collected DNA evidence.
When a match is made from a national DNA databank to link a crime scene to an offender having provided a DNA sample to a database, that link is often referred to as a cold hit. A cold hit is of value in referring the police agency to a specific suspect but is of less evidential value than a DNA match made from outside the DNA Databank.
FBI agents cannot legally store DNA of a person not convicted of a crime. DNA collected from a suspect not later convicted must be disposed of and not entered into the database. In 1998, a man residing in the UK was arrested on accusation of burglary. His DNA was taken and tested, and he was later released. Nine months later, this man's DNA was accidentally and illegally entered in the DNA database. New DNA is automatically compared to the DNA found at cold cases and, in this case, this man was found to be a match to DNA found at a rape and assault case one year earlier. The government then prosecuted him for these crimes. During the trial the DNA match was requested to be removed from the evidence because it had been illegally entered into the database. The request was carried out. The DNA of the perpetrator, collected from victims of rape, can be stored for years until a match is found. In 2014, to address this problem, Congress extended a bill that helps states deal with "a backlog" of evidence.
As DNA profiling became a key piece of evidence in the court, defense lawyers based their arguments on statistical reasoning. For example: Given a match that had a 1 in 5 million probability of occurring by chance, the lawyer would argue that this meant that in a country of say 60 million people there were 12 people who would also match the profile. This was then translated to a 1 in 12 chance of the suspect's being the guilty one. This argument is not sound unless the suspect was drawn at random from the population of the country. In fact, a jury should consider how likely it is that an individual matching the genetic profile would also have been a suspect in the case for other reasons. Also, different DNA analysis processes can reduce the amount of DNA recovery if the procedures are not properly done. Therefore, the number of times a piece of evidence is sampled can diminish the DNA collection efficiency. Another spurious statistical argument is based on the false assumption that a 1 in 5 million probability of a match automatically translates into a 1 in 5 million probability of innocence and is known as the prosecutor's fallacy.
When using RFLP, the theoretical risk of a coincidental match is 1 in 100 billion (100,000,000,000), although the practical risk is actually 1 in 1000 because monozygotic twins are 0.2% of the human population. Moreover, the rate of laboratory error is almost certainly higher than this, and often actual laboratory procedures do not reflect the theory under which the coincidence probabilities were computed. For example, the coincidence probabilities may be calculated based on the probabilities that markers in two samples have bands in precisely the same location, but a laboratory worker may conclude that similar--but not precisely identical--band patterns result from identical genetic samples with some imperfection in the agarose gel. However, in this case, the laboratory worker increases the coincidence risk by expanding the criteria for declaring a match. Recent studies have quoted relatively high error rates, which may be cause for concern. In the early days of genetic fingerprinting, the necessary population data to accurately compute a match probability was sometimes unavailable. Between 1992 and 1996, arbitrary low ceilings were controversially put on match probabilities used in RFLP analysis rather than the higher theoretically computed ones. Today, RFLP has become widely disused due to the advent of more discriminating, sensitive and easier technologies.
Since 1998, the DNA profiling system supported by The National DNA Database in the UK is the SGM+ DNA profiling system that includes 10 STR regions and a sex-indicating test. STRs do not suffer from such subjectivity and provide similar power of discrimination (1 in 1013 for unrelated individuals if using a full SGM+ profile). Figures of this magnitude are not considered to be statistically supportable by scientists in the UK; for unrelated individuals with full matching DNA profiles a match probability of 1 in a billion is considered statistically supportable. However, with any DNA technique, the cautious juror should not convict on genetic fingerprint evidence alone if other factors raise doubt. Contamination with other evidence (secondary transfer) is a key source of incorrect DNA profiles and raising doubts as to whether a sample has been adulterated is a favorite defense technique. More rarely, chimerism is one such instance where the lack of a genetic match may unfairly exclude a suspect.
It is possible to use DNA profiling as evidence of genetic relationship, although such evidence varies in strength from weak to positive. Testing that shows no relationship is absolutely certain. Further, while almost all individuals have a single and distinct set of genes, ultra-rare individuals, known as "chimeras", have at least two different sets of genes. There have been two cases of DNA profiling that falsely suggested that a mother was unrelated to her children. This happens when two eggs are fertilized at the same time and fuse together to create one individual instead of twins.
In one case, a criminal planted fake DNA evidence in his own body: John Schneeberger raped one of his sedated patients in 1992 and left semen on her underwear. Police drew what they believed to be Schneeberger's blood and compared its DNA against the crime scene semen DNA on three occasions, never showing a match. It turned out that he had surgically inserted a Penrose drain into his arm and filled it with foreign blood and anticoagulants.
The functional analysis of genes and their coding sequences (open reading frames [ORFs]) typically requires that each ORF be expressed, the encoded protein purified, antibodies produced, phenotypes examined, intracellular localization determined, and interactions with other proteins sought. In a study conducted by the life science company Nucleix and published in the journal Forensic Science International, scientists found that an in vitro synthesized sample of DNA matching any desired genetic profile can be constructed using standard molecular biology techniques without obtaining any actual tissue from that person. Nucleix claims they can also prove the difference between non-altered DNA and any that was synthesized.
In the case of the Phantom of Heilbronn, police detectives found DNA traces from the same woman on various crime scenes in Austria, Germany, and France--among them murders, burglaries and robberies. Only after the DNA of the "woman" matched the DNA sampled from the burned body of a male asylum seeker in France did detectives begin to have serious doubts about the DNA evidence. It was eventually discovered that DNA traces were already present on the cotton swabs used to collect the samples at the crime scene, and the swabs had all been produced at the same factory in Austria. The company's product specification said that the swabs were guaranteed to be sterile, but not DNA-free.
Familial DNA searching (sometimes referred to as "familial DNA" or "familial DNA database searching") is the practice of creating new investigative leads in cases where DNA evidence found at the scene of a crime (forensic profile) strongly resembles that of an existing DNA profile (offender profile) in a state DNA database but there is not an exact match. After all other leads have been exhausted, investigators may use specially developed software to compare the forensic profile to all profiles taken from a state's DNA database to generate a list of those offenders already in the database who are most likely to be a very close relative of the individual whose DNA is in the forensic profile. To eliminate the majority of this list when the forensic DNA is a man's, crime lab technicians conduct Y-STR analysis. Using standard investigative techniques, authorities are then able to build a family tree. The family tree is populated from information gathered from public records and criminal justice records. Investigators rule out family members' involvement in the crime by finding excluding factors such as sex, living out of state or being incarcerated when the crime was committed. They may also use other leads from the case, such as witness or victim statements, to identify a suspect. Once a suspect has been identified, investigators seek to legally obtain a DNA sample from the suspect. This suspect DNA profile is then compared to the sample found at the crime scene to definitively identify the suspect as the source of the crime scene DNA.
Familial DNA database searching was first used in an investigation leading to the conviction of Jeffrey Gafoor of the murder of Lynette White in the United Kingdom on 4 July 2003. DNA evidence was matched to Gafoor's nephew, who at 14 years old had not been born at the time of the murder in 1988. It was used again in 2004 to find a man who threw a brick from a motorway bridge and hit a lorry driver, killing him. DNA found on the brick matched that found at the scene of a car theft earlier in the day, but there were no good matches on the national DNA database. A wider search found a partial match to an individual; on being questioned, this man revealed he had a brother, Craig Harman, who lived very close to the original crime scene. Harman voluntarily submitted a DNA sample, and confessed when it matched the sample from the brick. Currently, familial DNA database searching is not conducted on a national level in the United States, where states determine how and when to conduct familial searches. The first familial DNA search with a subsequent conviction in the United States was conducted in Denver, Colorado, in 2008, using software developed under the leadership of Denver District Attorney Mitch Morrissey and Denver Police Department Crime Lab Director Gregg LaBerge. California was the first state to implement a policy for familial searching under then Attorney General, now Governor, Jerry Brown. In his role as consultant to the Familial Search Working Group of the California Department of Justice, former Alameda County Prosecutor Rock Harmon is widely considered to have been the catalyst in the adoption of familial search technology in California. The technique was used to catch the Los Angeles serial killer known as the "Grim Sleeper" in 2010. It wasn't a witness or informant that tipped off law enforcement to the identity of the "Grim Sleeper" serial killer, who had eluded police for more than two decades, but DNA from the suspect's own son. The suspect's son had been arrested and convicted in a felony weapons charge and swabbed for DNA the year before. When his DNA was entered into the database of convicted felons, detectives were alerted to a partial match to evidence found at the "Grim Sleeper" crime scenes. David Franklin Jr., also known as the Grim Sleeper, was charged with ten counts of murder and one count of attempted murder. More recently, familial DNA led to the arrest of 21-year-old Elvis Garcia on charges of sexual assault and false imprisonment of a woman in Santa Cruz in 2008. In March 2011 Virginia Governor Bob McDonnell announced that Virginia would begin using familial DNA searches. Other states are expected to follow.
At a press conference in Virginia on March 7, 2011, regarding the East Coast Rapist, Prince William County prosecutor Paul Ebert and Fairfax County Police Detective John Kelly said the case would have been solved years ago if Virginia had used familial DNA searching. Aaron Thomas, the suspected East Coast Rapist, was arrested in connection with the rape of 17 women from Virginia to Rhode Island, but familial DNA was not used in the case.
Critics of familial DNA database searches argue that the technique is an invasion of an individual's 4th Amendment rights. Privacy advocates are petitioning for DNA database restrictions, arguing that the only fair way to search for possible DNA matches to relatives of offenders or arrestees would be to have a population-wide DNA database. Some scholars have pointed out that the privacy concerns surrounding familial searching are similar in some respects to other police search techniques, and most have concluded that the practice is constitutional. The Ninth Circuit Court of Appeals in United States v. Pool (vacated as moot) suggested that this practice is somewhat analogous to a witness looking at a photograph of one person and stating that it looked like the perpetrator, which leads law enforcement to show the witness photos of similar looking individuals, one of whom is identified as the perpetrator. Regardless of whether familial DNA searching was the method used to identify the suspect, authorities always conduct a normal DNA test to match the suspect's DNA with that of the DNA left at the crime scene.
Critics also claim that racial profiling could occur on account of familial DNA testing. In the United States, the conviction rates of racial minorities are much higher than that of the overall population. It is unclear whether this is due to discrimination from police officers and the courts, as opposed to a simple higher rate of offence among minorities. Arrest-based databases, which are found in the majority of the United States, lead to an even greater level of racial discrimination. An arrest, as opposed to conviction, relies much more heavily on police discretion.
For instance, investigators with Denver District Attorney's Office successfully identified a suspect in a property theft case using a familial DNA search. In this example, the suspect's blood left at the scene of the crime strongly resembled that of a current Colorado Department of Corrections prisoner. Using publicly available records, the investigators created a family tree. They then eliminated all the family members who were incarcerated at the time of the offense, as well as all of the females (the crime scene DNA profile was that of a male). Investigators obtained a court order to collect the suspect's DNA, but the suspect actually volunteered to come to a police station and give a DNA sample. After providing the sample, the suspect walked free without further interrogation or detainment. Later confronted with an exact match to the forensic profile, the suspect pleaded guilty to criminal trespass at the first court date and was sentenced to two years probation.
In Italy a familiar DNA search has been done to solve the case of the murder of Yara Gambirasio whose body was found in the bush[clarification needed] three months after her disappearance. A DNA trace was found on the underwear of the murdered teenage near and a DNA sample was requested from a person who lived near the municipality of Brembate di Sopra and a common male ancestor was found in the DNA sample of a young man not involved in the murder. After a long investigation the father of the supposed killer was identified as Giuseppe Guerinoni, a deceased man, but his two sons born from his wife were not related to the DNA samples found on the body of Yara. After three and a half years the DNA found on the underwear of the deceased girl was matched with Massimo Giuseppe Bossetti who was arrested and accused of the murder of the 13-year-old girl. In the summer of 2016 Bossetti was found guilty and sentenced to life by the Corte d'assise of Bergamo.
Partial DNA matches are the result of moderate stringency CODIS searches that produce a potential match that shares at least one allele at every locus. Partial matching does not involve the use of familial search software, such as those used in the UK and United States, or additional Y-STR analysis, and therefore often misses sibling relationships. Partial matching has been used to identify suspects in several cases in the UK and United States, and has also been used as a tool to exonerate the falsely accused. Darryl Hunt was wrongly convicted in connection with the rape and murder of a young woman in 1984 in North Carolina. Hunt was exonerated in 2004 when a DNA database search produced a remarkably close match between a convicted felon and the forensic profile from the case. The partial match led investigators to the felon's brother, Willard E. Brown, who confessed to the crime when confronted by police. A judge then signed an order to dismiss the case against Hunt. In Italy, partial matching has been used in the controversial murder of Yara Gambirasio, a child found dead about a month after her presumed kidnapping. In this case, the partial match has been used as the only incriminating element against the defendant, Massimo Bossetti, who has been subsequently condemned for the murder (waiting appeal by the Italian Supreme Court).
In the United States, it has been accepted, courts often ruling that there is no expectation of privacy, citing California v. Greenwood (1988), in which the Supreme Court held that the Fourth Amendment does not prohibit the warrantless search and seizure of garbage left for collection outside the curtilage of a home. Critics of this practice underline that this analogy ignores that "most people have no idea that they risk surrendering their genetic identity to the police by, for instance, failing to destroy a used coffee cup. Moreover, even if they do realize it, there is no way to avoid abandoning one's DNA in public."
In the UK, the Human Tissue Act 2004 prohibits private individuals from covertly collecting biological samples (hair, fingernails, etc.) for DNA analysis, but exempts medical and criminal investigations from the prohibition.
Evidence from an expert who has compared DNA samples must be accompanied by evidence as to the sources of the samples and the procedures for obtaining the DNA profiles. The judge must ensure that the jury must understand the significance of DNA matches and mismatches in the profiles. The judge must also ensure that the jury does not confuse the match probability (the probability that a person that is chosen at random has a matching DNA profile to the sample from the scene) with the probability that a person with matching DNA committed the crime. In 1996 R v. Doheny Phillips LJ gave this example of a summing up, which should be carefully tailored to the particular facts in each case:
Members of the Jury, if you accept the scientific evidence called by the Crown, this indicates that there are probably only four or five white males in the United Kingdom from whom that semen stain could have come. The Defendant is one of them. If that is the position, the decision you have to reach, on all the evidence, is whether you are sure that it was the Defendant who left that stain or whether it is possible that it was one of that other small group of men who share the same DNA characteristics.
Juries should weigh up conflicting and corroborative evidence, using their own common sense and not by using mathematical formulae, such as Bayes' theorem, so as to avoid "confusion, misunderstanding and misjudgment".
In R v Bates, Moore-Bick LJ said:
We can see no reason why partial profile DNA evidence should not be admissible provided that the jury are made aware of its inherent limitations and are given a sufficient explanation to enable them to evaluate it. There may be cases where the match probability in relation to all the samples tested is so great that the judge would consider its probative value to be minimal and decide to exclude the evidence in the exercise of his discretion, but this gives rise to no new question of principle and can be left for decision on a case by case basis. However, the fact that there exists in the case of all partial profile evidence the possibility that a "missing" allele might exculpate the accused altogether does not provide sufficient grounds for rejecting such evidence. In many there is a possibility (at least in theory) that evidence that would assist the accused and perhaps even exculpate him altogether exists, but that does not provide grounds for excluding relevant evidence that is available and otherwise admissible, though it does make it important to ensure that the jury are given sufficient information to enable them to evaluate that evidence properly.
There are state laws on DNA profiling in all 50 states of the United States. Detailed information on database laws in each state can be found at the National Conference of State Legislatures website.
In August 2009, scientists in Israel raised serious doubts concerning the use of DNA by law enforcement as the ultimate method of identification. In a paper published in the journal Forensic Science International: Genetics, the Israeli researchers demonstrated that it is possible to manufacture DNA in a laboratory, thus falsifying DNA evidence. The scientists fabricated saliva and blood samples, which originally contained DNA from a person other than the supposed donor of the blood and saliva.
The researchers also showed that, using a DNA database, it is possible to take information from a profile and manufacture DNA to match it, and that this can be done without access to any actual DNA from the person whose DNA they are duplicating. The synthetic DNA oligos required for the procedure are common in molecular laboratories.
The New York Times quoted the lead author, Daniel Frumkin, saying, "You can just engineer a crime scene ... any biology undergraduate could perform this". Frumkin perfected a test that can differentiate real DNA samples from fake ones. His test detects epigenetic modifications, in particular, DNA methylation. Seventy percent of the DNA in any human genome is methylated, meaning it contains methyl group modifications within a CpG dinucleotide context. Methylation at the promoter region is associated with gene silencing. The synthetic DNA lacks this epigenetic modification, which allows the test to distinguish manufactured DNA from genuine DNA.
It is unknown how many police departments, if any, currently use the test. No police lab has publicly announced that it is using the new test to verify DNA results.
DNA testing is used to establish the right of succession to British titles.