ÇďżűĘÓƵ

Using Imaging to Identify Deceit: Scientific and Ethical Questions

Chapter 7: Neuroscience-Based Lie Detection: The Need for Regulation

Back to table of contents
Authors
Emilio Bizzi, Steven E. Hyman, Marcus E. Raichle, Nancy Kanwisher, Elizabeth Anya Phelps, Stephen J. Morse, Walter Sinnott-Armstrong, Jed S. Rakoff, and Henry T. Greely

Henry T. Greely

“I swear I didn’t do it.”
“The check is in the mail.”
“The article is almost »ĺ´Ç˛Ô±đ.”

In our lives and in our legal system we often are vitally interested in whether someone is telling us the truth. Over the years, humans have used reputation, body language, oaths, and even torture as lie detectors. In the twentieth century, polygraphs and truth serum made bids for widespread use. The twenty-first century is confronting yet another kind of lie detection, one based on neuroscience and particularly on functional magnetic resonance imaging (fMRI).

The possibility of effective lie detection raises a host of legal and ethical questions. Evidentiary rules on scientific evidence, on probative value compared with prejudicial effect, and, possibly, rules on character evidence would be brought into play. Constitutional issues would be raised under at least the Fourth, Fifth, and Sixth Amendments, as well as, perhaps, under a First Amendment claim about a protected freedom of thought. Four U.S. Supreme Court justices have already stated their view that even a perfectly effective lie detector should not be admissible in court because it would unduly infringe the province of the jury. And ethicist Paul Wolpe has argued that this kind of intervention raises an entirely novel, and deeply unsettling, ethical issue about privacy within one’s own skull.1

These issues are fascinating and the temptation is strong to pursue them, but we must not forget a crucial first question: does neuroscience-based lie detection work and, if so, how well? This question has taken on particular urgency as two, and possibly three, companies are already marketing fMRI-based lie detection services in the United States. The deeper implications of effective lie detection are important but may prove a dangerous distraction from the preliminary question of effectiveness. Their exploration may even lead some readers to infer that neuroscience-based lie detection is ready for use.

It is not. And nonresearch use of neuroscience-based lie detection should not be allowed until it has been proven safe and effective.2 This essay will review briefly the state of the science concerning fMRI-based lie detection; it will then describe the limited extent of regulation of this technology and will end by arguing for a premarket approval system for regulating neuroscience-based lie detection, similar to that used by the Food and Drug Administration to regulate new drugs.

NEUROSCIENCE-BASED LIE DETECTION: THE SCIENCE

Arguably all lie detection, like all human cognitive behavior, has its roots in neuroscience, but the term neuroscience-based lie detection describes newer methods of lie detection that try to detect deception based on information about activity in a subject’s brain.

The most common and commonly used lie detector, the polygraph, does not measure directly activity in the subject’s brain. From its invention around 1920, the polygraph has measured physiological indications that are associated with the mental state of anxiety: blood pressure, heart rate, breathing rate, and galvanic skin response (sweating). When a subject shows higher levels of these indicators, the polygraph examiner may infer that the subject is anxious and further that the subject is lying. Typically, the examiner asks a subject a series of yes or no questions while his physiological responses are being monitored by the device. The questions may include irrelevant questions and emotionally charged, “probable lie” control questions as well as relevant questions. An irrelevant question might be “Is today Tuesday?” A probable lie question would be “Have you ever stolen anything?” a question the subject might well be tempted to answer “no,” even though it is thought unlikely anyone could truthfully deny ever having stolen anything. Another approach, the so-called guilty knowledge test, asks the subject questions about, for example, a crime scene the subject denies having seen. A subject who shows a stronger physiological reaction to a correct statement about the crime scene than to an incorrect statement may be viewed as lying about his or her lack of knowledge.

The result of a polygraph examination combines the physiological results gathered by the machine with the examiner’s assessment of the subject to draw a conclusion about whether the subject answered particular questions honestly. The problems lie in the strength (or weakness) of the connection between the physiological responses and anxiety, on the one hand, and both anxiety and deception, on the other. Only if both connections are powerful can one argue that the physiological reactions are strong evidence of deception.

A 2003 National ÇďżűĘÓƵ of Sciences (NAS) report analyzed polygraphy in detail. The NAS found little rigorous scientific assessment of polygraph accuracy but concluded that in good settings it was substantially better than chance—and substantially less than perfect. The NAS further noted that subjects could take plausible countermeasures to lower the device’s accuracy even further. The NAS advised against the use of polygraphs for personnel screening.

Researchers are now working on at least five methods to produce a newer generation of lie detection devices, devices that measure aspects of the brain itself rather than the physiological responses associated with anxiety. One approach uses electroencephalography to look for a so-called P300 wave in the brain’s electrical activity. This signal, seen about 300 milliseconds after a stimulus, is said to be a sign that the person (or the person’s brain) recognizes the stimulus. A second method uses near-infrared laser spectroscopy to scatter a laser beam off the outer layers of the subject’s brain and then to correlate the resulting patterns with deception. Proponents of a third method, periorbital thermography, claim to be able to detect deception by measuring an increase in the subject’s temperature around the eyes, allegedly as a result of increased blood flow to the prefrontal regions. A fourth approach analyzes fleeting “facial micro-expressions” and is more like the polygraph in that it seeks assertedly involuntary and uncontrollable body reactions correlated with deception rather than looking at direct measures of aspects of the brain. That these new methods will work is far from clear. Almost no peer-reviewed literature exists for any of them.

The most advanced neuroscience-based method for lie detection is fMRI. As described in more detail by Marcus Raichle elsewhere in this volume, fMRI uses magnetism to measure the ratio of oxygenated to deoxygenated hemoglobin in particular areas of the brain, three-dimensional regions referred to as “voxels.” The Blood Oxygenation Level Dependence (BOLD) hypothesis holds that a higher ratio of oxygenated to deoxygenated blood in a particular voxel correlates to higher energy consumption in that region a few seconds earlier. An fMRI scan can document how these ratios change as subjects perceive, act, or think different things while being scanned. Then sophisticated statistical packages look at the changes in thousands of voxels to find correlations between these blood oxygen ratios and what was happening to the subject several seconds earlier.

Since the development of fMRI in the early 1990s, many thousands of peer-reviewed papers have been published using the technique to associate particular patterns of blood flow (and, hence, under the BOLD hypothesis, brain activity) with different mental activities. Some of these associations have been entirely plausible and have been adopted in neurosurgery; for example, using fMRI to locate precisely the location of a particular region of a patient’s brain in order to guide the surgeon’s scalpel. Other claimed connections are more surprising, like using fMRI to locate brain regions responsible for passionate, romantic love or for a nun’s feeling of mystical union with God. Still other published associations are about lie detection.

As of March 2007, at least twelve peer-reviewed articles from eight different laboratories had been published on fMRI-based lie detection.3 The different laboratories used different experimental designs (sometimes the same laboratory used different designs in different publications), but each claimed to find statistically significant correlations between deception and certain patterns of brain activity. This apparent scientific support for fMRI-based lie detection becomes much weaker on examination. This body of work has at least six major flaws.

First, almost none of the work has been replicated. One of the laboratories, Andrew Kozel’s, replicated at least one of its own studies and two of Daniel Langleben’s published studies are quite similar, though not identical.4 None of the other laboratories has replicated, at least in the published literature, their own studies. More important, none of the studies has been replicated by other labs. Replication is always important in science; it is particularly important with a new and complex technology like fMRI, where anything from the details of the experimental design, the method of subject selection, or the technical aspects of the individual MRI machine on any particular day can make a great difference.

Second, although the studies all find associations between deception and activation or deactivation in some brain regions, they often disagree among themselves in what brain regions are associated with deception. This some times happens within one laboratory: Langleben’s first two studies differed substantially in what regions correlated with deception.5

Third, only three of the twelve studies dealt with predicting deceptiveness by individuals.6 The other studies concluded that on average particular regions in the (pooled) brains of the subjects were statistically significantly likely to be activated (high ratio of oxygenated to deoxygenated hemoglobin) or deactivated (low ratio) when the subjects were lying. These group averages tell you nothing useful about the individuals being tested. A group of National Football League place kickers and defensive linemen could, on average, weigh 200 pounds when no single individual was within 80 pounds of that amount. The lie-detection results are not likely to be that stark, but before we can assess whether the method might be useful, we have to know how accurate it is in detecting deception by individuals—its specificity (lack of false positives) and sensitivity (lack of false negatives) are crucial. Only one of the Kozel articles and two of the Langleben articles discuss the accuracy of individual results.

Next is the question of the individuals tested. The largest of these studies involved thirty-one subjects;7 more of them looked at ten to fifteen. The two Langleben studies that looked at individual results were based on four subjects. For the most part, the subjects were disconcertingly homogenous— young, healthy, and almost all right-handed. Langleben’s studies, in particular, were limited to young, healthy, right-handed undergraduates at the University of Pennsylvania who were not using drugs. How well these results project to the rest of the population is unknown.

A fifth major problem is the artificiality of the experimental designs. People are recruited and give their informed consent to participate in a study of fMRI and deception. Typically, they are told to lie about something. In Langleben’s three studies they were told to lie when they saw a particular playing card projected on the screen inside the scanner. In Kozel’s work, perhaps the least artificial of the experiments, subjects were told to take either a ring or a watch from a room and then to say, in the scanner, that they had not taken either object. Note how different this is from a criminal suspect telling the police that he had not taken part in a drug deal, or, for that matter, from a dinner guest praising an overcooked dish. The experimental subjects are following orders to lie where nothing more rides on the outcome than (in some cases) a promised $50 bonus if they successfully deceive the researchers. We just do not know how well these methods would work in settings similar to those where lie detection would, in practice, be used.

Finally, and perhaps most worryingly, as with the polygraph, countermeasures could make fMRI-based lie detection ineffective against trained liars. And countermeasures are easy with fMRI. One can ruin a scan by movement of the head or sometimes of just the tongue. Or, more subtly, as the scanner is detecting patterns of blood flow associated with brain activity, one can add additional brain activity. What happens to these results if the subject, when answering, is also reciting to himself the multiplication tables? We have no idea.

The few published papers that have looked at individuals have claimed accuracy rates of about 70 to around 90 percent in detecting lies. These results —not substantially different, by the way, from reported results with the polygraph—must be taken with a grain of salt. We just do not know how reliably accurate fMRI-based lie detection will be with diverse subjects in realistic settings, with or without countermeasures. For now, at least, based on the peer-reviewed literature, the scientific verdict on fMRI-based lie detection seems clear: interesting but not proven.

NEUROSCIENCE-BASED LIE DETECTION: THE LAW

In spite of this lack of convincing proof of efficacy, at least two companies in the United States—No Lie MRI and Cephos Corp—are offering fMRI-based lie detection services. They can do this because of the near absence of regulation of lie detection in the United States.

In general, the use of lie detectors is legal in the United States. The polygraph is used thousands of times each week for security screenings, in criminal investigations, as part of the conditions of release for sex offenders. The device can be used even more broadly, subject to almost no regulation, with one major exception: employers.

The federal Employee Polygraph Protection Act (EPPA) of 1988 forbids most employers from forcing job applicants and most employees to take lie-detector tests and from using the results of such tests. As a result, no American can today legally face, as I did, a polygraph test when applying, at age twenty-one, for a job as a bartender at a pizza parlor. Some employers are granted exceptions, notably governments and some national security and criminal-investigation contractors. The act’s definition of lie detection is broad (although No Lie MRI has frivolously argued that EPPA does not apply to fMRI). The act also exempts the use of polygraphs (not other forms of lie detection) on employees in some kinds of employer investigations, subject only to some broad rights for the employees. About half the states have passed their own versions of this act, applying it to most or all of their state and local employees. Some states have extended protections against lie detection to a few other situations, including in connection with insurance claims, welfare applications, or credit reports. (A few states have required the use of lie detection in some settings, such as investigations of police officers.)

Almost half of states have a licensing scheme for polygraph examiners. A few of these statutes may effectively prohibit fMRI-based lie detection because they prohibit lie detection except by licensed examiners and provide only for licensing polygraph examiners, not fMRI examiners. No state, however, has yet explicitly regulated neuroscience-based lie detection.

One site for the possible use of lie-detection technology is particularly sensitive—the courtroom. Thus far, fMRI-based lie detection has not been admitted into evidence in court. The courts will apply their own tests in making such decisions. However, the eighty-plus years of litigation over courtroom uses of polygraph evidence might provide some useful lessons.

The polygraph is never admissible in U.S. courtrooms—except when it is. Those exceptions are few but not trivial. In state courts in New Mexico, polygraph evidence is presumptively admissible. In every other American state and in the federal courts, polygraph evidence is generally not admissible. Some jurisdictions will allow it to be introduced to impeach a witness’s credibility. Others will allow its use if both parties have agreed, before the test was taken, that it should be admitted. (This willingness to allow polygraph to be admitted by the parties’ stipulation has always puzzled me; should judges allow the jury to hear, as scientific evidence, the results of palm reading or the Magic Eight Ball if the parties stipulated to it?) At least one federal court has ruled that a defendant undergoing a sentencing hearing where the death penalty may be imposed is entitled to use polygraph evidence to try to mitigate his sentence.8

U.S. courts have rejected the polygraph on the grounds that it is not acceptable scientific evidence. For many years federal and state courts used as the test of admissibility of scientific evidence a standard taken from the 1923 case Frye v. United States (293 F. 1013 [DC Cir 1923]), which involved one of the precursors to the polygraph. Frye, which required proof that the method was generally accepted in the scientific community, was replaced in the federal courts (and in many state courts) by a similar but more complicated test taken from the 1993 U.S. Supreme Court case, Daubert v. Merrell Dow Pharmaceuticals, Inc. (509 U.S. 579 [1993]). Both cases fundamentally require a finding that the evidence is scientifically sound, usually based on testimony from experts. Under both Frye and Daubert, American courts (except in New Mexico) have uniformly found that the polygraph has not been proven sufficiently reliable to be admitted into evidence. Evidence from fMRI-based lie detection will face the same hurdles. The party seeking to introduce it at a trial will have to convince the judge that it meets the Frye or Daubert standard.

The U.S. Supreme Court confronted an interesting variation on this question in 1998 in a case called United States v. Scheffer (523 U.S. 303 [1998]). Airman Scheffer took part in an undercover drug investigation on an Air Force base. As part of the investigation, the military police gave him regular polygraph and urine tests to make sure he was not misusing the drugs himself. He consistently passed the polygraph tests but eventually failed the urine test, leading to his charge and conviction by court-martial.

Unlike the general Federal Rules of Evidence, the Military Rules of Evidence expressly forbid the admission of any polygraph evidence. Airman Scheffer, arguing that if the polygraph were good enough for the military police, it should be good enough for the court-martial, claimed that this rule, Rule 707, violated his Sixth Amendment right to present evidence in his own defense. The U.S. Court of Military Appeals agreed, but the U.S. Supreme Court did not and reversed. The Court, in an opinion written by Justice Thomas, held that the unreliability of the polygraph justified Rule 707, as did the potential for confusion, prejudice, and delay when using the polygraph.

Justice Thomas, joined by only three other justices (and so not creating a precedent), also wrote that even if the polygraph were extremely reliable, it could not be introduced in court, at least in jury trials. This, he said, was because it too greatly undercut “the jury’s core function of making credibility determinations in criminal trials.”

Scheffer is a useful reminder that lie detection, whether by polygraph, fMRI, or any other technical method, will have to face not only limits on scientific evidence but other concerns. Under Federal Rule of Evidence 403 (and equivalent state rules), the admission of any evidence is subject to the court’s determination that its probative value outweighs its costs in prejudice, confusion, or time. Given the possible damning effect on the jury of a fancy high-tech conclusion that a witness is a liar, Rule 403 might well hold back all but the most accurate lie detection. Other rules involving character testimony might also come into play, particularly if a witness wants to introduce lie-detection evidence to prove that he or she is telling the truth. In Canada, for example, polygraph evidence is excluded not because it is unreliable but because it violates an old common law evidentiary rule against “oath helping” (R. v. Béland, 2 S.C.R. 398 [1987]). While nonjudicial use of fMRI-based lie detection is almost unregulated, the courtroom use of fMRI-based lie detection will face special difficulties. The judicial system should be the model for the rest of society. We should not allow any uses of fMRI-based (or other neuroscience-based) lie detection until it is proven sufficiently safe and effective.

A TRULY MODEST PROPOSAL: PREMARKET REGULATION OF LIE DETECTION

Effective lie detection could transform society, particularly the legal system. Although fMRI-based lie detection is clearly not ready for nonresearch uses today, I am genuinely agnostic about its value in ten years (or twenty years, or even five years). It seems plausible to me that some patterns of brain activation will prove to be powerfully effective at distinguishing truth from lies, at least in some situations and with some people. (The potential for undetectable countermeasures is responsible for much of my uncertainty about the future power of neuroscience-based lie detection.)

Of course, “transform” does not have a normative direction—society could be transformed in ways good, bad, or (most likely) mixed. Should we develop effective lie detection, we will need to decide how, and under what circumstances, we want it to be usable, in effect rethinking EPPA in hundreds of nonemployment settings. And we will need to consider how our constitutional rights do and should constrain the use of lie detection. This kind of thinking and regulation will be essential to maximizing the benefits and minimizing the harms of effective lie detection.

But ineffective lie detection has only harms, mitigated by no benefits. As a first step, before consideration of particular uses, we should forbid the non-research use of unproven neuroscience-based lie detection. (Similarly, lie detection not based on neuroscience, including specifically the polygraph, should be forbidden. However, polygraphy is likely too well established to make its uprooting politically feasible.)

This step is not radical. We require that new drugs, biologics, and medical devices be proven “safe and effective” to the satisfaction of the federal Food and Drug Administration before they may legally be used outside of (regulated) research. Just as unsafe or ineffective drugs can damage bodies, unsafe or ineffective lie detection can damage lives—the lives of those unjustly treated as a result of inaccurate tests as well as the lives of those harmed because a real villain passed the lie detector. Our society can allow false, exaggerated, or misleading claims and implied claims for many marketed products or services, from brands of beer to used cars to “star naming” companies, because the products, though they may not do much good, are unlikely to do much harm. Lie detection is not benign, but is, instead, potentially quite dangerous and should be regulated as such.

Of course, just calling for regulation through premarket approval leaves a host of questions unsettled. What level of government should regulate these tests? What agency should do the assessments of safety and efficacy? How would one define safety or efficacy in this context? Should we require the equivalent of clinical trials and, if so, with how many of what kinds of people? How effective is effective enough? Should lie-detection companies, or fMRI-based lie-detection examiners, be licensed? And who will pay for all this testing and regulation? As always, the devil is truly in the details.

I have thoughts on the answers to all those questions (see Greely and Illes 2007), based largely on the Food and Drug Administration, but this is not the place to go into them. The important point is the need for some kind of premarket approval process to keep out unproven lie-detection technologies. Thanks to No Lie MRI and Cephos, the time to develop such a regulatory process is yesterday.

CONCLUSION

Lie detection is just one of the many ways in which the revolution in neuroscience seems likely to change our world. Nothing is as important to us, as humans, as our brains. Further and more-detailed knowledge about how those brains work—properly and improperly—is coming and will necessarily change our medicine, our law, our families, and our day-to-day lives. We cannot anticipate all the benefits or all the risks this revolution will bring us, but we can be alert for examples as—or, better, just before—they arise and then do our best to use them in ways that will make our world better, not worse.

Neuroscience-based lie detection could be our first test. If it works, it will force us to rethink a host of questions, mainly revolving around privacy. But if it does not work, or until it is proven to work, it still poses challenges—challenges we must accept. Premarket approval regulation of neuroscience-based lie detection would be a good start.

ENDNOTES

1. Paul R. Wolpe, Kenneth R.. Foster, and David D. Langleben, “Emerging Neurotechnologies for Lie-Detection: Promises and Perils,” American Journal of Bioethics 5 (2) (2005): 39–49.

2. This argument is made at great length in Henry T. Greely and Judy Illes, “Neuroscience-Based Lie Detection: The Urgent Need for Regulation,” American Journal of Law & Medicine 33 (2007): 377–431

3. Sean A. Spence et al., “Behavioral and Functional Anatomical Correlates of Deception in Humans,” Brain Imaging Neuroreport (2001): 2849; Tatia M. C. Lee et al., “Lie Detection by Functional Magnetic Resonance Imaging,” Human Brain Mapping 15 (2002): 157 (manuscript was received by the journal two months before Spence’s earlier-published article had been received and, in that sense, may be the earliest of these experiments); Daniel D. Langleben et al., “Brain Activity During Simulated Deception: An Event-Related Functional Magnetic Resonance Study,” Neuroimage 15 (2002): 727; G. Ganis et al., “Neural Correlates of Different Types of Deception: An fMRI Investigation,” Cerebral Cortex 13 (2003): 830; F. Andrew Kozel et al., “A Pilot Study of Functional Magnetic Resonance Imaging Brain Correlates of Deception in Healthy Young Men,” Journal of Neuropsychiatry & Clinical Neuroscience 16 (2004): 295; F. Andrew Kozel, Tamara M. Padgett, and Mark S. George, “Brief Communications: A Replication Study of the Neural Correlates of Deception,” Behavioral Neuroscience 118 (2004): 852; F. Andrew Kozel et al., “Detecting Deception Using Functional Magnetic Imaging,” Biological Psychiatry 58 (2005): 605; Daniel D. Langleben et al., “Telling Truth from Lie in Individual Subjects with Fast Event-Related fMRI,” Human Brain Mapping 26 (2005): 262; C. Davatzikos et el., “Classifying Spatial Patterns of Brain Activity with Machine Learning Methods: Application to Lie Detection,” Neuroimage 28 (2005): 663; Tatia M. C. Lee et al., “Neural Correlates of Feigned Memory Impairment,” Neuroimage 28 (2005): 305; Jennifer Maria Nunez et al., “Intentional False Responding Shares Neural Substrates with Response Conflict and Cognitive Control,” Neuroimage 25 (2005): 267; Feroze B. Mohamed et al., “Brain Mapping of Deception and Truth Telling about an Ecologically Valid Situation: Function MR Imaging and Polygraph Investigation–Initial Experience,” Radiology 238 (2006): 679.

4. Kozel, Padgett, and George, “Brief Communications: A Replication Study of the Neural Correlates of Deception,” replicates Kozel et al., “A Pilot Study of Functional Magnetic Resonance Imaging Brain Correlates of Deception in Healthy Young Men.” Langleben’s first study, Langleben et al., “Brain Activity During Simulated Deception: An Event-Related Functional Magnetic Resonance Study,” is closely mirrored by his second and third, Langleben et al., “Telling Truth from Lie in Individual Subjects with Fast Event-Related fMRI,” and Davatzikos et al., “Classifying Spatial Patterns of Brain Activity with Machine Learning Methods: Application to Lie Detection.”

5. Compare Langleben et al., “Brain Activity During Simulated Deception: An Event-Related Functional Magnetic Resonance Study,” with Langleben et al., “Telling Truth from Lie in Individual Subjects with Fast Event-Related fMRI.”

6. Langleben et al., “Telling Truth from Lie in Individual Subjects with Fast Event-Related fMRI”; Davatzikos et al., “Classifying Spatial Patterns of Brain Activity with Machine Learning Methods: Application to Lie Detection”; and Kozel et al., “Detecting Deception Using Functional Magnetic Imaging.”

7. Kozel et al., “Detecting Deception Using Functional Magnetic Imaging.”

8. Rupe v. Wood, 93 F.3d 1434 (9th Cir. 1996). See also Height v. State, 604 S.E.2d 796 (Ga. 2004).

REFERENCES

Greely, H. T. 2005. Premarket approval regulation for lie detection: An idea whose time may be coming. American Journal of Bioethics 5 (2): 50–52.

Greely, H. T., and J. Illes. 2007. Neuroscience-based lie detection: The urgent need for regulation. American Journal of Law and Medicine 33: 377– 431.

National ÇďżűĘÓƵ of Sciences. 2003. The polygraph and lie detection. Washington, DC: National Research Council.

Wolpe, P. R., K. R. Foster, and D. D. Langleben. 2005. Emerging neurotechnologies for lie-detection: Promises and perils. American Journal of Bioethics 5 (2): 39–49.