- Pain Research, Nuffield Department of Anaesthetics, University of Oxford, Oxford Radcliffe Hospital, Oxford, UK
- Correspondence to: Professor H J McQuay Pain Research, Nuffield Department of Anaesthetics, University of Oxford, Oxford Radcliffe Hospital, The Churchill, Oxford OX3 7LJ, UK;
- Received 27 May 2004
- Accepted 14 July 2004
Placebo, used here to mean an inert treatment given as if it was a real treatment, means lots of different things to different people. The structure of the article is that it begins by talking about the technical use of placebos in clinical trials, and the extent of the placebo response, then about the mechanism—“How does the placebo work?”—and last about the ethics of placebo in the contexts of research and in everyday practice.
Placebo means lots of different things to different people, which leads to endless confusion. To try and pick a way through this minefield we begin by talking about the technical use of placebos in clinical trials, and the extent of the placebo response, then about the mechanism—“How does the placebo work?”—and last about the ethics of placebo in the contexts of research and in everyday practice. Placebo is used here to mean an inert treatment, given as if it was a real treatment. The word first entered the English language through St Jerome’s Latin version (the Vulgate) of the Septuagint: “Placebo Domino in regione vivorum”. Jerome’s verse was used in the Vespers of the Office for the Dead. The verse began with the word placebo, so in the 13th century placebo became the name of that service. Some people attended the service and sang the Placebo, hoping to be rewarded by a dead person’s relatives, or the relatives paid priests to sing the Placebo on their behalf. Placebo came then to mean a sycophant. How it acquired its current meaning is not known.1
To set the scene ask yourself how you would design a trial to answer the question of whether or not arthroscopy was useful in knee osteoarthritis. Moseley et al randomised patients to arthroscopy or to placebo surgery, which involved three incisions and anaesthesia.2 The justification for placebo surgery is that there is no other way to control for the act of surgery itself, rather than what it may achieve, to produce pain relief.3 This trial shows some of the problems around placebo to be discussed below, such as why placebo may be the best or indeed the only way to design trials that can answer questions credibly, how much risk is acceptable for placebo patients, albeit in a non-life threatening context, and about the confusion between the ethics of clinical research and the ethics of clinical care. Interestingly the knee arthroscopy trial, like its famous predecessor that used placebo surgery to investigate the efficacy of internal mammary artery ligation for angina,4 showed no benefit from the procedure.
In the early days of margarine there was an advertising slogan that asked if you could tell Stork (a margarine brand) from butter. The manufacturers assumed that your ability to distinguish tastes was working well, but that they had made the tastes so similar that even with fully functioning taste discrimination you would not be able to tell the two apart. If your taste discrimination was poor or absent then you would say, incorrectly, the two spreads were indistinguishable. Saying the two were the same when they were not, because your test was technically insensitive, is precisely the reason that placebo is so important for clinical trials.
If you imagine a very simple clinical trial, a comparison of treatment A with treatment B for relief of pain after dental extraction, then you would need an outcome measure like pain relief. Let us say that at the end of the trial both treatments seem to work, but with exactly the same amount of pain relief (fig 1). Does this result mean that the two treatments were genuinely equally good analgesics, or that your pain relief outcome measure was too insensitive to pick up a real difference between the two, or indeed that neither was any good? We have no way of knowing unless we change the trial design to incorporate an index of sensitivity of the measures.
Such an index of sensitivity can be a third group of patients, randomised to receive placebo rather than treatments A or B (fig 2). If both treatment A and treatment B do well, and placebo does badly, then we can be more confident that A and B are both effective treatments, because placebo was ineffective, and that there is minimal difference between A and B’s performance in the trial.
This use of placebo as a “negative control” as the index of sensitivity is, and rightly so, pervasive in explanatory clinical trials designed to establish the efficacy of new treatments. An alternative to the negative control is the positive control, treatment X, in which a known effective treatment is given, ideally with one group of patients given a high dose of X and another drug group given a low dose (fig 3). The index of sensitivity criterion will be met if low dose X does badly compared with high dose X. While this positive control method reduces the ethical concerns about use of placebo, to be discussed later, it is also a fudge, in that low dose X needs to be a minimally effective dose for the sensitivity index to work, which then means that that patient group is receiving a less effective treatment.
An example in the pain field would be using paracetamol 500 mg as low dose X and 1000 mg as high dose X. Few trials have shown good separation at these doses. Those that do not manage to separate 500 mg from a 10000 mg then have no index of sensitivity.
PLACEBO RESPONSE OR PLACEBO EFFECT
This is perhaps one of the most difficult of all topics, especially with subjective outcomes such as pain or depression. If we were discussing a topic like myocardial infarction and our outcome measure was death, then we might be reasonably sure that a placebo would have no effect on the outcome. But with subjective outcomes like pain, we might guess that patients would feel better after placebo, and consequently have less pain, if the doctor or nurse was nice to them, or appeared authoritative, or if the placebo was given as a big red capsule instead of a tiny white pill, or as an injection and not a tablet. Whatever we think, proving that any or all of these influences had an effect would be difficult because very large trials would be necessary to show any effect independent of random chance.
It’s all very complicated, and made more so by the difficulty in proving that “negativity”, or “interaction”, or “expectation” contribute anything at all to the actual perception of pain as it is measured. We don’t help ourselves by using lax, if understandable, shorthand. When we want to discuss the effect that we observe when patients are given a placebo, we call it the “placebo effect” or placebo response. Immediately that can be retranslated as “the effect caused by placebo”. Indirectly, of course, administration of placebo can and does result in an effect, for instance resulting in analgesia in a pain study. The pitfall is that we jump to a simplistic causal connection, and then in turn jump to conclusions about the mechanism by which this happens. Ideas about how placebo has its effect will be discussed later. At this point just note that it is a creek where we don’t have a paddle.
Common misconceptions about the response to placebo
There are a number of misconceptions about placebo, and they are worth examining because they are highly instructive.
Misconception 1: for every intervention, a fixed fraction of the population, usually a third, responds to placebo, whatever the outcome
This just is not so. Table 2 lists rates of response with placebo in a number of clinical conditions in acute and chronic pain conditions.
Table 1 summarises effects that we might expect to find in various control groups.5
There is a wide range of response, from 7% for the response of freedom from pain two hours after migraine pain of moderate or severe intensity, to 49% of patients with painful diabetic neuropathy saying their pain is at least much better after eight weeks of treatment with a topical placebo. For some of the chronic painful conditions we have pitifully little information and the estimates may just be wrong, but even where we have large amounts of data there is wide variation. Why are we surprised? Would we not expect a large response if we do nothing in strains and sprains over a week, when a fair proportion are going to get better anyway? People with a migraine will rarely have it next week, at least not the same episode.
The real issue here is how hard the outcome is to achieve. If we use migraine as the examplar, there are four outcomes for pain relief from migraine that we can consider:
No pain or mild pain at two hours
No pain at two hours
No pain or mild pain at two hours and pain not returned and no analgesics over 24 hours
No pain at two hours and pain not returned and no analgesics over 24 hours
We might guess that no pain is harder to achieve than mild pain, and that “no return of pain plus no additional analgesics over 24 hours” is more difficult to achieve than outcomes that only look at the first two hours. Not surprisingly, we find that 35%–40% of people can obtain no pain or mild pain with placebo at two hours, but that only about 10% have no pain. Fewer people have a favourable outcome with placebo over 24 hours than over two hours (fig 4). No surprise here then.
Misconception 2: the placebo response is a fixed fraction (about a third) of the maximum effect of treatment—the bigger the treatment effect the bigger the placebo response
This idea came from an analysis of five randomised trials in acute pain.6 The relation that Evans found, treatment effect and placebo response moving in the same direction, was an artefact because he used mean values of skewed TOTPAR (total pain relief) data. The relation could not be shown when median values were used.7
Of course it may be the case that when the outcome is easier to achieve, both the response to placebo and the response to treatment are likely to be higher. Figure 5 is a replot of figure 4 with the addition of the response from rizatriptan 10 mg. There is a general relation based on degree of difficulty of the outcome, but it is not a fixed fraction.
Misconception 3: the more invasive the method of delivering a treatment, the higher will be the response to placebo—injection will give bigger response than tablets
Again this is not so. Table 3 shows that while we have comparatively small numbers of patients given intramuscular placebo, the proportion of patients having at least 50% pain relief is no bigger than with oral administration of placebo tablets. The same complete lack of any difference can be seen in an analysis of responses to placebo with different routes of administration in migraine (table 4). Although injected placebo has been claimed to give higher response rates than oral placebo in migraine,8 this was based on an analysis of a limited dataset.
Misconception 4: randomisation of different numbers of patients to active and placebo can affect the response to placebo
This final misconception again comes from the migraine field. The claim was that randomisation to different proportions of active treatment and placebo could affect the response to placebo. This was on the basis of one trial in which 16 patients were randomised to active treatment for every one randomised to placebo. Fifty six patients were randomised to placebo, and the response rate at two hours for no pain or mild pain, or just no pain, was high (fig 6). The answer, though, was that with 56 patients the 95% confidence interval of the response rate included that of the overall response rate from all randomisation schedules.
WHERE DOES THIS LEAVE US?
The most important point to appreciate is that the response to placebo can vary hugely. While these examples all came from pain studies, we could have used data from early postoperative nausea and vomiting after general surgery (range 0% to 50% patients vomiting),9 or after squint correction, where with essentially the same operation and anaesthetic technique the range was 18% to 88%,10 or in trials of surfactant in respiratory distress syndrome, where the range of the placebo response was 24% to 69%,11 or in depression trials (13% to 52%).12 This range of response has led people to look for explanations within the trials, such as kind nurses compared with unkind, or even flawed double blinding.
The reality is that random chance alone can produce this range of response,13 and the first line of defence against such variability lies in the size of the dataset. An unimaginably large dataset will include the range, but would allow us to estimate the “true” underlying placebo response in that setting. In table 2 with 12 000 patients the proportion of patients achieving at least 50% pain relief with placebo in postoperative pain is 18%. If we do small trials, with just 100 patients, then random chance means that the proportion showing at least 50% pain relief with placebo could be anywhere from 0% to 50%, which obviously could have a significant impact on how we view the performance of the active drug or drugs in that trial.
Once you understand that the response can vary, and that random chance is the most important factor underlying the variability that makes small studies particularly vulnerable, then that minimises the need to look for other explanations, such as the kind compared with unkind nurses. The fact that much time and effort has been spent on exploring such spurious influences shows just how easy it is in thinking about placebo to be navigating the creek without a paddle. The other trap has been the “soundbites” about fixed fractions of responders and about the extent of the response. The statements that one third of people respond and respond at one third of the maximum are common parlance. The information above shows that both statements are wrong. It takes a long time to debunk widely held beliefs.
HOW DOES PLACEBO WORK?
We once ran a trial of oral morphine compared with placebo in an experimental pain model, where you had to keep your arm in a bucket of icy water for as long as you could stand it. As subjects in the trial we knew that the treatment was either morphine or placebo. One of us had the treatment on one occasion, and then was constipated for a week, a very unusual event, and hence was absolutely convinced that the treatment had been morphine. It turned out that that was the placebo day. The shame of being a placebo responder. The mystery here is how did the placebo cause constipation? In the absence of a better explanation it has to be the belief that one had received morphine that resulted in the unusual constipation. Even if you accept that belief results in the effect of the placebo there is a long chain of biological events between belief and the effect that we do not understand.
Most of the experiments designed to tease out the mechanism of the placebo effect are small, complicated, and with results that do not stand the test of time. Taking the analgesic effect of placebo 20 years ago researchers had the clever idea of antagonising the analgesic action of placebo with the opioid antagonist naloxone. The hypothesis was that if the analgesic action of placebo was mediated through the opioid system then it would be reversed by giving naloxone. The researchers duly reported that they found such a reversal, but subsequent studies have not replicated their findings. More recently the brain geographers with their magnificent imaging devices have shown that when placebo produces an analgesic effect the same parts of the brain light up as light up when a known analgesic is given.14,15 There is a common theme to these results, which is that to have its effect placebo uses the same biological system as would an active drug, whether the effect is analgesia or constipation. We are still left with the overarching question of how an “inert” intervention activates the system.
The curious among us take this further, and look for differences between us and between us and animals in our susceptibility to placebo. Once again experiments tackling these questions tend to be small and hence subject to our old friend the random play of chance, so that all results need to be taken with a large pinch of salt. Beecher and colleagues used an observational study design to see if they could identify the people who did respond to placebo. What they found was that responders tended to be older, women more than men, church attending but not necessarily God believing, and with great faith in doctors and nurses. What you would expect really, if belief is the overarching switch to turn on a response to placebo.
Implicit in the idea that we can identify the believers is the principle that the belief does not waver. One way to check whether or not it is fixed is to rechallenge the same person with placebo. What you might expect in such a wolf, wolf paradigm is that if not much really happened the first time, even though you believed it would, then your belief will wane with successive use of placebo—your response will wane. This was tested in women with dysmenorrhoea. Some brave souls received placebo on successive menstrual cycles, and the analgesic effect of the placebo decreased with successive use of the placebo16 (fig 7).
The idea that we differ in our faith in doctors or in other markers of trust and belief is not revolutionary. We know for example that there are differences in our susceptibility to hypnosis or acupuncture, and folklore has it that one third of horses or dogs are hard to help with acupuncture, so this is not just a human issue. The fact that our belief can be context dependent, and that our response to placebo can therefore vary with context is not surprising, but it still does not explain how belief throws the switch to cause a response to placebo.
Professor Pat Wall did provide a testable idea about the placebo mechanism.17 He started with the link between the placebo response and the patient’s expectation, and the fact that part of the response of a patient to any treatment relates to the expectation of a beneficial effect. He argued that sensory events are analysed in terms of the appropriate motor responses. For pain this would be first to remove the stimulus, second to change posture to limit further injury and optimise recovery, and third to seek safety, relief and cure. “If the patient’s experience has taught them that a particular action is followed by relief, then they respond if they think the action has occurred. In this scheme of thinking, the placebo is not a stimulus but an appropriate action”, acting to terminate or cancel the pain sensation.
ETHICS OF PLACEBO
In clinical trials
Most of us react negatively to the idea of using placebo, because “it appears to violate the fundamental ethical principles of beneficence and nonmaleficence”.3 But if we are going to advance in medicine we need trials of our interventions. For some interventions a placebo treatment may be necessary if the trial is to answer the question posed. The arthroscopy trial mentioned at the beginning, for instance, could not have answered the question without a placebo surgery group of patients. The collision, if it is a collision, is between the ethics of our clinical care in general with the ethics of clinical research.
In clinical research trials are unethical if their design means that they cannot answer the question. In figure 1 we could not tell if treatments A or B were both effective or both ineffective, because there was no index of internal sensitivity. Adding a placebo group (fig 2) provides that index, and then we would be able to tell whether treatments A or B were both effective (as in fig 2) or both ineffective. This is the justification for using placebo in trials.
A little common sense is necessary. Nobody in their right mind is going to advocate using placebo in a life threatening condition. Two obvious examples would be antibiotic trials in septicaemia, or chemotherapy. In such contexts trial design that shows one drug is as good as another (fig 1), called equivalence trials, can be very difficult to interpret.18 One way round this dilemma is to add the test treatment to existing treatment, as happens in trials in epilepsy. Decisions about the legitimacy of placebo in a particular setting are not always as black and white as this makes it sound. If you know that statins reduce long term cardiac risk then is it or is it not legitimate to use a placebo group in a trial of the efficacy of a new statin drug on long term cardiac mortality? If you decide that it is not legitimate, then how do you design an equivalence trial, old statin compared with new statin, which does not run into the figure 1 problem? This takes you into a controversial area,18 controversial both because of the design problems of equivalence trials and because cholesterol lowering is taken as a proxy of the long term cardiac mortality.
In the circumstance where it is legitimate to use placebo, it is important to understand that being randomised to placebo does not condemn the patient to long term suffering. The patient is free to withdraw from the trial at any point, and then receive normal treatment. Patients may be given rescue (or “escape”) medication if the trial treatment is inadequate, and that is usually done for oral medication from the first hour onwards, because it takes an hour for the medicine to get into the system. Obviously if everyone in the placebo group drops out one hour after treatment, or needs to take rescue medication, and none of the active treatment patients drop out or need rescue, then you have your answer.
Ethical objections to the use of placebo in clinical trials are “based on the requirements to minimize risks, limit the level of risks that are not offset by the potential benefits to participants, and obtain informed consent”, and do not, in the view of Horng and Miller, support an absolute prohibition against the use of placebo when its use is methodologically necessary to answer clinically important questions.3 This justification of placebo as the index of internal sensitivity, as in our figure 1 problem with drug trials, will persist unless and until we could use a “gold standard” active treatment as our comparator, as in figure 3. Imagine that we were certain that whenever and wherever amitriptyline was used it would always produce x units of antidepressant activity. Then we could use it as the positive control in our design. If the amitriptyline did not produce x units of improvement in the trial then we would know that the trial was faulty, a “method failure”, just as we do now if gold standard does not beat placebo. The problem is that we do not have any certainty about the x units of antidepressant activity. We know that x varies widely in trials.18,12 Perhaps the way forward is to use large pooled datasets to produce a more robust estimate, analogous to the estimate of 18% of 12 000 placebo patients achieving at least 50% postoperative pain relief (table 2).
In everyday practice
The ethical issues around placebo in everyday practice are different from those in clinical research. First there should be a distinction between knowingly and unknowingly treating a patient with a placebo intervention. Knowingly doing so is deceit, which is hard to condone. Unknowing might be the use of a drug or procedure for which there is no good evidence of efficacy. The fact of the procedure itself, the fact of the prescription written, may produce improvement, particularly in a self limited disorder. Many of us do this, given that there is minimal risk, in our everyday practice. Indeed the cynic would argue that much of complementary medicine is based on this principle. We should acknowledge too that any placebo effect is added to the “genuine” pharmacological effect each time we prescribe.