Phase 3 clinical trials are the most expensive part of drug development, and the most important hurdle to regulatory approval. More than 95% of phase 3 trials in Alzheimer’s disease fail. The main reason is a poor foundation, namely, inadequate or misleading phase 2 trials. Dr. Donald Berry, Founder of Berry Consultants and Founder and Professor of the Department of Biostatistics at the University of Texas MD Anderson Cancer Center, along with his son Dr. Scott Berry, President of Berry Consultants, are committed to turning the tide. Their revolutionary methodology involves adaptive clinical trial design within a Bayesian statistical approach. More
Drs. Berry applied their art and science in consultation with Eisai Pharmaceuticals, the Japanese pharmaceutical company that developed Leqembi, aka lecanemab, They designed a phase 2 trial in Alzheimer disease that adaptively identified Leqembi’s optimal dose regimen, shifted focus to that regimen, and led to regulatory approval. With the phase 2 trial still in follow-up, Eisai initiated a phase 3 trial that perfectly replicated the phase 2 trial results in a larger patient population.
In 2006 the FDA’s Critical Path Initiative encouraged the use of Bayesian and other adaptive designs. FDA’s goal was building a stronger foundation for phase 3, including having a better assessment of which drugs should go into phase 3. Eisai approached Drs. Berry seeking to more effectively and more efficiently build a phase 2 trial for Leqembi in Alzheimer disease. The Berry design was revolutionary. They build a trial that would be run by an automaton and that adapted the future course of the trial to the results observed from patients being treated in the trial. It’s artificial intelligence. The algorithm’s adaptations are tailored to the trial’s goals, such as identifying the most effective dose of a drug and deciding whether to initiate a phase 3 trial.
The Leqembi trial design had a precedent in type 2 diabetes. As part of the FDA’s Critical Path Initiative Drs. Berry had designed a landmark Bayesian adaptive trial, called AWARD-5, for Eli Lilly’s drug Trulicity, a GLP-1 agonist aka dulaglutide. In the first stage of the AWARD-5 trial, patients were adaptively randomized to control therapy or one of seven doses of Trulicity. Drs. Berry taught the AWARD-5 algorithm to find the drug’s two most effective doses, both accurately and economically. The algorithm had to decide whether and when the results were sufficient to drop the other doses and proceed seamlessly into fixed randomization with the two selected doses and control. They also taught the algorithm how to determine the overall trial sample size using Bayesian predictive probabilities. The algorithm performed perfectly in the trial; the doses selected demonstrated substantial efficacy and were marketed for type 2 diabetes, making Trulicity Eli Lilly’s most successful drug.
Another innovation of AWARD-5 was that the endpoint used was a clinical utility index. One of the index’s four components was HbA1c, the standard regulatory endpoint for type 2 diabetes. Another component was weight loss. Although Lilly did not submit AWARD-5 for marketing approval for that indication, the trial had demonstrated dramatic weight loss, especially at the two doses that were greater than the two doses chosen for diabetes. So AWARD-5 heralded the eventual weight loss indication of Trulicity and other GLP-1 agonists.
The algorithms that ran the AWARD-5 and Leqembi trials are complicated, Extensive simulations are necessary to evaluate the design’s operating characteristics, including false-positive rate and the trial’s ability to achieve its objectives. Simulations are also involved at the design stage to test and refine the algorithm’s performance. Independent monitoring committees may be unblinded to the results but they cannot modify the design.
The phase 2 Leqembi trial considered five dose regimens and placebo. The first 196 patients were randomized with probability 1/7 for the five Leqembi regimens and at 2/7 for placebo. A traditional approach would have continued with this randomization throughout the trial. The Berry design switched gears after the 196th patient and again at each of the 13 subsequent interim analyses at which the data were updated and the assignment probabilities were revised. The goal was to get as much information as possible about the efficacy of the dose that would eventually be used in a phase 3 trial. Assignment proportion to placebo mirrored the currently most likely phase 3 dose.
The two highest doses performed well early in the trial, leading to greater sample sizes for these doses: 253 and 161, which was 48% of the total of 856 patients in the trial. Only 51 and 52 patients (total of 12%) were assigned to the two lowest doses. Placebo was assigned to 247 patients, or 29% of the total. This emphasis meant the trial was more informative about the dose regimens of greatest interest.
The trial’s design elegantly handled the greater than 30% rate of missingness typical of Alzheimer trials. A standard approach for imputing missing data is “last observation carried forward” or LOCF. This approach would have concluded a 13% reduction in disease progression in comparison with placebo. The company would not likely have run a phase 3 trial with such a weak signal, and an effective drug might have been shelved. Using multiple imputation in the Bayesian approach led them to conclude a 27% reduction in disease progression, which is clinically meaningful. It persuaded Eisai to initiate a phase 3 trial. The phase 3 trial showed the same 27% reduction, and led to FDA approval.
The prospectively defined Bayesian model did so much better than the old-fashioned LOCF because it was trained to consider each patient’s pattern of data and data missingness. Especially important are patterns of disease course before a patient dropped out. Dropouts of placebo patients were largely caused by the treatment not working. The algorithm identified very different patterns of missingness in patients assigned to the highest dose. This was partly due to the beneficial treatment effect. But the major reason for missingness on the highest dose was mandated by an outside force unrelated to results in the trial. Namely, partway through the trial an ex-U.S. regulatory authority required that the highest dose no longer be used for an important subset of patients, those who were APOE4 carriers. Further, this authority mandated that protocol therapy stop for APOE4 carriers who had been receiving the highest dose for less than 6 months. This circumstance led to very high rates of missing data at this dose.
There is another advantage of running an efficient, informative, adaptive phase 2 trial. The sample size of a phase 3 trial is based on the estimated efficacy of the experimental treatment, typically from a phase 2 trial. With a phase 2 estimate in hand that turned out to be perfect, Eisai designed a phase 3 trial with sample size that was neither too big, thus saving time and resources, nor too small, thereby adequately powering the phase 3 trial and showing statistical significance which was the basis for the drug’s marketing approval.
Drs. Berry have clearly demonstrated great advantages for designing clinical trials that adapt to the actual ongoing trial results. The concept applies to other types of clinical trials as well.