Journal Club #1: COVID Vaccine Safety and Efficacy
How stats methods can affect interpretation of results
We are starting off with a juicy one. With each of these “journal clubs” I will start by providing the article citation, where it can be found, and if the article is open access or behind a paywall. I wish I could simply link a PDF of each of these articles, but they are subject to copy-write protection, and getting sued or taken offline for infringement would defeat the purpose of writing these posts. I will focus primarily on open access articles because this aligns with the goal of teaching people how to critically read medical literature.
This article describes the original study upon which the FDA depended for the approval of the Pfizer-BioNTech vaccine. It’s worth defining one word that I’ll be using repeatedly throughout this post, efficacy. Efficacy in a study like this is how well the vaccine (or treatment) works at achieving the primary endpoint (in this case preventing COVID) under ideal circumstances. That is different than effectiveness which is how well it works under real-world conditions. It may sound like semantics, but it is an important distinction when discussing drug trials.
Citation
Polack et al. Safety and Efficacy of the BNT162b2 mRNA Covid-19 Vaccine. N Engl J Med 2020;383:2603-15. DOI: 10.1056/NEJMoa2034577
The full article is available at NEJM.org and is open access (free to read for anyone).
Abstract
This article has a good example of a well-formulated abstract. It starts by very briefly laying the groundwork for why the study was done and then clearly describes the methods (we’ll get into these later). From the results, you can see this was a large study and you get a clear idea of how they intend to interpret their results. The conclusion is a clear restatement of their results. The authors also note that the study was funded by Pfizer and BioNTech.
Introduction
This introduction section is exactly what you would expect from an article in the New England Journal of Medicine (NEJM). It is well written and has numerous important citations. For the interested parties, the citations in this section include a paper describing their phase 1 study and the formulation of the Pfizer-BioNTech mRNA vaccine. A phase one study looks at the toxicity of a drug at various doses as well as how it is metabolized by the body. Phase 2 studies look at if a drug has efficacy in a small group and phase 3 is a larger, randomized trial. This paper combines some phase 2 and phase 3 data which is somewhat atypical. However, given the circumstances and rapidity at which these vaccines were developed, it is not unreasonable. The authors don’t have a clear hypothesis statement at the end of the discussion, but they do clearly state what the paper is reporting.
Methods
As I stated in the introduction to medical literature, this is where the money is. This is a randomized, placebo-controlled trial. That means that participants were randomly assigned to either receive an injection of the real vaccine or an injection of a placebo (in this case an equal volume of saline). Furthermore, the person administering the shot and the recipient were blinded to which one they received. This design is the best way to assess if a drug (or vaccine) is effective at preventing disease if no other treatments exist. If a vaccine already existed against COVID at the time of this study, they would have compared their vaccine to the already available one.
Inclusion/Exclusion criteria
The inclusion criteria are mostly there. The authors state that they include people age 16 or older with “stable” chronic disease. They don’t define here what “stable” means (although they may in another article or supplement). Those who were known to have had COVID and those with compromised immune systems were excluded. For an efficacy study of a vaccine meant to trigger an immune response, it is reasonable to exclude these patients, although a separate study of just these patients would be justified if the vaccine is determined to be effective.
Safety and Efficacy Endpoints
The authors clearly define their endpoints, including the timing of evaluation for vaccine safety. It is interesting to note that only a subset of participants was required to report all symptoms for the 7 days after each dose (8,183 of the 37,706 with safety data available). Anyone could report, but they had to initiate the contact. This reporting could be done until 1 month for mild events and 6 months for serious adverse events. This is interesting because typically it would be the responsibility of the group conducting the trial of a new drug to proactively monitor participants for serious adverse events. There is a reason for that. Those participating in a trial may not consider something a “serious” event or may even conclude that it is unrelated to the trial and thus not report it. In my opinion, this dependence on the participants to report events opens the study to criticism regarding its safety claims. One final thing to note, this paper doesn’t actually include adverse events out to 6 months. Instead, it only reports to 14 weeks. Surprisingly, a study of such importance which aims to answer the safety and efficacy of a new class of vaccine would cut its safety data collection period shorter than was prespecified.
The efficacy definitions also warrant scrutiny. The primary endpoint was infection by the SARS-CoV-2 virus (the virus that causes COVID) more than 7 days after the second vaccine dose. They define infection as having one of a constellation of symptoms PLUS a positive PCR-based test. This is interesting to me given the fervor of those telling people that asymptomatic people spread COVID and the case definitions used by most agencies as simply a positive PCR test. It would have been interesting to see the comparison of asymptomatic infection rates. This is pertinent given that they had a secondary endpoint of severe infection which suggested that they may have suspected some partial benefit even if a person receiving the vaccine got COVID.
Statistic Methods
I suspect that most people reading this article would make it to this part without too much trouble. Unfortunately, this is where most readers’ eyes glaze over if they bother to read the statistical methods section at all. Try to resist that temptation because this section sets up how you interpret the results. For the safety endpoints, they do a basic description of the rates of certain reactions. They also provide 95% confidence intervals which tell you the expected range of these rates in a wider or different population. These states are common and relatively straightforward, the efficacy stats are not.
The study designers decided to use incident rate ratios (IRR) to analyze their primary efficacy endpoint. How the incidence rate ratio is calculated is this: the number of COVID cases per 1000 person-years in the vaccine group divided by the number of COVID cases per 1000 person-years in the placebo group. The final vaccine efficacy is then calculated as follows: 100 x (1-IRR) and reported as a percent efficacy. There are justifications for using this method. A study like this aimed at seeing if a vaccine might be able to halt an ongoing pandemic will need to take advantage of the full duration of observation for early participants to increase the overall amount of data available to detect a difference between the vaccine and placebo. For example, if you enroll a person in July 2020 you would want to keep collecting their data through the end of the study. However, you wouldn’t want to compare their infection rate over 6 months of observation to that of a person that only had 2 months of observation. The standardized infection rate per 1000 person-years controls for that. However, when we get to the results, you will see that this method is not without its flaws. I won’t get into the other methods discussed because they are more technical and aren’t important to the conclusion that came out of this study.
Results
We finally made it to the section we’ve all been waiting for. First off, they describe the group enrolled. This was a massive study of more than 43,000 people. Besides being impressive in its scope, the size of the study is important in another way. The larger a study, the more likely you are to find differences between groups that are statistically significant. I’ll explain what I mean. In any study, the results come with the 95% confidence intervals I described earlier. If the confidence intervals between the drug and placebo groups overlap, then it can be concluded that there is a reasonable chance there is no difference between the groups. So how does size come into play? The larger the group, the narrower the confidence intervals. This can result in very small differences being statistically significant even if they don’t seem clinically significant.
The authors provide a nice flow diagram of participants in the study as well as the demographics of those that participated (Table 1). On the safety front, there was not surprisingly a larger number of local reactions in the vaccine group than in the placebo group. This was also true for systemic reactions such as fever, particularly after the second dose. There were more overall adverse events reported in the vaccine group (27%) vs the placebo group (12%). There were 2 deaths in the vaccine group and 4 deaths in the placebo group all of which were considered unrelated. Interestingly, no one enrolled in the study died of COVID during the reported study period.
The efficacy results are where I think this study, and its interpretation get interesting. There were only 170 cases of COVID in the 36,523 participants with no evidence of prior infection (8 in the vaccine group and 162 in the placebo group). Put another way, only 0.47% of participants got COVID during the study period. Using the author’s efficacy endpoint, they found the vaccine to have a 95% efficacy. Using methods we skipped, they then state a >99.99% chance that the vaccine is more than 30% effective.
This is where the previous discussion of study size and choice of statistics becomes important. The large study size narrows the confidence intervals and allows even small differences to become significant. Also, by using an incidence rate ratio, the authors were able to make small numbers appear more significant. They could have looked at what percent of participants got symptomatic COVID in each group (0.04% in the vaccine group, 0.88% in the placebo group) or how much the vaccine lowered that risk (0.84% absolute rate reduction). While not exactly apples to apples due to the different time periods of participants in the study arms, this method provides a clearer view of the vaccine efficacy in the study participants. Another option that would have made their results sound even more impressive than 95% effective would be to calculate a relative rate reduction which divides the infection rate in the vaccine group by the infection rate in the placebo group. This method would have shown that the vaccine reduced infection by 22x. You can see how the same results can be described in different ways to sound more or less impressive.
Discussion
The discussion section of this paper is where the authors’ bias is shown. They start by using that statement that we have all heard again and again, that the vaccine is safe and effective. They describe the different groups and the timing of when that protection begins. None of this is contradicted by their data, but it does overstate their results. They use the word effective without having real-world time frames in most of the population. They say safe and then are forced later on in the discussion to admit that they didn’t have enough observations or a long enough observation time for rare serious adverse events. Overall, the authors paint a rosy picture while giving lip service to limitations.
My Interpretation
With all of the critiques I’ve given, I do think that the study is well designed. The period in which it was enrolling, turned out to be a lull in infections which could partially explain the low overall infection rates. The data also does show a decrease in symptomatic infections and serious infections. My main issues are the overstatement of efficacy and safety by the authors and subsequently by the media and the medical community. This data was used by many to justify mandates because it “prevented infection”. What they didn’t say was that the authors used a different definition than everyone else was using, specifically symptomatic infection. For all of the hysteria regarding asymptomatic spread (which started months before this study began enrolling), not including an asymptomatic positive endpoint is glaring. I can buy using this study to justify pushing for vaccination of high-risk individuals given the benefits against serious disease, but I do not see it justifying mandates for young healthy people. It absolutely does not justify any differential treatment of vaccinated vs unvaccinated individuals when it comes to requirements for masking, social distancing or eating in a restaurant.
I hope that you have found this post and the process of reading this article interesting. As articles are suggested to me or as I come across interesting studies, I will do more. As always, your feedback is welcomed.