Pfizer’s edge in the COVID-19 vaccine race: Data Science

SEP 08, 2021

A U.S. Army soldier immunizes a woman with the Pfizer COVID-19 vaccine in North Miami, Florida.

A year in the life of the data scientists who helped bring Pfizer's COVID-19 vaccine to the public in record time

Pfizer dominated news headlines and family dinner conversations last December when it became the first company to bring a COVID-19 vaccine to the U.S. market. The pharma giant accomplished the feat in record time: less than a year after the disease was first identified.

Integral to that effort was the work of Pfizer's informatics and digital technology team for its vaccine R&D business. Led by Frank DePierro, this group of researchers crunched and chronicled all of the clinical trial data that led to a green light from the U.S. Food and Drug Administration (FDA), and a safeguard for millions of people. What did it take to make that happen? IEEE Spectrum spoke with DePierro last week via video call to find out. The transcript below has been edited for length and clarity.

Frank DePierro, Pfizer

IEEE Spectrum: What's it like to have the eyes of the world on your work?

Frank DePierro: It's been wild knowing how close my team and I are to the success of this vaccine. I would read articles with people speculating about the data or science, watch news clips with talking heads or experts, or overhear people discussing rumors while out and about, and the whole time I would be biting my tongue thinking about how wrong or right something was that was being said. Being so close to the process but also having to hold so much back, even from close family and friends, made it a tense year.

Spectrum: That must have been exasperating.

DePierro: The spring and summer of 2020 was the most difficult time of my life. I was at home working remotely, under intense pressure to deliver and run the team, had to balance working the longest hours of my career while helping a third grader with school and keeping a three-year-old entertained because my wife was busy on the front lines as a healthcare worker. I think people forget—especially in the news cycles—that we are human too, juggling all the same things while also trying to advance science.

“Being so close to the process but also having to hold so much back, even from close family and friends, made it a tense year."

Spectrum: What does your team do for Pfizer?

DePierro: My team supports clinical trial data. When blood and other kinds of samples are brought to the lab, each sample has to be received, tracked, divided up, and analyzed with complex robotics and instruments, and then statistical analysis is performed and final data generated. So the tools that track all of those samples and generate the data—that's my team. Then we report it out to the FDA.

Spectrum: How many samples from COVID-19 vaccine trials have you processed?

DePierro: The short answer is: a lot! In the last year and a half we logged more clinical samples for COVID than all other vaccine programs combined since 2014.

Spectrum: What kinds of assays or tests are conducted on the samples?

DePierro: It depends on what kind of clinical trial we're running. One test we run is where we introduce a live virus to a blood sample to see how the blood reacts. If the virus is neutralized in the blood, that tells us that the person had built up immunity. We also do PCR, which is the same technology used ubiquitously now in COVID diagnostics.

Spectrum: What are some of the informatics tools that you use?

DePierro: The main behemoth behind all of what we do is called LIMS, which is a Laboratory Information Management System. This is the broad system that enables us to track samples coming in, collect the data, and aggregate it. These are off-the-shelf, but highly customizable, and a lot of venders offer them. The one we happen to use for our vaccine trials is by a company called LabWare. There are certain aspects where we just check boxes to configure, but there are other aspects where we're going in and writing complex subroutines and code using a proprietary language called LIMS Basic, which is very similar to a Java or BASIC.

Then we have other tools that enable us to connect the instruments, robotics and statistical modeling. We're heavy users of SAS. That's the bread and butter of many of the algorithms we write that statistically analyze the clinical data to generate final results. We also use R for statistical analysis. And we have many complex instruments and robotics that have their own proprietary applications that need to be configured and have to communicate effectively with our LIMS or other applications.