In recent years the word “science” has been used more often by more people than at any other time of our lives. We and those reading this work in a technical field require (or at least should require) fluency in the language of science. The challenges in the world and in our field of work have provided an opportunity to reconsider science both in “daily life” and as it applies to the pharma/biopharma/medical device industries.
We suspect that everyone reading this feels comfortable using the word science and may even use that word in their work frequently. There was a time when we just believed that almost everyone in our field shared a common understanding of the method and philosophy of science. However, considering the last twenty or so years, we are no longer sure that there is a common understanding of the scientific method we once took to be universal.
Discussion of how we consider, apply, and interpret the rules of science is fundamental and sometimes in life and education revisiting fundamentals is necessary. We think the time has come and therefore offer this brief communication in hopes of rekindling or perhaps even re-awakening an interest in discussing and debating science again. This is right and proper because vigorous debate is the fuel on which science runs. The history of science is replete with debates that became so heated that rivalries developed between or among the participants. We believe this is as it should be: science is important and things that are important create passionate viewpoints. Scientific history also informs us that the loudest, most authoritative, or even most officially endorsed scientist can be wrong.
Some of the greatest science of the 20th century was contributed by Albert Einstein while he was working as a patent clerk in Switzerland without an academic post and was largely unknown. Great science may come from unexpected sources and its lasting power relates only to the quality of work, not to the identity of the scientist. A scientific position should be credible not because of who proposed it or because a person holds a prestigious title. It is credible because it is rigorously reviewed, and a scientific consensus is reached. Significantly, even if credible it remains challengeable and scientific philosophy requires that it be challenged. An excellent brief review of the philosophy of the scientific method can be found on the internet in the Stanford Encyclopedia of Philosophy this review comes replete with an extensive bibliography for those interested in further reading.1
So, we’ll start at the most basic fundament, what is science? The Oxford English Dictionary2 states that science is “The intellectual and practical activity encompassing the systematic study of the structure and behaviour of the physical and natural world through thought and experimentation.” There are several keywords in this definition that require discussion. First, according to this brief definition, science is both intellectual and practical. That is important because, in our industry and others, science is applied both intellectually as well as practically.
Science is also defined as an activity; science is something that one engages in – it is neither static nor passive. It is said in this definition to be a systematic form of study which implies science requires a fixed plan, system, or method according to which the activity is conducted. The definition suggests that the structure and behaviour of the physical and natural world are the objectives of science and that the exploration or study of these features of the world is done using the activities of thought and experimentation.
This implies that the conduct of science as an activity requires an understanding of the scientific method which in turn implies training. It states that the activity requires thought and thinking is an activity that we assert draws from knowledge and experience. The fact that science is done systematically requires that it follows a plan or system and that it cannot be done unless one has been trained in the systematic principles by which it is performed. Finally, one must have enough knowledge to call upon the ability to design relevant experiments, which itself requires an understanding of a scientific tool kit applicable to the experimental task at hand.
This definition is the description of a profoundly rigorous and active pursuit. One does not visualize a casual activity in reading this definition, but instead imagines a mighty effort and quite possibly an endless one because absent in the Oxford definition of science is an endpoint. Science in this definition does not have a finish line. As applied to any subject of study, the result of science is not a finite conclusion. There is always the necessity of ongoing thought and experimentation.
Since the definition of science makes clear that it is a process, activity, or system, it is not surprising that those who do science (the people we know as scientists) have over the centuries evolved a general scientific method. The graphic shown in Figure 1 is a simplified depiction of the “scientific method’ borrowed from Wikipedia.
This chart depicts the scientific method as a six-step process starting with an observation or the posing of a question. Research at the library or on the web is then conducted to determine what studies in the field may already have been done that are germane to the observation or question. Arising from a melding of the scientist’s knowledge, their topical research and perhaps input from other scientists is a hypothesis. Although we believe most readers are familiar with the word hypothesis we offer below a definition of that term.
The word hypothesis comes from Greek and came into common use in the fields of science and philosophy in the early 1500s. It is defined as “a supposition or proposed explanation made on the basis of limited evidence as a starting point for further investigation.” So, it must be emphasized that hypothesis is the start of the experimental phase of the work not the end of the work. Since it is the beginning rather than the end of the scientific process a scientist does not treat their hypothesis as a done deal. In other words, if you thought up a possible requirement for testing a product you would not be ready to suggest its implementation based on an untested hypothesis (or even a tested one, but more on that later).
Two examples of a hypothesis most all of us are presently aware of:
- SARS-CoV-2 evolved naturally through infection of a wild mammalian species and was transmitted to humans in a meat market from a live animal or its meat infecting one or more humans who then spread it widely.
- SARS-CoV-2 arose from some type of gain of function research done in a laboratory and was released by humans in some manner into a community from which it spread across the planet.
We chose these hypotheses not only because they are well-known but that they are controversial. You, our reader, may hold a strong and perhaps even unbreakable belief in one of them. If you do hold a strong opinion, we ask you to consider if it is truly factual or rather how you would like it to be. The correct scientific answer is that they are both hypotheses and therefore suppositions which await further study by the methods shown downstream of the hypothesis in Figure 1.
We may not like reading that more factual data is required to discover the origins of COVID-19, but if we consider the situation fully, we realize that proof of a scientific fact should be hard. Millions of global citizens have died, and scientists should want to know how and why that happened. It is important to prevent future pandemics. Convenience is not the goal; factual reality is.
The bar in science must not be set by philosophical preference but by rigorous adherence to analysis so that we have sufficient data upon which to pass an objective judgment. Science done correctly should not be concerned with favoured outcomes. It must be only about a quest for the truth. This, of course, means that any mixture of science and politics is combustible because those two types of human activity are often in conflict.
A phrase that may be coming to your mind right now is “settled science.” Take another look at Figure 1. That graphic represents science as a circular activity in which you return to the possibility of formulating another hypothesis. An initial hypothesis may not be proven by testing and analysis but perhaps experimentation points to an alternate hypothesis. This could lead to a refinement of the initial hypothesis or the formulation of a new one.
We agree with those who suggest that the idea of “settled science” is a myth. We are always learning which leads to refinement of our understanding. There is an important difference between theory and hypothesis. A theory is tested, well-substantiated and the substantiating data is not only statistically valid but also reproducible. In science we expect multiple laboratories to confirm the correctness and reproducibility of an experimental result. Again, remember that a hypothesis awaits confirmation, whereas a theory is confirmed to the best of our ability. However, over the course of time new facts may lead to a new hypothesis which if proven will lead to a changed understanding of the theory. As an example, in the biological sciences, an entire generation of students was taught that the central theorem of biology was that DNA goes to RNA and goes to protein. Howard Temin and David Baltimore discovered that some viruses had only RNA as their genetic material and had a double-stranded DNA step in their replicative cycle which effectively reversed the transcription process3 This proved to be a landmark in the history of biology and paved the way for the entire field of recombinant genetics. Temin and Baltimore quite justifiably shared a Nobel Prize. Later other scientists discovered that proteins could be modified post-transcription into functional pieces.4
In this case, the central theorem wasn’t wrong, but it just wasn’t the whole story.5 Life was more complicated than originally accepted, and the science was far from settled. We have Darwin’s theory, not Darwin’s law because questions exist as gaps in the fossil record. Archaeology sometimes fills those gaps. Those of you who are microbiologists know that genomics has resulted in the re-classification of organisms as we learn more about their evolutionary relatedness. The point here is that science is always changing because we are always learning. This is a very good thing because many of the modern biologics our firms produce would not have been possible without these advancements. Thanks be to those who did not consider science settled!
How Well Do We Apply the Scientific Method in Healthcare?
Historically, we think the global pharmaceutical and biopharmaceutical industries have an envious record of safety and efficacy. No industry or discipline is perfect and there have been lapses in quality to be sure, but thankfully those have been extremely rare. However, our generally solid performance in terms of providing products which are generally safe does not mean we shouldn’t question our approaches and those required of us by national and international regulatory policies.
As a test of the application of the scientific method consider one of the most emphasized and scrutinized analyses done in support of aseptic processing. Environmental monitoring (EM) has grown into an exercise around which whole departments have been built, special software has been developed to tabulate data and regulatory inspectors always carefully review and sometimes find wanting.
We must start by asking the most fundamental question, “What is the scientific question environmental monitoring was implemented to answer?” The history is a bit murky. We recall discussions with now long-retired industry microbiologists who remembered doing some surface sampling and settling plates in the 1950’s. We know it was done on an as-needed basis in clinical settings,4 but the implementation of hygienic controls was found to be more effective.5 We also know that EM took on an importance in our industry that is far more significant than in any other healthcare discipline.6-9
A pertinent question to ask is whether there has ever been a scientific hypothesis written or tested on EM regarding its analytical utility as it relates to aseptic processing. Was any hypothesis evaluated independently before the practice was introduced? Up until now, we have yet to find a formal scientific inquiry into the capabilities of EM methods regarding the analytical tasks it has been assigned, that is as a predictor of product contamination. Our discussions with those retired industry microbiologists led us to suspect that when EM was introduced, there was no hypothesis, and no experimental evaluation was required because it was implemented only as a general assessment of factory hygiene. EM was implemented only to provide a general and largely qualitative assessment regarding the usefulness of disinfection procedures, ventilation systems and operator gowns in providing a hygienic aseptic environment. This is a somewhat more extensive role than EM once played in hospitals, and in that setting, no correlation was observed between EM results and infection rates. General recovery targets may have existed, but alert and action levels didn’t, and it is worth remembering that the word asepsis literally means without infection rather than sterility.
In the late 1970’s when process validation came to the forefront both EM and media fill testing took on added significance. By the mid-1980 regulators first began to consider EM to be a process control activity and the emphasis placed on EM increased accordingly. It was in this era that alert and action levels were implemented, and the industry went along rather enthusiastically. Although never evaluated for correlation to outcome, this seemed to many like a logical extension of the general hygienic assessments we had been performing. The implementation of alert and action levels did not initially take on a process control function and was not part of batch record review and product lot release. Also, the numerical gap between alert and action levels was often great enough to fit microbiology’s status as a quantitatively logarithmic science.
By the late 1980s, however, some regulators had different ideas and suggested that alert and action levels should be considered control points and the data arising from EM should be plotted and trends analyzed.10 These control points were soon interpreted by inspectors to be forms of process control and EM became an unstandardized product release assay. Once it was assumed EM could serve as a measure of sterility assurance it soon morphed into a kind of secondary sterility test. More recently EM’s use expanded into a tool for analysis of risk. Often people we speak to in industry accept this evolution of EM without protest, considering it emblematic of an effort we as a regulated industry must undertake.
We suspect it may come as a shock to younger scientists and engineers who began their work in the industry in the 21st century, but the reality is none of these assumptions or suppositions about the functionality of EM was ever the subject of experimental evaluation. Nor did we know from the use of EM in other applications that EM had the accuracy, precision, reproducibility, or sensitivity to perform the roles in sterility assurance determination and risk assessment it had been assigned. Many different types of air samplers were used with different sample volumes, recovery rates and aseptic intervention requirements. Surface sampling could be performed by different types of devices from wet or dry swabs to different types of contact plates. So, in the case of “What came first, the chicken or the egg?”, did we implement EM with a strong scientific understanding of its capabilities, or did we merely assume it had all the necessary capabilities and move forward on that basis?
We are open to alternative explanations for the evolution of EM from a general hygienic assessment without correlation to product safety to effectively a kind of non-process secondary sterility test which, if done well enough, assures sterility. Without regard to how it has evolved, however, the real capability of EM to measure product risk on a lot-by-lot basis has never been challenged. This is not to say that there haven’t been comparative studies done on air sampler media incubation times or other selected properties of EM programs. We think upon many years of review, though, that none of these studies have been adequate to truly evaluate EM as a general practice and certainly it has never been evaluated for its ability to perform the analytical roles assigned it in aseptic processing quality assurance.
Given our long tenure in the industry and our familiarity with how and why EM is utilized, we think the following hypotheses for the assessment of EM need to be challenged experimentally and if unsupportable either revised or eliminated.
- EM sampling methods for active air are equivalent and can be used interchangeably without the need for standardization.
- The limit of detection of EM is one cell of any species likely to be found in a classified clean room. This means, among other things, that the SCDM media (sometimes with additives) commonly utilized in EM will grow any microorganism.
- EM data correlates directly to media fill outcomes as measured by contamination rates.
- EM data is useful in defining process control parameters.
- EM methods can differentiate between one (1) CFU and three (3) CFU making the distinction between alert and action level analytically and statistically valid.
- EM results can be used to determine precisely where aseptic processing risk resides in a process and therefore to infer the existence of or loss of sterility assurance.
- EM done on gowned personnel yields results which enable accurate and reliable means of determining if an operator is well enough trained or has a performance deficiency requiring retraining.
- EM is suitable for the microbiological classification of clean rooms.
- Repeat samples taken at a defined location are a valid way to enhance trend analysis when an alert or action level excursion is observed.
We could suggest several more hypotheses but the nine listed above are enough to start a scientific reconsideration of EM. We believe the regulations given in documents such as EU Annex 1 or the FDA’s aseptic processing guideline imply that all these hypotheses are confirmed by data and analysis. As far as we have been able to determine, none of them have ever been confirmed, although some have been tested indirectly and were disproven. In fact, in our view as scientists and engineers, it should have been a requirement that the full capability of EM was comprehensively evaluated experimentally before it was implemented as a regulatory requirement.
Unfortunately, there was prima facie evidence that crucial EM capability expectations were invalid before they were implemented. We have collectively reviewed thousands of EM results for media fill tests and have never been able to establish any kind of correlation between EM findings and media fill test success or failure. We have reviewed media fills in which multiple alerts or even action-level EM events occurred which had no containers contaminated. Conversely, we have reviewed media fills in which no contamination was recovered at all by EM and one or more media fill containers in the test were contaminated. It is assumed in some cases that an EM recovery on a non-product contact surface is predictive of product contamination risk, yet there is no evidence at all that this is in fact true.
It was known when EM was implemented that soybean casein digest medium (“SCDM” or also known as trypticase soya agar “TSA”) could not grow all organisms released from humans nor could it recover all microbes transported into in a classified clean room by another vector. This obviates any possibility of concrete evidence regarding sterility assurance even if a very loose definition of that attribute is applied.
It was also known prior to EM’s implementation that the limits of detection (LOD) of the methods used in EM were much higher than one cell. Growth promotion testing of media was typically done using inoculations of 10-100 CFU. There have been experiments claiming LOD of less than five CFU, but these studies were with cultures which were adapted to the media prior to the study.
We do not find scientific support for the supposition that EM has sufficient precision to reliably detect a difference between an alert level of one CFU and an action level of two or three CFU. This single requirement results in countless excursion reports, deviation reports or risk analyses to be conducted. All these studies are statistically inconclusive. As our late colleague Scott Sutton pointed out in 2015, “Virtually all clean room action levels as set by regulations are deep in the noise range of the plate count method.”11 As experienced industry scientists Russell Madsen and David Hussong pointed out in 2004,12 at a mean count of one CFU the % standard deviation is 100%; at a count of three CFU it is 58%; at 10 CFU it is 32%. Therefore, there is no reliability analytically or statistically in attempting to make a qualitative or quantitative differentiation between any EM result in the order of magnitude between one and ten CFU. Stated slightly differently, it is confirmation that microbiology is a logarithmic science. It almost goes without saying that the common practice of setting alert and action levels based on Gaussian or Poisson distribution of recovery data is wrong because method variability is too high to allow the requisite precision. Put another way, the inherent error associated with a method with 30-50% variability at the commonly applied alert and action levels in our critical aseptic areas makes it impossible to calculate a meaningful average. This is particularly evident when we consider that 99% of the samples show no growth at all and hence produce only null data.
We believe that proving a negative absolute in any scientific endeavour is impossible. Even if we assumed that we could apply EM without standardization and were able to rely on a single growth medium, it would be impossible to prove that zero microorganisms would be present, even if none were detected. To make that claim would require sampling all the air that passed through the room with a limit of detection of one colony, which is not the same as one cell although it is often treated as a cell count. We’d also have to sample every internal surface with the same limit of detection. Testing for no growth, which is quite a bit less rigorous than establishing sterility, would require both a perfect assay and an unlimited sample size.
There is a small chance that there could be gowning circumstances where it might be possible that EM could detect differences related to employee practices. However, it is very unlikely that a differentiation could be made relating to the potential impact on an aseptic process. Asserting the value of EM in personnel evaluation in cleanrooms cannot be done without rigorous controlled experiments. Also, there is ample published scientific research indicating that under normal working circumstances, humans wearing new aseptic gowns release 1000 CFU or more of detectable contamination into their environment per hour.13 Because EM recovery on personnel monitoring plates in aseptic processing is most often zero, this is likely evidence that the methods used for EM on personnel underestimate the actual levels of human-released contamination. Yet, since these products have an excellent record regarding infection control, we can presume that current conditions regardless of actual microbial contribution from gowned personnel are fit for purpose.
If any scientist or group of scientists has experimental data to bring to bear on these hypotheses, we would appreciate seeing them. In possession of such data, we would then know much more about what we are doing and perhaps even be able to hypothesize what we should be doing and how that should be accomplished. However, based on what we currently know of the science, our assessment of the way EM has evolved tells a story of the evolution of a discipline in the absence of rigorous experimental study. EM consists of a series of unproven hypotheses, many of which were not even discussed prior to EM evolving into the significant consumer of resources it has become. We think we should know more in concrete scientific terms about something that costs the global industry millions of dollars each week.
Of course, there is a bigger question here based on the scientific method and the need to know how well anything we do works. We propose that the formal introduction of standards of practice must come only after rigorous scientific experimental study of the proposal. No regulation should emerge and then continue to evolve and expand in the absence of experimental confirmation at every step of the process.
What is the real takeaway from the evaluation of EM presented in this article? We are concerned that EM was not properly evaluated using well-designed and controlled experimentation before it was introduced as a regulatory expectation. Also, we believe that the expansion which has occurred in the scope of EM has likewise occurred without rigorous scientific review. We believe EM was stitched together based on regulatory inspectional opinions into a patchwork of prescriptive cGMP expectations. We have even seen EM applied to non-aseptic processes without any evaluation of their scientific value or appropriateness. Given that no single assumption in the myriad of cGMP practices which fall under the EM umbrella was ever subject to scientific experimentation either before or after implementation we find sweeping conclusions made from limited, imprecise and highly variable data unpersuasive. Sadly, some of the defining concepts of EM go against common microbiological and general scientific or engineering knowledge. This should, of course, be rectified because as CDC observed EM “is an expensive and time-consuming process that is complicated by many variables in protocol, analysis and interpretation.”14
We have learned some things from the conduct of EM, or at least from microbiological studies done in clean rooms. EM has helped confirm that the primary source of microbial contamination in aseptic cleanrooms is personnel. We know from studies that microorganisms detectable by EM do not enter aseptic environments through HEPA filters which are certified to be integral. We also have found that minor HEPA filter leaks are not detectable by EM and the pattern of detected contamination is not impacted by minor filter defects. Experimentation would be necessary to determine the magnitude of a leak that might contribute detectable contamination, which as previously emphasized is not proof of sterility.
We have also learned from experience that sample locations using a grid, as is often done in certifying clean rooms for particulate air quality, is not valid in microbiological EM. This is because contamination is closely associated with personnel and their movement. The fact that HEPA filters are very effective at preventing microbial contamination from outside the facility and the knowledge that the most significant source of contamination is gowned personnel have helped us realize that grid sampling was a valueless endeavour even though it is a common feature of so-called EM process qualification (“EMPQ”) activities. This approach is also sometimes used in risk analysis activities but in our experience is ineffective. We also have learned that repeating a sample at a location at which an alert or action level result was observed is of little or no value. Such a determination can only be made post-incubation, which may be three to seven days after the original sample was taken. Given typical ISO 5 or ISO 7 air exchange rates thousands of air changes will have occurred between the time the original and remedial samples are taken. These are then effectively two completely different sample conditions which are impossible to duplicate.
This article is not really about EM. EM is used only as a fitting example of how the scientific method was in a highly emphasized quality program ignored in establishing regulations and cGMP practice. We assert that all proposed analytical standards and processes or regulations relying on them must be subjected to rigorous evaluation using the scientific method prior to implementation. No analytical or performance standard should appear as a fully formed requirement without rigorous experimental evaluation. We believe the need for rigorous study applies to compendial standards as well. Compendial standards must be objectively proved to be capable of assessing the quality attribute they are hypothetically expected to evaluate.
The application of the scientific method in all our processes and regulations must be evaluated rigorously and without bias. We believe no hypothesis should be rejected for inconvenience or accepted merely because regulatory authorities support it. If the practice fails to yield the expected or desired results, it should be rejected. Only if it passes the rigour of scientific evaluation should it be embraced.
References
- 1. https://plato.stanford.edu/entries/scientific-method/. Accessed 9/7/2022
- 2. Oxford English Dictionary (2022) http://oed.com/
- 3. John M. Coffin (2021) 50th Anniversary of the discovery of reverse transcriptase, Molecular Biology of the Cell 32(2): 91-97.
- 4. Charles Stewart Roberts, MD. (2012) Comments on Darwinism, in Proceedings of the Baylor University Medical Center https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3246855/
- 5. M. Van den Ende and M. Spooner. (1941) Reduction of dust-borne bacteria in a ward by treating floor and bedclothes. Lancet 241: 400-401.
- 6. P.W. Smith, K. Watkins and A. Hewlett. (2012) Infection control through the ages; American Journal of Infection Control (40) pp 35-42.
- 7. Scott Sutton, (2010) The Environmental Monitoring Program in a GMP Environment; Journal of GXP Compliance Vol 14 (3) pp 22-29.
- 8. FDA (2004) Guidance for Industry: Sterile Drug Products Produced by Aseptic Processing-Current Good Manufacturing Process.
- 9. PDA Technical Report Number 13. (2001) Fundamentals of an Environmental Monitoring Program
- 10. EU GMP Annex 1: Manufacture of Sterile Medicinal Products – revision 2004.
- 11. Scott Sutton (2011) Accuracy of Plate Counts in Microbiology Topics, Journal of Validation Technology pp 42-46.
- 12. D. Hussong and RE Madsen, (2004) Analysis of Environmental Microbiology Data from Cleanroom Samples; Pharm. Tech. Aseptic Processing Issue, pp 10-15.
- 13. Ljungqvist, B. and B Reinmüller. (2021) People as a Contamination Source in Pharmaceutical Cleanrooms- Source Strengths and Calculated Concentrations of Airborne Contaminants; PDA Journal of Pharma Sci and Technol. 75(2): pp 119-127.
- 14. The Centers for Disease Control and Prevention (CDC) in Guidelines for Environmental Infection Control in Healthcare Facilities Background F. Environmental Sampling (2002https://www.cdc.gov/infectioncontrol/guidelines/environmental/background/ sampling.html
Author Details
James E. Akers, PhD, President- Akers Kennedy & Associates, Inc.; James Agalloco, President- Agalloco & Associates; Phil DeSantis, Principal- DeSantis Consulting Associates; Russell E. Madsen- Retired Pharmaceutical Executive
Jim Agalloco is a pharmaceutical manufacturing expert consultant with 50+ years of experience. He has assisted more than 200 firms with validation, sterilization, aseptic processing, and compliance. He has co-edited 5 texts; written 40+ chapters; published more than 170 papers and lectured extensively.
Phil DeSantis is a pharmaceutical consultant, specializing in Pharmaceutical Engineering and Compliance. Phil retired in 2011 as Senior Director, Engineering Compliance for Global Engineering Service at Merck (formerly Schering-Plough) located in Whitehouse Station, NJ. He continues to work on standards and practices for all facility and equipment-related capital projects and site operations, as well as in the areas of sterilization and contamination control.
James E. Akers, PhD, is President of Akers Kennedy & Associates, Inc., located in Kansas City, MO. Dr. Akers has over 25 years experience in the Pharmaceutical industry and has worked at various director level positions within the industry and for the last decade as a consultant. Dr. Akers served as President of the PDA from 1991 to 1993 and as a member of the PDA Board of Directors from 1986-1999. Currently, he is Chairman of the USP Committee of Experts Microbiology and Sterility Assurance, as well co-chairman of the PDA Isolator Technology Task Force, Aseptic Processing Task force and member of several recent program committees.
Russell E. Madsen is a retired pharmaceutical executive with over 50 years of industrial, technical and regulatory experience. He is a member of ASTM E55 Executive Committee and has authored or co-authored over 90 publications.
Publication Details
This article appeared in American Pharmaceutical Review: Vol. 27, No. 6 Sept/Oct 2024 Pages: 34-39Subscribe to our e-Newsletters
Stay up to date with the latest news, articles, and events. Plus, get special
offers from American Pharmaceutical Review delivered to your inbox!
Sign up now!
This post was originally published on here