Developing Guidance on Conducting Scientifically Sound Pharmacoepidemiologic Safety Studies Using Large Electronic Healthcare Data Sets

The Biotechnology Industry Organization (BIO) thanks the Food and Drug Administration (FDA) for the opportunity to submit comments on <em>Developing Guidance on Conducting Scientifically Sound Pharmacoepidemiologic Safety Studies Using Large Electronic Healthcare Data Sets.&nbsp;</em></p>

Re: Docket No. FDA–2008N–0234: Developing Guidance on Conducting Scientifically Sound Pharmacoepidemiologic Safety Studies Using Large Electronic Healthcare Data Sets; Public Workshop; Request for Comments

Dear Sir/Madam:

The Biotechnology Industry Organization (BIO) thanks the Food and Drug Administration (FDA) for the opportunity to submit comments on Developing Guidance on Conducting Scientifically Sound Pharmacoepidemiologic Safety Studies Using Large Electronic Healthcare Data Sets. BIO represents more than 1,200 biotechnology companies, academic institutions, state biotechnology centers and related organizations across the United States and in more than 30 other nations. BIO members are involved in the research and development of innovative healthcare, agricultural, industrial and environmental biotechnology products, thereby expanding the boundaries of science to benefit humanity by providing better healthcare, enhanced agriculture, and a cleaner and safer environment.

BIO supports the Agency’s efforts to fully utilize new electronic health care data sources for post-approval pharmacoepidemiologic studies. These new data resources offer great promise to revolutionize the practice of pharmacovigilance with more timely and cost-effective methods for conducting Phase IV studies, but great care must be taken to minimize the potential for biases. BIO echoes the sentiment expressed at the May 7th public workshop that standards and best practices must be developed to address key issues such as data quality, database validation, study design and validation, data access, governance, personnel qualifications, and regulatory actions. However, BIO’s comments primarily focus on the overall criteria that should be used for selecting database analysis to address a key post-approval drug safety question, rather than employing more traditional approaches, such as observational studies or controlled clinical trials. The comments also examine issues relating to the validation of data sources and the burden of evidence required to take action based on observational data.


The Food and Drug Administration Amendments Act of 2007 (FDAAA, P.L 110-085), provided FDA with new tools and authorities to address key drug safety questions, as well as guidance on selecting appropriate study methodologies. For example, Congress provided FDA with new authority to “require a responsible person for a drug to conduct a postapproval study or studies of the drug, or a postapproval clinical trial or trials of the drug, on the basis of scientific data deemed appropriate by the Secretary.” (21 USC 355 (o)) We interpret “study” to mean an observational epidemiological study, such as case-control study, cohort study, or patient registry, and “clinical trial” to mean an experimental/interventional study (e.g., randomized, controlled clinical trial). In addition, the legislation requires the Agency to establish a postmarket risk identification and analysis system based on large automated healthcare databases, such as claims data or electronic health records, to allow the Agency to engage in data mining for potential drug safety signals. (21 USC 355 (k)(3)).

To clarify how these two new Agency tools for post-market surveillance will work in tandem, Congress established a series of triggers to help FDA determine whether a study or a trial is appropriate, depending on the scope and relative burden of the post-market surveillance that is necessary. First, Congress required that before the FDA can mandate a postapproval study, the Agency must determine that regular reporting and analyses conducted through the new active postmarket risk identification and analysis system will be insufficient to assess a known serious risk related to the use of the drug, or a serious risk related to the use of the drug, or to identify an unexpected serious risk when available data indicates the potential for a serious risk. (21 USC 355 (o)(3)(D)(i)) Secondly, FDA cannot mandate a postapproval clinical trial unless the Agency makes a determination that a postapproval study or studies will be insufficient to meet the goals of the study, as stated above. (21 USC 355 (o)(3)(D)(ii))

Taken together, these provisions establish a decision process to help FDA, in consultation with the sponsor, to determine which type of study is appropriate and necessary to answer a pending drug safety question. The process is designed to identify the most efficient, and least burdensome, means of investigating the safety question. First, FDA should determine if the answer can be ascertained through analysis of the active postmarket risk identification and analysis system. If not, then epidemiological studies, or further epidemiologic research, should be utilized. However, we believe that there may be steps short of a full epidemiological study that are appropriate following FDA assessment of a safety signal through the active postmarket risk identification and analysis system. For example, it may be appropriate for the sponsor to conduct epidemiological research using existing databases, instead of or prior to a determination that an epidemiological study such as a cohort study or case-control study is needed. If epidemiological studies are needed and are insufficient to answer the question at hand, then clinical trials should be commissioned as a last resort.


However, at this time there are no clear standards regarding what criteria should be weighed to determine how to evaluate a drug safety signal or to answer a research question. To help further explore the instances when database analysis may be a preferable compared to observational studies or clinical trials, we offer the following points:

Pharmacoepidemiologic safety studies using large electronic healthcare data sets (‘database studies’) may be preferable when:

·         Results are Needed Promptly: Effective database analysis can return results in a more expedited manner than observational studies or clinical trials and may be preferable in instances where insight on a drug safety question is needed promptly.

·         Other forms of Observational Studies or Clinical Trials Would Be Prohibitively Large: Database analysis is effective in instances where the adverse event is relatively rare and would require study of a prohibitively large patient sample in a clinical trial or studies. However, we recognize that follow-up clinical trials may be useful in certain instances to confirm safety signals initially detected through database analysis or observational study or to eliminate suspected confounding by indication. However, the usefulness of a follow-up study is dependent on the mechanism of action of the product, the disease population, the safety event of interest, and other factors. FDA should consider all of these factors carefully before requesting a clinical trial for the purpose of the confirming the presence or absence of an association between a product and safety.

·         Ethical Considerations Exclude Patient Trials or Studies: In other instances, database analysis may be preferable when certain endpoints or exposures in an interventional (or even a prospective observational) study might raise ethical issues.

·         Product is in Wide Distribution: Database analysis is most effective when the product in question is in wide distribution and there has been sufficient exposure in the population covered by the database to provide statistical power to detect meaningful differences between appropriate comparison groups. Observational studies may be preferable for products that are administered to small patient populations. For example, orphan products may not be ideal candidates for analysis of large datasets and can be more easily studied through registries (or small, targeted studies) because registries identify and generally actively recruit those populations.

·         Adverse Event is Associated with Long Latency: Retrospective database analysis can more effectively determine associations between drug exposure and an adverse event with a long latency period, assuming that the databases have long-term follow up data. Prospective observational studies and clinical trials are better suited for identifying short-term adverse events.

·         Exposures of Interest can be Clearly Distinguished in the Database: A prerequisite for database analysis is the ability to accurately identify the exposures of interest. In situations where a drug is available over the counter or is used in a specialized treatment setting – and thus, not included in the electronic database - other investigative approaches will be needed.

·         All Analytical Items Required for the Study Question are Represented in the Available Healthcare Database: Compared to a randomized clinical trial, greater levels of patient information are required for proper unbiased analyses. All or many of the variables that have been demonstrated as representing risk factors for the endpoint of interest should be available in the database to assess risk. Drug use timing and dosing should be recorded at a level of specificity to properly denote at-risk populations. Endpoints of interest should be captured in the scope of the healthcare databases.

·         Events of Interest are Well Defined Medical Entities: Most electronic data systems currently use ICD 9 CM. The ability of the ICD-9 system to identify the indication and/or the outcome of interest limits its utility in specific settings. For example, some diverse and distinct conditions may be grouped under a single ICD 9 rubric. Other ICD 9 codes for cancers do not identify the biomarkers that may define an indication. Additionally, manifestations of hypersensitivity, a common safety issue with biological products, are poorly defined and may be difficult to identify using ICD 9 coding and may not in fact be recorded as a final diagnosis.


BIO notes however that the above criteria for selecting database analysis rather than other study methods are fully contingent upon the availability of quality, validated data sources. In order to validate data sources in a standardized manner for study of any product, BIO suggests that FDA’s guidance require: 1) a description of the commonly used electronic healthcare data sets (to be written by the owner of the data sets); 2) a proposed system or process that can help to ensure the data are accurate and received in a timely manner; and 3) validated algorithms for common outcomes of interest (similar to the creation of standardized MedDRA queries (SMQs) for the AERS database). These validated algorithms would be created by the owners of the data sets and all researchers would use the same algorithm for assessing a given issue within that data set in order to facilitate comparisons across studies. Additionally, the data sources themselves should be held to some common standard of quality (such as vendor quantifying false positive and false negative rates for, at a minimum, Designated Medical Events relative to medical records or claims profiling) that is clearly articulated.

BIO also notes that it is important to distinguish between electronic medical records (EMR) and claims data. Electronic claims data are designed to facilitate insurance payments and are subject to potential distortions based on their primary function. EMRs are primarily designed to support patient care but have limitations in terms of standardization of terminology and recording practices. It is important to use the most appropriate database to minimize limitations and the rationale should be provided for whatever type of database is selected. The choice of studies and databases should maximize the quality and completeness of the required information of drug usage, necessary covariates for adjustment of imbalances, comprehensive measurement of endpoints, and statistical power to detect clinically meaningful effects (or confidence interval narrowness).

BIO also notes that many biologic products are administered in specialty settings, such as outpatient clinics and hospitals, and this may have an impact on selecting an appropriate data source for pharmacoepidemiological safety studies on biologics. BIO would be pleased to work with the FDA and other key stakeholder groups, such as the International Society for Pharmacoepidemiology (ISPE), to further evaluate the most appropriate data sources and methodologies for database analysis specifically relating to biologic products.


BIO also suggests that the FDA guidance should examine what constitutes the burden of evidence in observational research using electronic healthcare databases. In other words, when does FDA or a sponsor have enough confidence in a finding generated through observational research to warrant additional follow-up or regulatory action? For product approval, the standard for demonstration of efficacy is two well-controlled clinical studies, but the standard for safety findings in observational studies is yet to be defined. For example, would an observational study be confirmed if the findings are replicated:1) in multiple databases using the same definitions, or 2) utilizing different methods or internal validation (i.e., randomly selected validation cohorts) in the same database? How would conflicting results be regarded, within or between databases? We suggest that FDA consider the variables, or axes of information, necessary to validate an observational study, including whether the analysis relied upon multiple methods, multiple databases, and/or multiple definitions over time.

Analytic methods that are used should be fully evaluated in the public sphere and subject to peer review and evaluation. This transparency should extend to the full analytic protocol such that precise definitions of all variables and covariates are presented.

We also recognize the need to better understand the “natural history of a signal.” For example, what are the rules of monitoring in these data and at what point does a potential signal cross a confidence threshold such that it is no longer just a transient phenomenon?

Finally, BIO notes that there have been a number of discussions on this topic that may be useful to the Agency to help inform the guidance regarding the weight of evidence generated through pharmacoepidemiological study, outcomes research, and other relevant disciplines, as each has its own set of unique issues. i, ii , iii, iv


Though BIO supports the development of guidance on safety studies using large electronic healthcare databases, we note that FDA has three years to issue a final guidance, consistent with the PDUFA IV performance goals. We encourage FDA to shorten this timeline, and issue a final guidance before three years have passed. If this is not possible, we would encourage the Agency to consider what preliminary advice the Agency will offer to medical reviewers, academia, and industry in the interim period as these types of database analyses become increasingly common.


BIO appreciates this opportunity to comment on Developing Guidance on Conducting Scientifically Sound Pharmacoepidemiologic Safety Studies Using Large Electronic Healthcare Data Sets. We would be pleased to provide further input or clarification of our comments, as needed.


/s/ Andrew J. Emmett

Director for Science and Regulatory Affairs

Biotechnology Industry Organization

Read the comments