Data code against a backdrop of people walking
Perspective
|
Christoph Eberle, PhD, Melvin Lye

A Balanced Look at Promises and Hurdles of Virtual Control Groups

A primer on the use of virtual control groups in both the clinical and preclinical setting

Bench and Bytes logo.jpg Virtual control groups (VCGs), an established concept in clinical trials, are now being vetted in the preclinical setting as a potentially efficient, ethical, and cost-effective alternative to traditional animal controls. What exactly are VCGs? What advantages do they offer, and where do they fall short? And are they universally applicable across therapeutic areas, or more suited to some applications than others?

At its core, a virtual control group is a control cohort composed of subjects not enrolled in the same trial but constructed retrospectively from pre-existing data (historical trial records, real-world data, or legacy databases). In the clinical setting, practice outcomes from virtual (untreated or standard-of-care) patients are compared with those of trial participants receiving the experimental therapy.

In nonclinical animal toxicity studies, VCGs have been proposed to replace or reduce the number of living control animals by leveraging previously collected control data under similar experimental conditions. Thus, VCGs are not a fundamentally new type of trial, but rather a design strategy: instead of contemporaneously enrolling a control arm, researchers borrow existing data to form a comparator. They are virtual in the sense that the subjects already existed, and the data had already been generated. Over the past two years, several companies have been piloting VCGs as replacements for animal controls, including Charles River. In partnership with several pharmaceutical companies, Charles River is leveraging years of legacy data, loosely defined as groups of data points created from historical control datasets associated with past studies, and seeing how these virtual controls stand up to their live animal counterparts in toxicology studies. Thus far, the pilot data from the drug companies and Charles River’s own internal data look promising. Simultaneously, Charles River Discovery scientists are exploring the use of VCGs in certain cancer studies.

Through our 4Rs mission of replacement, reduction, refinement, and responsibility, virtual control groups are driving innovation within the nonclinical phase of drug development.

Driving Innovation Through VCGs
With a depth of nonclinical experience and our dynamic biotechnology network, we are fusing toxicology science with machine learning for a powerful alternative. Modality and model knowledge are leveraged to statistically and biological qualify and analyze data on targeted studies for this virtual control group option.
Learn More

Why We Should Include Virtual Control Groups

There are compelling reasons to adopt VCGs or, more broadly, external control arms in clinical and preclinical research, especially when traditional control arms pose ethical, practical, or logistical difficulties:

  • Ethical and patient-centric advantages: Because all participants receive the experimental therapy, no one is assigned to a placebo or no-treatment arm. This can be especially important in serious or life-threatening conditions (e.g., rare diseases, oncology) or in settings where withholding effective therapy is unethical.
  • Resource efficiency and cost/time savings: Enrolling large control cohorts can be expensive, time-consuming, and logistically challenging. Using an existing dataset avoids these burdens.
    Improved feasibility in small or rare disease populations: When trial populations are very limited, VCGs (or more generally, external controls) may be one of the few realistic options for evaluating efficacy or safety.
  • Flexibility in early-phase or single-arm trials: VCGs can lend evaluative power to phase 1/2 trials, open-label extensions, or expanded-access programs, where the primary aim may be safety, but early efficacy signals are still useful.
  • Lower animal numbers in preclinical studies: In nonclinical toxicology, adoption of VCGs can meaningfully reduce the number of living animals used, in line with ethical and regulatory incentives. Some estimates point to reductions of around 25%.

Nevertheless, evidence from both preclinical and clinical contexts indicates limitations, making VCGs a complement rather than a wholesale replacement for traditional control groups. Key concerns include:

  • Risk of bias and non-randomized comparisons: Because VCGs rely on historical or external data, the comparison between treated and control subjects may not be randomized. This opens the door to selection bias, confounding, and other systematic differences (in diagnosis, baseline characteristics, co-interventions, standards of care, data collection methods) between groups, all of which can distort effect estimates. Indeed, a classic review comparing historical controls with randomized concurrent controls concluded that survival and relapse-free survival outcomes often diverged substantially. In 42% of matched control groups, the difference exceeded 10 percentage points; in 5%, it exceeded 30%. Such findings cast doubt on the assumption that historical controls reliably approximate what would have happened in a contemporaneous, randomized control arm.
  • Lack of blinding and potential for observer, patient, or analytic bias: Externally controlled or virtual arm trials are often unblinded (or unable to be properly blinded), which can influence outcome assessment, reporting, or analysis. Such biases are difficult to eliminate or fully adjust for.
  • Differences in outcome GettyImages-2230567462.jpgdefinitions, data quality, and measurement methods: Especially in real-world data or older trials, how outcomes were measured (e.g., lab assays, diagnostic criteria, follow-up schedule) might differ from the current trial. Such inconsistencies can lead to information bias or misclassification. In preclinical toxicity studies, this problem is illustrated vividly. A recent proof-of-concept (POC) study using VCGs in rats found that while many pathology endpoints matched between VCG and the concurrent control group (CCG), clinical pathology parameters often did not, raising concerns about sensitivity, detection of subtle effects, and reproducibility.
  • Non-equivalence to concurrent controls, especially when endpoints are subtle or quantitative: Some important endpoints (especially sensitive laboratory-based or subclinical measures) could not be reproduced reliably when historical data are substituted for actual concurrent controls. Moreover, success with VCGs in one context (e.g. simple binary outcomes) does not guarantee success for all endpoints, especially nuanced ones, which limits generalizability.
  • Regulatory and methodological challenges: Despite growing interest, regulatory agencies remain cautious. For example, when using an external control group, regulators emphasize that outcomes must be well defined and reliable, and comparable across datasets. Also, for preclinical implementation, success depends heavily on the availability of large, well-structured, harmonized historical control datasets. Finally, advanced statistical methods are often necessary to mitigate biases and account for variability, but these add complexity and may not fully compensate for unmeasured confounding.

In theory, VCGs are methodologically agnostic. The concept can apply to any therapeutic area, provided there are high-quality, relevant, and comparable historical or real-world data, and outcomes can be rigorously defined and measured. However, in practice, their suitability and reliability depend heavily on the disease area, endpoint type, and context. While VCGs can be applied broadly, they are not equally reliable across all therapeutic areas or all endpoints. Their success depends heavily on data quality, endpoint robustness, and methodological rigor. Below use-case matrix summarizes when Virtual Control Groups (VCGs) are suitable, promising, or risky across different contexts:

VCG Suitability

Context / Feature

Rationale

High

Rare diseases / orphan indications

Small patient populations make concurrent controls impractical.

 

Ethical pressures favor all patients receiving experimental therapy. 

High

Life-threatening conditions / ethically sensitive trials

Avoids placebo/no-treatment arms. 

 

Allows early access to experimental therapy.

Moderate to High

Early-phase / single-arm trials (Phase 1/2)

Can provide comparative context for safety or preliminary efficacy signals. 

 

Endpoints often surrogate or intermediate.

High

Objective hard endpoints (e.g. overall survival, mortality, measured biomarker changes)

Less prone to measurement bias.

 

More likely historical data are comparable.

Low to Moderate

Soft or subjective endpoints such as patient-reported outcomes

High risk of reporting bias. 

 

Historical comparators can differ in analytical methods.

Low

Long-term outcomes with changing standard-of-care

Historical data may not reflect contemporary treatment patterns.

 

Risk of confounding.

Moderate to High

Preclinical toxicology / animal studies

Reduces the number of animals.

 

Works well when robust historical control data exist under standardized conditions.

Low

Endpoints sensitive to experimental conditions or environment

Limitations from differences in experimental conditions, measurement protocols, or animal strains.

Moderate

Highly heterogeneous populations / real-world data sources

Statistical methods (propensity score matching, Bayesian models) can help, but unmeasured confounding remains concern.

Virtual Control Groups: A Versatile Toolbox When Used Prudently

Virtual control groups are not a methodological shortcut that will universally replace traditional randomized, concurrent control arms. Rather, it is a complementary tool that can reduce concurrent control requirements. This tool also offers genuine benefits in terms of ethics, efficiency, and feasibility, especially in certain high-need contexts such as rare diseases, early-phase trials, or preclinical safety studies. VCGs are most reliable when endpoints are objective, high-quality historical data are available, and patient populations are small or subject to ethical constraints. VCGs are riskier for subjective or environment-sensitive endpoints.

Their non-randomized, retrospective nature introduces bias, measurement heterogeneity, and uncertainty in interpretation. Particularly in nonclinical toxicology, some endpoints simply do not translate well from historical data, limiting reliability. Moreover, practical and regulatory hurdles remain. Large, well-curated historical datasets are often required, and advanced statistical methods are needed to mitigate bias. Therefore, VCGs should not be viewed as a “one-size-fits-all” solution, but as a nuanced option within New Approach Methodologies (NAMs). When deployed with careful attention to data quality, endpoint selection, and analytical transparency, VCGs can meaningfully accelerate research. However, they require rigorous justification, clear reporting, and, where possible, validation against traditional methods.

Request Info About VCGs

References:

1.    Steger-Hartmann T, Kreuchwig A, Vaas L, et al. Introducing the concept of virtual control groups into preclinical toxicology testing. ALTEX, 2020, 37: 343-349. doi: 10.14573/altex.2404201.
2.    Evans SR. Fundamentals of clinical trial design. J Exp Stroke Transl Med, 2010, 3: 19-27. doi: 10.6030/1939-067x-3.1.19.
3.    U.S. Food and Drug Administration. Guidance for Industry: E 10 Choice of Control Group and
Related Issues in Clinical Trials. 2001. https://www.fda.gov/media/71349/download.
4.    Seeger JD, Davis KJ, Iannacone MR, et al. Methods for external control groups for single arm trials or long-term uncontrolled extensions to randomized clinical trials. Pharmacoepidemiol Drug Saf., 2020, 29: 1382-1392. doi: 10.1002/pds.5141.
5.    Diehl LF, Perry DJ. A comparison of randomized concurrent control groups with matched historical control groups: are historical controls valid? J Clin Oncol, 1986, 4: 1114-1120. doi: 10.1200/jco.1986.4.7.1114.
6.    Andaya R, Sullivan R, Pourmohamad T, et al. A proof-of-concept rat toxicity study highlights the potential utility and challenges of virtual control groups. ALTEX, 2024 41: 647-659. doi: 10.14573/altex.2404201.
7.    Golden E, Allen D, Amberg A, et al. Toward implementing virtual control groups in nonclinical safety studies. ALTEX, 2024 41: 282-301. doi: 10.14573/altex.2310041.

 

Bench + Bytes is a column written by Charles River Scientist Christoph Eberle, PhD, and Melvin Lye, Senior Director, Scientific Affairs and Product at Curiox Biosystems. It is hosted by Eureka, Charles River's scientific blog.