METHODOLOGY

How it works

ApprovalAlpha estimates the probability that a drug application will receive FDA approval by combining published clinical success rates with real-time signals from public data sources.

DATA SOURCES

ClinicalTrials.gov — Active trials, enrollment pace, and phase advancement history for every program in the pipeline.

SEC EDGAR — Financial filings provide cash runway and insider transaction patterns. 8-K filings are scanned for FDA correspondence including advisory committee outcomes and designation grants.

FDA openFDA API — Drug application records, approval history, and regulatory track record at the company level.

MODEL

A proprietary multi-factor probability model anchored to published phase × indication success rate tables. Each program is scored independently across a set of clinical, regulatory, and financial factors, then combined into a portfolio-level probability.

The model is validated against a historical dataset of FDA decisions using a time-based holdout. Output is a calibrated probability estimate with a confidence interval, not a qualitative rating.

VALIDATION

The model is evaluated on a time-based holdout: calibrated on pre-2021 decisions, tested on 2021–2024 outcomes. The historical dataset spans 257 FDA decisions from 2013–2024 across approvals and complete response letters.

AUC-ROC
0.701
vs. 0.50 random baseline
Accuracy
75.8%
at 0.50 threshold
Test Events
91
2021–2024 holdout
Historical Dataset
257
FDA decisions, 2013–2024
Training Window
2013–2020
calibration period
Data Sources
3
per ticker, fetched live

LIMITATIONS

Manufacturing and CMC-related complete response letters are structurally undetectable from public pre-PDUFA data and represent the primary source of model error. All outputs are for research purposes only and do not constitute investment advice.

Model outputs are for research purposes only and do not constitute investment advice.