ApprovalAlpha — FDA Approval Probability Engine

Data

Live, point-in-time-safe collection at scoring time — no look-ahead from information unavailable on the catalyst date.

Source	Data
ClinicalTrials.gov	Trial phases, enrollment pace, primary endpoints, p-values
SEC EDGAR	XBRL financials, Form 4 insider trades, 8-K filings
openFDA	Designations, AdComm votes, CRL history, prior approvals
CTO Dataset	14,700 Phase 3 trial outcome labels (Gao et al. 2024)
PubMed	Published Phase 3 trial results for endpoint verification

Model

Each drug starts from a historical phase × indication base rate (BIO/IQVIA). Factors across four signal categories shift the probability up or down in log-odds space; an isotonic map fit on a rolling 2018+ window corrects systematic overconfidence at the high end.

Category	# Factors
Clinical	12
Regulatory	11
Financial	3
Non-linear interactions (SHAP)	3

Validation

Trained on ~1,000 historical public NDA/BLA events through 2020, evaluated out-of-time on ~500 events from 2021-present.

Metric	Value
AUC-ROC (out-of-time, n=512)	0.841
Small-cap subset (mega-pharma excluded)	0.81
Brier score	0.117
Expected Calibration Error	2.1%
Accuracy (optimal threshold)	83%

The headline AUC is buoyed by mega-pharma events (PFE, JNJ, NVS) that are trivially predictable from sponsor track record. The small-and-mid-cap subset — where the prediction is hard and the use-case lives — is 0.81.

Benchmarks

All models trained and evaluated on the same out-of-time public test set.

Model	AUC-ROC
ApprovalAlpha	0.841
Lo et al. 2019 method (reproduced)	0.807
Phase × indication base rate only	0.667

Designations Are Not a Free Lift

A propensity-matched analysis on the training data: after controlling for base rate, trial results, sponsor history, endpoint type, and mechanism of action, Priority Review is the only FDA designation with a positive causal effect. Breakthrough Therapy, Fast Track, and Orphan Drug are markers of clinical difficulty — FDA grants them to drugs whose path is inherently harder. The model uses the causal-adjusted coefficients, not the naive correlations.

Dossier Outputs

Beyond the headline probability, every scoring emits five structured outputs. Deterministic post-hoc transforms of the model output — no LLMs, no extra training.

Output	What it is
Top drivers	Per-factor pp impact on this specific prediction.
Comparables	Five most-similar historical events (cosine similarity weighted by L1 coefficient magnitude) with realised outcomes.
Sensitivity	What the prediction becomes if any one factor flips toward the opposite class's typical value.
Subscores	Clinical / Regulatory / Sponsor decomposition. Describes which feature bucket pulls probability down, not which CRL category will occur.
Probability history	Sparkline of prior scorings for the same drug as new data lands (8-Ks, AdComm decisions, dilution events).

Disclosures

Model outputs reflect public information only; non-public FDA review material is not observable.
Historical accuracy does not guarantee future performance.
Not investment advice. Research tool, not a buy/sell signal.

References

Lo, A.W. et al. (2019). Machine learning with statistical imputation for predicting drug approvals. Harvard Data Science Review.
Siah, K.W. et al. (2021). Predicting drug approvals: the Novartis data science and AI challenge. Patterns.
Wong, C.H. et al. (2018). Estimation of clinical trial success rates and related parameters. Biostatistics.
Gao, Z. et al. (2024). CTO: Clinical Trial Outcome prediction dataset. NeurIPS Datasets & Benchmarks.
BIO/IQVIA/QLS Advisors (2021). Clinical Development Success Rates 2011–2020.

Built by Sean Koth, finance student at Fordham University's Gabelli School of Business.

LinkedIn ↗

Methodology