Announcing Topos-1 — Read the Technical Report →

Introducing Topos-1

Today we're releasing the first technical report on Topos-1, our all-atom foundation model for intrinsically disordered proteins (IDPs). Topos-1 generates physically realistic conformational ensembles for flexible proteins and does so at a speed that makes ensemble-level modeling practical in day-to-day research and drug discovery.

Artificial intelligence is revolutionizing biomedicine due to groundbreaking advances in protein structure prediction, but many diseases remain out of reach because the proteins involved are too dynamic for current tools.

Why static structures break down

Biomolecules are dynamic, but most experimental measurements only resolve the static elements of their structure.

Many proteins consist of, or contain sequences that have, a high degree of intrinsic disorder, meaning that they do not adopt a single well-defined state and are not resolved in the most widely used experimental characterization methods.

Estimates suggest that roughly one-third of the human proteome lacks a single well-defined structure. This matters because many essential interactions, including regulation, signaling, transcriptional control, and phase behavior, depend on flexible motifs and state-dependent recognition.

A foundation model for conformational ensembles

Topos-1 is an all-atom generative model that produces ensembles of physically realistic conformations for disordered proteins and regions. Instead of returning one structure, Topos-1 is able to generate tens of thousands of conformers in minutes, at full atomic resolution.

Topos-1 is designed to produce complete conformations suitable for downstream physical analysis and structure-based workflows. Both the model design and training pipeline are aimed at capturing the statistics of highly flexible proteins. This is an area that is sparsely represented in the classical structural databases that enabled the current wave of AlphaFold-style structure predictors. The training of Topos-1 leverages large-scale physics-based simulations and internal experimental measurements specifically designed for disordered proteins.

Validated against experiment

We evaluate Topos-1 on both global and local experimental observables.

Global ensemble properties (SAXS and radius of gyration)

We curated a large literature dataset of SAXS-derived radius of gyration (Rg) measurements for IDPs.

Across 104 IDPs, Topos-1 reduces normalized Rg error by at least 40% compared to leading structure prediction models including BioEmu, Chai-1, Boltz-2x, and AlphaFold-2.

Local conformational statistics (NMR chemical shifts)

We also benchmarked NMR chemical shift agreement, which probes local backbone geometry.

Across a set of IDPs with high-quality NMR data, Topos-1 achieves at least 30% lower error compared to leading all-atom structure prediction models including Boltz-2x, AlphaFold-2, and Chai-1.

Together, these results show that Topos-1 improves predictions for both the large-scale shape of the ensemble and the local geometry that control recognition, binding, and function.

Consistent with molecular dynamics, but at a fraction of the time

Molecular dynamics (MD) remains the most general way to generate IDP ensembles, but it’s too slow and expensive to be practical for rational drug design.

In the report, we compare Topos-1 to reference MD ensembles across 270 IDPs and evaluate multiple global ensemble metrics, including Rg, RMSF, and secondary-structure propensities. Topos-1 is the top-performing all-atom model across these observables, outperforming general-purpose structure predictors and prior IDP-focused methods.

Our estimates suggest Topos-1 can generate ensembles roughly 1,000x faster than atomistic MD in explicit solvent.

Case studies

Parkinson’s disease: α-synuclein

α-synuclein is a canonical IDP central to Parkinson’s disease biology. In our case study, Topos-1 generates ensembles that closely match both experimental secondary-structure propensities and high-quality MD reference data (including long-timescale simulations). We also analyze finer-grained geometric agreement (e.g., residue-level dihedral statistics), where Topos-1 shows significantly superior performance compared to baseline models.

Prostate cancer: androgen receptor N-terminal domain (AR NTD)

The androgen receptor NTD is a clinically validated driver of aggressive prostate cancer and a notoriously difficult intrinsically disordered target.

In the report, we use Topos-1 ensembles as the structural substrate for ligand evaluation and show that predicted rankings align with experimentally measured potencies from cell-based assays. Notably, even without prior information about binding sites, the Topos-1–based pipeline achieves meaningful correlation and ranking power against experiment (e.g., a representative run reports Pearson r ≈ 0.58 and Spearman ρ ≈ 0.64), and repeated runs with different random seeds yield consistent rankings.

The technical report contains the full set of benchmarks, evaluation protocols, and case-study details, along with evidence for power-law scaling with additional data and model capacity.

Topos-1 is our first major milestone toward a broader goal of designing new therapeutic strategies that address so-called “undruggable” diseases.

Topos-1 model animation
Topos-1 generates conformational ensembles that capture the dynamic behavior of intrinsically disordered proteins.

Read the complete technical details in our full report.

Full Report (PDF)