cryptic pockets conformational dynamics druggability molecular dynamics

Cryptic Binding Pockets: How Conformational Sampling Changes Druggability Predictions

2024-12-05 Tomás Vidal — Computational Chemistry Lead 11 min read

The published crystal structure shows one conformation. The therapeutically relevant binding event may involve a pocket that only opens transiently during protein dynamics. Conformational sampling methods now make these cryptic pockets accessible for structure-guided campaigns.

What a Crystal Structure Does and Does Not Tell You

X-ray crystallography captures the lowest-energy conformation of a protein in a specific crystal packing environment. For well-behaved, conformationally rigid binding sites — the ATP-binding cleft of a kinase in the DFG-in conformation, the active site of a serine protease — this conformation closely approximates the bioactive state that ligands exploit. For proteins with significant inherent flexibility, or with allosteric sites that open conditionally, the crystal structure may represent a closed or inactive state that does not reflect the pocket geometry available to a small molecule in solution.

This is not a failure of crystallography. It is a physical reality of protein dynamics: the crystal lattice forces and the cryo-cooling conditions used to reduce radiation damage both disfavor conformational heterogeneity. The population-weighted ensemble of conformations that a protein explores at physiological temperature in solution is partially collapsed onto a single low-energy state in the deposited structure. Cryo-EM structures are somewhat more informative about conformational heterogeneity — multi-particle 3D classification can separate distinct conformational populations — but resolving minor states below 15-20% population frequency remains challenging even with state-of-the-art data collection.

Cryptic binding pockets are defined as pockets that are absent or occluded in the apo crystal structure but become accessible during protein dynamics. Published analyses of MD simulation ensembles suggest that 30-40% of surveyed protein surfaces contain cryptic pocket events when simulations are run for sufficient timescales. The fraction that are druggable — deep enough, enclosed enough, and with sufficient physicochemical complementarity to bind small molecules with sub-micromolar affinity — is smaller, but includes therapeutically relevant examples across kinase, protease, and intrinsically disordered target classes.

Why Cryptic Pockets Matter for SBDD Campaigns

The canonical example from the published literature is the allosteric site on BCR-ABL kinase exploited by asciminib. The myristoyl binding pocket on the N-terminal lobe of ABL kinase is occluded in most published crystal structures of the kinase in its active conformation. It opens in a specific inactive conformation associated with myristoyl-mediated auto-regulation. An SBDD screen against the canonical ATP-binding site of ABL would not have discovered asciminib — the pocket simply was not visible in the standard structural input. The discovery required a combination of NMR fragment screening and targeted crystallography under conditions that captured the open allosteric state.

A second well-documented case: c-Src kinase has a cryptic pocket in the αC-helix region that opens transiently in MD simulations of the DFG-out conformation. Published Markov State Model analysis of the Src conformational ensemble identified this pocket's open state as comprising approximately 8-12% of the sampled population under physiological simulation conditions — visible in MD trajectories but not represented in any PDB structure available at the time of the analysis. Fragment docking against the open-state pocket geometry identified chemotypes that selectivity profiling subsequently confirmed showed preference for this allosteric vector over the ATP site.

For programs against historically difficult targets — PPI interfaces, transcription factors, constitutively active mutant oncoproteins — the question of whether cryptic pockets exist is frequently the most important structural biology question a discovery program can ask. Treating a protein as undruggable based solely on its apo structure is precisely the error that a conformational ensemble analysis can correct or confirm at relatively low cost compared to a full HTS campaign against a blind target.

Conformational Sampling Methods

Standard MD simulations at nanosecond timescales are adequate for sampling backbone fluctuations and side-chain rotamers near the binding site. They are not adequate for sampling large-amplitude conformational events — loop openings, helix displacements, domain rotations — that generate cryptic pockets. These events occur on microsecond to millisecond timescales, outside what is computationally tractable for routine project work with standard GROMACS or AMBER MD.

Accelerated MD (aMD) applies a dihedral boost potential that flattens energy barriers, accelerating conformational sampling by two to four orders of magnitude at modest accuracy cost for relative population weights. Gaussian accelerated MD (GaMD) applies a similar boost without requiring pre-defined reaction coordinates, improving generality. Both approaches require calibration to recover the correct reweighting of sampled conformations back to the unperturbed ensemble — uncalibrated aMD populations are qualitative, not quantitative.

Mixed-solvent MD (cosolvent MD, MixMD) probes potential binding sites by running simulations with low concentrations of small organic probe molecules — isopropyl alcohol, acetonitrile, benzene, fragment-sized heterocycles — dissolved in the explicit solvent around the protein. Sites that accumulate probe density above background are candidates for cryptic pocket opening. FTMAP provides a computationally cheaper solvent mapping approach without full MD trajectories. Published results on MDM2, KRAS G12C, and Hsp90 N-terminal domain all demonstrated probe accumulation at cryptic or allosteric sites before those pockets were confirmed by crystallography.

Markov State Models (MSMs) offer a rigorous statistical framework for analyzing long-aggregate MD trajectory data. By clustering trajectory snapshots into conformational microstates and estimating transition probabilities between states, MSMs can identify metastable conformational populations at timescales inaccessible to any single trajectory. The PyEMMA and deeptime libraries provide accessible implementations. The limitation is data requirements: well-converged MSMs for a new target typically require hundreds of microseconds of aggregate simulation time, making MSM construction a project in itself rather than a screening add-on.

From Conformational Ensemble to Docking-Ready Pocket

Identifying candidate cryptic pocket events in an MD ensemble is the first step; converting those to a docking-ready model is the second. The challenge is that pocket geometry in a transient open state may only be well-resolved in a minority of trajectory frames, and those frames must be selected, clustered, and prepared for docking in a way that preserves the conformational context.

A practical workflow: cluster the MD trajectory by the RMSD of pocket-lining residues (not global backbone RMSD, which is dominated by distant fluctuations). Select representative frames from clusters with pocket volume above approximately 200 Å³ for fragment-sized ligands. Prepare each representative frame through PROPKA protonation assignment at pH 7.4, apply restrained minimization in OPLS4 or AMBER force fields (PrepWizard in Schrödinger Suite handles this efficiently), and define the docking grid around the newly opened pocket volume using SiteMap receptor point detection.

Ensemble docking — screening a fragment library against multiple representative receptor conformations simultaneously — is more expensive than single-structure docking but returns richer information. A compound that docks well in only 1 of 5 conformational states occupies a transiently accessible pocket; this may still be therapeutically relevant if the target's open-state occupancy is sufficient for pharmacological effect, but it signals higher entropic cost compared to a compound that docks consistently across multiple states. Cross-docking performance — how well poses from one receptor conformation transfer to related conformations — provides an early estimate of binding mode stability and transferability before synthesis.

Druggability Prediction Under Conformational Uncertainty

Standard druggability prediction tools — SiteMap Dscore, fpocket score, DoGSiteScorer — are calibrated on static crystal structures. Their performance on dynamically opened cryptic pockets is less reliable, because the tools were not designed for transient geometries. A pocket that scores as undruggable in the apo crystal structure may score as marginally druggable in a handful of MD frames and clearly druggable in a small fraction of fully open-state frames. Interpreting this distribution requires judgment that the scalar Dscore does not provide alone.

We're not claiming that MD-based cryptic pocket analysis guarantees a druggable pocket exists. The majority of computationally identified cryptic pockets do not produce viable drug programs. What the analysis does is change the probability estimate: from "this target is undruggable (no visible pocket in the apo structure)" to "this target has conditional pocket accessibility, and these fragment chemotypes are predicted to engage the open state." That is a qualitatively different starting point for a discovery program decision than treating the protein as structurally intractable.

Experimental validation remains necessary. SPR experiments with fragment libraries against the full-length protein — not a truncated construct crystallized under conditions that stabilize the closed conformation — provide the ground truth for whether the MD-predicted pocket is real and bindable. HDX-MS (hydrogen-deuterium exchange mass spectrometry) can confirm whether a compound perturbs conformational dynamics at the predicted allosteric region. These experiments are more accessible at the CRO level than they were five years ago and should be planned from the start of any cryptic pocket campaign.

The KRAS Lesson

KRAS G12C provides the defining public case study in cryptic pocket exploitation at drug-approval scale. The switch-II pocket (S-IIP) targeted by sotorasib and adagrasib is invisible in the GDP-bound KRAS crystal structure. It opens transiently during nucleotide cycling. Fragment screening by the Shokat laboratory using a thiol-reactive covalent fragment panel identified the pocket biochemically before any structural model of it existed. Co-crystal structures (PDB 6OIM and related depositions) confirmed the S-IIP geometry that now underpins multiple approved molecules and a broad clinical pipeline.

The KRAS discovery sequence — biochemical fragment hit, then structural characterization of the pocket the fragment revealed — is the inverse of standard SBDD: structure came last, not first. It demonstrates that cryptic pocket campaigns can begin without a structural model when the screening format is correctly designed for covalent or covalent-fragment-based detection. But once the pocket was characterized, SBDD principles took over: the co-crystal structure of the S-IIP in its open state drove the evolution from a fragment Kd of ~100 μM to the potent, selective inhibitors that entered clinical development.

For programs where conventional SBDD has stalled on pocket accessibility, conformational dynamics analysis is a tractable next step. The computational methods are now standard tools, not frontier methodology. For how this workflow is applied in our structure-guided campaigns, see the Capabilities page and the conformational sampling methodology described on our Science page.

Related reading: Docking Scores vs FEP discusses how binding mode uncertainty at different structural confidence levels affects method selection for hit triage.