Small-Molecule Screening at Specialty CROs: A Practical Overview

Laboratory setting for small-molecule screening at a specialty CRO

At some point every seed or Series-A biotech faces the same uncomfortable decision: keep small-molecule virtual screening in-house or hand it to a specialty CRO. It sounds like a straightforward make-vs-buy call. In our experience, it usually isn't. The choice depends on factors that aren't obvious until you've sat inside a 12-person discovery team watching eleven weeks evaporate on a single docking campaign.

When In-House Makes Sense — and When It Doesn't

Small biotech teams often assume that owning the pipeline is cheaper. Sometimes that's true. If your CTO or a postdoc hire already has Glide or OpenEye licenses, knows the workflow cold, and you're running fewer than three target campaigns per year, the unit economics can tip toward internal execution. We've seen teams do this well. But the calculation breaks down fast.

Most 10-20 FTE discovery organizations don't have a dedicated cheminformatics hire. They have a computational biologist who runs docking as a side function on top of MD simulations, protein prep, and internal data management. When a new target needs a full SBVS campaign, that person is already 70% allocated elsewhere. The campaign gets deprioritized. Timeline slips. Or worse, it gets rushed and the short-list quality suffers.

Specialty CROs exist precisely to fill that gap. The right question isn't "can we do this in-house?" It's "what is the actual cost of the eleven-week wait?" When your Series-A runway is 18 months and you need validated hits before the next data package, waiting 11 weeks for a short-list you could have had in four is not a neutral decision.

How to Scope a Campaign Correctly

This is where most outsourcing relationships go wrong. Vague scoping leads to vague outputs. Specifically.

Before you send an RFP or hop on a scoping call, have these four things defined:

  • Target structure quality. Is this a crystal structure from PDB, an AlphaFold2 model, or a homology model? The answer affects which docking protocols are appropriate and what confidence level you can reasonably place on binding pose predictions. A CRO that doesn't ask this question up front is a red flag.
  • Binding site definition. Co-crystal ligand, mutagenesis data, or pure predictive? Ambiguous binding sites require multiple grid placements and should add time and cost to any honest proposal. If the proposal has a flat-rate price regardless of binding site confidence, be skeptical.
  • Library composition and source. Are you docking against an internal collection, a vendor library like Enamine REAL, or a fragment set? Library size and structural diversity directly affect run time and false-positive rate. Get the exact SMILES count and format requirements in writing before signing.
  • Required deliverables. A ranked list of 50 compounds with docking scores is not a deliverable. A ranked short-list with predicted binding poses, ADMET flags, PAINS alerts, synthetic accessibility scores, and a Benchling-compatible data package — that's a deliverable. Write it into the SOW.

We routinely see proposals that describe the process in detail but are vague on output format. Fact: output format is what determines whether your medicinal chemist can act on the results immediately or needs another two weeks of data reformatting.

What Good Deliverables Actually Look Like

Here's what a solid short-list package should contain. Not aspirationally — as a minimum requirement you should put in the contract.

Deliverable Component Why It Matters
Ranked compound list (20-50 compounds) Manageable for medicinal chemistry review; shorter than a raw hit-list from docking alone
Docking scores (SP and XP where applicable) Enables score-based filtering and comparison across campaigns
Predicted binding poses (SDF or Maestro) Required for medicinal chemistry interpretation and scaffold comparison
ADMET flag annotations Filters out poor DMPK candidates before synthesis; saves 2-4 weeks of wet-lab attrition
PAINS substructure alerts Eliminates reactive/promiscuous compounds that generate biochemical false positives
Synthetic accessibility scores Prioritizes candidates your synthetic chemistry team can actually make
Campaign report (attrition funnel, score distributions) Documents the screening logic; required for IND-enabling audit trails

In our experience, about 40% of proposals we've reviewed omit PAINS filtering entirely. That's not a minor oversight. Pan-assay interference compounds are a persistent source of false positives in biochemical assays. A CRO returning a short-list without running at least the 480 Baell-Holloway substructure filters has handed you compounds that may look active in your first-pass screen for reasons that have nothing to do with binding to your target. You'll find out two months into your hit-to-lead program.

Red Flags in CRO Proposals

Honest take: not all specialty CROs are the same. Some are running Glide on cloud instances with a junior computational chemist reviewing output before it ships. Others have tight QC pipelines, named scientists with pharma backgrounds, and audit-ready documentation. The proposal language usually tells you which one you're dealing with.

Watch for these specific warning signs:

  • Black-box scoring. If the CRO cannot describe their re-ranking model, its training data, and its failure modes, you have no basis for trusting the final rank order. "Proprietary AI" with no scientific rationale is a red flag. Ask for a white paper or methodology summary. A serious operation will have one.
  • No PAINS filtering mentioned. Described above. Non-negotiable.
  • No ADMET at the screening stage. Some CROs position ADMET scoring as a separate, optional add-on service. That's a revenue-maximization move, not a scientific one. ADMET flags applied at the short-listing stage directly affect which 50 compounds make the cut. Moving it to a separate downstream step reduces short-list quality.
  • Deliverable described as a "hit list" without output format specification. See the table above. If the contract doesn't specify format and schema, you're negotiating data reformatting costs later.
  • No documentation of protein prep steps. AlphaFold2 models in particular require protonation state assignment, binding-site validation, and grid generation before docking. A CRO that skips or abbreviates these steps will produce docking results that are numerically plausible but scientifically unreliable.
  • Timeline under 48 hours for a 1M+ compound library. Legitimate GPU-accelerated Glide SP docking on a 1 million compound library takes 8-24 hours of active compute time. Add target prep, ADMET annotation, PAINS filtering, report generation, and QC review, and any honest estimate is 3-5 business days at minimum. Under 48 hours for a full campaign is a number that doesn't add up.

The In-House vs Outsource Decision, Simplified

Here's the framework we use when advising early-stage biotech teams:

Outsource if: you have fewer than 2 cheminformatics FTEs, you're running more than 2 target campaigns per year, or your current timeline from target structure to wet-lab synthesis is longer than 6 weeks. At that point, a well-scoped CRO engagement paying $25,000-$50,000 per target campaign will return measurably better data faster than stretching an overloaded internal resource.

Keep in-house if: you have a dedicated computational chemistry team with valid Glide and OpenEye licenses, an established QC process, and a validated internal pipeline that already delivers short-lists in under 5 business days. That's a real capability worth protecting.

The middle ground — teams that have some internal capability but no clean QC process or documentation — is the hardest case. That's where we most often see 11-week timelines and short-lists where 30% of candidates are PAINS compounds nobody caught. Not because the team isn't smart. Because the pipeline was built incrementally, never formalized, and is running on one person's institutional knowledge.

A Final Word on Audit Trails

If your discovery program is on a path toward IND-enabling studies, documentation isn't bureaucratic overhead. Every docking parameter, model version, and filter setting that shaped your short-list is part of the scientific rationale that a regulatory reviewer may want to understand. This applies even at the hit identification stage. A CRO that delivers results without a campaign report documenting the screening logic is saving themselves time and charging you for incomplete work. Insist on the report. Every time.

Running a structure-based virtual screening campaign and want to scope it properly? Talk to the Moleculepath team about what a complete deliverable package looks like for your target.

Related Articles

PAINS Compounds in Hit Identification
Drug Discovery

PAINS Compounds in Hit Identification: Detection and Handling

Early ADMET Filters and Attrition
Cheminformatics

Early ADMET Filters and Their Effect on Late-Stage Attrition

Hit Identification Timelines at Seed-Stage Biotechs
Drug Discovery

Hit Identification Timelines at Seed-Stage Biotechs: What to Expect