Skip to main content

Drug discovery is no longer a single-modality game. What once looked like a world dominated by small molecules has evolved into a landscape where antisense oligos, siRNA, peptides, antibodies, etc., coexist in the same pipelines.

It’s an exciting change. Multimodal drug discovery opens new therapeutic doors and expands the set of biological questions scientists can explore. But with this evolution comes a quieter, often underestimated challenge: data infrastructure that wasn’t built for it.

Often, data lives in different systems, structures aren’t aligned, and scientists have to jump between tools to stitch insights together manually. As science becomes more multimodal, the cracks in legacy systems widen.

Why traditional systems struggle

The uncomfortable truth is that most scientific data systems weren’t built with multimodality in mind.

Historically, platforms tended to grow along one of two paths:

  • Chemistry-first systems later adapted to biologics, or
  • Biologics platforms patched to handle small molecules

On paper, these platforms now “support multiple formats.” In practice, bolted-on support often means:

  • Separate registries for different entity types
  • Hard-coded schemas that can’t evolve easily
  • Inconsistent identifiers and metadata fields
  • Slow cross-modal queries
  • Reliance on manual exports for analysis
  • Difficulty linking assay results across entity classes

It works - until complexity rises. Then teams find themselves juggling spreadsheets, external scripts, and multi-system workflows just to answer basic scientific questions.

This is an architecture problem. When systems are designed around a single entity class and later extended, the issues appear.

What “true multimodal” data management means

Multimodality isn’t reached just by adding an extra bullet to the product feature list. It does require a different way of thinking about research data.

Below is a list of the foundational design principles shared across modern multimodal data systems:

1) One flexible entity model. A shared schema capable of representing small molecules, sequences, proteins etc without forcing rigid categories. No separate silos.

2) Consistent identifiers. Entities, batches, and variants should connect seamlessly - across modalities and across time.

3) Unified assay context. In multimodal environments, everything connects back to biological evidence. Assay data should be included from the start, not bolted on.

4) Controlled vocabularies readiness. Not because it’s fashionable, but because language consistency future-proofs data reuse.

5) Transparent and extensible architecture. Systems must evolve as science evolves. That means APIs, scriptability, and modular extensibility - not black-box upgrades.

How we are solving this with grit

Since modern science rarely sits inside a single modality, grit was designed from the ground up to support multi-entity, cross-assay research without forcing rigid categories or separate systems.

Instead of splitting chemistry and biologics into different modules, grit uses a flexible data store that can represent small molecules, oligos, antibodies, peptides and emerging modalities as they appear in pipelines.

Assays, batches, and results all live in the same datastore, so cross-modality questions don’t require stitching data between systems.

This matters because the real bottleneck in multimodal R&D isn’t the modality registration - it’s connecting experimental evidence to the entities being explored and doing so consistently over time.

With grit, teams can:

  • Store all modalities alongside pre-clinical assay data (in vitro, in vivo, DMPK, toxicology, SEND formats)
  • Configure entity types without engineering cycles
  • Apply user-defined vocabularies that enforce structure without rigidity
  • Filter assays and compounds across modalities, biological context, and metadata
  • Track lineage across molecule → batch → assay → results

The result is a data layer where new modalities don’t break the model, updates don’t depend on vendor releases, and SAR exploration spans the full spectrum of molecular types and biological evidence.

Why this matters for discovery teams

Early drug discovery moves fast. The last thing a research team needs is a system that forces data workflows to follow software boundaries instead of scientific direction.

A multimodal-ready foundation gives scientists and data managers a tangible advantage:

  • Faster answers to cross-modality questions
  • Cleaner data histories and traceability
  • Reusable knowledge across programs
  • Easier collaboration with CROs and academic partners
  • Fewer manual exports and spreadsheet detours

It also reduces long-term technical debt. Teams aren’t forced to re-platform the moment a new modality enters the portfolio, or maintain parallel systems just to track different molecule classes.

As discovery becomes increasingly hybrid, the data layer must be flexible first, capable of representing science as it is practiced - not as legacy systems expect it to be.

That’s what grit was built for: a research data foundation that matches the reality of modern drug discovery, and has room for whatever comes next.