Drug discovery is no longer a single-modality game. What once looked like a world dominated by small molecules has evolved into a landscape where antisense oligos, siRNA, peptides, antibodies, etc., coexist in the same pipelines.
It’s an exciting change. Multimodal drug discovery opens new therapeutic doors and expands the set of biological questions scientists can explore. But with this evolution comes a quieter, often underestimated challenge: data infrastructure that wasn’t built for it.
Often, data lives in different systems, structures aren’t aligned, and scientists have to jump between tools to stitch insights together manually. As science becomes more multimodal, the cracks in legacy systems widen.
The uncomfortable truth is that most scientific data systems weren’t built with multimodality in mind.
Historically, platforms tended to grow along one of two paths:
On paper, these platforms now “support multiple formats.” In practice, bolted-on support often means:
It works - until complexity rises. Then teams find themselves juggling spreadsheets, external scripts, and multi-system workflows just to answer basic scientific questions.
This is an architecture problem. When systems are designed around a single entity class and later extended, the issues appear.
Multimodality isn’t reached just by adding an extra bullet to the product feature list. It does require a different way of thinking about research data.
Below is a list of the foundational design principles shared across modern multimodal data systems:
1) One flexible entity model. A shared schema capable of representing small molecules, sequences, proteins etc without forcing rigid categories. No separate silos.
2) Consistent identifiers. Entities, batches, and variants should connect seamlessly - across modalities and across time.
3) Unified assay context. In multimodal environments, everything connects back to biological evidence. Assay data should be included from the start, not bolted on.
4) Controlled vocabularies readiness. Not because it’s fashionable, but because language consistency future-proofs data reuse.
5) Transparent and extensible architecture. Systems must evolve as science evolves. That means APIs, scriptability, and modular extensibility - not black-box upgrades.
Since modern science rarely sits inside a single modality, grit was designed from the ground up to support multi-entity, cross-assay research without forcing rigid categories or separate systems.
Instead of splitting chemistry and biologics into different modules, grit uses a flexible data store that can represent small molecules, oligos, antibodies, peptides and emerging modalities as they appear in pipelines.
Assays, batches, and results all live in the same datastore, so cross-modality questions don’t require stitching data between systems.
This matters because the real bottleneck in multimodal R&D isn’t the modality registration - it’s connecting experimental evidence to the entities being explored and doing so consistently over time.
With grit, teams can:
The result is a data layer where new modalities don’t break the model, updates don’t depend on vendor releases, and SAR exploration spans the full spectrum of molecular types and biological evidence.
Early drug discovery moves fast. The last thing a research team needs is a system that forces data workflows to follow software boundaries instead of scientific direction.
A multimodal-ready foundation gives scientists and data managers a tangible advantage:
It also reduces long-term technical debt. Teams aren’t forced to re-platform the moment a new modality enters the portfolio, or maintain parallel systems just to track different molecule classes.
As discovery becomes increasingly hybrid, the data layer must be flexible first, capable of representing science as it is practiced - not as legacy systems expect it to be.
That’s what grit was built for: a research data foundation that matches the reality of modern drug discovery, and has room for whatever comes next.