Most people can see the value of a central database with FAIR data that can be accessed across departments and organisations. It creates the best foundation for an efficient drug discovery process and ultimately increases the chances of getting new medicines to market faster.
But while agreeing on the end goal is simple, the path to get there isn't always.
A fundamental challenge is that the lab scientists who produce data have little incentive to put a lot of effort into ensuring FAIR data and getting it uploaded correctly to a centralised platform. They are not the ones who will be working with the data going forward and need it to be easy to analyse across datasets. Once they have their IC50/ED50 value in place, they are, for good reason, busy moving on with their core task - running the next experiment and creating new data.
Data managers, data scientists and project managers, on the other hand, have the biggest need to get a FAIR data store so they can maximise the value of the data.
Build a bridge that's easy to cross
Between these two objectives lies the upload process, and this is where the dilemma shines through. Lab scientists want to spend as little time on the task as possible. Data managers need quality data, and this places demands on the data that is uploaded. Demands that take time to fulfil.
The key is to build a bridge that is as easy to cross as possible while not compromising on data quality.
In many systems today, the bridge is anything but easy to cross. Often the process looks something like this:
- Lab scientist uploads the compounds used in the experiment.
- Error: External compound numbers already exist, they must be unique.
- Upload is cancelled and lab scientist corrects the external compound numbers.
- Attempts to upload again.
- New error: There is text in the number columns.
- Upload cancelled and lab scientist corrects data.
- Attempt to upload again.
- And so on...and by now they've had enough.
A smoother upload experience
In the new release of our scientific data management platform grit, we wanted to make this process smoother. So instead of having to start from scratch, you 'll be guided through the necessary corrections during the process. This way lab scientists can spend less time preparing data and complete the upload process more efficiently.
For example, if you upload 100 compounds and the structure on 5 of them already exist, you will be prompted to define how to handle them. Should they be uploaded as new compounds, skipped or how should it be handled?
It's not a question of lowering the bar for data quality, quite the opposite. We use controlled vocabulary to ensure that we only receive accepted values and hence data of the right quality - also at metadata level. This requires data to be not only correct, but also compatible, and to alleviate this task, we have optimised the upload process to make it easier to deliver data in the right format.
In this way, we try to meet the needs of the busy lab researcher and offer a better way to deliver data, while giving the data manager the right foundation to build and maintain a high-quality data management platform with FAIR-enough data.