Managing data workflows in a purification laboratory

Published on 15 March 2024 | Last modified September 14, 2024

In this insightful exchange, we are privileged to have John, a distinguished expert with over 40 years of experience in the scientific and informatics domains, particularly within the pharmaceutical industry. John’s extensive background in analytical chemistry, ranging from NMR and chromatography to chemical structure representation, coupled with his impressive track record in informatics project management, positions him as a key voice in the discussion about purification workflows and data management solutions in the pharmaceutical industry.

Could you walk us through the different steps in a purification workflow in the pharma industry?

For low-throughput purification, requests are made, often with some detail from the originating chemist. This can be by email, or from a submission system, such as a LIMS. High-throughput workflows generally use an in-house submission system, linked with an electronic lab notebook (ELN). In the simplest case, the purification lab may receive a CSV file containing details for the samples in the request. More advanced workflows may have an in-house database that allows the purification scientists to select information for submissions as the samples arrive.

Once the sample and request have been paired, the separation scientist will probably run several “scout” LCMS analyses to identify whether the desired material is present at a suitable level and what the best purification method would be. There may be several column or mobile phase types that are tested and evaluated in the course of this process. The retention time for the peak in the selected “best method” will often be used to select a “focused gradient” for the mobile phase composition of the prep run.

Once the method has been selected, the purification will take place and fractions will be collected and evaluated to decide which ones will be combined to continue through the process. There may be a single product isolated or multiple products, depending on the workflow and the submitter’s needs.

After combining fractions, these are evaporated and weighed before undergoing final QC via LCMS and an orthogonal analytical method (often NMR).

How big is the data obtained during a purification workflow and how is it managed?

The data generated from each step of the process is stored in a repository and a summary report is returned to the submitter as well as key information about purity, retention time, etc. This finds its way back into the ELN in either an automated or manual way, depending on the degree of integration between the processes.

A fully joined-up system may also link to compound management systems and the corporate chemical structure registry. Each of these systems have their own requirements for data and metadata from the process. Note that chemical structure is a key ingredient in the process and must be supported by the software!

Data volumes depend on demand. An analytical system may generate a 5 Mb dataset every 2-3 minutes and there will be at least two scout and one QC dataset for each sample. A prep dataset may be more like 30 Mb per sample. A purification factory may purify up to 20,000 samples per year.

What are the common challenges that labs encounter when managing this data? And how can a software solution effectively address these issues?

There are a number of different systems that generate and consume data as part of the process. These are often commercial systems and require transformation of information from one system to another, which can be challenging. There is also the practical consideration of keeping track of such large amounts of data and metadata, not to mention the physical samples and fractions. With rigour, it is possible for people to manage this manually with well-defined processes and documentation; however, this approach is highly error-prone, and software can make a huge difference in managing the data, process, and inventory of physical entities.

Can you share some insights on how the Mnova automation solution can help manage these processes?

Many decisions are made based on the analysis of data coming from LCMS and NMR systems. The choice of method, the fractions to choose, and the quality check at the end of the process are all based on the processing and analysis of analytical data. Mnova has algorithms to do this at its core. Combined with the automation capabilities of Mgears, these analyses are carried out automatically and the results presented back to the separation scientists to allow them to make their final decisions.

Integration is key. How does Mnova connect with existing informatics infrastructure and lab instruments?

We should probably be clear that Mnova doesn’t handle all interactions itself. In some ways, this is the power of the Mestrelab approach, in that the Mestrelab software and automation tools can fit into an existing informatics infrastructure seamlessly. For example, the instrument vendor may have a set of automation tools that work very well with their instrument, and it would be pointless reinventing these. Mnova and Mgears can be configured to sit between different IT systems, taking outputs from one and creating inputs to another.

How do Mnova solutions accommodate the increasing demands on data workflows?

Data processing is seldom the bottleneck for these processes. Acquiring data and performing purification take a finite amount of time, during which it is entirely possible to process all the data. That said, it is important to mention that Mnova supports other applications and can be set to run complementary analyses, to verify structures or determine the physico-chemical properties of an analyte, for instance, so there is no need to use other software. Additionally, the design of the system allows for multiple Mnova servers to share the work, should that ever become necessary.

Interoperability is often a concern when dealing with different systems. How does the Mnova solution ensure smooth data exchange across various lab departments?

Traditionally, groups tend to purchase the same brand of analytical equipment to get around this problem. This is obviously a limitation, and Mnova’s vendor independence makes it possible to adopt multiple vendors’ systems whilst maintaining the same workflows.

Finally, what benefits can labs expect to see after successfully implementing a global data workflow solution?

Using a common workflow solution can help minimise local differences in practice. This then allows the analysis of results, independent of where they were generated. Once we are comparing like-with-like, we can identify differences in outcomes and use these to learn and optimise local systems further.

Learn more on Mnova Gears and our solutions for the purification workflow.

Managing data workflows in a purification laboratory

Related posts

Scoring of LC separation procedures for a chiral compound using Mgears Chrom Best Method

Scoring of LC separation procedures for ezetimibe and its degradants

Webinar | Automation in a purification lab at AstraZeneca