Six Steps to Developing a Strategy in AI and Machine Learning in Life Sciences

With technology moving to a digital era, the push to artificial intelligence (AI) and machine learning (ML) is stronger than ever. We see AI and ML embedded in our everyday lives, from smart phone facial recognition to autonomous driving. While AI and ML have become a staple in the modern world, there are still challenges that come with leveraging its use in health care and scientific studies. Take cell biology for example. AI can be leveraged to tag different features in cells or even human photos, but more is needed to maintain the integrity of the cell image—which is rich in context—to account for its nuanced features for research, patient care and drug development. Determining the right course of action with AI and ML can be complex. To simplify this process, consider these points before developing an AI strategy in the life sciences field.

Identify the use-case for artificial intelligence.

There are a few ways to design an AI strategy and to decide which route to take largely depends on what you are hoping to accomplish. AI can provide tools for exploratory research to process a large amount of data and enable scientists to efficiently navigate the data that leads to a new discovery.  AI can dramatically improve accuracy and productivity while removing subjectivity and error for well-defined clinical tests such as chest X-ray analysis or PAP smear test.

Define the requirements at the start.

Despite all of the promises of AI technology, there are limitations and obstacles.  It’s crucial to define the hypothesis and validation strategy. Understand that data should first be collected, and the conclusions should be drawn after. The current limitations including bias and lack of transparency can easily mislead researchers to a biased conclusion without robust validation.  

Define the acceptance criteria.

There has been much concern about the black box of algorithms and challenges of interpretability. Based on the requirements, it is important to define acceptance criteria to measure the performance of the algorithm specifically accuracy, robustness, learning efficiency and adaptation, and computing capacity.

Start with a small set with a known outcome.

Make sure the data is reproducible and aligns with what is known in science. Systematic debugging and robust validation with both synthetic data and real data with known outcomes are required, especially when an AI algorithm is used in clinical practice.

Align domain experts.

While the data should be good to produce optimal results, it’s important to leverage the knowledge of subject matter experts to ensure the accuracy of the data used.

Separate the training and testing data to avoid a bias.

One of the goals of an AI algorithm is to learn a classifier with good generalization. It is crucial to measure the performance of the algorithm on test data that has not been used to train the algorithm.

Jiyun Byun, PhD

Jiyun Byun, PhD, is a Senior Manager, Computer Vision Research at Epic Sciences, where she develops imaging algorithms for image processing, analysis, and classification pipeline from image capture to patient stratification. Recently she developed AR-V7 Imaging Algorithm (ARIA) for CTC classification and AR-V7 nuclear localization for Epic Sciences AR-V7 liquid biopsy test exclusively available through Genomic Health as the Oncotype DX AR-V7 Nucleus Detect test.

Jiyun Byun earned her PhD at University of California, Santa Barbara. Jiyun came through UCSB’s Vision Research Lab, and Mayachitra where she developed and commercialized a powerful image processing suite whose customers included Olympus and DARPA. She has deep experience in the entire image processing and pattern recognition pipeline across several imaging modalities and applications including fluorescence imaging, content-based search and retrieval, scene understanding, and object classification in various applications.