Poster Presentation Clinical Oncology Society of Australia Annual Scientific Meeting 2022

AI-assisted clinical trial recruitment using an open-source Natural Language Processing workflow (#244)

Helen Kavnoudias 1 2 , Christopher Berry 2 , Theo Christian 2 , Amy McKimm 2 , Lachlan MacBean 2 , Adil Zia 1 2 , Adam Morris 1 , William Librata 2 , Dominic Buensalido 2 , Joanna Batstone 1 , Meng Law 1 2 , Anne Woollett 2 , Stephen Jane 1 2 , Helena Teede 1
  1. Monash University, Melbourne, Vic, Australia
  2. Alfred Health, Melbourne, Vic, Australia

Aims

To evaluate the open-source Natural-Language-Processing tool ‘CogStack’ in conjunction with Machine Learning  for oncology clinical trial recruitment. Different approaches were explored and compared to existing methods.

Methods

A short-list of potential trial patients with hypervascular tumours was recorded over five weeks using existing targeted manual (human) and semi-automated methods.  CogStack’s Named-Entity-Recognition model was employed to review the Electronic-Medical-Record (EMR) of the 21,050 patient presentations during that period and identify biomedical inclusion/exclusion criteria based concepts from the Unified Medical Language System. Each patient output was used as input into an ensemble of ML models. Model 1 ranked each patient based on their similarity to six different synthetic patients, one for each cancer of interest. The 100 most similar patients each day were short-listed and the next ML model made a binary prediction on patient suitability. The EMR’s of the CogStack short-listed patients were manually reviewed for suitability.

Results

The test case was a complex commercial trial with extensive inclusion/exclusion criteria and included multiple cancer types. The existing method yielded 12/25 suitable patients (precision@k value-0.48), identifying the first suitable patient in 84.85hrs, and average review time was 5.33mins per-patient.

Model 1 short-listed 137 unique patients; yielding 43/137 suitable patients (precision@k value-0.31), identifying the first suitable patient in 2.28hrs (97.31% faster) with an average review time of 3.08mins per-patient (42.19% faster); the classification model yielded a (weighted-average) F1-score of 55% (precision@k value-0.60).

Conclusions

This CogStack patient recruitment test-case employed a trial specific solution.  Model 1 demonstrated the ability to achieve comparable precision with established methods, without requiring training data. CogStack/ML short-listed more suitable patients, as it assessed all hospital presentations and the time saving was significant.  The classification model results demonstrate that with sufficient training data CogStack/ML can out-perform established methods. The benefits are an ‘always-on’ prospective patient-identification and reduced time-to-trial solution.

  1. Jackson R, Kartoglu I, Stringer C, Gorrell G, Roberts A, Song X, Wu H, Agrawal A, Lui K, Groza T, Lewsley D. CogStack-experiences of deploying integrated information retrieval and extraction services in a large National Health Service Foundation Trust hospital. BMC medical informatics and decision making. 2018 Dec;18(1):1-3.