By Monica Nandagopal, senior analyst, Beroe Inc.
Clinical trials play a crucial role in driving medical innovation but frequently encounter challenges in enrolling diverse and representative patient populations. Combining RWD with advanced ML models offers a powerful and transformative solution to optimize patient recruitment. This article examines the essential factors that define meaningful patient cohorts, reviews important sources of RWD, and highlights machine learning techniques that improve patient identification and recruitment efficiency. Case studies from the industry showcase successful implementations of RWD and AI-powered tools that accelerate trial timelines and enhance inclusivity. The analysis emphasizes the importance of integrating multiple data sources alongside customized ML algorithms to overcome recruitment challenges, lower costs, and produce clinically generalizable results. This integrated approach is poised to become the future standard in clinical research, enabling faster and more equitable delivery of effective therapies to a wider patient population.1
Well-chosen cohorts improve trial validity, reduce timeline delays, and help meet regulatory expectations for real-world applicability. Data from articles published in 2019 showed that automated eligibility prescreening reduced patient screening time by 30%-35%, increased the number of candidates who matched relevant trial criteria by 15%, and increased those approached and consented into the trial by 10% compared to manual screening methods.2
A relevant patient population is necessary to generate statistically meaningful results that confirm treatment safety and efficacy in trial populations that reflect real-world diversity in age, gender, ethnicity, and clinical features to ensure generalizable outcomes. Meaningful cohorts include underrepresented groups (e.g., elderly, minorities, those with comorbidities) and are studied to address equity and improve treatment relevance. Properly defined cohorts also reduce screen failure rates and enable timely enrollment, accelerating study completion and market access.
Patient Cohort Selection:
Major factors that impact patient selection are clinical, geographical, patient awareness, and demographical parameters. Evaluating each parameter and giving them an appropriate weight can help researchers determine patients’ appropriateness for a trial. Pharmaceutical companies can apply these weightage parameters to identify the most important factors when selecting the right patients for a specific therapeutic area. For instance, during RWD selection for rare diseases, it is crucial that higher weight is assigned to clinical eligibility and access parameters, since these factors exert a greater influence on the success of clinical trials.3
Sources: Beroe Analysis
The above are indicative weights that can be adopted while evaluating patients for clinical trials and this can be modified based on the disease severity and patient access Combining multiple RWD sources can give a better understanding of the patient’s status and clinical morbidity.
Traditional recruitment is often slow, costly, and inefficient, meaning trials often struggle to enroll enough eligible and diverse patients. RWD helps by providing a richer, broader view of patient populations that better reflect real-world demographics and clinical realities. Combining and harmonizing these sources allows for a comprehensive picture of disease prevalence, patient characteristics, and physician treatment patterns.3 RWD sources include:
The analysis below describes the RWD landscape under various parameters, such as region and therapeutic area.
Source: Global Data
The above charts show that oncology is the leading therapeutic area using RWD and RWE, including 34% of studies that incorporate these elements. Trials in the central nervous system field represent 12% of studies using RWD/RWE, followed by cardiovascular indications, which account for 10%.4
China leads in the geographic distribution of RWE trials, accounting for 30% of the global total. It is followed by Italy at 10%, Germany at 10%, the United States at 9%, and Japan at 8%.4
The higher concentration of RWD/RWE trials in certain geographies is influenced by factors such as regulatory environments, data availability and quality, healthcare system maturity, and local adoption of innovative trial methodologies. Combining multiple RWD sources such as EHRs, claims, registries, pharmacy, wearables, and patient-generated data enhances patient identification precision and recruitment efficiency. This multi-source integration is particularly valuable in oncology and complex therapeutic areas where patient heterogeneity and dynamic disease status are common.5-8
Useful combinations of data sources include:
Using appropriate ML models for identifying patterns can help determine which patients are most likely to benefit from or meet the criteria for specific trials. The integration of EHRs takes this a step further by allowing real-time access to comprehensive patient data, such as medical history, diagnoses, and treatments.9-11
Different model algorithms are used for different purposes. Below are a few examples:
Sources: Secondary articles and Beroe Analysis
Pharma companies have started to utilize the RWD and ML models to bring about the best of patient quality in clinical trials. Some real-world examples include:
Sources: Press releases
Pharma companies are actively adopting RWD combined with advanced ML models to optimize patient recruitment in clinical trials. For those interested in adopting this approach:
With these integrations, clinical trials can achieve faster, more cost-effective recruitment of relevant and diverse patient cohorts, ultimately reducing study timelines and improving the generalizability of trial outcomes. This approach supports the broader goal of bringing safe and effective therapies to patients more rapidly and equitably.15,16
References:
About The Author:
Monica Nandagopal is a category research analyst with over six years of experience in market research and consulting. Her insights have supported top pharma companies’ strategic decisions on supplier outsourcing, category management, and planning. In the past year, she engaged in more than 10 market sourcing studies, five supplier data visualizations, and multiple quick, reactive analyses across clientele for global and regional requirements.