Drug safety is, understandably, a prime concern for pharma organizations, regulators, health authorities and patients alike. While there is always risk associated with any medication or treatment, the aim is always to understand any risk so that it can be handled or mitigated appropriately.
The holy grail is, then, how can we predict risk effectively? This is a huge focus of many research initiatives and is being address at many levels – drug target, molecule, patient, population. With the recent flourishing of AI/ML, we’ve seen a blossoming of models to enable risk prediction.
FDA research demonstrates use of NLP for adverse events to feed machine learning
There is ongoing work at the FDA to develop models that can predict adverse events (AEs) using post-market safety data, for new drugs coming on the market. Several recent papers use a combination of AI/ML tools, including NLP, ensemble models and classification algorithms. Some of the research build upon pilot work. The pilot study of six drugs demonstrated that pharmacological target AE profiles, based on marketed drugs, can be used to predict unlabelled adverse events for a new drug at the time of approval.
Addition of chemical similarity and mechanism of action
Daluwatte et al (2020) take this research forward by using additional features in their ML model, including structural similarity, target similarity and time on market, as well as profiles of adverse events from FDA drug labels and from literature. These features were used to train a Naïve Bayes classifier, with 10,000 bootstrapping steps. This approach predicts 53 serious adverse events with high positive predictive values where well-characterized target-event relationships exist. However, adverse events that may be idiosyncratic or related to secondary target effects were not so well predicted.
Adverse event extraction from range of data sources feeds Target-Adverse event profiles predictions
The second recent paper from the FDA is also a great addition to this body of research. Schotland et al (2021) used data from three key sources and extracted features for Target-Adverse event profiles (TAEs). These features were fed into an ensemble machine learning model. In essence, the study uses data from AE reports, peer reviewed literature, and FDA drug labels, for AEs reports for particular drugs. By inference, these drugs are linked to the drug target, which then enables a level of risk prediction for new drugs targeting the same protein.
The drug target AE profiles were built from: (1) FAERS reports, generated using a bioinformatics tool, EFFECT (Molecular Health, GmbH); (2) FDA drug labels, with the safety content text-mined using Linguamatics NLP; and (3) full text literature, abstracts and conference proceedings from EMBASE curation. All the adverse events were mapped to MedDRA. See Figure for overall workflow.
The ensemble ML model was able to predict postmarket AE with good overall performance yielding an F-score (F1) of 0.71. The model identified future safety label changes for further review and has the potential to speed safety assessments by FDA evaluators. This approach may improve post-market pharmacovigilance by being able to focus resources on predicted AEs and can potentially be applied at any stage of drug development.
NLP enables high recall, precision
Figure: Workflow used by Schotland et al (2018). Linguamatics NLP was used to generate the Target-Adverse Event profiles (step 3) across FDA drug labels, mapping the AE to MedDRA. The authors performed an analysis of the Linguamatics NLP text-mining query. AE recall was increased with linguistic strategies e.g. morphological variants; spelling correction; matching across conjunctions; and precision was increased using linguistic context and utilizing document regions. The final query used for this study had a recall of 0.98, a precision of 0.94, and an F1 score of 0.96, when tested on 20 random drugs from this study used to train the query. When tested on 20 different random drugs from this study, the final query had a recall of 0.91, a precision of 0.90, and an F1 score of 0.90 (Schotland et al, 2021).
Improving drug safety with AI/ML
As with many areas of drug discovery, development, and delivery of therapeutics in the clinic, AI/ML technologies are enabling us to get a better understanding of drug safety. Investment in AI and ML in across the healthcare industry is expected to grow to $8 billion by 2022. This investment signals a dramatic increase in the quantity and quality of drugs on the market, which, in parallel, will need improved understanding of safety, for both existing drugs and new drug therapies. It’s exciting to see the potential that these technologies have, to enhance our decision-making processes around drug safety, and support more efficient drug development and safety monitoring.