Penn Medicine, Philadelphia: Researchers Aim to Use AI to Predict Rare Diseases


What if your own medical record could figure out that you are at risk of developing a rare disease so you could receive a diagnosis months or even years earlier than you might otherwise, allowing doctors to get started on important treatment sooner? That’s what a team of researchers co-led by faculty at the Perelman School of Medicine at the University of Pennsylvania and the University of Florida College of Medicine will explore with the help of a $4.7 million grant from the National Institutes of Health (NIH).

For the next four years, researchers will work to develop a set of algorithms powered by machine learning, a form of artificial intelligence (AI), to identify which patients are at risk of five different types of vasculitis and two different types of spondyloarthritis. These predictions, derived from information already available in patients’ electronic health records, could greatly increase the chance of patients being diagnosed sooner.

The efforts to develop this prediction method, called “PANDA: Predictive Analytics via Networked Distributed Algorithms for multi-system diseases,” will be led by principal investigators Yong Chen, PhD, a professor of Biostatistics, and Peter A. Merkel, MD, MPH, chief of Rheumatology and a professor of Medicine and Epidemiology at Penn, and Jiang Bian, PhD, chief data scientist of the University of Florida Health system and a professor in the Health Outcomes & Biomedical Informatics at the University of Florida College of Medicine.

“This is an exciting step forward, building on our current PDA framework, from clinical evidence generation toward AI-informed interventions in clinical decision-making,” Chen said. “Despite the clear need to reduce the dangerous and costly delays in diagnosis, individual clinicians, especially in primary care, face important challenges.”

Chen used one of the forms of vasculitis under study, granulomatosis with polyangiitis (GPA), as an example of the promise the PANDA system holds. GPA involves inflammation of many organs and can be very severe or even fatal. Mortality rates for patients with this condition remain high in the first year after diagnosis, and the correct diagnosis of this type of vasculitis, and all the other types, can be delayed by months or even years.

“An earlier diagnosis of any of the types of vasculitis and spondyloarhritis we’re working on leads to a much better prognosis and better clinical outcomes,” Merkel said. “Even if we determine that a patient has just a 10 percent likelihood of developing one of these diseases, that is a much higher chance of a rare problem, and clinicians can keep that in mind and make better decisions for their patients.”

Among the challenges in diagnosis faced by clinicians and their patients are how rare diseases can camouflage themselves as other common diseases, a lack of access to data or other clinicians the patient works with, and, simply, a lack of familiarity with extremely uncommon conditions. An algorithm that automatically scans known information to identify the possibility of a disease like GPA could be lifesaving.

“The increasing availability of real-world data, such as electronic health records collected through routine care, provides a golden opportunity to generate real-world evidence to inform clinical decision-making,” Bian said. “Nevertheless, to leverage these large collections of real-world data, which are often distributed across multiple sites, novel distributed algorithms like PANDA are much needed.”

The researchers plan is to pull data through Patient-Centered Clinical Research Networks (PCORnet), a national database including information from different health systems, adding up to more than 27 million patients. De-identified data from these patients, including lab test results, comorbid conditions, past treatments, and other commonly available information, will be used to create the algorithms. Once built, the researchers will test each algorithm’s predictive power across 10-plus health systems, and then following these tests, the methods the team develops will be shared and available to apply to other diseases.

Because, as its name implies, machine learning algorithms are designed to “learn” and refine themselves as they’re used and fed more data, it’s possible that PANDA will continuously refine itself and become more helpful as time passes. “The proposed machine learning algorithms will adaptively update their key parameters as more data are made available,” said Chen. “We plan to evaluate these machine learning algorithms periodically to ensure they meet our pre-specified standards and can evolve positively over time.”

Leave A Reply

Your email address will not be published.