Université de Montréal experts use AI to explore potential zoonotic diseases
The COVID-19 pandemic highlighted the importance of closely monitoring viruses that could infect humans. In the early stages of the pandemic, Timothée Poisot and his colleagues were already developing an algorithm for predicting mammal-virus interactions.
“We had been working on this project from the first few months of 2020, before the pandemic took off,” said Poisot, a professor in the Department of Biological Sciences at Université de Montréal and a member of the Viral Emergence Research Initiative (VERENA), an international research institute based in Washington, D.C.
Just over three years later, Poisot and colleagues have published the results of thousands of hours of calculations and validation in the journal Patterns, a premium open access journal from Cell Press.
Making better predictions
Poisot belongs to a multidisciplinary research team which hopes to make better predictions of interactions between mammals and viruses in general. When certain conditions are met, the passage of viruses from one species to another can ultimately lead to the emergence of a zoonosis – which the World Health Organization defines as “an infectious disease that has jumped from a non-human animal to humans.”
According to Poisot, “the basic problem is that we are only aware of between one and two per cent of the interactions between viruses and mammals. The networks are scattered and there are few interactions, which are concentrated in just a few species.”
Trying to sample all interactions manually would be an enormous task, especially since there are thousands of species of mammals and even more thousands of viruses – leading to literally infinite mammal-virus combinations.
Poisot and his colleagues therefore sought to develop a new algorithm using machine learning, as a way of formulating hypotheses which would then serve to identify which host-virus interactions to explore further.
“We want to know which species of virus is likely to infect which species of mammal, so we can establish which interactions are most likely to occur,” said Poisot, who spent several thousand hours with his team creating the algorithm and refining.
‘Out-of-date names and errors’
“Some of the data sets we had were older: they contained out-of-date names for particular species, or they had errors because the data had been entered by hand,” Poisot said.
The first job was to clean up and standardize the data – a very time-consuming task. Once he and his colleagues created the algorithm, they then had to refine it. “One of the advantages of our algorithm is that you don’t need a lot of information to use it,” Poisot said.
Deploying existing models to make predictions requires a lot of information on taxonomy, phylogenetic structure, data taken by sampling, and more. To deal with that, the algorithm developed by Poisot and his team represents the system as a network of interactions between viruses and mammals that the algorithm must then complete.
“The algorithm takes the network we already know, and projects it into a new space, a bit like shadow theatre: it casts light on interactions in a new way,” said Poisot. “This, in turn, allows us to make predictions.”
Even so, making these predictions required 10,000 hours of computing on Calcul Québec’s computers. Matching the results with known interactions, the model found 80,000 new potential interactions between viruses and hosts.
“After that,” said Poisot, “the main task was to determine the level of confidence we had in the model’s ability to make predictions.” The model had to be validated statistically, which in itself required the researchers to publish an article on validation methodology using very incomplete data.
Monitoring 20 key viruses
The research team then selected 20 key viruses worth monitoring, since they have the potential to jump the species barrier and infect humans. The team also identified “hot” regions where resources should be focused. “We had a lot of discussions on the team, because at first some of the results seemed strange to us,” said Poisot.
One of the viruses that came to light was murine ectromelia, which is related to smallpox in mice. “We were sceptical, but when we searched the literature, we found there had been cases in humans,” said Poisot.
One of the important results of this research project is the rediscovery of specific zoonotic viruses, which had already been the subject of scattered publications, but which had never been linked in databases.
Another innovative aspect of the research is mapping the results to better understand virus-mammal interactions on a global scale. “Our model makes spatial predictions, but more precisely, the model indicates specifically in which group of mammals and in which location certain types of virus are likely to be found,” said Poisot.
Two regions to explore
The team has identified two geographic regions to explore. First, the Amazon basin in South America, where the interactions between hosts and viruses are more original than elsewhere and where new interactions are more likely to be observed. And second, Central Africa, where new hosts have been found that are potential carriers of zoonotic viruses.
“We are really shifting the places where we need to go and study mammals to discover new viruses,” Poisot explained. These two regions should therefore be of interest to virologists who wish to understand the diversification of host-virus systems and the zoonotic risk they represent for humans, he added.
The next step for Poisot and his research colleagues is to make the information easily accessible and user-friendly for partners in the field. “We want to make it easier for stakeholders to adopt our model. We now know which species to monitor, where and for what type of virus,” he said.
In the end, he believes, this research project could prove essential to preventing a future pandemic.