As the COVID-19 pandemic surged, the World Health Organization and the United Nations issued a stark warning: an “infodemic” of online rumors and fake news relating to COVID-19 was impeding public health efforts and causing unnecessary deaths. “Misinformation costs lives,” the organizations warned. “Without the appropriate trust and correct information … the virus will continue to thrive.”
In a bid to solve that problem, researchers at the Stevens Institute of Technology are developing a scalable solution: an AI tool capable of detecting “fake news” relating to COVID-19, and automatically flagging misleading news reports and social-media posts. “During the pandemic, things grew incredibly polarized,” explained K.P. Subbalakshmi, AI expert at the Stevens Institute for Artificial Intelligence and a professor of electrical and computer engineering. “We urgently need new tools to help people find information they can trust.”
To develop an algorithm capable of detecting COVID-19 misinformation, Subbalakshmi first worked with Stevens graduate students Mingxuan Chen and Xingqiao Chu to gather around 2,600 news articles about COVID-19 vaccines, drawn from 80 different publishers over the course of 15 months. The team then cross-referenced the articles against reputable media-rating websites and labeled each article as either credible or untrustworthy.
Next, the team gathered over 24,000 Twitter posts that mentioned the indexed news reports, and developed a “stance detection” algorithm capable of determining whether a tweet was supportive or dismissive of the article in question. “In the past, researchers have assumed that if you tweet about a news article, then you’re agreeing with its position. But that’s not necessarily the case — you could be saying ‘Can you believe this nonsense!?’” Subbalakshmi said. “Using stance detection gives us a much richer perspective, and helps us detect fake news much more effectively.”
Using their labeled datasets, the Stevens team trained and tested a new AI architecture designed to detect subtle linguistic cues that distinguish real reports from fake news. That’s a powerful approach because it doesn’t require the AI system to audit the factual content of a text, or keep track of evolving public health messaging; instead, the algorithm detects stylistic fingerprints that correspond to trustworthy or untrustworthy texts.
“It’s possible to take any written sentence and turn it into a data point — a vector in N-dimensional space — that represents the author’s use of language,” explained Subbalakshmi. “Our algorithm examines those data points to decide if an article is more or less likely to be fake news.”
More bombastic or emotional language, for instance, often correlates with bogus claims, Subbalakshmi explained. Other factors such as the time of publication, the length of an article, and even the number of authors can be used as by an AI algorithm, allowing it to determine an article’s trustworthiness. These statistics are provided with their newly curated dataset. Their baseline architecture is able to detect fake news with about 88% accuracy — significantly better than most previous AI tools for detecting fake news.
That’s an impressive breakthrough, especially using data that was collected and analyzed almost in real time, Subbalakshmi said. Still, much more work is needed to create tools that are powerful and rigorous enough to be deployed in the real world. “We’ve created a very accurate algorithm for detecting misinformation,” Dr. Subbalakshmi said. “But our real contribution in this work is the dataset itself. We’re hoping other researchers will take this forward, and use it to help them better understand fake news.”
One key area for further research: using images and videos embedded in the indexed news articles and social-media posts to augment fake-news detection. “So far, we’ve focused on text,” Subbalakshmi said. “But news and tweets contain all kinds of media, and we need to digest all of that in order to figure out what’s fake and what’s not.”
Working with short texts such as social media posts presents a challenge, but Subbalakshmi’s team has already developed AI tools that can identify tweets that are deceptive and tweets that spout fake news and conspiracy theories. Bringing bot-detection algorithms and linguistic analysis together could enable the creation of more powerful and scalable AI tools, Dr. Subbalakshmi said.
With the Surgeon General now calling for the development of AI tools to help crack down on COVID-19 misinformation, such solutions are urgently needed. Still, Subbalakshmi warned, there’s a long way still to go. Fake news is insidious, she explained, and the people and groups who spread false rumors online are working hard to avoid detection and develop new tools of their own.
“Each time we take a step forward, bad actors are able to learn from our methods and build something even more sophisticated,” she said. “It’s a constant battle — the trick is just to stay a few steps ahead.”