University of Southern California: USC at the AAAI ’22 conference: 3D objects recognition, teaching robots to think on their own, finding information in any languages
At the 36th AAAI Conference on Artificial Intelligence, held online on Feb. 22- March 1, 2022, USC researchers will present 13 papers spanning a variety of topics including language learning, 3D objects recognition, thinking robots and data management.
Run by the largest professional organization in the field, the AAAI conference aims to promote research in artificial intelligence and scientific exchange among AI researchers, practitioners, scientists, and engineers in affiliated disciplines.
This year, USC computer science professor Sven Koenig serves as AAAI conference committee chair. He explains that “the AAAI Conference on Artificial Intelligence is one of two top conferences on all areas of artificial intelligence (AI). The many events at AAAI bring the whole AI community together, although only virtually this year.”
Adds Koenig: “Despite the pandemic, the conference continued to grow. It attracted more than 9,000 paper submissions and had a paper acceptance rate of only 15.2 percent – compared to last year’s 21%. USC, with several AI-related centers in the Viterbi School of Engineering, is an AI powerhouse that has a strong presence at AAAI every year due to the many USC researchers who work on AI methods and applications across lots of the different subareas of AI.”
USC research professor Pedro Szekely and director of the AI division of USC’s Information Sciences Institute (ISI) notes that “the Association for the Advancement of Artificial Intelligence is the premier organization for the AI community. The annual AAAI conference is the most prestigious conference in AI, covering all sub-disciplines in AI, providing a forum of researchers in different sub-communities to interact and exchange ideas. Several ISI researchers have been recognized in the prestigious AAAI Fellows program including Research Professor of Computer Science and Spatial Sciences and Principal Scientist Yolanda Gil, and our executive director Craig Knoblock.” This year, USC Research Professor and Principal Scientist at ISI Kristina Lerman will also be inducted as a AAAI Fellow for significant contributions to the field of network science and the application of AI to computational social science. This honors recognizes Lerman’s influential research in AI to understand human behavior.
In addition to the numerous papers presented at this year’s conference, USC will also participate in a tutorial on Recent Advances in Multi-Agent Path Finding led by Jiaoyang Li (USC), Daniel Harabor, Sven Koenig (USC) and Ariel Felner. USC will also hold a workshop on AI for Decision Optimization with Bistra Dilkina (USC) and Segev Wasserkrug. USC Research Professor and Director of Knowledge Technologies at the Information Sciences Institute Yolanda Gil, past president of AAAI, will introduce the Presidential Address and present several awards, including the 2nd $1M award for AI for the benefit of humanity, honoring individuals whose work in the field has had a transformative impact on society.
Research spotlights, AAAI 2022
Teaching robots to think like humans
“Talking to robots someday would be cool, but we first need to understand what the robot understands, and the robot needs to understand what it thinks we understand”, says Jesse Thomason, one of the authors of TEACh: Task-driven Embodied Agents that Chat and assistant professor of computer science at USC. TEACh was introduced to study how agents, or robots, connect an understanding of natural human language to the visual world, while simultaneously using conversations to communicate and have their own goals.
There is a “commander” (one who understands the chore) and “follower” (one who carries out the chore) involved in the over 3,000 simulated interactions: the commander will communicate the desired task by speaking to the follower, who will then interact with its surrounding environment to complete a household task or chore. The follower asks questions to gain more information on how to perform certain tasks that range from making coffee to preparing breakfast.
This research allows for machine learning models to handle new levels of responsiveness, such as asking what and where the kitchen is rather than simply giving up when not knowing how to perform a task. The goal is for these robots to learn from the interactions they have with us, as humans, and how they can adapt to our unique natural language. “I’d like to see a future where my Roomba stops before it runs through a big pile of cat litter and pings me with a message like “Did you actually want me to vacuum this area right now?” if it figures I am not aware of that big pile yet.” said Thomason.
Detecting occluded shapes for 3D objects
Driving safety often depends on a car’s ability to detect an object in its surroundings, whether it be a pedestrian, cyclists, another vehicle, or any other approaching object. One barrier that exists in current vehicle technology that prevents complete and comprehensive detection of objects is occluded, or obstructed, objects.
While most modern car technology contains features that help cars safely dodge objects, none fully address the challenge of occluded object perception and detection which is necessary for enhanced driver and environment safety. In their paper Behind the Curtain, Qiangeng Xu and his colleagues enumerate several sources of confusion for a vehicle, one being lack of perception of objects. In order to mitigate potential accidents or hazards on the road, this research offers a solution to target this fundamental artificial intelligence challenge of object perception. It is the first three-dimensional object detector that illustrates the purpose and benefit of learning occluded shapes for improving driving safety.
Even with a partial understanding of the shape, Behind the Curtain functions to create a complete picture of the obstructed object using well-defined probability models. The new technology aims to reduce the number of car accidents and improve the efficiency and safety of self-driving vehicles.
“The car can stop before an accident occurs because our technology can detect objects earlier, faster, and more accurately,” said Xu.
Identifying critical information in many languages
This research from USC’s Information Sciences Institute’s Steven C. Fincke, Shantanu Agarwal, Scott Miller, and Elizabeth Boschee explore the question: Can you better exploit the strengths of a multilingual language model by specifying your immediate question when requesting a representation of your input text?
The team connects this research to everyday life. “Texts such as newspaper articles contain information about events such as natural disasters, protests, or terror attacks. To make this information easily accessible to humans (who cannot read thousands of articles a day), we train a model to detect these events and determine critical information about them (who, when, where, etc.). We use English examples, but the key is that we ask it to learn from those English examples and then apply what it has learned on foreign language data.”
The authors were able to demonstrate, using an example of a language model trained on English texts such as websites and newspaper articles, how when they were able to extract knowledge from a foreign language text just because it used the same model. “We break down our queries into small bits, such as Who arrived? in an article about a US President visiting a foreign country. Then we ask this question in a language-agnostic model, producing state-of-the-art performance on Arabic text with a system that has only seen English examples.” The key is using those small bits: by addressing each point, the model can find similar information in other languages.
Gathering a lot of data while keeping it safe
Machine learning is playing an ever-more important role in the functioning of modern society. This process relies on neural networks which, in turn, rely on huge data sets. Among these networks, Graph Neural Networks (GNNs) are the first choice methods for their unparalleled ability to learn state-of-the-art level representations from graph-structured data.
The problem, however, is that centralizing the amount of data GNNs is prohibitive because of serious privacy concerns on the part of the individuals whose data is being collected.
In this paper, USC Viterbi Professor Salman Avestimehr and his co-authors propose the use of a tool called SpreadGNN – a novel training network that uses federated learning combined with GNN. Federated learning is a process in which data is siloed safely at numerous edge locations instead of centrally under the control of one entity. The team’s results show that SpreadGNN outperforms GNN models trained over a central server-dependent federated learning system, even in constrained topologies.
Handling large data sets without draining resources
As the number of cloud-assisted AI services has grown rapidly, designing systems that can better cope with this surge has become a challenge for engineers. Traditionally, the approach to solving this problem is a process known as “replication”, which assigns the same prediction task to multiple workers. This process, while somewhat reliable, is inefficient and incurs heavy resources. In response to this, many researchers have proposed using a learning-based approach called “parity model” (ParM). However, this model is not without its own challenges – it can be more efficient than replication but can only work with a small number of queries.
In response, Avestimehr and his colleagues have proposed a new approach called “Approximate Coded Inference (ApproxIFER). Their approach does not require training of parity models and can be more readily applied to different data domains and model architectures. In other words, it’s able to handle the large data sets while also remaining efficient and not draining resources. The team’s extensive experiments on a large number of datasets and model architectures also show significant accuracy improvement by up to 58% over the parity model approaches.