École polytechnique: Science and video games conference: AI and cooperation

Hanabi, a board game where players must cooperate to share incomplete information, is considered the new challenge for artificial intelligence. Antoine Bauza, Vicky Kalogeiton and Pierre Beri discussed this board game during the “Science and Video Games” Chair conference.



Since the work of Google DeepMind researchers in 2019, Hanabi has been described as “the new frontier of artificial intelligence research”. Antoine Bauza, author of board games including Hanabi in 2010, Vicky Kalogeiton, researcher at the École Polytechnique Computer Science Laboratory (LIX*) in the GeoViC team, and Pierre Beri, an accomplished gamer with nearly 7,000 games of Hanabi under his belt, will share their views on the game and its subtleties, both for humans and for AIs. These exchanges, accompanied by David Louapre, Ubisoft’s scientific director, took place as part of the second annual conference of the “Science and Video Games” Chair, supported by Ubisoft and hosted by the Leprince-Ringuet laboratory for particle physics and astrophysics (LLR*).

Hanabi is a cooperative board game with incomplete information, as each player only sees the cards of the others. Coordination is done through a limited number of clues, thus limiting the knowledge of one’s own cards. Several interpretations are possible for these clues, requiring finesse and understanding of the game and the intentions of the other players for a good transmission of information. The game is based on “theory of mind”, where a player assumes that others think like him and will adapt to his way of playing.

Winning the prestigious “Spiel des Jahres” award in 2013, Hanabi poses multiple challenges for AI researchers. Indeed, Antoine Bauza’s game requires anticipating the behaviour of other players in order to coordinate with them. Vicky Kalogeiton, an expert in artificial intelligence for video comprehension, explains that this way of playing is in contradiction with the most common learning technique for AI.

The latter, ‘reinforcement learning’, involves building the rules of a game into the design of an AI and having it play a large number of games against itself.The technique has proven itself, most notably in 1997 with the historic victory of Deeperblue, a supercomputer that defeated world chess champion Garry Kasparov. More recently, the technique enabled “AlphaStar”, a multi-agent reinforcement learning algorithm, to reach the level of the best Starcraft 2 players. This video game is known to be one of the most difficult professional e-sports, and also for its relevance to the real world in terms of raw complexity.

Since the work of Google DeepMind researchers, more than 250 research papers have been published on Hanabi, which remains a challenge for artificial intelligences. Coordinating with other human players and adapting to their actions turns the reinforcement learning paradigm around and inspires AI researchers to push the boundaries of algorithms.