Please report any queries concerning the funding data grouped in the sections named "Externally Awarded" or "Internally Disbursed" (shown on the profile page) to
your Research Finance Administrator. Your can find your Research Finance Administrator at https://www.ucl.ac.uk/finance/research/rs-contacts.php by entering your department
Please report any queries concerning the student data shown on the profile page to:
Email: portico-services@ucl.ac.uk
Help Desk: http://www.ucl.ac.uk/ras/portico/helpdesk
Email: portico-services@ucl.ac.uk
Help Desk: http://www.ucl.ac.uk/ras/portico/helpdesk
Publication Detail
Alchemy: A benchmark and analysis toolkit for meta-reinforcement
learning agents
-
Publication Type:Journal article
-
Authors:Wang JX, King M, Porcel N, Kurth-Nelson Z, Zhu T, Deck C, Choy P, Cassin M, Reynolds M, Song F, Buttimore G, Reichert DP, Rabinowitz N, Matthey L, Hassabis D, Lerchner A, Botvinick M
-
Publication date:04/02/2021
-
Keywords:cs.LG, cs.LG, cs.AI
-
Author URL:
-
Notes:Published in Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 2021
Abstract
There has been rapidly growing interest in meta-learning as a method for
increasing the flexibility and sample efficiency of reinforcement learning. One
problem in this area of research, however, has been a scarcity of adequate
benchmark tasks. In general, the structure underlying past benchmarks has
either been too simple to be inherently interesting, or too ill-defined to
support principled analysis. In the present work, we introduce a new benchmark
for meta-RL research, emphasizing transparency and potential for in-depth
analysis as well as structural richness. Alchemy is a 3D video game,
implemented in Unity, which involves a latent causal structure that is
resampled procedurally from episode to episode, affording structure learning,
online inference, hypothesis testing and action sequencing based on abstract
domain knowledge. We evaluate a pair of powerful RL agents on Alchemy and
present an in-depth analysis of one of these agents. Results clearly indicate a
frank and specific failure of meta-learning, providing validation for Alchemy
as a challenging benchmark for meta-RL. Concurrent with this report, we are
releasing Alchemy as public resource, together with a suite of analysis tools
and sample agent trajectories.
› More search options
UCL Researchers