Loading...
Generalizing with neural networks on mazes
Knutson, Brandon A.
Knutson, Brandon A.
Citations
Altmetric:
Advisor
Editor
Date
Date Issued
2025-04
Date Submitted
Keywords
Research Projects
Organizational Units
Journal Issue
Embargo Expires
Abstract
Generalizing from training data is a key issue in machine learning, especially since in practice the test data is often drawn from a different distribution. Unlike traditional fixed-depth networks such as multilayer perceptrons, recurrent and implicit neural networks are particularly suitable for out-of-distribution generalization because their variable-depth allows them to scale up computation to solve harder problems during test-time. We quantify the generalization ability of these models in the context of solving mazes, where we can easily generate out-of-distribution shifts while retaining a ground-truth solution. We show that a model trained on mazes without loops fails to generalize to mazes with loops; instead, the model emulates the 'dead-end filling' algorithm. Finally, we show that diversifying the training data very slightly by adding some loops to relatively few mazes dramatically increasing overall generalization. This indicates a switch in the underlying algorithm that the model is learning to emulate.
Associated Publications
Rights
Copyright of the original work is retained by the author.