Clip
The Evolution of Benchmarks in AI Research
The classical paradigm of supervised learning involves partitioning data into a training, validation, and test set. The community has accepted benchmark tasks such as the BABY tasks proposed by FAIR to test machines' ability to reason and access working memory.