1Cademy - SCAN Tasks for Evaluating Compositional Generalization

Learn Before

Compositional Generalization in LLMs

Dataset

SCAN Tasks for Evaluating Compositional Generalization

The SCAN (Simplified versions of the CommAI Navigation tasks) benchmark is a set of tasks used to measure an LLM's ability for compositional generalization. These tasks require the model to translate natural language instructions into corresponding sequences of actions.

Updated 2025-10-04

Contributors are:

Who are from:

References