About EASI

EASI is a holistic evaluation framework for assessing Multimodal Large Language Models on spatial intelligence. It provides comprehensive coverage across six core dimensions of spatial reasoning, grounded in cognitive science.

View Leaderboard Submit a Model

Taxonomy of Spatial Capabilities

Six core dimensions of spatial intelligence, derived from cognitive science, that structure the EASI evaluation framework.

(a)MM

Metric Measurement

Inferring 3D dimensions such as depth and length from 2D observations.

(b)MR

Mental Reconstruction

Constructing complete 3D structure from limited viewpoints.

(c)SR

Spatial Relations

Understanding relative positions and orientations of objects.

(d)PT

Perspective-taking

Reasoning about scenes across different viewpoints.

(e)DA

Deformation & Assembly

Understanding structural changes, folding, and assembly.

(f)CR

Comprehensive Reasoning

Multi-stage spatial reasoning combining multiple capabilities.

EASI-8 Benchmarks

Eight curated benchmarks providing comprehensive coverage of spatial reasoning tasks.

VSI-Bench

Acc.

Visual-Spatial Intelligence

MMSI-Bench

Acc.

Multi-Modal Spatial Intelligence

MindCube-Tiny

Acc.

Mental Rotation & Cube Folding

ViewSpatial

Acc.

View-based Spatial Reasoning

SITE

CAA

Spatial Intelligence in Text & Environment

BLINK

Acc.

Spatial Perception from Images

3DSRBench

Acc.

3D Spatial Reasoning

EmbSpatial

Acc.

Embodied Spatial Understanding

Citation

@article{easi2025,
  title={Holistic Evaluation of Multimodal LLMs on Spatial Intelligence},
  author={Cai, Zhongang and Wang, Yubo and Sun, Qingping and Wang, Ruisi and Gu, Chenyang and Yin, Wanqi and Lin, Zhiqian and Yang, Zhitao and Wei, Chen and Shi, Xuanke and Deng, Kewang and Han, Xiaoyang and Chen, Zukai and Li, Jiaqi and Fan, Xiangyu and Deng, Hanming and Lu, Lewei and Li, Bo and Liu, Ziwei and Wang, Quan and Lin, Dahua and Yang, Lei},
  journal={arXiv preprint arXiv:2508.13142},
  year={2025}
}

GitHub→

arXiv Paper→🤗HuggingFace→