API Reference
Programmatic access to the EASI Leaderboard — fetch results, submit evaluations, and authenticate via HuggingFace OAuth.
| Endpoint | Method | Auth | Description |
|---|---|---|---|
/api/auth/callback | GET | None | HuggingFace OAuth callback |
/api/leaderboard | GET | None (public) | Fetch latest leaderboard data |
/api/submit | POST | Bearer token | Submit evaluation (browser flow, Vercel Blob) |
/api/submit-with-file | POST | Bearer token | Submit evaluation (scripts, inline zip) |
/api/blob-upload | POST | Bearer token | Vercel Blob token exchange (browser only) |
Authentication
Authentication uses HuggingFace OAuth 2.0 (authorization code flow). The API uses Bearer token authentication — the same approach works for both the web form and programmatic scripts.
Browser Flow
- Client redirects to HuggingFace authorize URL with
client_id,redirect_uri,scope=openid profile - User authorizes on HuggingFace
- HuggingFace redirects to
/api/auth/callback?code=... - Server exchanges code for access token, fetches user info
- Server redirects to
/submit?hf_user=<base64url>with user data + access token - Client stores token in
localStorageand sends asAuthorization: Beareron submit
Script Flow
- Obtain an HF token — either a regular API token (
hf_*) fromhuggingface.co/settings/tokensor an OAuth token - Call
POST /api/submit-with-file/withAuthorization: Bearer <token>and the zip file as multipart - Server verifies identity via HuggingFace (
/oauth/userinfofor OAuth tokens,/api/whoamifor regular tokens)
Token Details
- Accepted tokens: Both regular HF API tokens (
hf_*) and OAuth tokens (hf_oauth_*) - What it proves: The identity of the user (username, profile)
- What it does NOT grant: Write access to the EASI dataset repository
- For scripts: Use a regular HF API token from
huggingface.co/settings/tokens— no browser OAuth flow needed - Expiry: Determined by HuggingFace. Submit returns 401 when expired.
OAuth Callback
Not called directly — HuggingFace redirects here after user authorizes. Exchanges code for token with retry (2 attempts, 10s timeout).
Query Parameters
| Parameter | Type | Description |
|---|---|---|
code | string | Authorization code from HuggingFace |
state | string | CSRF state parameter |
Error Codes
On failure, redirects to /submit?auth_error=<code>&detail=<message>.
| Code | Cause |
|---|---|
missing_code | No code parameter in callback URL |
server_config | Missing HF_CLIENT_ID or HF_CLIENT_SECRET |
token_exchange | HuggingFace rejected the authorization code |
userinfo | Failed to fetch user profile |
unknown | Unexpected server error |
Leaderboard Data
Fetches the latest leaderboard data from the private HuggingFace dataset repo. The server uses its own token internally — no authentication required from the caller. Results are cached in memory for 5 minutes.
Behavior
- Lists files in
leaderboard/versions/(with retry: 3 attempts, exponential backoff) - Picks the latest by filename timestamp (e.g.,
bench_20260214T040553.json) - Concurrently fetches leaderboard JSON and
capability_map.json(taxonomy mapping) - Transforms to
ModelEntry[]format - Returns cached data if less than 5 minutes old
Success Response
{
"data": [
{
"name": "qwen2.5_vl_3b_instruct",
"type": "instruction",
"precision": "bfloat16",
"scores": {
"vsi_bench": 27.0,
"mmsi_bench": 28.6,
"site": 33.14
}
}
],
"lastUpdated": "2026-02-14T04:05:53Z",
"capabilityMap": {
"vsi_bench": {
"object_counting": ["cr"],
"object_abs_distance": ["mm"]
},
"mmsi_bench": {
"msr_accuracy": ["cr"]
}
}
}Taxonomy Map (capabilityMap)
Maps each benchmark's sub-scores to spatial taxonomy categories. Fetched from capability_map.json in the HF dataset repo. Structure: Record<benchmarkId, Record<subScoreKey, taxonomyLabels[]>>.
- Each sub-score maps to an array of taxonomy labels (e.g.,
["mm"],["sr", "mm"]) - Empty array
[]means no taxonomy mapping for that sub-score - Only benchmarks with mappings appear in the map
- Taxonomy labels are dynamically derived from the data
Error Response
Status 502:
{
"error": "Failed to load leaderboard data. Please try again later."
}Submit Evaluation (Browser)
Used by the web form. Accepts a JSON payload with a Vercel Blob URL referencing the uploaded zip file. The zip is uploaded to Vercel Blob first (via /api/blob-upload/), then this endpoint fetches it, validates contents, and uploads to the HuggingFace dataset repository.
For scripts and programmatic access, use /api/submit-with-file/ instead (see below).
Submit Evaluation (Scripts)
Script-friendly endpoint for programmatic submissions. Accepts multipart/form-data with the JSON payload and zip file inline. Supports both regular HF API tokens (hf_*) and OAuth tokens — no browser flow needed.
Zip file size limit: 4.5 MB (Vercel serverless payload limit). For larger files, use the web form which uploads via Vercel Blob with no size limit.
Rate Limiting
Sliding window: 5 submissions per 2-hour window per user. Each timestamp expires independently. Tracked in-memory by server-verified username.
Request Format
multipart/form-data with two parts:
| Part | Type | Required | Description |
|---|---|---|---|
payload | JSON string | Yes | Submission metadata and scores (see fields below) |
zipFile | File (.zip) | Yes | Zip archive containing evaluation result files (max 4.5 MB) |
Payload Fields
| Field | Type | Required | Description |
|---|---|---|---|
modelName | string | Yes | HuggingFace model ID (org/model) |
modelType | string | Yes | pretrained | finetuned | instruction | rl |
precision | string | Yes | bfloat16 | float16 | float32 | int8 |
revision | string | No | Model revision. Defaults to "main" |
weightType | string | No | Original | Delta | Adapter |
baseModel | string | Conditional | Required for Delta/Adapter weights |
backend | string | Yes | vlmevalkit | lmmseval | others |
scores | object | Yes | Benchmark ID → number or null |
subScores | object | No | Benchmark ID → { sub_key → number or null } |
remarks | string | No | Free-text notes |
Zip File Requirements
The zip file must contain your raw evaluation result files for independent verification.
- Max size: 20 MB (max decompressed: 100 MB)
- Allowed file types: .json, .jsonl, .csv, .tsv, .txt, .log, .yaml, .yml, .xml, .md, .html, .pdf, .py, .sh, .png, .jpg, .jpeg, .gif, .svg, .parquet, .arrow, .npy, .pkl
- Security: Files with disallowed extensions or path traversal patterns are rejected
Sub-Scores
Each benchmark has a set of sub-score keys. The subScores field is optional — only include benchmarks where at least one sub-score is filled. Unfilled sub-scores within an included benchmark should be null. See the benchmark sub-score keys in the submit form for the full mapping.
Benchmark IDs
EASI-8 (core):
| ID | Name | Metric |
|---|---|---|
vsi_bench | VSI-Bench | Acc. |
mmsi_bench | MMSI-Bench | Acc. |
mindcube_tiny | MindCube-Tiny | Acc. |
viewspatial | ViewSpatial | Acc. |
site | SITE | CAA |
blink | BLINK | Acc. |
3dsrbench | 3DSRBench | Acc. |
embspatial | EmbSpatial | Acc. |
Additional:
| ID | Name | Metric |
|---|---|---|
mmsi_video_bench | MMSI-Video-Bench | Acc. |
omnispatial_(manual_cot) | OmniSpatial (Manual CoT) | Acc. |
spar_bench | SPAR-Bench | Acc. |
vsi_debiased | VSI-Debiased | Acc. |
A null value means not evaluated. A 0 value means evaluated with a score of zero.
Validation Pipeline
| Step | Check | Status |
|---|---|---|
| 1 | HF_UPLOAD_TOKEN configured | 500 |
| 2 | Bearer token valid (OAuth or regular HF token) | 401 |
| 3 | Rate limit not exceeded | 429 |
| 4 | Valid multipart form data | 400 |
| 5 | Payload JSON parseable | 400 |
| 6 | Zip file present | 400 |
| 7 | Zip valid (magic bytes, size, decompressed size, contents) | 400 |
| 8 | modelName matches org/model format | 400 |
| 9 | modelType non-empty | 400 |
| 10 | precision non-empty | 400 |
| 11 | Model exists on HuggingFace | 400 |
| 12 | Model has a license | 400 |
| 13 | Base model valid (Delta/Adapter) | 400 |
| 14 | Upload to HF repo via LFS succeeds | 500 |
Success Response
{ "success": true }Error Messages
| Status | Message |
|---|---|
| 500 | Server configuration error. Please contact the maintainers. |
| 401 | Your session has expired. Please sign in with HuggingFace again. |
| 429 | You've reached the submission limit (5 per 2 hours). |
| 400 | Invalid request. Expected multipart form data. |
| 400 | Missing submission payload. |
| 400 | Evaluation results zip file is required. |
| 400 | File is not a valid ZIP archive. |
| 400 | ZIP file exceeds 20 MB limit. |
| 400 | ZIP decompressed size exceeds 100 MB limit. |
| 400 | ZIP contains invalid file paths. |
| 400 | ZIP contains disallowed file types: {list} |
| 400 | A valid model name in the format 'organization/model-name' is required. |
| 400 | Model "{name}" was not found on HuggingFace. |
| 400 | Model "{name}" does not have a license set. |
| 400 | Base model is required when using {type} weights. |
| 500 | Failed to upload your submission to the repository. |
Usage Examples
These examples use /api/submit-with-file/ for script-based submissions. You can use a regular HuggingFace API token (from huggingface.co/settings/tokens) — no OAuth browser flow needed.
Python
import requests
import json
# Use a regular HF API token (hf_*) — no OAuth needed
# Get one from: https://huggingface.co/settings/tokens
HF_TOKEN = "hf_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
payload = {
"modelName": "org/model-name",
"modelType": "instruction", # pretrained | finetuned | instruction | rl
"precision": "bfloat16", # bfloat16 | float16 | float32 | int8
"revision": "main",
"weightType": "Original", # Original | Delta | Adapter
"baseModel": "", # required for Delta/Adapter
"backend": "vlmevalkit", # vlmevalkit | lmmseval | others
"scores": {
"vsi_bench": 27.0,
"mmsi_bench": 28.6,
"blink": None, # null = not evaluated
"site": 0, # 0 = evaluated with score of zero
},
"subScores": { # optional
"vsi_bench": {
"obj_appearance_order_accuracy": 25.3,
"object_abs_distance": 18.7,
"object_counting": None,
},
},
"remarks": "Submitted via Python script",
}
# Submit with zip file (max 4.5 MB)
response = requests.post(
"https://easi.lmms-lab.com/api/submit-with-file/",
headers={"Authorization": f"Bearer {HF_TOKEN}"},
files={
"zipFile": (
"eval_results.zip",
open("eval_results.zip", "rb"),
"application/zip",
),
},
data={"payload": json.dumps(payload)},
)
result = response.json()
if result.get("success"):
print("Submission successful!")
else:
print(f"Error ({response.status_code}): {result.get('error')}")cURL
# Use a regular HF API token — no OAuth needed
curl -X POST https://easi.lmms-lab.com/api/submit-with-file/ \
-H "Authorization: Bearer hf_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" \
-F 'payload={
"modelName": "org/model-name",
"modelType": "instruction",
"precision": "bfloat16",
"revision": "main",
"weightType": "Original",
"baseModel": "",
"backend": "vlmevalkit",
"scores": {"vsi_bench": 27.0, "mmsi_bench": 28.6},
"subScores": {"vsi_bench": {"object_counting": 30.2}},
"remarks": "Submitted via cURL"
}' \
-F "zipFile=@eval_results.zip"
# Response: {"success": true}Notes
- Zip size limit: 4.5 MB for script submissions. For larger files, use the web form at
/submit - Token type: Both regular HF API tokens (
hf_*) and OAuth tokens (hf_oauth_*) are accepted - Rate limit: 5 submissions per 2-hour window per user
- Scores: Use
nullfor benchmarks not evaluated,0for zero score