API Reference

Programmatic access to the EASI Leaderboard — fetch results, submit evaluations, and authenticate via HuggingFace OAuth.

Endpoint	Method	Auth	Description
`/api/auth/callback`	GET	None	HuggingFace OAuth callback
`/api/leaderboard`	GET	None (public)	Fetch latest leaderboard data
`/api/submit`	POST	Bearer token	Submit evaluation (browser flow, Vercel Blob)
`/api/submit-with-file`	POST	Bearer token	Submit evaluation (scripts, inline zip)
`/api/blob-upload`	POST	Bearer token	Vercel Blob token exchange (browser only)

Authentication

Authentication uses HuggingFace OAuth 2.0 (authorization code flow). The API uses Bearer token authentication — the same approach works for both the web form and programmatic scripts.

Browser Flow

Client redirects to HuggingFace authorize URL with client_id, redirect_uri, scope=openid profile
User authorizes on HuggingFace
HuggingFace redirects to /api/auth/callback?code=...
Server exchanges code for access token, fetches user info
Server redirects to /submit?hf_user=<base64url> with user data + access token
Client stores token in localStorage and sends as Authorization: Bearer on submit

Script Flow

Obtain an HF token — either a regular API token (hf_*) from huggingface.co/settings/tokens or an OAuth token
Call POST /api/submit-with-file/ with Authorization: Bearer <token> and the zip file as multipart
Server verifies identity via HuggingFace (/oauth/userinfo for OAuth tokens, /api/whoami for regular tokens)

Token Details

Accepted tokens: Both regular HF API tokens (hf_*) and OAuth tokens (hf_oauth_*)
What it proves: The identity of the user (username, profile)
What it does NOT grant: Write access to the EASI dataset repository
For scripts: Use a regular HF API token from huggingface.co/settings/tokens — no browser OAuth flow needed
Expiry: Determined by HuggingFace. Submit returns 401 when expired.

OAuth Callback

GET/api/auth/callback

Not called directly — HuggingFace redirects here after user authorizes. Exchanges code for token with retry (2 attempts, 10s timeout).

Query Parameters

Parameter	Type	Description
`code`	string	Authorization code from HuggingFace
`state`	string	CSRF state parameter

Error Codes

On failure, redirects to /submit?auth_error=<code>&detail=<message>.

Code	Cause
`missing_code`	No code parameter in callback URL
`server_config`	Missing HF_CLIENT_ID or HF_CLIENT_SECRET
`token_exchange`	HuggingFace rejected the authorization code
`userinfo`	Failed to fetch user profile
`unknown`	Unexpected server error

Leaderboard Data

GET/api/leaderboardPublic

Fetches the latest leaderboard data from the private HuggingFace dataset repo. The server uses its own token internally — no authentication required from the caller. Results are cached in memory for 5 minutes.

Behavior

Lists files in leaderboard/versions/ (with retry: 3 attempts, exponential backoff)
Picks the latest by filename timestamp (e.g., bench_20260214T040553.json)
Concurrently fetches leaderboard JSON and capability_map.json (taxonomy mapping)
Transforms to ModelEntry[] format
Returns cached data if less than 5 minutes old

Success Response

json

{
  "data": [
    {
      "name": "qwen2.5_vl_3b_instruct",
      "type": "instruction",
      "precision": "bfloat16",
      "scores": {
        "vsi_bench": 27.0,
        "mmsi_bench": 28.6,
        "site": 33.14
      }
    }
  ],
  "lastUpdated": "2026-02-14T04:05:53Z",
  "capabilityMap": {
    "vsi_bench": {
      "object_counting": ["cr"],
      "object_abs_distance": ["mm"]
    },
    "mmsi_bench": {
      "msr_accuracy": ["cr"]
    }
  }
}

Taxonomy Map (`capabilityMap`)

Maps each benchmark's sub-scores to spatial taxonomy categories. Fetched from capability_map.json in the HF dataset repo. Structure: Record<benchmarkId, Record<subScoreKey, taxonomyLabels[]>>.

Each sub-score maps to an array of taxonomy labels (e.g., ["mm"], ["sr", "mm"])
Empty array [] means no taxonomy mapping for that sub-score
Only benchmarks with mappings appear in the map
Taxonomy labels are dynamically derived from the data

Error Response

Status 502:

json

{
  "error": "Failed to load leaderboard data. Please try again later."
}

Submit Evaluation (Browser)

POST/api/submitBearer

Used by the web form. Accepts a JSON payload with a Vercel Blob URL referencing the uploaded zip file. The zip is uploaded to Vercel Blob first (via /api/blob-upload/), then this endpoint fetches it, validates contents, and uploads to the HuggingFace dataset repository.

For scripts and programmatic access, use /api/submit-with-file/ instead (see below).

Submit Evaluation (Scripts)

POST/api/submit-with-fileBearer

Script-friendly endpoint for programmatic submissions. Accepts multipart/form-data with the JSON payload and zip file inline. Supports both regular HF API tokens (hf_*) and OAuth tokens — no browser flow needed.

Zip file size limit: 4.5 MB (Vercel serverless payload limit). For larger files, use the web form which uploads via Vercel Blob with no size limit.

Rate Limiting

Sliding window: 5 submissions per 2-hour window per user. Each timestamp expires independently. Tracked in-memory by server-verified username.

Request Format

multipart/form-data with two parts:

Part	Type	Required	Description
`payload`	JSON string	Yes	Submission metadata and scores (see fields below)
`zipFile`	File (.zip)	Yes	Zip archive containing evaluation result files (max 4.5 MB)

Payload Fields

Field	Type	Required	Description
`modelName`	string	Yes	HuggingFace model ID (org/model)
`modelType`	string	Yes	pretrained \| finetuned \| instruction \| rl
`precision`	string	Yes	bfloat16 \| float16 \| float32 \| int8
`revision`	string	No	Model revision. Defaults to "main"
`weightType`	string	No	Original \| Delta \| Adapter
`baseModel`	string	Conditional	Required for Delta/Adapter weights
`backend`	string	Yes	vlmevalkit \| lmmseval \| others
`scores`	object	Yes	Benchmark ID → number or null
`subScores`	object	No	Benchmark ID → { sub_key → number or null }
`remarks`	string	No	Free-text notes

Zip File Requirements

The zip file must contain your raw evaluation result files for independent verification.

Max size: 20 MB (max decompressed: 100 MB)
Allowed file types: .json, .jsonl, .csv, .tsv, .txt, .log, .yaml, .yml, .xml, .md, .html, .pdf, .py, .sh, .png, .jpg, .jpeg, .gif, .svg, .parquet, .arrow, .npy, .pkl
Security: Files with disallowed extensions or path traversal patterns are rejected

Sub-Scores

Each benchmark has a set of sub-score keys. The subScores field is optional — only include benchmarks where at least one sub-score is filled. Unfilled sub-scores within an included benchmark should be null. See the benchmark sub-score keys in the submit form for the full mapping.

Benchmark IDs

EASI-8 (core):

ID	Name	Metric
`vsi_bench`	VSI-Bench	Acc.
`mmsi_bench`	MMSI-Bench	Acc.
`mindcube_tiny`	MindCube-Tiny	Acc.
`viewspatial`	ViewSpatial	Acc.
`site`	SITE	CAA
`blink`	BLINK	Acc.
`3dsrbench`	3DSRBench	Acc.
`embspatial`	EmbSpatial	Acc.

Additional:

ID	Name	Metric
`mmsi_video_bench`	MMSI-Video-Bench	Acc.
`omnispatial_(manual_cot)`	OmniSpatial (Manual CoT)	Acc.
`spar_bench`	SPAR-Bench	Acc.
`vsi_debiased`	VSI-Debiased	Acc.

A null value means not evaluated. A 0 value means evaluated with a score of zero.

Validation Pipeline

Step	Check	Status
1	HF_UPLOAD_TOKEN configured	500
2	Bearer token valid (OAuth or regular HF token)	401
3	Rate limit not exceeded	429
4	Valid multipart form data	400
5	Payload JSON parseable	400
6	Zip file present	400
7	Zip valid (magic bytes, size, decompressed size, contents)	400
8	modelName matches org/model format	400
9	modelType non-empty	400
10	precision non-empty	400
11	Model exists on HuggingFace	400
12	Model has a license	400
13	Base model valid (Delta/Adapter)	400
14	Upload to HF repo via LFS succeeds	500

Success Response

json

{ "success": true }

Error Messages

Status	Message
500	Server configuration error. Please contact the maintainers.
401	Your session has expired. Please sign in with HuggingFace again.
429	You've reached the submission limit (5 per 2 hours).
400	Invalid request. Expected multipart form data.
400	Missing submission payload.
400	Evaluation results zip file is required.
400	File is not a valid ZIP archive.
400	ZIP file exceeds 20 MB limit.
400	ZIP decompressed size exceeds 100 MB limit.
400	ZIP contains invalid file paths.
400	ZIP contains disallowed file types: {list}
400	A valid model name in the format 'organization/model-name' is required.
400	Model "{name}" was not found on HuggingFace.
400	Model "{name}" does not have a license set.
400	Base model is required when using {type} weights.
500	Failed to upload your submission to the repository.

Usage Examples

These examples use /api/submit-with-file/ for script-based submissions. You can use a regular HuggingFace API token (from huggingface.co/settings/tokens) — no OAuth browser flow needed.

Python

python

import requests
import json

# Use a regular HF API token (hf_*) — no OAuth needed
# Get one from: https://huggingface.co/settings/tokens
HF_TOKEN = "hf_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"

payload = {
    "modelName": "org/model-name",
    "modelType": "instruction",       # pretrained | finetuned | instruction | rl
    "precision": "bfloat16",          # bfloat16 | float16 | float32 | int8
    "revision": "main",
    "weightType": "Original",         # Original | Delta | Adapter
    "baseModel": "",                  # required for Delta/Adapter
    "backend": "vlmevalkit",          # vlmevalkit | lmmseval | others
    "scores": {
        "vsi_bench": 27.0,
        "mmsi_bench": 28.6,
        "blink": None,                # null = not evaluated
        "site": 0,                    # 0 = evaluated with score of zero
    },
    "subScores": {                    # optional
        "vsi_bench": {
            "obj_appearance_order_accuracy": 25.3,
            "object_abs_distance": 18.7,
            "object_counting": None,
        },
    },
    "remarks": "Submitted via Python script",
}

# Submit with zip file (max 4.5 MB)
response = requests.post(
    "https://easi.lmms-lab.com/api/submit-with-file/",
    headers={"Authorization": f"Bearer {HF_TOKEN}"},
    files={
        "zipFile": (
            "eval_results.zip",
            open("eval_results.zip", "rb"),
            "application/zip",
        ),
    },
    data={"payload": json.dumps(payload)},
)

result = response.json()
if result.get("success"):
    print("Submission successful!")
else:
    print(f"Error ({response.status_code}): {result.get('error')}")

cURL

bash

# Use a regular HF API token — no OAuth needed
curl -X POST https://easi.lmms-lab.com/api/submit-with-file/ \
  -H "Authorization: Bearer hf_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" \
  -F 'payload={
    "modelName": "org/model-name",
    "modelType": "instruction",
    "precision": "bfloat16",
    "revision": "main",
    "weightType": "Original",
    "baseModel": "",
    "backend": "vlmevalkit",
    "scores": {"vsi_bench": 27.0, "mmsi_bench": 28.6},
    "subScores": {"vsi_bench": {"object_counting": 30.2}},
    "remarks": "Submitted via cURL"
  }' \
  -F "zipFile=@eval_results.zip"

# Response: {"success": true}

Notes

Zip size limit: 4.5 MB for script submissions. For larger files, use the web form at /submit
Token type: Both regular HF API tokens (hf_*) and OAuth tokens (hf_oauth_*) are accepted
Rate limit: 5 submissions per 2-hour window per user
Scores: Use null for benchmarks not evaluated, 0 for zero score

API Reference

Authentication

Browser Flow

Script Flow

Token Details

OAuth Callback

Query Parameters

Error Codes

Leaderboard Data

Behavior

Success Response

Taxonomy Map (capabilityMap)

Error Response

Submit Evaluation (Browser)

Submit Evaluation (Scripts)

Rate Limiting

Request Format

Payload Fields

Zip File Requirements

Sub-Scores

Benchmark IDs

Validation Pipeline

Success Response

Error Messages

Usage Examples

Python

cURL

Notes

Taxonomy Map (`capabilityMap`)