API Reference

Programmatic access to the EASI Leaderboard — fetch results, submit evaluations, and authenticate via HuggingFace OAuth.

EndpointMethodAuthDescription
/api/auth/callbackGETNoneHuggingFace OAuth callback
/api/leaderboardGETNone (public)Fetch latest leaderboard data
/api/submitPOSTBearer tokenSubmit evaluation (browser flow, Vercel Blob)
/api/submit-with-filePOSTBearer tokenSubmit evaluation (scripts, inline zip)
/api/blob-uploadPOSTBearer tokenVercel Blob token exchange (browser only)

Authentication

Authentication uses HuggingFace OAuth 2.0 (authorization code flow). The API uses Bearer token authentication — the same approach works for both the web form and programmatic scripts.

Browser Flow

  1. Client redirects to HuggingFace authorize URL with client_id, redirect_uri, scope=openid profile
  2. User authorizes on HuggingFace
  3. HuggingFace redirects to /api/auth/callback?code=...
  4. Server exchanges code for access token, fetches user info
  5. Server redirects to /submit?hf_user=<base64url> with user data + access token
  6. Client stores token in localStorage and sends as Authorization: Bearer on submit

Script Flow

  1. Obtain an HF token — either a regular API token (hf_*) from huggingface.co/settings/tokens or an OAuth token
  2. Call POST /api/submit-with-file/ with Authorization: Bearer <token> and the zip file as multipart
  3. Server verifies identity via HuggingFace (/oauth/userinfo for OAuth tokens, /api/whoami for regular tokens)

Token Details

  • Accepted tokens: Both regular HF API tokens (hf_*) and OAuth tokens (hf_oauth_*)
  • What it proves: The identity of the user (username, profile)
  • What it does NOT grant: Write access to the EASI dataset repository
  • For scripts: Use a regular HF API token from huggingface.co/settings/tokens — no browser OAuth flow needed
  • Expiry: Determined by HuggingFace. Submit returns 401 when expired.

OAuth Callback

GET/api/auth/callback

Not called directly — HuggingFace redirects here after user authorizes. Exchanges code for token with retry (2 attempts, 10s timeout).

Query Parameters

ParameterTypeDescription
codestringAuthorization code from HuggingFace
statestringCSRF state parameter

Error Codes

On failure, redirects to /submit?auth_error=<code>&detail=<message>.

CodeCause
missing_codeNo code parameter in callback URL
server_configMissing HF_CLIENT_ID or HF_CLIENT_SECRET
token_exchangeHuggingFace rejected the authorization code
userinfoFailed to fetch user profile
unknownUnexpected server error

Leaderboard Data

GET/api/leaderboardPublic

Fetches the latest leaderboard data from the private HuggingFace dataset repo. The server uses its own token internally — no authentication required from the caller. Results are cached in memory for 5 minutes.

Behavior

  1. Lists files in leaderboard/versions/ (with retry: 3 attempts, exponential backoff)
  2. Picks the latest by filename timestamp (e.g., bench_20260214T040553.json)
  3. Concurrently fetches leaderboard JSON and capability_map.json (taxonomy mapping)
  4. Transforms to ModelEntry[] format
  5. Returns cached data if less than 5 minutes old

Success Response

json
{
  "data": [
    {
      "name": "qwen2.5_vl_3b_instruct",
      "type": "instruction",
      "precision": "bfloat16",
      "scores": {
        "vsi_bench": 27.0,
        "mmsi_bench": 28.6,
        "site": 33.14
      }
    }
  ],
  "lastUpdated": "2026-02-14T04:05:53Z",
  "capabilityMap": {
    "vsi_bench": {
      "object_counting": ["cr"],
      "object_abs_distance": ["mm"]
    },
    "mmsi_bench": {
      "msr_accuracy": ["cr"]
    }
  }
}

Taxonomy Map (capabilityMap)

Maps each benchmark's sub-scores to spatial taxonomy categories. Fetched from capability_map.json in the HF dataset repo. Structure: Record<benchmarkId, Record<subScoreKey, taxonomyLabels[]>>.

  • Each sub-score maps to an array of taxonomy labels (e.g., ["mm"], ["sr", "mm"])
  • Empty array [] means no taxonomy mapping for that sub-score
  • Only benchmarks with mappings appear in the map
  • Taxonomy labels are dynamically derived from the data

Error Response

Status 502:

json
{
  "error": "Failed to load leaderboard data. Please try again later."
}

Submit Evaluation (Browser)

POST/api/submitBearer

Used by the web form. Accepts a JSON payload with a Vercel Blob URL referencing the uploaded zip file. The zip is uploaded to Vercel Blob first (via /api/blob-upload/), then this endpoint fetches it, validates contents, and uploads to the HuggingFace dataset repository.

For scripts and programmatic access, use /api/submit-with-file/ instead (see below).

Submit Evaluation (Scripts)

POST/api/submit-with-fileBearer

Script-friendly endpoint for programmatic submissions. Accepts multipart/form-data with the JSON payload and zip file inline. Supports both regular HF API tokens (hf_*) and OAuth tokens — no browser flow needed.

Zip file size limit: 4.5 MB (Vercel serverless payload limit). For larger files, use the web form which uploads via Vercel Blob with no size limit.

Rate Limiting

Sliding window: 5 submissions per 2-hour window per user. Each timestamp expires independently. Tracked in-memory by server-verified username.

Request Format

multipart/form-data with two parts:

PartTypeRequiredDescription
payloadJSON stringYesSubmission metadata and scores (see fields below)
zipFileFile (.zip)YesZip archive containing evaluation result files (max 4.5 MB)

Payload Fields

FieldTypeRequiredDescription
modelNamestringYesHuggingFace model ID (org/model)
modelTypestringYespretrained | finetuned | instruction | rl
precisionstringYesbfloat16 | float16 | float32 | int8
revisionstringNoModel revision. Defaults to "main"
weightTypestringNoOriginal | Delta | Adapter
baseModelstringConditionalRequired for Delta/Adapter weights
backendstringYesvlmevalkit | lmmseval | others
scoresobjectYesBenchmark ID → number or null
subScoresobjectNoBenchmark ID → { sub_key → number or null }
remarksstringNoFree-text notes

Zip File Requirements

The zip file must contain your raw evaluation result files for independent verification.

  • Max size: 20 MB (max decompressed: 100 MB)
  • Allowed file types: .json, .jsonl, .csv, .tsv, .txt, .log, .yaml, .yml, .xml, .md, .html, .pdf, .py, .sh, .png, .jpg, .jpeg, .gif, .svg, .parquet, .arrow, .npy, .pkl
  • Security: Files with disallowed extensions or path traversal patterns are rejected

Sub-Scores

Each benchmark has a set of sub-score keys. The subScores field is optional — only include benchmarks where at least one sub-score is filled. Unfilled sub-scores within an included benchmark should be null. See the benchmark sub-score keys in the submit form for the full mapping.

Benchmark IDs

EASI-8 (core):

IDNameMetric
vsi_benchVSI-BenchAcc.
mmsi_benchMMSI-BenchAcc.
mindcube_tinyMindCube-TinyAcc.
viewspatialViewSpatialAcc.
siteSITECAA
blinkBLINKAcc.
3dsrbench3DSRBenchAcc.
embspatialEmbSpatialAcc.

Additional:

IDNameMetric
mmsi_video_benchMMSI-Video-BenchAcc.
omnispatial_(manual_cot)OmniSpatial (Manual CoT)Acc.
spar_benchSPAR-BenchAcc.
vsi_debiasedVSI-DebiasedAcc.

A null value means not evaluated. A 0 value means evaluated with a score of zero.

Validation Pipeline

StepCheckStatus
1HF_UPLOAD_TOKEN configured500
2Bearer token valid (OAuth or regular HF token)401
3Rate limit not exceeded429
4Valid multipart form data400
5Payload JSON parseable400
6Zip file present400
7Zip valid (magic bytes, size, decompressed size, contents)400
8modelName matches org/model format400
9modelType non-empty400
10precision non-empty400
11Model exists on HuggingFace400
12Model has a license400
13Base model valid (Delta/Adapter)400
14Upload to HF repo via LFS succeeds500

Success Response

json
{ "success": true }

Error Messages

StatusMessage
500Server configuration error. Please contact the maintainers.
401Your session has expired. Please sign in with HuggingFace again.
429You've reached the submission limit (5 per 2 hours).
400Invalid request. Expected multipart form data.
400Missing submission payload.
400Evaluation results zip file is required.
400File is not a valid ZIP archive.
400ZIP file exceeds 20 MB limit.
400ZIP decompressed size exceeds 100 MB limit.
400ZIP contains invalid file paths.
400ZIP contains disallowed file types: {list}
400A valid model name in the format 'organization/model-name' is required.
400Model "{name}" was not found on HuggingFace.
400Model "{name}" does not have a license set.
400Base model is required when using {type} weights.
500Failed to upload your submission to the repository.

Usage Examples

These examples use /api/submit-with-file/ for script-based submissions. You can use a regular HuggingFace API token (from huggingface.co/settings/tokens) — no OAuth browser flow needed.

Python

python
import requests
import json

# Use a regular HF API token (hf_*) — no OAuth needed
# Get one from: https://huggingface.co/settings/tokens
HF_TOKEN = "hf_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"

payload = {
    "modelName": "org/model-name",
    "modelType": "instruction",       # pretrained | finetuned | instruction | rl
    "precision": "bfloat16",          # bfloat16 | float16 | float32 | int8
    "revision": "main",
    "weightType": "Original",         # Original | Delta | Adapter
    "baseModel": "",                  # required for Delta/Adapter
    "backend": "vlmevalkit",          # vlmevalkit | lmmseval | others
    "scores": {
        "vsi_bench": 27.0,
        "mmsi_bench": 28.6,
        "blink": None,                # null = not evaluated
        "site": 0,                    # 0 = evaluated with score of zero
    },
    "subScores": {                    # optional
        "vsi_bench": {
            "obj_appearance_order_accuracy": 25.3,
            "object_abs_distance": 18.7,
            "object_counting": None,
        },
    },
    "remarks": "Submitted via Python script",
}

# Submit with zip file (max 4.5 MB)
response = requests.post(
    "https://easi.lmms-lab.com/api/submit-with-file/",
    headers={"Authorization": f"Bearer {HF_TOKEN}"},
    files={
        "zipFile": (
            "eval_results.zip",
            open("eval_results.zip", "rb"),
            "application/zip",
        ),
    },
    data={"payload": json.dumps(payload)},
)

result = response.json()
if result.get("success"):
    print("Submission successful!")
else:
    print(f"Error ({response.status_code}): {result.get('error')}")

cURL

bash
# Use a regular HF API token — no OAuth needed
curl -X POST https://easi.lmms-lab.com/api/submit-with-file/ \
  -H "Authorization: Bearer hf_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" \
  -F 'payload={
    "modelName": "org/model-name",
    "modelType": "instruction",
    "precision": "bfloat16",
    "revision": "main",
    "weightType": "Original",
    "baseModel": "",
    "backend": "vlmevalkit",
    "scores": {"vsi_bench": 27.0, "mmsi_bench": 28.6},
    "subScores": {"vsi_bench": {"object_counting": 30.2}},
    "remarks": "Submitted via cURL"
  }' \
  -F "zipFile=@eval_results.zip"

# Response: {"success": true}

Notes

  • Zip size limit: 4.5 MB for script submissions. For larger files, use the web form at /submit
  • Token type: Both regular HF API tokens (hf_*) and OAuth tokens (hf_oauth_*) are accepted
  • Rate limit: 5 submissions per 2-hour window per user
  • Scores: Use null for benchmarks not evaluated, 0 for zero score