hypedata
  • Product
  • How it works
  • Use cases
  • Pricing
  • Developers
Sign in Start free
Hypedata/ Developers/ Jobs

Jobs & batches.

For everything north of 10,000 URLs — or when your code can't hold a long-lived connection — submit a batch job. Hypedata runs it on its own schedule, calls a webhook (or you poll), and ships you a downloadable result file.

EndpointPOST /v1/jobs
Max URLs / job1,000,000
OutputNDJSON · CSV · Parquet
DeliveryWebhook or signed URL
Contents
  1. 01Create a job
  2. 02Input formats
  3. 03Poll status
  4. 04Output formats
  5. 05List · cancel · retry
  6. 06Limits

01Create a job

POST https://api.hypedata.io/v1/jobs
POST /v1/jobs
{
  "name": "nightly-catalog-2026-05-12",
  "input": { "upload_id": "upl_8K2nB7" },     // or "urls": [...] inline
  "defaults": {
    "render": true,
    "proxy_type": "residential",
    "extract": { "name": "string", "price": "number" }
  },
  "concurrency": 32,
  "output": { "format": "ndjson", "gzip": true },
  "webhook": "https://your-app.com/hooks/jobs"
}
202Accepted · queued
{
  "id": "job_3F2D1A77B0E1",
  "status": "queued",
  "urls_total": 128400,
  "eta_s": 2700,
  "created_at": "2026-05-12T22:00:14Z"
}

02Input formats

Three ways to deliver URLs:

  • Inline. "urls": ["https://…", …] up to 1,000 URLs. Great for ad-hoc.
  • Upload. POST /v1/uploads with an NDJSON or CSV file (1 GB max), then pass the returned upload_id.
  • S3 / GCS. "input": { "s3_uri": "s3://bucket/path.ndjson", "role_arn": "…" }. We assume your role and stream the file.

Per-URL overrides are supported — supply each line as a JSON object with at minimum "url", and any subset of Scrape parameters to override defaults.

03Poll status

GET/v1/jobs/{id}
{
  "id": "job_3F2D1A77B0E1",
  "status": "running",        // queued | running | completed | cancelled | failed
  "urls_total": 128400,
  "urls_done": 48217,
  "urls_errored": 312,
  "credits_used": 293482,
  "eta_s": 1820,
  "download_url": null,         // present once status=completed
  "download_url_expires_at": null
}

Prefer the job.completed webhook over polling — it's more accurate, lower-latency, and free.

04Output formats

  • ndjson (default) — one JSON line per URL. Streaming-friendly.
  • csv — flat CSV with the extracted fields as columns. Requires an extract schema in defaults.
  • parquet — Apache Parquet, compressed (zstd by default). Same column rules as CSV.

The download URL is a signed S3 link valid for 24 hours by default (configurable up to 30 days). Failed URLs are included in the output with "status": "error".

05List · cancel · retry

GET/v1/jobs?status=running&limit=20
DELETE/v1/jobs/{id}
POST/v1/jobs/{id}/retry

Cancellation is graceful — in-flight URLs finish, queued ones are skipped, the partial output becomes available. Retry produces a new job containing only the URLs that errored in the original.

06Limits

  • 1,000,000 URLs per job. Need more? Chain jobs from a webhook.
  • Maximum job lifetime: 24 hours. Long jobs are auto-cancelled with whatever has been completed retained.
  • Concurrency per job: 256 (subject to plan concurrency cap).
  • Maximum result file size: 50 GB (gzipped). Larger jobs are split into multi-part files.
← Previous Webhooks Next → Retries & idempotency
hypedata. SHERIDAN, WY · EST. 2024
HYPELABS, LLC · v2.4.0
hypedata

Production-grade web data infrastructure. Operated by HypeLabs, LLC under the laws of Wyoming, USA.

All systems operational

Product

  • Scrape API
  • SERP API
  • Stream API
  • AI Parser
  • Pricing

Developers

  • Documentation
  • SDKs
  • API reference
  • Quickstart
  • Status page

Company

  • About
  • Customers
  • Blog
  • Careers
  • Press kit

Legal

  • Terms
  • Privacy
  • DPA
  • Acceptable use
  • Security
© 2026 HYPELABS, LLC · EIN 35-2851293 · SHERIDAN, WY
Twitter / XGitHubLinkedIn