01JavaScript rendering
Rendering is 5× more expensive than a plain fetch and 2 – 3 seconds slower. Turn it on only when you have to.
How to decide
- Fetch without rendering.
curl https://api.hypedata.io/v1/scrape?url=…. - Search the HTML for the data you need. If it's there, you're done.
- If the page is mostly empty divs and lazy
<script>bundles, enablerender: true.
Wait strategies
For SPAs, wait_for a known DOM element is much more reliable than tuning wait_until:
{ "url": "…", "render": true, "wait_for": "[data-testid=product-price]" }Lazy content
Many sites paginate or "show more" on scroll. The script parameter lets you trigger it before the parser sees the page:
{
"url": "https://news.example/feed",
"render": true,
"script": "for(let i=0;i<10;i++){window.scrollTo(0,document.body.scrollHeight); await new Promise(r=>setTimeout(r,800));}"
}02Proxies & geo
The proxy tier decision tree, in order:
datacenter— try this first. Free, fast.- Getting 403 or captcha walls? Add
proxy_type: "residential". ~+1 credit per request, but success rates jump on retailers and travel. - Still blocked? Try
proxy_type: "mobile". Reserved for the hardest sites — banks self-service portals, ticketing, geo-strict streaming.
Pin geo only if the target serves different content per region. Pinning costs nothing extra but limits the IP pool, which slightly increases the chance of getting trimmed by anti-bot heuristics.
03Sessions & cookies
Sessions are how you scrape behind a login or any flow that depends on prior requests.
// 1. authenticate await hd.scrape({ url: "https://target/login", method: "POST", body: "email=e@x&password=pw", headers: { "Content-Type": "application/x-www-form-urlencoded" }, session: "session-42" }); // 2. ride the auth state const page = await hd.scrape({ url: "https://target/account/orders", session: "session-42" });
Session ID is opaque to Hypedata — but keep it secret if you don't want a teammate to inherit the same auth state. Sessions live 10 minutes after last use.
04Pagination & crawl loops
Three common shapes:
- Page-number.
?page=N. Easy — increment until you get an empty page or a 404. - Cursor. Each response contains a
nextCursor. Follow it until empty. - Infinite scroll. Use the
scriptparameter with a scroll loop (see Rendering).
For large catalogs, use the Stream API rather than a sequential loop — you'll go ~16× faster and Hypedata handles backpressure for you. For 10,000+ URLs, use the Jobs API.
Always dedupe URLs before submitting. The cheapest scrape is the one you didn't need to do.
05Anti-bot & CAPTCHAs
Hypedata's stealth: true default defeats most consumer-grade anti-bot. For the hard cases:
- Cloudflare Turnstile / I'm Under Attack. Add
proxy_type: "residential"+render: true. Usually enough. - DataDome. Residential + a sticky session (so the device fingerprint persists across the challenge/response).
- PerimeterX / HUMAN. Mobile proxy + low concurrency. Slow but reliable.
- hCaptcha / reCAPTCHA Enterprise. Out of scope — we don't bypass interactive captchas. Contact us if you have a licensed integration.
If you keep hitting walls on the same target, email support@hypelabs.llc with a sample trace_id. We've often pre-built playbooks we can enable for your workspace.
06Cost optimization
Five levers, ranked by typical impact:
- Disable rendering when you can. ~80% of pages don't need it.
- Use
cache: "hit"for warm reads. Free, near-zero latency. Combine with a TTL on the write side. - Let plan-cache pay for itself. Reuse the same
extractschema across many URLs of the same hostname — after the first call, AI parser drops to 3 credits. - Set
html: falsewhen you only needdata. Saves bandwidth, not credits, but reduces your storage cost. - Batch with the Stream API. Same credits, way less wall time.
The dashboard's Insights tab surfaces the cheapest opportunity — usually "you've rendered 12,400 pages that didn't need it this week".