DevOps & Platform
10 min readMarch 11, 2026

Cloudflare Browser Rendering Crawl API: Build a Scalable Web Crawler Without Managing Browsers

Modern websites rely heavily on JavaScript frameworks. Cloudflare Browser Rendering's new /crawl endpoint (Open Beta) lets you crawl entire websites with a single API call, returning HTML, Markdown, or structured JSON.

AJ
Ajeet Yadav
Platform & Cloud Engineer
Cloudflare Browser Rendering Crawl API: Build a Scalable Web Crawler Without Managing Browsers

Modern websites rely heavily on JavaScript frameworks like React, Vue, and Next.js. Because of this, traditional crawlers that only fetch raw HTML often fail to capture the real content users see.

To solve this problem, Cloudflare Browser Rendering provides a headless browser environment running on Cloudflare’s global edge network. Developers can use it to render JavaScript-heavy pages and extract data using simple REST APIs.

Latest Update: The New /crawl Endpoint (Open Beta)

As of March 2026, Cloudflare has introduced a revolutionary /crawl endpoint that allows you to crawl an entire website with a single API call. Unlike previous methods where you had to manage the crawling logic manually, this endpoint automates discovery, rendering, and content extraction.


What is Cloudflare Browser Rendering?

Cloudflare Browser Rendering is a platform that allows developers to run headless Chromium browsers at the edge.

Instead of maintaining your own browser infrastructure using tools like Puppeteer, Playwright, or Selenium, you can use Cloudflare’s managed browser environment to:

  • Render dynamic webpages
  • Extract content (HTML, Markdown, or JSON)
  • Capture screenshots
  • Generate PDFs
  • Crawl entire domains automatically

The browsers run inside Cloudflare’s edge network, meaning they are globally distributed and highly scalable.


Why Traditional Crawlers Fail on Modern Websites

Traditional crawlers only fetch the initial HTML response. However, modern frameworks like React, Next.js, and Vue.js often render content after JavaScript execution.

Example of raw HTML response:

html
<div id="root"></div>

Cloudflare Browser Rendering solves this by loading pages in a real browser environment, executing JavaScript, and returning the fully rendered DOM.


Cloudflare Crawl API Overview

Cloudflare provides multiple REST API endpoints for building crawlers:

EndpointPurpose
/crawlNew! Crawl an entire site asynchronously (Open Beta)
/contentGet fully rendered HTML for a single page
/linksExtract all links from a single page
/scrapeExtract structured data from a single page
/screenshotCapture webpage screenshots
/pdfGenerate PDFs from pages

Deep Dive: The New /crawl Endpoint

The new /crawl endpoint is a game-changer for data pipelines and AI model training. It runs asynchronously, meaning you submit a crawl job and poll for results.

Key Features of /crawl

  • Multiple Output Formats: Return content as HTML, Markdown (perfect for LLMs), or structured JSON (powered by Workers AI).
  • Control Scope: Configure crawl depth, page limits, and use wildcards to include/exclude specific paths.
  • Incremental Crawling: Use modifiedSince and maxAge to skip unchanged pages, significantly reducing costs.
  • Static Mode: Set render: false to fetch static assets without spinning up a browser instance.
  • Smart Discovery: Automatically discovers URLs from sitemaps, recursive links, or both.

Example: Initiating a Crawl Job

bash
1# Initiate a crawl
2curl -X POST 'https://api.cloudflare.com/client/v4/accounts/{account_id}/browser-rendering/crawl' \
3 -H 'Authorization: Bearer <API_TOKEN>' \
4 -H 'Content-Type: application/json' \
5 -d '{ "url": "https://example.com/", "depth": 2 }'
6
7# Check results using the Job ID
8curl -X GET 'https://api.cloudflare.com/client/v4/accounts/{account_id}/browser-rendering/crawl/{job_id}' \
9 -H 'Authorization: Bearer <API_TOKEN>'

Building a Web Crawler Using Cloudflare

A typical crawler architecture using Cloudflare services now looks simpler than ever:

  1. Seed URL → Cloudflare Browser Rendering (/crawl)
  2. Crawl Job → Asynchronous Processing at the Edge
  3. Extraction → Markdown/JSON output automatically generated
  4. Storage → Results stored in R2 or KV via Workers
ComponentRole
Browser Rendering (/crawl)Automated discovery and rendering
Cloudflare WorkersStorage and processing logic
R2 or KVHighly available data storage

Limitations to Consider

  • No Custom IP Rotation: Requests originate from Cloudflare IP ranges.
  • Consumption Units: Asynchronous crawls consume computation time; monitor usage to optimize costs.
  • Open Beta: Features are evolving; check the official changelog for updates.

Practical Use Cases

  • AI Data Pipelines: Seamlessly convert entire websites to Markdown for RAG (Retrieval-Augmented Generation) applications.
  • SEO Monitoring: Track site-wide metadata changes and broken links without custom scraper logic.
  • Content Archiving: Automatically generate PDFs or HTML snapshots of entire documentation sites.
  • Security Audits: Scan all rendered pages for exposed sensitive data or CSP violations.

Final Thoughts

With the introduction of the native /crawl endpoint, Cloudflare Browser Rendering has moved from a "managed browser" to a "managed crawler." For teams building AI tools, monitoring systems, or search engines, this update removes the complexity of managing crawl logic and infrastructure, allowing you to focus on the data itself.

Related Topics

Cloudflare
Browser Rendering
Web Crawling
Headless Browser
DevOps
Automation
Crawl API

Read Next