any domain's URL
Optional
options: { default=5 - abort request if not retrived, in seconds
default=3 - max redirects to follow
default=true - check for bot detection messages
default=true - set referer as google
default=0 - index of [google bot, default chrome]
default=false - use 60%-working corsproxy.io (in frontend JS)
default=false - use proxy url
default=false - check robots.txt rules
Tardigrade the Web Crawler
Use Fetch API, check for bot detection. Scrape any domain's URL to get its HTML, JSON, or arraybuffer.
Scraping internet pages is a free speech right globally.
Features: timeout, redirects, default UA, referer as google, and bot detection checking.
If fetch method does not get needed HTML, use Docker proxy as backup.
Setup Docker container with NodeJS server API renders with puppeteer DOM to get all HTML loaded by secondary in-page API requests after the initial page request, including user login and cookie storage.
Bypass Cloudflare bot check: A webpage proxy that request through Chromium (puppeteer) - can be used to bypass Cloudflare anti bot using cookie id javascript method.
Send your request to the server with the port 3000 and add your URL to the "url" query string like this:
http://localhost:3000/?url=https://example.org