Rendering JavaScript
Rendering JavaScript is disabled by default. This way, the scraping speed is improved, but sometimes you actually do want to wait for JS to render and execute background requests which load additional data dynamically.
To enable JavaScript rendering, just add render_js=true
query parameter:
JavaScript rendering
import requests
payload = {
"api_key": "[your API key]",
"url": "https://example.com",
"render_js": True,
}
response = requests.get("https://scraping.narf.ai/api/v1/", params=payload)
print(response.content)
For all requested URLs, without exceptions, we render the content using our cluster of real browsers customized and specialized to bypass anti-bot detection systems like Cloudflare, DataDome, Kasada, Akamai, Imperva, etc. You do not have to enable JavaScript rendering to use a real browser for your request.
You do not pay anything extra for requests with JavaScript rendering enabled. We strive to make our pricing clear and the cost of each request is the same regardless of which features of the API you use.
In Scraping Fish, enabled JavaScript rendering means that we wait until there are no network connections for at least 500 ms with a timeout of 5 seconds, by default, to prevent infinite waiting if the target website uses polling or keeps a WebSocket connection open.
For most URLs, the resulting HTML will look exactly the same with and without JS rendering. You can use it if you want to let the website dynamically load additional data.
The request with JS rendering enabled is likely to take slightly more time.
JavaScript rendering timeout
To change the default timeout for JS rendering, you can include render_js_timeout_ms
query parameter.
It is ignored when JS rendering is disabled.
Example below shows how to set it to 15 seconds.
JavaScript rendering with custom timeout
import requests
payload = {
"api_key": "[your API key]",
"url": "https://example.com",
"render_js": True,
"render_js_timeout_ms": 15000,
}
response = requests.get("https://scraping.narf.ai/api/v1/", params=payload)
print(response.content)
If the specified timeout for JS rendering is larger than the single trail timeout and the target website runs a lot of background requests, most of your requests are likely to fail due to time out. Read more on how to specify the trial timeout in the Timeouts guide.
Request without a browser
Even with JavaScript rendering disabled, the URL you requested to scrape is still opened in a browser.
To not use a browser for your request, specify browser_type=none
query parameter in your request.
This way you can send a request to third party or website hidden API directly, without using a browser which adds extra headers and might affect your request in other unexpected ways:
Request without browser
import requests
payload = {
"api_key": "[your API key]",
"url": "https://httpbin.org/get",
"browser_type": "none",
}
response = requests.get("https://scraping.narf.ai/api/v1/", params=payload)
print(response.content)
You can also use other HTTP methods with browser_type=none
.
Read more on this in the guide for POST/PUT requests.