Timeouts
Scraping Fish API allows you to set two types of timeouts:
- For the entire request, including JS scenario execution time, with the
total_timeout_ms
parameter. - For one trial of loading the website with the
trial_timeout_ms
parameter.
Timeouts should be specified in milliseconds, and it is possible to set both trial_timeout_ms
and trial_timeout_ms
for the same request.
Total timeout
The total request timeout is by default set to 90,000 ms (90 seconds). This is an approximate value, and the actual timeout can happen within a 1,000 ms margin. If you have a complex JS scenario use case and need more time, you have to adjust the total_timeout_ms
parameter as in the example below.
To simulate a long JS scenario, we go to example.com and simply wait for 100 seconds. To make sure that we have enough time for the website to load and then to execute our dummy JS scenario, we set total_timeout_ms
to 110,000 ms.
import requests
import json
payload = {
"api_key": "[your API key]",
"url": "https://www.example.com",
"js_scenario": json.dumps(
{"steps": [{"wait": 100_000}]}
),
"total_timeout_ms": 110_000
}
response = requests.get("https://scraping.narf.ai/api/v1/", params=payload)
print(response.content)
Single trial timeout
In addition to the total request timeout, it is possible to set a timeout for one trial of loading the website using the trial_timeout_ms
query parameter. This does not include the JS scenario, which is executed after the website is loaded, and it is set to 30,000 ms (30 seconds) by default. In case loading the website fails for any reason within one trial timeout, Scraping Fish attempts to load it again until it succeeds (or until interrupted by the total request timeout).
It can be useful to adjust the trial timeout for a website that is expected to take a very long time to load.
In the example below, we expect example.com to load very quickly and want to force Scraping Fish API to retry the request if it fails within 10 seconds.
import requests
payload = {
"api_key": "[your API key]",
"url": "https://www.example.com",
"trial_timeout_ms": 10_000
}
response = requests.get("https://scraping.narf.ai/api/v1/", params=payload)
print(response.content)
If you set total_timeout_ms
to a value smaller than trial_timeout_ms
, trial_timeout_ms
will be reduced to total_timeout_ms
value.