Response HTTP headers
Scraping Fish returns its own headers but preserves all the headers from the original website as well.
The original headers are prefixed with Sf-
prefix.
Example
If the website you want to scrape responds with the following headers:
Content-Type: text/html; charset=UTF-8
Content-Length: 1256
Server: ECS (dcb/7F84)
then, Scraping Fish API will respond with headers:
...ScrapingFish headers...
Content-Type: text/html; charset=UTF-8
Sf-Content-Type: text/html; charset=UTF-8
Sf-Content-Length: 1256
Sf-Server: ECS (dcb/7F84)
Resolved URL
Scraping Fish returns the final resolved URL after any redirects.
This information can be found in the Resolved-Url
header of the API response.
Resolved URL is included in the response header every time and there is no additional parameter needed to enable it.
import requests
payload = {
"api_key": "[your API key]",
"url": "http://google.com",
}
response = requests.get("https://scraping.narf.ai/api/v1/", params=payload)
print(response.headers)
Response headers:
Response
Content-Type: text/html; charset=utf-8
...
Resolved-Url: https://www.google.com/?gws_rd=ssl
...
Content-Length: 39
Original status code
Scraping Fish offers the ability to forward the original status code from the website.
You can enable this feature by setting the forward_original_status
query parameter to true.
When forward_original_status
is enabled, the original status code of the website will be available in the Sf-Original-Status-Code header of the response.
When this feature is enabled, all requests are considered successful and will be deducted from your pack of API requests regardless of the status code.
The following example demonstrates how to access the original status code using the https://httpbin.org/status/202
endpoint which returns 202
status code.
By setting forward_original_status
to true
, the Sf-Original-Status-Code
in the response headers will be 202
.
import requests
payload = {
"api_key": "[your API key]",
"url": "https://httpbin.org/status/202",
"forward_original_status": True,
}
response = requests.get("https://scraping.narf.ai/api/v1/", params=payload)
print(response.headers)
Response headers:
Response
Content-Type: text/html; charset=utf-8
...
Sf-Original-Status-Code: 202
...
Content-Length: 39