Response HTTP headers

Scraping Fish returns its own headers but preserves all the headers from the original website as well. The original headers are prefixed with Sf- prefix. The Content-Type header is set to original response Content-Type, if it's available.

Example

If the website you want to scrape responds with the following headers:

Content-Type: text/html; charset=UTF-8
Content-Length: 1256
Server: ECS (dcb/7F84)

then, Scraping Fish API will respond with headers:

...ScrapingFish headers...

Content-Type: text/html; charset=UTF-8
Sf-Content-Type: text/html; charset=UTF-8
Sf-Content-Length: 1256
Sf-Server: ECS (dcb/7F84)

Resolved URL

Scraping Fish returns the final resolved URL after any redirects. This information can be found in the Resolved-Url header of the API response. Resolved URL is included in the response header every time and there is no additional parameter needed to enable it.

GET
/api/v1/
import requests

payload = {
  "api_key": "[your API key]",
  "url": "http://google.com",
}

response = requests.get("https://scraping.narf.ai/api/v1/", params=payload)
print(response.headers)

Response headers:

Content-Type: text/html; charset=utf-8
...
Resolved-Url: https://www.google.com/?gws_rd=ssl
...
Content-Length: 39

Original status code

Scraping Fish offers the ability to forward the original status code from the website. You can enable this feature by setting the forward_original_status query parameter to true.

When forward_original_status is enabled, the original status code of the website will be available in the Sf-Original-Status-Code header of the response.

The following example demonstrates how to access the original status code using the https://httpbin.org/status/202 endpoint which returns a 202 status code. By setting forward_original_status to true, the Sf-Original-Status-Code in the response headers will be 202.

GET
/api/v1/
import requests

payload = {
  "api_key": "[your API key]",
  "url": "https://httpbin.org/status/202",
  "forward_original_status": True,
}

response = requests.get("https://scraping.narf.ai/api/v1/", params=payload)
print(response.headers)

Response headers:

Content-Type: text/html; charset=utf-8
...
Sf-Original-Status-Code: 202
...
Content-Length: 39