Scraping URLs containing query parameters

If the URL you want to scrape contains query parameters, it is required to first URL-encode it. This is important because otherwise our API wouldn't be able to differentiate between query params used in Scraping Fish API call vs. query params you want to pass to the web page.

GET
/api/v1/
import requests

payload = {
  "api_key": "[your API key]",
  "url": "https://example.com",
}

response = requests.get("https://scraping.narf.ai/api/v1/", params=payload)
print(response.content)

URL Encoding

When using requests Python package or axios in NodeJS, parameters provided in params argument are automatically URL-encoded as in the example above. In this case, you should not encode parameters because that would result in double encoding and you'd get error response from API stating that the URL you provided is malformed.

You only have to apply URL-encoding when you construct the URL manually with template strings.

If using cURL however, you need to encode URLs containing query parameters. With cURL, it is recommended to always encode your URL, regardless of wheter it contains query parameters or not.

cURL

curl -G --data-urlencode 'url=https://example.com?example=param&second=parameter' \
'https://scraping.narf.ai/api/v1/?api_key=[your API key]'

Below is a list of links to documentation for URL-encoding methods in selected popular programming languages:

Encoding of other parameters

When using cURL (or an HTTP client that does not automatically URL encodes parameters), you should also encode other API parameters that contain non-ASCII or reserved characters. For example, when using JS Scenario you should encode it too.

cURL

curl -G --data-urlencode 'url=https://example.com' \
--data-urlencode 'js_scenario={"steps": [{"wait": 1000}, {"click_and_wait_for_navigation": "p > a"}]}' \
'https://scraping.narf.ai/api/v1/?api_key=[your API key]'