JavaScript scenario
Scraping Fish API allows you to specify a series of steps to execute once the page is loaded. You can use it, for example, to click a button or fill in a form.
Steps are passed as JSON in js_scenario
query param.
Remember to encode this parameter like in the example below.
Example
To give an idea of how you can use this feature, below is an example scenario in which, once the page is loaded, we will sleep for 1 second (1000 ms), click on the item selected by p > a
CSS selector and wait for navigation to complete:
- Python
- NodeJS
- cURL
import requests
import json
payload = {
"api_key": "[your API key]",
"url": "https://example.com",
"js_scenario": json.dumps(
{"steps": [{"wait": 1000}, {"click_and_wait_for_navigation": "p > a"}]}
),
}
response = requests.get("https://scraping.narf.ai/api/v1/", params=payload)
print(response.content)
const axios = require("axios");
const payload = {
api_key: "[your API key]",
url: "https://example.com",
js_scenario: JSON.stringify({
steps: [{ wait: 1000 }, { click_and_wait_for_navigation: "p > a" }]
}),
};
const response = await axios.get("https://scraping.narf.ai/api/v1/", { params: payload });
console.log(response.data);
curl -G --data-urlencode 'url=https://example.com' \
--data-urlencode 'js_scenario={"steps": [{"wait": 1000}, {"click_and_wait_for_navigation": "p > a"}]}' \
'https://scraping.narf.ai/api/v1/?api_key=[your API key]'
If the action causes navigation to another URL, it will be charged as a separate request.
Steps
"steps"
is a list of dictionaries which define actions to execute in a JS scenario.
Each dictionary's only key is a name of the action to perform and the value is its argument.
For example:
{
"steps": [
{
"wait_for": "#button-id"
},
{
"select": {
"selector": "#select-id",
"options": "value1"
}
},
{
"click": "#button-id"
}
]
}
Execution of this scenario will start with waiting until "#button-id"
element is available, select "value1"
option from the select element (drop-down list) with #select-id
id and then click the button.
In the following section, we provide all available predefined actions which you can use as steps in a JS scenario.
If you need to execute custom JavaScript code, it's documented in the section Custom JavaScript evaluation
Available actions
Click
"click"
action clicks an element specified by a selector. Example:
{
"steps": [
{"click": "#a-button"}
]
}
The argument for this action must be a string and a valid selector.
If clicking an element navigates to another page (e.g. by submitting a form), use "click_and_wait_for_navigation"
which waits until the navigation is completed.
If you experience 500 or timeout error with this action, it's likely that you have to first wait for the button to appear.
Click and wait for navigation
"click_and_wait_for_navigation"
action clicks an element specified by a selector and waits for the navigation to complete. Example:
{
"steps": [
{"click_and_wait_for_navigation": "#a-button"}
]
}
The argument for this action must be a string and a valid selector.
If clicking the specified element doesn't trigger navigation, use plain "click"
action as using this action will cause the request to time out.
If you experience 500 or timeout error with this action, it's likely that you have to first wait for the button to appear.
Input
"input"
action fills in given values to the input elements specified by selectors. Example:
{
"steps": [
{
"input": {
"#input1": "value1",
"#input2": "value2"
}
}
]
}
The argument for this action step must be a dictionary mapping from input selectors to values you want to fill in.
If the order of filling in the inputs matters in your use case, you should specify each input field as a separate input action. For example, if #input2
should be filled before #input1
:
{
"steps": [{
"input": {
"#input2": "value2"
},
"input": {
"#input1": "value1"
},
}]
}
Select
"select"
action selects option(s) from a given <select>
element (drop-down list). Example:
{
"steps": [
{
"select": {
"selector": "#select1",
"options": "1"
}
}
]
}
The argument for this action must be a dictionary with "selector"
specifying the selector to find a desired <select>
element and "options"
(string or array) specifying the options.
Selecting multiple options is supported by using an array instead of a string:
{
"steps": [
{
"select": {
"selector": "#select1",
"options": ["1", "2"]
}
}
]
}
Set localStorage values
"set_local_storage
action sets key/value pairs in localStorage. Example:
{
"steps": [
{
"set_local_storage": {
"key1": "value1",
"key2": "value2"
}
}
]
}
Scroll
"scroll"
action scrolls the web page vertically by a given number of pixels. Example:
{
"steps": [
{"scroll": 1000}
]
}
The argument for this action must be a number.
Wait for timeout
"wait"
action waits for a fixed amount of time, specified in milliseconds.
{
"steps": [
{"wait": 1000}
]
}
The argument for this action must be a number.
Wait for an element
"wait_for"
action waits for an element specified by a selector to become visible (default) or attached. Example:
{
"steps": [
{"wait_for": "#some-button"}
]
}
The argument for this action must be a string and a valid selector or an object with "selector"
and "state"
keys, where "state"
is one of "visible"
or "attached"
.
If "state"
is set to "visible"
(default) the element you want to wait for must have non-empty bounding box (i.e. no "display: none"
) and no "visible: hidden"
.
If you want to wait for an element to be present in DOM (but not necessarily visible), use "state": "attached"
:
{
"steps": [
{
"wait_for": {
"selector": "#some-button",
"state": "attached"
}
}
]
}
Wait for one of multiple elements
"wait_for_any"
action waits for any of the specified elements to become visible (default) or attached. Example:
{
"steps": [
{
"wait_for_any": [
{
"selector": "#some-button",
"state": "attached"
},
{
"selector": "#some-other-button",
"state": "visible"
}
]
}
]
}
If you need to wait for any of the specified elements to be visible, you can use a simpler form and only provide selectors:
{
"steps": [
{
"wait_for_any": ["#some-button", "#some-other-button"]
}
]
}
Custom JavaScript evaluation
If the predefined actions we provide don't fit your needs and you want to evaluate custom JavaScript, there's a special evaluate
action which you can use to execute arbitrary JavaScript code:
{
"steps": [
{
"evaluate": "console.log('Hello from Scraping Fish!')"
}
]
}
Timeout
All the steps from your JavaScript scenario must complete within 90 seconds, otherwise the request will time out.