JS Scenario

In this guide, we will look at how to use JS Scenario to perform activities on the scraped website.

Scraping Fish API allows you to specify a series of steps to execute once the page is loaded. You can use it, for example, to click a button or fill in a form. Steps to perform are passed as JSON in js_scenario query parameter.

Example

To give you an idea on how you can use this feature, let's see an example scenario which, once the page is loaded, waits for 1 second (1000 ms), clicks on the item selected by p > a CSS selector and waits for navigation to complete.

  • Name
    steps
    Type
    array
    Description

    An array of objects which define action to execute. Steps are executed in a sequence. In this example we use:

    • wait - waits for a given amount of milliseconds.
    • click_and_wait_for_navigation - clicks an element specified by the given selector and waits for navigation to complete.
  • Name
    url
    Type
    string
    Description

    URL to navigate to. This is a standard parameter required even if you don't perform any actions.

  • Name
    api_key
    Type
    string
    Description

    Your Scraping Fish API key. Required to authenticate your requests.

Execute JS Scenario

GET
/api/v1/
import requests
import json

payload = {
  "api_key": "[your API key]",
  "url": "https://example.com",
  "js_scenario": json.dumps({
    "steps": [
      {"click_and_wait_for_navigation": "p > a"}
    ]
  })
}

response = requests.get("https://scraping.narf.ai/api/v1/", params=payload)

Response


<!doctype html>
<html>
<head>
<title>Example Domains</title>

</head>
<body>

</body>
</html>

Steps

steps is a list of objects each of which defines an action to execute in a JS scenario. Each object's only key is a name of the action to perform and the value is its argument. For example:

{
  "steps": [
    {
      "wait_for": "#button-id"
    },
    {
      "select": {
        "selector": "#select-id",
        "options": "value1"
      }
    },
    {
      "click": "#button-id"
    }
  ]
}

Execution of this scenario will start with waiting until #button-id element is available, select value1 option from the select element (drop-down list) with #select-id id and then click the button.

In the following section, we provide all available predefined actions which you can use as steps in a JS scenario.

If you need to execute custom JavaScript code, use evaluate action.

Available actions

  • Name
    click
    Type
    string
    Description
    Clicks an element specified by a selector.
  • Name
    click_if_exists
    Type
    string
    Description
    Clicks an element specified by a selector but only if the element exists and skips this step otherwise. It can be useful if you want to close a cookie banner or other model window which does not appear every time.
  • Name
    click_and_wait_for_navigation
    Type
    string
    Description
    Clicks an element specified by a selector and waits for the navigation to complete.
  • Name
    input
    Type
    object
    Description
    Fills in given values to the input elements specified by selectors. It's an object mapping selectors to desired input values. If the order of filling in the inputs matters in your use case, you should specify each input field as a separate input action. You can optionally specify an option to "humanize" an input action. If set, actual key press events are sent. It may only be necessary if keyboard events are handled differently than usual input.
  • Name
    select
    Type
    object
    Description
    Selects option(s) from a given <select> element (drop-down list). The argument for this action must be an object with "selector" specifying the selector to find a desired <select> element and "options" (string or array) specifying the options. Selecting multiple options is supported by using an array instead of a string.
  • Name
    set_local_storage
    Type
    object
    Description
    Sets key/value pairs in localStorage. Provided object's keys to values will be resembled set in localStorage.
  • Name
    scroll
    Type
    integer
    Description
    Scrolls the web page vertically by a given number of pixels.
  • Name
    wait
    Type
    integer | object
    Description
    Waits for a fixed amount of time, specified in milliseconds. The argument for this action must be either a number or an object for random wait configuration. You may specify a range to randomize the time of wait. To do so, specify a config object with min_ms and max_ms values.
  • Name
    wait_for
    Type
    string | object
    Description
    Waits for an element specified by a selector to become visible (default) or attached. The argument for this action must be a string and a valid selector or an object with "selector" and "state" keys, where "state" is one of "visible" or "attached". If "state" is set to "visible" (default) the element you want to wait for must have non-empty bounding box (i.e. no "display: none") and no "visible: hidden". If you want to wait for an element to be present in DOM (but not necessarily visible), use "state": "attached".
  • Name
    wait_for_any
    Type
    array[string | object]
    Description
    Waits for any of the specified elements to become visible (default) or attached. If you need to wait for any of the specified elements to be visible, you can use a simpler form and only provide selectors.
  • Name
    evaluate
    Type
    string
    Description
    If the predefined actions we provide don't fit your needs and you want to evaluate custom JavaScript, this is a special action which you can use to execute arbitrary JavaScript code.

Click

{
  "steps": [
    {"click": "#a-button"}
  ]
}

Click if exists

{
  "steps": [
    {"click_if_exists": "#a-button"}
  ]
}

Click and wait for navigation

{
  "steps": [
    {"click_and_wait_for_navigation": "#a-button"}
  ]
}

Input

{
  "steps": [
    {
      "input": {
        "#input1": "value1",
        "#input2": "value2"
      }
    }
  ]
}

Select

{
  "steps": [
    {
      "select": {
        "selector": "#select1",
        "options": "1"
      }
    }
  ]
}

Set localStorage values

{
  "steps": [
    {
      "set_local_storage": {
        "key1": "value1",
        "key2": "value2"
      }
    }
  ]
}

Scroll

{
  "steps": [
    {"scroll": 1000}
  ]
}

Wait for timeout

{
  "steps": [
    {"wait": 1000}
  ]
}

Wait for selector

{
  "steps": [
    {"wait_for": "#some-button"}
  ]
}

Wait for any

{
  "steps": [
    {
      "wait_for_any": ["#some-button", "#some-other-button"]
    }
  ]
}

Custom JavaScript evaluation

{
  "steps": [
    {
      "evaluate": "console.log('Hello from Scraping Fish!')"
    }
  ]
}

Timeout

All the steps from your JavaScript scenario must complete within single trial timeout, otherwise the request will time out.