JS Scenario

In this guide, we will look at how to use JS Scenario to perform activities on the scraped website.

Scraping Fish API allows you to specify a series of steps to execute once the page is loaded. You can use it, for example, to click a button or fill in a form. Steps to perform are passed as JSON in js_scenario query parameter.

Example

To give you an idea on how you can use this feature, let's see an example scenario which, once the page is loaded, waits for 1 second (1000 ms), clicks on the item selected by p > a CSS selector, and waits for navigation to complete.

Execute example JS Scenario

GET
/api/v1/
import requests
import json

payload = {
  "api_key": "[your API key]",
  "url": "https://example.com",
  "js_scenario": json.dumps({
    "steps": [
      {"wait": 1000},
      {"click_and_wait_for_navigation": "p > a"}
    ]
  })
}

response = requests.get("https://scraping.narf.ai/api/v1/", params=payload)
print(response.content)

Response


<!doctype html>
<html>
<head>
	<title>Example Domains</title>

</head>
<body>

</body>
</html>

Steps

The top level key for JS Scenario JSON must be steps. It is an array of objects which define action steps to be executed in a sequence. Each object's only key is a name of the action to perform and the value is its argument. For example:

{
  "steps": [
    {
      "wait_for": "#button-id"
    },
    {
      "select": {
        "selector": "#select-id",
        "options": "value1"
      }
    },
    {
      "click": "#button-id"
    }
  ]
}

Execution of this scenario will start with waiting until #button-id element is available, then select value1 option from the select element (drop-down list) with #select-id id, and finally click the button.

In the following section, we provide all available predefined actions which you can use as steps in a JS scenario.

Available actions

  • Name
    click
    Type
    string
    Description

    Clicks an element specified by a selector.

    Click

    {
      "steps": [
        {"click": "#a-button"}
      ]
    }
    
  • Name
    click_if_exists
    Type
    string
    Description

    Clicks an element specified by a selector but only if the element exists and skips this step otherwise. It can be useful if you want to close a cookie banner or other model window which does not appear every time.

    Click if exists

    {
      "steps": [
        {"click_if_exists": "#a-button"}
      ]
    }
    
  • Name
    click_and_wait_for_navigation
    Type
    string
    Description

    Clicks an element specified by a selector and waits for the navigation to complete.

    Click and wait for navigation

    {
      "steps": [
        {"click_and_wait_for_navigation": "#a-button"}
      ]
    }
    
  • Name
    input
    Type
    object
    Description

    Fills in given values to the input elements specified by selectors. It's an object mapping selectors to desired input values. If the order of filling in the inputs matters in your use case, you should specify each input field as a separate input action. You can optionally specify an option to "humanize" an input action. If set, actual key press events are sent. It may only be necessary if keyboard events are handled differently than usual input.

    Input

    {
      "steps": [
        {
          "input": {
            "#input1": "value1",
            "#input2": "value2"
          }
        }
      ]
    }
    
  • Name
    select
    Type
    object
    Description

    Selects option(s) from a given <select> element (drop-down list). The argument for this action must be an object with "selector" specifying the selector to find a desired <select> element and "options" (string or array) specifying the options. Selecting multiple options is supported by using an array instead of a string.

    Select

    {
      "steps": [
        {
          "select": {
            "selector": "#select1",
            "options": "1"
          }
        }
      ]
    }
    
  • Name
    set_local_storage
    Type
    object
    Description

    Sets key/value pairs in localStorage. Provided object's keys to values will be resembled set in localStorage.

    Set localStorage values

    {
      "steps": [
        {
          "set_local_storage": {
            "key1": "value1",
            "key2": "value2"
          }
        }
      ]
    }
    
  • Name
    scroll
    Type
    integer
    Description

    Scrolls the web page vertically by a given number of pixels.

    Scroll

    {
      "steps": [
        {"scroll": 1000}
      ]
    }
    
  • Name
    wait
    Type
    integer | object
    Description

    Waits for a fixed amount of time, specified in milliseconds. The argument for this action must be either a number or an object for random wait configuration. You may specify a range to randomize the time of wait. To do so, specify a config object with min_ms and max_ms values.

    Wait for timeout

    {
      "steps": [
        {"wait": 1000}
      ]
    }
    
  • Name
    wait_for
    Type
    string | object
    Description

    Waits for an element specified by a selector to become visible (default) or attached. The argument for this action must be a string and a valid selector or an object with "selector" and "state" keys, where "state" is one of "visible" or "attached". If "state" is set to "visible" (default) the element you want to wait for must have non-empty bounding box (i.e. no "display: none") and no "visible: hidden". If you want to wait for an element to be present in DOM (but not necessarily visible), use "state": "attached".

    Wait for selector

    {
      "steps": [
        {"wait_for": "#some-button"}
      ]
    }
    
  • Name
    wait_for_any
    Type
    array[string | object]
    Description

    Waits for any of the specified elements to become visible (default) or attached. If you need to wait for any of the specified elements to be visible, you can use a simpler form and only provide selectors.

    Wait for any

    {
      "steps": [
        {
          "wait_for_any": ["#some-button", "#some-other-button"]
        }
      ]
    }
    
  • Name
    evaluate
    Type
    string
    Description

    If the predefined actions we provide don't fit your needs and you want to evaluate custom JavaScript, this is a special action which you can use to execute arbitrary JavaScript code.

    Custom JavaScript evaluation

    {
      "steps": [
        {
          "evaluate": "console.log('Hello from Scraping Fish!')"
        }
      ]
    }
    

Timeout

All the steps from your JavaScript scenario must complete within single trial timeout, otherwise the request will time out.