Skip to main content

Introducing JavaScript scenario execution

Paweł Kobojek
Mateusz Buda

Execute JavaScript steps on scraped website

Today, we are pleased to introduce the much awaited feature - JavaScript scenario execution! A lot of our customers were asking for the possibility of interacting with the scraped website - clicking buttons, filling out forms, selecting <select> options etc. This is now possible with Scraping Fish API without compromising our commitment to keep our product usability as simple as possible.

Why do I need it?

In certain web scraping situations you might want to not only load a website but also, e.g., select an option, wait for data to be loaded and click a button which is only enabled after performing some action. Or maybe you need to fill out a form before the data you desire is available. In all these cases JavaScript scenario execution is your friend.

What is a JavaScript scenario?

A JavaScript scenario is a series of steps which are executed after the initial page is loaded, one after another. Possible actions include clicking on an element specified by a selector, waiting for an element to appear or filling out an input field. For a full list and detailed explanation, please refer to our documentation. For example:

{
"steps": [
{"scroll": 1000},
{"wait_for": "#element-id"},
{"select": {
"selector": "#the-select-element",
"options": ["option1", "option2"]
}},
{"click_and_wait_for_navigation": "#button-id"}
]
}

You pass the steps as a url encoded JSON object in query parameter.

Real life example

For a quick showcase of what you can do with this new feature, we will scrape a single room from booking.com. Please note that this code works as of the day of this post's publication and it may break if the website changes. For the sake of example, we will check the price of some random hostel. In order to know the exact price of the hotel room as well as how much would it cost to cancel the reservation, we need to:

  1. Navigate to the hotel's url.
  2. Select how many beds do we need.
  3. Click a button and wait for navigation so that the desired information is available.

These steps are depicted below.

Thanks to the new feature, all this can be done by invoking a single request to Scraping Fish API.

import requests
import json
import datetime
from urllib.parse import quote_plus

checkin_date = datetime.date.today() + datetime.timedelta(30)
checkout_date = checkin_date + datetime.timedelta(1)

checkin = checkin_date.strftime("%Y-%m-%d")
checkout = checkout_date.strftime("%Y-%m-%d")

url = f"https://www.booking.com/hotel/pl/la-guitarra-hostel-gdansk.html?aid=304142&label=gen173nr-1FCAEoggI46AdIM1gEaLYBiAEBmAEeuAEHyAEP2AEB6AEB-AELiAIBqAIDuAKb8ZKUBsACAdICJDJmNDliMzMxLWIzYjEtNDQyNi1iNTdjLTljODIxNjg1M2Y3YdgCBuACAQ&sid=85334a8861a557bc4c65d6eabd08e1fe&all_sr_blocks=32277903_118755301_0_0_&checkin={checkin}&checkout={checkout}&dest_id=-501400&dest_type=city&group_adults=1&group_children=0&hapos=1&highlighted_blocks=32277903_118755301_0_0_0&hpos=1&lang=en-us&matching_block_id=32277903_118755301_0_0_0&no_rooms=1&req_adults=1&req_children=0&room1=A&sb_price_type=total&soz=1&sr_order=popularity&sr_pri_blocks=32277903_118755301_0_0_0__12452&srepoch=1652865196&srpvid=42cc40d5eaa90626&type=total&ucfs=1&lang_click=other&cdl=pl&lang_changed=1"

scenario = {
"steps": [
{"select": {"selector": "#hprt_nos_select_32277903_118755301_0_0_0", "options": "1"}},
{"click": ".txp-bui-main-pp"},
{"wait_for": ".bp-card--cancellation-schedule"}
# You can also replace the two lines above with {"click_and_wait_for_navigation": ".txp-bui-main-pp"}
]
}

url = quote_plus(url)
scenario = quote_plus(json.dumps(scenario))

api_key = "[your api key]"
response = requests.get(f"https://scraping.narf.ai/api/v1/?api_key={api_key}&url={url}&js_scenario={scenario}")

print(response.content) # extract whatever info you want from this

And that's all you need. Simple as always.

Try it yourself!

This was a quick showcase of how you can use JavaScript scenario execution to extract the data your business needs. The possibilities are endless!

Want to apply our new feature to your use case? You can start using Scraping Fish API for just $2!