How Web Scraping is Used to Scrape Website using Infinite Scrolling?

February 4, 2022
How-Web-Scraping-is-Used-to-Scrape-Website-using-Infinite-Scrolling

For example, you want to scrape 100 Flipkart products from every category. But this formula will only extract the top 15 products on a page. Flipkart offers a feature called limitless scrolling, which eliminates the need for pagination (such as?page=2,?page=3) in the URL. If it had this functionality, we would have entered the value into a "while loop" and increased the page values as seen below.

page_count = 0
while page_count < 5:
URL = "http://example.com/?page=%d" %(page_count)
# scraping code...
page_count += 1

So, let's go back to limitless scrolling.

The usage of "Ajax" allows any webpage to employ limitless scrolling. However, that ajax request also includes a URL from which the goods on the same page are loaded when scrolling.

To look at that URL.

  • Launch Google Chrome and navigate to the page.
  • Then, in the console, right-click and choose LogXMLHttpRequests.
  • Refresh the page and scroll gently now. When new goods are added, you will notice various URLs that begin with "XHR finished loading: GET." Users can click on them. There are several varieties of such URLs available on Flipkart. The one you want begins with "flipkart.com/lc/pr/pv1/spotList1/spot1/productList?p=blahblahblah&lots of crap."
  • When you left-click upon the URL, this will be marked in the Chrome dev tools' Network tab. You may then save that link or open it in a new window.
To-look-at-that-URL

When you click the link in a new window, you'll find it easily to this, with about 15 to 20 goods on each page.

To-look-at-that-URL-2

Analyze the URL; there is indeed a Get parameter called?start=. (Some number)

  • Then, for the very first 20 goods, change the number to 0; for the following 20 products, set the number to 21, and if there are 15 products per page, set the number to 0, 16, 31, and so on. Iterate this URL in the while loop as I previously demonstrated, and you're done.
  • When you right-click and see the page source of that URL, you'll notice an <img> tag with the data-src="" property; this is your product picture. This is simply an example from Flipkart.com. Distinct websites may have different Ajax urls and URL get parameters.

Certain websites' Ajax URLs may additionally include "JSON" answers. If you locate them, you won't need to scrape them; simply retrieve the JSON response as you would any other JSON API.

If you are looking for web scraping service using Infinite scrolling, contact Scraping Intelligence.

Request for a quote!

10685-B Hazelhurst Dr.#23604 Houston,TX 77043 USA

Incredible Solutions After Consultation

  •   Industry Specific Expert Opinion
  •   Assistance in Data-Driven Decision Making
  •   Insights Through Data Analysis