What do we mean by "infinite scroll"?
Instead of having links to page 1, page 2, page 3 etc, on some websites, more items appear when you scroll down the page.
What's the problem?
Infinite scroll can be a tricky one, because the URL generally remains static (it doesn't change, even when you're on a different page).
Website also handle this in various different ways structurally, so it's not always possible to get around.
But... here are some tips and tricks to hack the URL.
How can you get around this?
There are a few tips and tricks to finding these URL patterns.
Using the example:
Firstly, before you scroll down, right-click>inspect and click on the network tab.
Then, clear any existing activity by hitting the clear button next to the red circle on the lefthand side.
The ?pg=4 is the URL parameter that corresponds to the page number.
If you go directly to this URL it skips straight to those items - now we know how the website really paginates!
So create your Extractor to:
As I mentioned, websites are build in many different ways. Here is another example.
Right-click>inspect and click on the network tab.
Beginning with the same methodology,
Clear any existing activity by hitting the clear button next to the red circle on the lefthand side.
Scroll down on the page until more items appear.
Look for an action with the type "xhr"
This time the Request URL looks like it is to some kind of PHP script which used a form to make a POST request. It is not necessary to understand this. Just that we can't use this URL.
Right-click on one of the items that appears after scrolling down and click inspect, and look for the Elements rather than the Network tab.
This should take you to the location of that item in the html. You're looking for anything that says page/pagination/scroll/scrolling/lazyload etc. (You can use command + f and search directly for these terms as an alternative)
Create you Extractor to:
and add the URLs:
Great, take me to the Extractor!