... it gives me errors 426, 401 for some websites especially HTTPS.
426 is "upgrade required
" (the destination server refuses to accept the current protocol).
401 is "unauthorized
", since it needs authentication.
You may want to try another scraper but since cloud based
means using someone else's
server resources, they are likely to ask for some $ at some point. There is a list of "cloud based web scrapping solutions" at:
You can find some free tiers there
Do notice cloud scraper prices can be way overpriced when compared with hosting your own.
The most efficient way is of course having your own web scraper configured in a server under your control
-even if the hosting/server fee itself is free. The important part is having the ability to modify the scraper to account for HTML changes and being flexible
in the ways to interact with the destination server as time goes by.
do you need to have? (daily, hourly, other)
When you say "monitor a webpage for changes" do you mean any
change on the page or is it a certain portion
of the contents within the page?
...You can actually get away with using only the "Last-Modified
" header in the first case.
If it has to do with monitoring specific contents, then yes, a regular web scraper is due.
BTW I'm coding a "Webpage to Address book" program right now, so I can be of help/assistance since the first part of parsing data from the web is essentially scraping