topbanner_forum
  *

avatar image

Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length
  • Tuesday April 16, 2024, 5:57 pm
  • Proudly celebrating 15+ years online.
  • Donate now to become a lifetime supporting member of the site and get a non-expiring license key for all of our programs.
  • donate

Author Topic: navigate webpage results automaticaly and save them  (Read 13720 times)

kalos

  • Member
  • Joined in 2006
  • **
  • default avatar
  • Posts: 1,823
    • View Profile
    • Donate to Member
navigate webpage results automaticaly and save them
« on: August 27, 2008, 10:20 PM »
hello

first, I am talking about webpages that have 1,2,3,next etc links (like google results)
I need a 'bot' that will click 'next' in a webpage, go to the next webpage, save it and then click next, go to the next, save it, etc

is there anything like this?

thanks

PS: in any browser, I have no specific preference

jgpaiva

  • Global Moderator
  • Joined in 2006
  • *****
  • Posts: 4,727
    • View Profile
    • Donate to Member
Re: navigate webpage results automaticaly and save them
« Reply #1 on: August 28, 2008, 03:19 PM »
I think you can use httrack for that. Just pass it the page's address and configure it to download all the pages linked by that one at a depth of '1'.
If the page has other links not related to the search, they will be downloaded too, but I suppose you could delete those manually or something. I think httrack can ignore domains, so if those other pages are all in the same domain (the domain of the original page), you could just ignore that one and you'd get only the interesting pages ;)

kalos

  • Member
  • Joined in 2006
  • **
  • default avatar
  • Posts: 1,823
    • View Profile
    • Donate to Member
Re: navigate webpage results automaticaly and save them
« Reply #2 on: August 28, 2008, 05:43 PM »
the problem is that I need to do this inside the web browser, because the website needs authentication, which is not easy to achieve in webpage offline downloaders (it is not webpages in http://user:[email protected] format, but it requires web form authentication)

4wd

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 5,643
    • View Profile
    • Donate to Member
Re: navigate webpage results automaticaly and save them
« Reply #3 on: August 28, 2008, 07:50 PM »
the problem is that I need to do this inside the web browser, because the website needs authentication, which is not easy to achieve in webpage offline downloaders (it is not webpages in http://user:[email protected] format, but it requires web form authentication)

Sounds like a job for either GreaseMonkey, AutoIt and AutoHK but unless you're willing to provide some details I don't think anyone will be able to help:

eg.

GreaseMonkey - you need to provide access to the site so as to be able to create a userscript to do the actions you want.
AutoIt/AutoHK - you might get away with providing a screenshot of the site so as to give reference to mouse movement/actions and/or key input.

I think these are the most likely automated options barring a dedicated program.

If the website is using a form for verification then it most likely sets a cookie and you could use a website downloader that can use the cookie.

Try FireFox with DownThemAll! - it can supposedly download all links on a page.
« Last Edit: August 28, 2008, 08:02 PM by 4wd »

kalos

  • Member
  • Joined in 2006
  • **
  • default avatar
  • Posts: 1,823
    • View Profile
    • Donate to Member
Re: navigate webpage results automaticaly and save them
« Reply #4 on: August 28, 2008, 11:49 PM »
unless you're willing to provide some details I don't think anyone will be able to help

let's say I search in google for 'something' and it returns a webpage that it displays the google results, where at the bottom there is 1,2,3,4,next

each of the number of the google results webpages has this url:
http://www.google.co...mething&start=10
http://www.google.co...mething&start=20
etc

what I want to do is to save the google results webpage (the one with the numbers at the bottom), then click to go to the next google results webpage, save, go to next, save, etc (in other words I need to save all the webpages of the above mentioned urls)

all the above must be done within the web browser, because the website needs me to first authenticate via a web form


lanux128

  • Global Moderator
  • Joined in 2005
  • *****
  • Posts: 6,277
    • View Profile
    • Donate to Member
Re: navigate webpage results automaticaly and save them
« Reply #5 on: August 29, 2008, 01:29 AM »
you can use Repagination to combine all the pages into one and then save. just a thought. :)

sc_08-08-29_001.png
https://addons.mozil...S/firefox/addon/2099

kalos

  • Member
  • Joined in 2006
  • **
  • default avatar
  • Posts: 1,823
    • View Profile
    • Donate to Member
Re: navigate webpage results automaticaly and save them
« Reply #6 on: August 29, 2008, 05:51 AM »
very interesting!

you can do miracles with JAVAscript and greasemonkey, but unfortunately it's hard to code it and there are not many JS developers

I will test it asap, thanks

sri

  • Honorary Member
  • Joined in 2006
  • **
  • Posts: 689
    • View Profile
    • Sridhar Katakam
    • Read more about this member.
    • Donate to Member
<a href="https://sridharkatakam.com">My blog</a>

kalos

  • Member
  • Joined in 2006
  • **
  • default avatar
  • Posts: 1,823
    • View Profile
    • Donate to Member
Re: navigate webpage results automaticaly and save them
« Reply #8 on: August 29, 2008, 07:49 AM »
it works, but for the 400+ webpages results that I need to save... it will crash the browser

a web navigation automate script or bot would be the ultimate solution

is there any?

lanux128

  • Global Moderator
  • Joined in 2005
  • *****
  • Posts: 6,277
    • View Profile
    • Donate to Member
Re: navigate webpage results automaticaly and save them
« Reply #9 on: August 29, 2008, 09:25 AM »
it works, but for the 400+ webpages results that I need to save... it will crash the browser

wow, that is a lot of pages. :) there is one other add-on that i have in my bookmarks but haven't tried it before.

SC_2008-08-29_001.png
https://addons.mozil.../firefox/addon/3262/

lanux128

  • Global Moderator
  • Joined in 2005
  • *****
  • Posts: 6,277
    • View Profile
    • Donate to Member
Re: navigate webpage results automaticaly and save them
« Reply #10 on: August 29, 2008, 09:43 AM »
i totally forgot about this - iMacros for Firefox. :)

SC_2008-08-29_002.png
http://www.iopus.com/imacros/firefox/

kalos

  • Member
  • Joined in 2006
  • **
  • default avatar
  • Posts: 1,823
    • View Profile
    • Donate to Member
Re: navigate webpage results automaticaly and save them
« Reply #11 on: August 31, 2008, 09:41 PM »
unfortunately macros won't work, because when I try to save each webpage of the results, the name of the filename is the same

is there a way to auto-rename them?

cmpm

  • Charter Member
  • Joined in 2006
  • ***
  • default avatar
  • Posts: 2,026
    • View Profile
    • Donate to Member
Re: navigate webpage results automaticaly and save them
« Reply #12 on: August 31, 2008, 10:51 PM »
I'd think you would have to use Foxmarks to sync your bookmarks.
Then go to your Foxmarks web site where all your links are and work with them from there.
Of course you need Firefox also which I guess you have.

Would the addon, 'Download Them All', work?

Or you can use a download manager and the addon 'Copy all Links'.

 Copy and paste them into the manager, which ever one is built into Firefox, and there are a few. Which one to choose would depend on it's options that you need.

mwang

  • Supporting Member
  • Joined in 2007
  • **
  • default avatar
  • Posts: 205
    • View Profile
    • Donate to Member
Re: navigate webpage results automaticaly and save them
« Reply #13 on: September 01, 2008, 01:10 AM »
Scrapbook (https://addons.mozilla.org/en-US/firefox/addon/427), maybe?

If the sequential pages have some sort of numbering rule in their URL (most do, I think), then you could copy the starting URL, duplicate it as many times as required in an editor, change the numbering as required for each URL (with 400+ items, I would probably do this step in Excel or something similar), and ask Scrapbook to down them all in a folder.

I did a small test with one of the long thread on this forum:
scrapbook.pngnavigate webpage results automaticaly and save them

If you can't or don't want to produce the URLs in advance, you can still do it with Scrapbook, but this time with the help of a Scrapbook Addon called AutoSave (http://amb.vis.ne.jp/mozilla/scrapbook/addons.php?lang=en#AutoSave) and iMacro mentioned above or something similar. I didn't try this approach though.

kalos

  • Member
  • Joined in 2006
  • **
  • default avatar
  • Posts: 1,823
    • View Profile
    • Donate to Member
Re: navigate webpage results automaticaly and save them
« Reply #14 on: September 01, 2008, 11:43 PM »
thanks

these are interesting, but I wonder if it is possible the program to know when the webpage is 100% loaded and afterwards to save it (so that there will be no incomplete webpages saved)

mwang

  • Supporting Member
  • Joined in 2007
  • **
  • default avatar
  • Posts: 205
    • View Profile
    • Donate to Member
Re: navigate webpage results automaticaly and save them
« Reply #15 on: September 02, 2008, 12:31 AM »
If you use the first method I mentioned (giving Scrapbook a list of URLs to save), it saves the web pages in the background, meaning it doesn't load the pages into Firefox. There's a small pop up showing the progress:
progress.pngnavigate webpage results automaticaly and save them

It saves one page at a time, with a small delay (a couple of seconds) in between, so it won't overwhelm the server. You may safely ignore the progress dialog (which would take some time if you give it a long list) and continue to use Firefox.

When it's done, the progress dialog goes away and another small message box pup up from the lower-right corner telling you "capture completed".
complete.pngnavigate webpage results automaticaly and save them

kalos

  • Member
  • Joined in 2006
  • **
  • default avatar
  • Posts: 1,823
    • View Profile
    • Donate to Member
Re: navigate webpage results automaticaly and save them
« Reply #16 on: September 08, 2008, 04:23 AM »
if the next webpage with results has an url that cannot be shown? eg. if to go there you click a button and the new url is not shown? then I cannot find the list of urls

is there any javascript bot that can auto-browse under specific commands, wait pages to load and then save them?

mwang

  • Supporting Member
  • Joined in 2007
  • **
  • default avatar
  • Posts: 205
    • View Profile
    • Donate to Member
Re: navigate webpage results automaticaly and save them
« Reply #17 on: September 08, 2008, 07:42 PM »
The auto saving part can be taken care by Scrapbook (with AutoSave plugin), as I mentioned above. There are other extensions that do this as well.

As to the auto clicking part, you'll probably need the help of iMicro (also mentioned above) or something like that. I've never tried it though, so can't help you there.

Paul Keith

  • Member
  • Joined in 2008
  • **
  • Posts: 1,989
    • View Profile
    • Donate to Member
Re: navigate webpage results automaticaly and save them
« Reply #18 on: September 09, 2008, 12:27 AM »
Just out of curiosity, why do you need 400+ Google results pages?

kalos

  • Member
  • Joined in 2006
  • **
  • default avatar
  • Posts: 1,823
    • View Profile
    • Donate to Member
Re: navigate webpage results automaticaly and save them
« Reply #19 on: September 09, 2008, 05:24 AM »
Just out of curiosity, why do you need 400+ Google results pages?

it's not about google results, google was just for example

Paul Keith

  • Member
  • Joined in 2008
  • **
  • Posts: 1,989
    • View Profile
    • Donate to Member
Re: navigate webpage results automaticaly and save them
« Reply #20 on: September 09, 2008, 09:09 AM »
Oh ok. Thanks for clarifying that.