ATTENTION: You are viewing a page formatted for mobile devices; to view the full web page, click HERE.

DonationCoder.com Software > Post New Requests Here

Transform list of url links to copies of the actual web pages

<< < (2/2)

IainB:
I have searched the Internet for years to find something that makes a decent copy of web pages, for archive/library reference purposes.
The best I had found was the Firefox add-on Scrapbook.
More recently, I found Zotero.
Nothing else seems to come close.
They are both very good indeed.
Then I discovered that they both use the same engine:WebPageDump
(Details partially copied below and with just the download file embedded hyperlinks.)
introduction
WebPageDump is a Firefox extension which allows you to save local copies of pages from the Web. It sounds simple, but it's not. The standard "Save page" function of web browsers fails with most web pages and also web site downloaders donĀ“t work in a satisfactory manner. This shortcomings were a serious problem for our research.

Each web page is saved in an automatic named subdirectory making it easy to create whole (shareable) web page collections. It is built upon the Scrapbook extension and enhances its capabilities regarding HTML entites, charsets and command-line/batch functionality improving the visual exactness of the local copy. ...
...
using
WebPageDump can be used simply with the "WebPageDump" Entry inside the Firefox "Tools" menu. Hence the actual web page will be saved inside a WPD named subdirectory after selecting the destination directory. This mode is going to be the "normal" mode for most web page collecting applications.

For batch processing the following options can be used through the Firefox command-line. This command-line options are mainly present for webpagedump testing purposes but maybe useful for some special applications. Be sure that a single batch command has ended before proceeding with another one. ...
...
downloads
WebPageDump v0.3 (beta) firefox extension
WebPageDump v0.3 (beta) source code
The extension is provided under the terms of the Mozilla Public License. If you want to install WebPageDump you will either have to manually allow extension installations from this url or save the xpi file with "save as". See changes.txt for the version information.
Tested web pages (~68 MB)
Because of copyright issues we have removed the package of test web pages. But we will make them available for serious scientific research. They were downloaded and modified with WebPageDump using the SmartCache Java Proxy.
_____________________

--- End quote ---

nkormanik:
Excellent suggestions all.  Thank you!

Though I will definitely keep your code and try it out, the solution I did the little task with was a Firefox extensions called Shelve:

https://addons.mozilla.org/en-US/firefox/addon/shelve/

That did the trick.

Thanks again!

Navigation

[0] Message Index

[*] Previous page

Go to full version