ATTENTION: You are viewing a page formatted for mobile devices; to view the full web page, click HERE.

Main Area and Open Discussion > Living Room

Any ideas for a save website crawler for offline reading?

<< < (4/4)

NigelH:
Can you access any of the material using Firefox or Chrome?
Zotero has pretty good capabilities for archiving web page content.
An IE connector for the standalone version is apparently in development

Store anything.

Zotero collects all your research in a single, searchable interface. You can add PDFs, images, audio and video files, snapshots of web pages, and really anything else. Zotero automatically indexes the full-text content of your library, enabling you to find exactly what you're looking for with just a few keystrokes.
--- End quote ---

cyberdiva:
Web Research seems very interesting... how does it store the websites?  Local Website Archive stores the html so you dont need LWA to view the articles you've downloaded.  Does Web Research use a proprietary format?
-kfitting (April 04, 2012, 10:53 AM)
--- End quote ---
To be honest, it has been a year or two since I last used Web Research, and I do not have it on the computer I bought this year.  I was never all that concerned about whether it used a proprietary format, since I was interested only in retrieving and consulting information I had saved, not exporting it.  I just went onto the Web Research website, and here's what it says about exporting documents:

Web Research offers various methods to export documents and folders:

    Export as files in original format
    Export as "Single File Web Page" (mht format)
    Export as album (chm format)
    Export as a Document Package
    Create a Web Page Presentation
    Copy the Web Research address of a document
    Print documents
    Transfer to Microsoft Word
    Programming interface (API)
    External linking via Web Research protocol handler

Perhaps I'm not understanding the above correctly, but a couple of the items seem to suggest that the material is not saved in a proprietary format.  I think you'd be best off writing to the company to find out for sure.

Carol Haynes:
Can you access any of the material using Firefox or Chrome?
Zotero has pretty good capabilities for archiving web page content.
An IE connector for the standalone version is apparently in development

Store anything.

Zotero collects all your research in a single, searchable interface. You can add PDFs, images, audio and video files, snapshots of web pages, and really anything else. Zotero automatically indexes the full-text content of your library, enabling you to find exactly what you're looking for with just a few keystrokes.
--- End quote ---
-NigelH (April 04, 2012, 07:13 PM)
--- End quote ---

I can access the site using Firefox and Chrome but it is all a bit screwed up and doesn't work properly. If I am not using IE the site actually pops up and says it only works with IE and there are issues with any other browser.

I'm not sure but I think part of the problem is that the page types are not .html - they are .chm and it uses ColdFusion.

I have tried loads of downloaders/web spiders/archivers etc. now and none seem to be able to get past the login page. Really frustrating.

Just upgraded Surfulater as I used to use that but even that if I use the browser extension to save the page just saves a link to the pacge and that only goes to the login - it can't save the contents!

To add to the problem the website is frame based and quite a few of the links open things in a number of frames. None of the downloaders I have tried seem to like frames too much!

cyberdiva:
Just upgraded Surfulater as I used to use that but even that if I use the browser extension to save the page just saves a link to the pacge and that only goes to the login - it can't save the contents!
-Carol Haynes (April 04, 2012, 07:53 PM)
--- End quote ---
I think that Web Research will save the content of linked pages, though I don't know whether it plays nicely with frames.  I'm pretty sure you can download it for a free trial period.

Navigation

[0] Message Index

[*] Previous page

Go to full version