1
UrlSnooper / Cannot find network adapter
« on: November 26, 2015, 07:49 AM »
I installed the latest update and try to run UrlSnooper 2 and it says it cannot find a network adapter.
So... what are you doing, anyway?-ayryq (June 15, 2015, 07:26 AM)
Go to http://www.pedersonfuneralhome.com/obituaries/ObitSearchList/1 (and increment the final number)-ayryq (June 13, 2015, 07:18 PM)
The last page number is also stored within the source:Code: HTML [Select]
<input type="hidden" id="totPages" value="83" />
This could probably be done with a GreaseMonkey script that cycles through each page grabbing the links and at the end displaying a page with all of them, which then could be saved using Save Page As ...
Just messing around, this is a heavily modified site scraper from http://blog.nparashuram.com/2009/08/screen-scraping-with-javascript-firebug.html
Currently it will start at the URL @ayryq mentioned above and load every page until the last one, (requires GreaseMonkey naturally), at a rate of about 1 every 3 seconds. It also grabs all the URLs from each page but as I haven't worked out how to store them yet, they get overwritten at each page load.Code: Javascript [Select]
// ==UserScript== // @name Get The Deadites // @namespace http://blog.nparashuram.com/2009/08/screen-scraping-with-javascript-firebug.html // @include http://www.pedersonfuneralhome.com/obituaries/ObitSearchList/* // ==/UserScript== /* * Much modified from the original script for a specific site */ function loadNextPage(){ var url = "http://www.pedersonfuneralhome.com/obituaries/ObitSearchList/"; var num = parseInt(document.location.href.substring(document.location.href.lastIndexOf("/") + 1)); if (isNaN(num)) { num = 1; } // If the counter exceeds the max number of pages we need to stop loading pages otherwise we go energizer bunny. if (num < maxPage) { document.location = url + (num + 1); // } else { // Reached last page, need to read LocalStore using JSON.parse // Create document with URLs retreived from LocalStore and open in browser, user can then use Save Page As ... } } function start(newlyDeads){ // Need to get previous entries from LocalStore (if exists) // var oldDeads = localStorage.getItem('obits'); // if (typeof oldDeads === undefined) { // No previous data so just store the new stuff // localStorage.setItem('obits', JSON.stringify(newlyDeads)); // } else { // Convert to object using JSON.parse // var tmpDeads = JSON.parse('oldDeads'); // Merge oldDeads and newlyDeads - new merged object stored in first object argument passed // m(tmpDeads, newlyDeads); // Save back to LocalStore using JSON.stringify // localStorage.setItem('obits', JSON.stringify(tmpDeads)); // } /* * Dont run a loop, better to run a timeout sort of a function. * Will not put load on the server */ var timerHandler = window.setInterval(function(){ window.clearInterval(timerHandler); window.setTimeout(loadNextPage, 2000); }, 1000); // this is the time taken for your next page to load } // https://gist.github.com/3rd-Eden/988478 // function m(a,b,c){for(c in b)b.hasOwnProperty(c)&&((typeof a[c])[0]=='o'?m(a[c],b[c]):a[c]=b[c])} var maxPage; var records = document.getElementsByTagName("A"); // Grab all Anchors within page //delete records[12]; // Need to delete "Next" anchor from object (property 13) var inputs = document.getElementsByTagName("INPUT"); // Grab all the INPUT elements maxPage = inputs[2].value; // Maximum pages is the value of third INPUT tag start(records);
The comments within the code are what I think should happen but I haven't tested it yet, (mainly because I can't code in Javascript ... but I'm perfectly capable of hitting it with a sledge hammer until it does what I want ... or I give up :P ).
Someone who actually does know Javascript could probably fill in the big blank areas in record time.-4wd (June 13, 2015, 09:12 PM)
well there are a few programs designed to "spider" a page and download all linked pages, images, etc.
one well known one is "Teleport Pro", but there are others.-mouser (June 13, 2015, 05:52 PM)