ATTENTION: You are viewing a page formatted for mobile devices; to view the full web page, click HERE.

Main Area and Open Discussion > General Software Discussion

Is there software for this?

<< < (2/2)

4wd:
Go to http://www.pedersonfuneralhome.com/obituaries/ObitSearchList/1 (and increment the final number)-ayryq (June 13, 2015, 07:18 PM)
--- End quote ---

The last page number is also stored within the source:


--- Code: HTML ---<input type="hidden" id="totPages" value="83" />
This could probably be done with a GreaseMonkey script that cycles through each page grabbing the links and at the end displaying a page with all of them, which then could be saved using Save Page As ...

Just messing around, this is a heavily modified site scraper from http://blog.nparashuram.com/2009/08/screen-scraping-with-javascript-firebug.html

Currently it will start at the URL @ayryq mentioned above and load every page until the last one, (requires GreaseMonkey naturally), at a rate of about 1 every 3 seconds.  It also grabs all the URLs from each page but as I haven't worked out how to store them yet, they get overwritten at each page load.

--- Code: Javascript ---// ==UserScript==// @name Get The Deadites// @namespace http://blog.nparashuram.com/2009/08/screen-scraping-with-javascript-firebug.html// @include http://www.pedersonfuneralhome.com/obituaries/ObitSearchList/*// ==/UserScript== /** Much modified from the original script for a specific site*/  function loadNextPage(){  var url = "http://www.pedersonfuneralhome.com/obituaries/ObitSearchList/";  var num = parseInt(document.location.href.substring(document.location.href.lastIndexOf("/") + 1));  if (isNaN(num)) {    num = 1;  }// If the counter exceeds the max number of pages we need to stop loading pages otherwise we go energizer bunny.  if (num < maxPage) {    document.location = url + (num + 1);//  } else {// Reached last page, need to read LocalStore using JSON.parse// Create document with URLs retreived from LocalStore and open in browser, user can then use Save Page As ...  }}  function start(newlyDeads){// Need to get previous entries from LocalStore (if exists)//  var oldDeads = localStorage.getItem('obits');//  if (typeof oldDeads === undefined) {   // No previous data so just store the new stuff//    localStorage.setItem('obits', JSON.stringify(newlyDeads));//  } else {// Convert to object using JSON.parse//    var tmpDeads = JSON.parse('oldDeads');// Merge oldDeads and newlyDeads - new merged object stored in first object argument passed//    m(tmpDeads, newlyDeads);// Save back to LocalStore using JSON.stringify//    localStorage.setItem('obits', JSON.stringify(tmpDeads));//  } /** Dont run a loop, better to run a timeout sort of a function.* Will not put load on the server*/  var timerHandler = window.setInterval(function(){  window.clearInterval(timerHandler);  window.setTimeout(loadNextPage, 2000);  }, 1000); // this is the time taken for your next page to load} // https://gist.github.com/3rd-Eden/988478// function m(a,b,c){for(c in b)b.hasOwnProperty(c)&&((typeof a[c])[0]=='o'?m(a[c],b[c]):a[c]=b[c])} var maxPage;var records = document.getElementsByTagName("A");     // Grab all Anchors within page//delete records[12];                                 // Need to delete "Next" anchor from object (property 13)var inputs = document.getElementsByTagName("INPUT");  // Grab all the INPUT elementsmaxPage = inputs[2].value;                            // Maximum pages is the value of third INPUT tagstart(records);
The comments within the code are what I think should happen but I haven't tested it yet, (mainly because I can't code in Javascript ... but I'm perfectly capable of hitting it with a sledge hammer until it does what I want ... or I give up  :P ).

Someone who actually does know Javascript could probably fill in the big blank areas in record time.

SomebodySmart:
Excellent! It looks like now I'll be able to build hyperlinks to every
obituary on every website built by FuneralOne.com ! Thanks.

As for GreaseMonkey, I don't know anything about that, but
I do use curl and my home-made Python 3.2 programs.


Go to http://www.pedersonfuneralhome.com/obituaries/ObitSearchList/1 (and increment the final number)-ayryq (June 13, 2015, 07:18 PM)
--- End quote ---

The last page number is also stored within the source:


--- Code: HTML ---<input type="hidden" id="totPages" value="83" />
This could probably be done with a GreaseMonkey script that cycles through each page grabbing the links and at the end displaying a page with all of them, which then could be saved using Save Page As ...

Just messing around, this is a heavily modified site scraper from http://blog.nparashuram.com/2009/08/screen-scraping-with-javascript-firebug.html

Currently it will start at the URL @ayryq mentioned above and load every page until the last one, (requires GreaseMonkey naturally), at a rate of about 1 every 3 seconds.  It also grabs all the URLs from each page but as I haven't worked out how to store them yet, they get overwritten at each page load.

--- Code: Javascript ---// ==UserScript==// @name Get The Deadites// @namespace http://blog.nparashuram.com/2009/08/screen-scraping-with-javascript-firebug.html// @include http://www.pedersonfuneralhome.com/obituaries/ObitSearchList/*// ==/UserScript== /** Much modified from the original script for a specific site*/  function loadNextPage(){  var url = "http://www.pedersonfuneralhome.com/obituaries/ObitSearchList/";  var num = parseInt(document.location.href.substring(document.location.href.lastIndexOf("/") + 1));  if (isNaN(num)) {    num = 1;  }// If the counter exceeds the max number of pages we need to stop loading pages otherwise we go energizer bunny.  if (num < maxPage) {    document.location = url + (num + 1);//  } else {// Reached last page, need to read LocalStore using JSON.parse// Create document with URLs retreived from LocalStore and open in browser, user can then use Save Page As ...  }}  function start(newlyDeads){// Need to get previous entries from LocalStore (if exists)//  var oldDeads = localStorage.getItem('obits');//  if (typeof oldDeads === undefined) {   // No previous data so just store the new stuff//    localStorage.setItem('obits', JSON.stringify(newlyDeads));//  } else {// Convert to object using JSON.parse//    var tmpDeads = JSON.parse('oldDeads');// Merge oldDeads and newlyDeads - new merged object stored in first object argument passed//    m(tmpDeads, newlyDeads);// Save back to LocalStore using JSON.stringify//    localStorage.setItem('obits', JSON.stringify(tmpDeads));//  } /** Dont run a loop, better to run a timeout sort of a function.* Will not put load on the server*/  var timerHandler = window.setInterval(function(){  window.clearInterval(timerHandler);  window.setTimeout(loadNextPage, 2000);  }, 1000); // this is the time taken for your next page to load} // https://gist.github.com/3rd-Eden/988478// function m(a,b,c){for(c in b)b.hasOwnProperty(c)&&((typeof a[c])[0]=='o'?m(a[c],b[c]):a[c]=b[c])} var maxPage;var records = document.getElementsByTagName("A");     // Grab all Anchors within page//delete records[12];                                 // Need to delete "Next" anchor from object (property 13)var inputs = document.getElementsByTagName("INPUT");  // Grab all the INPUT elementsmaxPage = inputs[2].value;                            // Maximum pages is the value of third INPUT tagstart(records);
The comments within the code are what I think should happen but I haven't tested it yet, (mainly because I can't code in Javascript ... but I'm perfectly capable of hitting it with a sledge hammer until it does what I want ... or I give up  :P ).

Someone who actually does know Javascript could probably fill in the big blank areas in record time.
-4wd (June 13, 2015, 09:12 PM)
--- End quote ---

ayryq:
So... what are you doing, anyway?

SomebodySmart:
So... what are you doing, anyway?


-ayryq (June 15, 2015, 07:26 AM)
--- End quote ---

I'm building a genealogy website that will help persons trace their family trees. There's a lot of genealogy info in obituaries. I cannot copy the obituaries onto my new website for obvious copyright reasons but I can help persons find those obituaries on the funeral home and newspaper websites for free.

Navigation

[0] Message Index

[*] Previous page

Go to full version