topbanner_forum
  *

avatar image

Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length
  • Friday March 29, 2024, 4:09 am
  • Proudly celebrating 15+ years online.
  • Donate now to become a lifetime supporting member of the site and get a non-expiring license key for all of our programs.
  • donate

Author Topic: User Filter for Google Search Results  (Read 3204 times)

mraeryceos

  • Participant
  • Joined in 2010
  • *
  • Posts: 41
    • View Profile
    • Donate to Member
User Filter for Google Search Results
« on: September 01, 2020, 05:17 PM »
I'd like to browse a non-corporate web, for websites where all content is delivered by a single domain.
Perhaps this could be coded as a Greasemonkey or Tampermonkey user script?  It would filter Google search results, to only web pages that use only one domain for their content.  I guess this would take recursive bot-based browsing of all search results, until the desired number of results would be listed.

4wd

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 5,641
    • View Profile
    • Donate to Member
Re: User Filter for Google Search Results
« Reply #1 on: September 01, 2020, 08:52 PM »
If I understand correctly you want a Google search to return only those results for sites that do not pull content from any other domain and that aren't connected to big business.

Aside from the almost impossibly hard chore of trawling every result returned for sites that don't reference other sites, (not to mention what does and does not constitute a "corporate" website), it would most likely break 99% of websites since so many of them rely on, for example: JavaScript libraries stored on CDNs, Google, etc; Google Analytics, etc.

How do you determine what benign content should be left without having to manually load each site to see if it works?

I think the closest you're going to get is to load the uMatrix extension to stop cross-site requests, (be prepared to click a lot to get a site to have minimal functionality), and try a different search engine where you can remove results based on certain criteria, eg. MillionShort.

There's also the Google Hit Hider userscript so you can remove Google, (and other recognised engines), search results from whole domains.

mraeryceos

  • Participant
  • Joined in 2010
  • *
  • Posts: 41
    • View Profile
    • Donate to Member
Re: User Filter for Google Search Results
« Reply #2 on: September 02, 2020, 10:35 AM »
I figure corporate sites are going to draw from multiple domains.  That is all.  I don't expect an algorithm to know what a corporate website is.

As for breaking websites, it won't break the websites that rely on only one domain, which is what I want to explore.

Umatrix and millionshort are not doing anything related to the task.
« Last Edit: September 03, 2020, 12:02 AM by mraeryceos »

4wd

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 5,641
    • View Profile
    • Donate to Member
Re: User Filter for Google Search Results
« Reply #3 on: September 02, 2020, 01:08 PM »
I figure non-corporate sites are not going to draw from a single domain.

...

As for breaking websites, it won't break the websites that rely on only one domain, which is what I want to explore.

This is confusing, your first sentence implies corporate websites will draw content from their own domain.

Your second paragraph states you want to explore websites that only rely on one domain, ie. corporate websites according to your first sentence.

An interesting example, (or not), my own website does nothing but display text and scanned images of a family history nature but it references JavaScript library routines to display the images. Considering 99.9% of the content is hosted on the one domain I guess that makes it a corporate website.

mraeryceos

  • Participant
  • Joined in 2010
  • *
  • Posts: 41
    • View Profile
    • Donate to Member
Re: User Filter for Google Search Results
« Reply #4 on: September 02, 2020, 05:40 PM »
I accept that your site will be a casualty.  I am looking for sites that only use one domain.  Sorry if this is confusing to you.

Deozaan

  • Charter Member
  • Joined in 2006
  • ***
  • Points: 1
  • Posts: 9,747
    • View Profile
    • Read more about this member.
    • Donate to Member
Re: User Filter for Google Search Results
« Reply #5 on: September 02, 2020, 11:37 PM »
I figure non-corporate sites are not going to draw from a single domain.

I think the confusion is understandable, as the above quoted text can be re-written without two negatives to read "I figure non-corporate sites are going to draw from multiple domains."

That said, I suppose it's pretty clear at this point what mraeryceos desires:

I am looking for sites that only use one domain.

Whether these sites are "corporate" or not is immaterial to the request, and IMO that nomenclature isn't helping the discussion. That said, I think 4wd got it (mostly) right the first time and offered an adequate response with good suggestions. It may be possible to use some kind of a blocker such as uMatrix to block any attempts a site makes to connect to 3rd party sites. But it doesn't seem feasible to filter search results, especially Google search results, to list only sites that don't attempt to connect to or load from 3rd party sites. You'd have to load each page in the search results to see if they connect to anything else. And possibly go through thousands upon thousands of them before you got even one page of results that met that criteria.

At that point you might as well be making an internet archive. In fact, that may be another alternative. Use a site like https://archive.md/ to archive any page you want to visit. It will archive the content on its own servers and serve a static page.

mraeryceos

  • Participant
  • Joined in 2010
  • *
  • Posts: 41
    • View Profile
    • Donate to Member
Re: User Filter for Google Search Results
« Reply #6 on: September 03, 2020, 12:03 AM »
You're right, that was confusing.  I fixed it.

I don't want to archive.  I want to search for search terms, whatever they may be.  If I have to wait and go do something else while the search completes, I'm ok with that.

mraeryceos

  • Participant
  • Joined in 2010
  • *
  • Posts: 41
    • View Profile
    • Donate to Member
Re: User Filter for Google Search Results
« Reply #7 on: September 05, 2020, 06:02 PM »
Most websites draw resources from multiple domains.  If you could search for websites drawing only from their own domain, plus some domains in a whitelist, which domains would you add to the whitelist?

4WD, what is the site of your javascript?
« Last Edit: September 06, 2020, 04:01 AM by mraeryceos »