Author Topic: IDEA: Searchengine Scraper (Read 5468 times)

The Code Queryer · « **on:** April 06, 2019, 02:00 PM »

Hi,

My idea is a searchengine scraper.
It is like this ....
You come to a webpage and you see a search box (like google and the like). You type a url and click the "Scrape SERPs" button.
Now, the web app would visit the SERP page and scrape all the result links. It would follow to the next SERP pages and do likewise until it has met the dept you put.
A spider that visits SERP pages and scrapes all the result links. It then saves them on the website's database under your member username. Others can search band see what you scraped by doing your Username search. Likewise you can do too.
The scraper would scrape not only the links but their anchor texts, page titles, page meta keywords and meta descriptions.
In other words, a searchengine scraper. A web app. Built with php.
Anybody can build this then do the community a favour by releasing the source code here and on the gpl so we can learn from your source code. I am php student. I reckon cURL is good for the job.

Anyone like this idea,. Give it a thumbs up!

Just imagine, you can scrape any searchengine with this.
I have built a .exe one. Anyone who builds a .php one then I am willing to trade or willinbg to give you a copy if you give me the .php copy along with comments so I can learn from your code.

NetRunner · « **Reply #1 on:** April 12, 2019, 04:48 PM »

What for? Who is going to use this?

One can do something similar already with existing tools. My feed reader does that basically, except for grabbing every linked page, as excerpts are fine for me, but would be easy to make it grab the full pages.

nickodemos · « **Reply #2 on:** April 12, 2019, 05:34 PM »

I just cant see who would be willing to store all that kind of data for others to look at. I would imagine that it would be obsoleted fairly fast.

Simpler idea is find a way to post links you visited.

Author Topic: IDEA: Searchengine Scraper (Read 5468 times)

The Code Queryer

IDEA: Searchengine Scraper

NetRunner

Re: IDEA: Searchengine Scraper

nickodemos

Re: IDEA: Searchengine Scraper