topbanner_forum
  *

avatar image

Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length
  • Wednesday April 24, 2024, 4:51 am
  • Proudly celebrating 15+ years online.
  • Donate now to become a lifetime supporting member of the site and get a non-expiring license key for all of our programs.
  • donate

Author Topic: Targeted Web clipping or scraping and formatting  (Read 3001 times)

sphere

  • Participant
  • Joined in 2018
  • *
  • default avatar
  • Posts: 176
    • View Profile
    • Donate to Member
Targeted Web clipping or scraping and formatting
« on: February 12, 2020, 05:33 PM »

What is the best way to automate the collection of specific data, ie contact information, when visiting a webpage?  Years ago I used to methodically copy or enter the  person/company’s  name, address, web url telephone number and email and notes into a contact manager or address book. More recently, I have simply saved a copy of their contact page.  I can link to the saved page, tag it, and or search for it.  This works,- but it is a little chaotic.   

It occurred to me that maybe it would be possible for  Clipboard Help + Spell to identify different types of text strings and sort them neatly into their categories based on the attributes of the text.  A ".com", ".org" etc would indicate that text strig is part of a web url.  An "@" sign would indicate an email address or an Instagram etc.

It would be great to be able to pull
Name,
Address,
Telephone
Mobile
Email
Facebook Page
Instagram (more and more businesses are on instagram)
Etc…
A description
page url

It seems like an easier way to identify what text is what would be using the page coding. 
I have looked into some Customer Relationship Managers CRMs to see if they offer a way to pull contact information into their systems and I have not found anything yet.  I have also looked into web scraping tools, but most seem to target an entire site.  I would like to indicate a page and pull the information I want.

Any ideas?

sphere

  • Participant
  • Joined in 2018
  • *
  • default avatar
  • Posts: 176
    • View Profile
    • Donate to Member
Re: Targeted Web clipping or scraping and formatting
« Reply #1 on: February 06, 2024, 07:07 PM »
So I realize this post is old, but once again I am looking for options.  Never found anything really the first time around.

publicdomain

  • Honorary Member
  • Joined in 2019
  • **
  • Posts: 736
  • Call me Vic!
    • View Profile
    • Donate to Member
Re: Targeted Web clipping or scraping and formatting
« Reply #2 on: February 07, 2024, 06:42 AM »
I would like to indicate a page and pull the information I want.

Any ideas?

A user-configurable custom web scrapper should be able to perform this in a very smooth way.

Basically, you configure the HTML tags to extract the information from (as they appear on the source page) and then send the collected data into the target contact manager program via automation (e.g. winapi's WM_SETTEXT, for a traditional Windows program).

This is perfectly doable :up:
My name's Victor but do feel free to call me Vic! (now known as "paradisusvic")

❤️ Support on Patreon @ www.patreon.com/paradisusis
New Email/Paypal: paradisusvicgmail.com
« Last Edit: February 07, 2024, 06:56 AM by publicdomain »

publicdomain

  • Honorary Member
  • Joined in 2019
  • **
  • Posts: 736
  • Call me Vic!
    • View Profile
    • Donate to Member
Re: Targeted Web clipping or scraping and formatting
« Reply #3 on: February 16, 2024, 12:16 AM »
A user-configurable custom web scrapper should be able to perform this in a very smooth way.

Talks with @sphere are advanced regarding the creation of such as custom web scrapper :)

We've been brainstorming back & forth and there's a pretty good idea as to what's needed to achieve a proper release :Thmbsup:

(Official dedicated thread for the new "AddressBooker" program is to be posted!)
My name's Victor but do feel free to call me Vic! (now known as "paradisusvic")

❤️ Support on Patreon @ www.patreon.com/paradisusis
New Email/Paypal: paradisusvicgmail.com
« Last Edit: February 16, 2024, 10:31 AM by publicdomain »

publicdomain

  • Honorary Member
  • Joined in 2019
  • **
  • Posts: 736
  • Call me Vic!
    • View Profile
    • Donate to Member
Re: Targeted Web clipping or scraping and formatting
« Reply #4 on: February 23, 2024, 09:54 PM »
(Official dedicated thread for the new "AddressBooker" program is to be posted!)

Done! Thread is published @ https://www.donationcoder.com/forum/index.php?topic=53987.0
My name's Victor but do feel free to call me Vic! (now known as "paradisusvic")

❤️ Support on Patreon @ www.patreon.com/paradisusis
New Email/Paypal: paradisusvicgmail.com