topbanner_forum
  *

avatar image

Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length
  • Thursday March 28, 2024, 6:54 am
  • Proudly celebrating 15+ years online.
  • Donate now to become a lifetime supporting member of the site and get a non-expiring license key for all of our programs.
  • donate

Author Topic: Utility to convert embedded hyperlinks to displayed text  (Read 2834 times)

questorfla

  • Supporting Member
  • Joined in 2012
  • **
  • Posts: 570
  • Fighting Slime all the Time
    • View Profile
    • Donate to Member
I am trying to find a tool that can scan a document *(usually a pdf file) and do an 'in place' replacement changing any words or images with embedded hyperlinks so that the active link is replaced by the text it contains.

This may end up with some ugly documents when done but for now it is something I need to find a way to accomplish.

My original idea was to simply remove the link and leave the text but I found too often that people have been typing in a short part of the link and either embedding the link or Adobe etc is chopping it off to an acceptable length.

Most Antivirus software is flagging a lot of these as contain malware.  I found out that while the PDF or DOC itself is clean, it may contain "hyperlinks' to sites that are currently Blacklisted.  90% or more of the time, they end up being false positives but I cannot risk telling people to ignore them because of the 10% that are not.

No one here goes to these sites but they do have to post these documents so that they are available to others and that is where the problem occurs.  The people they send them to don't understand the full implications of the warnings.  Either they toss the email and the attachments due to the warnings or they open them and maybe end up getting malware from a site we sent them the link to.

I would like to find a way to remove the ACTIVE part of the hyperlink replacing it with the text it contains.  I have been told that if a site really is malware,  that presumably the viewers own 'web-shield' would protect them from going there but at least the 90% "False positives" would not pop up on every PDF.  Anyone who want to go there can copy the text and paste it to a browser or highlight and click to get google to do it for them.  Most of the time, "no One" goes there , it is just provided as a reference.


I am sure there are tools that can do this but I have not yet found one other than maybe in the full Adobe Acrobat package.

4wd

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 5,641
    • View Profile
    • Donate to Member
Re: Utility to convert embedded hyperlinks to displayed text
« Reply #1 on: April 20, 2016, 07:08 PM »
Find and replace hyperlinks in a PDF

Replace the second step with a DOS find/replace command or do it in Powershell.

Of course your problem is going to be correlating the link text that's displayed in the PDF with the actual link :)
« Last Edit: April 20, 2016, 07:29 PM by 4wd »

questorfla

  • Supporting Member
  • Joined in 2012
  • **
  • Posts: 570
  • Fighting Slime all the Time
    • View Profile
    • Donate to Member
Re: Utility to convert embedded hyperlinks to displayed text
« Reply #2 on: April 20, 2016, 07:34 PM »
OK 4WD.  I can always count on you.  :thmbsup:
now you KNOW things are never as simple as that.  What I posted on the MS Office 365 forum was the REAL need.  These documents are in emails.
No one is going to do anything if they require "doing anything"  :-\

also while most are pdf's, some will be WORD doc's.
I am hoping this turns out to be all OLD stuff.  Right now I just have a raw list from AVAST that I first have to track down where each file came from before it got quarantined.
It can be done but I would rather make (  eh.. ? Ask politely ? )the people who put them there so it before putting them there.