topbanner_forum
  *

avatar image

Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length
  • Thursday March 28, 2024, 3:32 pm
  • Proudly celebrating 15+ years online.
  • Donate now to become a lifetime supporting member of the site and get a non-expiring license key for all of our programs.
  • donate

Author Topic: Copying from pdf's from the web (download them first!)  (Read 2659 times)

ital2

  • Member
  • Joined in 2017
  • **
  • default avatar
  • Posts: 115
    • View Profile
    • Donate to Member
Copying from pdf's from the web (download them first!)
« on: January 29, 2017, 06:56 AM »
I am too lazy to try this with alternative browsers, but here's the link in case you're interested in checking:
http://eur-lex.europ...284:0043:0072:IT:PDF

Just a bit from the text as a mockup "screenshot" (formatted as in the 2-columns original text):

Il regolamento (CE) n. 883/2004 del Parlamento europeo
e del Consiglio, del 29 aprile 2004, relativo al coordina-
mento dei sistemi di sicurezza sociale(3), dispone che il
contenuto degli allegati II, X e XI di detto regolamento sia
determinato prima della sua data di applicazione.

Note the "(3)" in it is a link to a footnote, so the footnote will appear in the following texts, by way of programming of the pdf, but you can try with any other paragraph in those pdf's of the European Union, without such a footnote, and the respective formatting results are similar, as follows:

Display the linked page in Firefox, then copy and paste the paragraph in question into any editor or text program. You'll get:

Il regolamento
(CE) n. 883/2004 del Parlamen
to europeo
e
 del Con
siglio,
 del    29 aprile 2004, relativo al coordin
a
­
men
to
  dei  sistemi  di  sicurezza  sociale
(
3
)  GU L 166 del 30.4.2004, pag
.
1.
 (
3
),  dispon
e  che  il
con
tenuto
deg
li
alleg
ati
I
I
, X e XI di
 detto reg
olamento
sia
determin
ato
prima della sua data di applicazion
e.

As you can see, this text is unusable, copying by typing it from the screen will be faster than trying to manually reformat what you've got.

But save the pdf, then open it in Adobe Reader (or probably any other pdf viewer, didn't try those), you'll get:

Il regolamento (CE) n. 883/2004 del Parlamento europeo e del Consiglio, del 29 aprile 2004, relativo al coordinamento dei sistemi di sicurezza sociale(3)
GU L 166 del 30.4.2004, pag. 1. (3), dispone che il contenuto degli allegati II, X e XI di detto regolamento sia determinato prima della sua data di applicazione.

As you can see, there's a line break between the link code and the link target, but except from that, you'll get the text as expected.

The problem described here regularly appears with pdf's from the UE and in some cases also with third-party pdf's, so when you encounter it, don't think they have found a way to prevent copying by other means than securing the pdf, but just download and copy from your local copy, or have a non-browser pdf viewer display web pdf links, by tweaking the browser settings.

Curt

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 7,566
    • View Profile
    • Donate to Member
Re: Copying from pdf's from the web (download them first!)
« Reply #1 on: January 29, 2017, 08:08 AM »
selected (in Firefox) and copied as plain text as displayed on the screen online:

I l reg olamento (CE ) n . 883/2004 del Parlamen to europeo e del Con siglio, del 29 aprile 2004, relativo al coordin a men to dei sistemi di sicurezza sociale ( 3 ) GU L 166 del 30.4.2004, pag . 1. ( 3 ), dispon e che il con tenuto deg li alleg ati I I , X e XI di detto reg olamento sia determin ato prima della sua data di applicazion e

Auto Context and ColT are both said to have been abandoned, but can do this ("copy as plain text"). I am too lazy to give you a link...


IainB

  • Supporting Member
  • Joined in 2008
  • **
  • Posts: 7,540
  • @Slartibartfarst
    • View Profile
    • Read more about this member.
    • Donate to Member
Re: Copying from pdf's from the web (download them first!)
« Reply #2 on: January 30, 2017, 02:57 AM »
@Curt: Well done for understanding the lingo.    :Thmbsup:
I never could understand French.