topbanner_forum
  *

avatar image

Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length
  • Thursday March 28, 2024, 1:26 pm
  • Proudly celebrating 15+ years online.
  • Donate now to become a lifetime supporting member of the site and get a non-expiring license key for all of our programs.
  • donate

Author Topic: server-side configuration Open to View PDFs in browser not download them  (Read 7359 times)

questorfla

  • Supporting Member
  • Joined in 2012
  • **
  • Posts: 570
  • Fighting Slime all the Time
    • View Profile
    • Donate to Member
This is a pretty urgent need and i would be willing to looks at any alternatives.
In the past years, this was not really problem and I am not sure when the rules changes nor with what software.
On a site we host there are hundreds of pdf files in various library's. 

The Intent is for a User to be able to Open and view the files as much as necessary but other than making a screenshot, they should not be able to download and save the original PDF from the site.

This is a standard Apache Website using a PHP/MySQl setup to manage the documents.  I have tried inserting a few suggested code snippet with no results.  The same sites never acted this way before but i am unable to determine when this new "Download first/Open 2nd" ability became the default.

People visiting the website DO need to be able to view the files but I am beginning to believe it would be best (IF Possible) to find a way to
allow them to view it using a viewer hosted on the Web Server as the only way to accomplish what is needed

And there are so many other PDF viewers out there that I am sure attempts to prevent this after the file is already in their temporary internet files would be a waste

Any Advice Apprciated

Stoic Joker

  • Honorary Member
  • Joined in 2008
  • **
  • Posts: 6,646
    • View Profile
    • Donate to Member
The Intent is for a User to be able to Open and view the files as much as necessary but other than making a screenshot, they should not be able to download and save the original PDF from the site.

I honestly can't fathom how that would be possible, as all web content is downloaded before being displayed. DRM (ick) can try to prevent print/save/etc. but for the truly creative there are always options for circumventing it.

wraith808

  • Supporting Member
  • Joined in 2006
  • **
  • default avatar
  • Posts: 11,186
    • View Profile
    • Donate to Member
These users that you're trying to prevent are internet users, correct?

questorfla

  • Supporting Member
  • Joined in 2012
  • **
  • Posts: 570
  • Fighting Slime all the Time
    • View Profile
    • Donate to Member
Yes, these are internet users.  And to be honest, "I" personally have no issue with the downloads anyway and I have explained to the >"Powers That Be"< that these files have "ALWAYS" been downloaded.  If they (the viewing person) sees it, they have downloaded it.  The differences are that in the past, this download stayed in their Temporary Internet files and was a random number for a title.

Please remember that I am speaking of "Ordinary Mortals" here and not anyone who has even a shred of an idea how things really work.  So these "Internet visitors" who come to the website to view these files, all I am trying to do is to either get it to work like it used to such that the files and up in the temporary Internet folder as random numbers     >  OR

In some way manage to control how much further they are able to disseminate what they see.  I am well aware the there are hundreds of ways to "cheat" but this issue is about normal use by normal "non-technical" people.   The only reason this even came up was that one of the clients happened to noticed that a laptop they had been using which was not their own but rather one that was to be returned to a "stockpile" contained copies of all these documents they thought they were only viewing now stored in the downloads folder.  While the files are not "secret" they are also not something that was mean to be shared with the next user of that laptop.  This "storage in the downloads folder" is something that has occurred relatively recently, viewing pdf files did not work like this in the recent past, maybe 6 months ago?   I am not sure when I first noticed it but I have known that eventually this issue would come up.

I have considered things like encryption with the file unable to be read without a "key" which could be provided to the people who are reading them ("I"  know this is a dumb idea and pointless as the reader could simply supply that key to anyone along with the encrypted file  But you would be surprised at how many people will BELIEVE that this would help and belief in something even when wrong is sometimes all the "cure" that is needed!)

Anything I can do "server side" to make things "Look" like they used to is what I am trying to find right now.  A way to either automatically prevent the downloading of the file to the downloads folder where it looks like a normal pdf file with name title body etc.    OR    If it IS downloaded to go back to putting it in the temporary internet files folder stored under a random bunch of letters and numbers.

Eventually, the ideal solution would be a way to do something on the “Server-Side” to make the file >"unusable/unreadable"< by anyone beyond the person who was given the link to the website.  The Pdf's are already set with every protection offered by Adobe but almost everyone these days knows plenty of ways to by-pass that type of password protection.  Fortunately, this is not a "High Security" Issue but more one of "What They Want the Site To Do".  I would be more than willing to use a different file type, or any other option that would achieve this. 

Please keep in mind that for a “Quick Fix”  a "Trick" is as "Good as Gold" in this case.  If the "TRICK" works to make it seem as though there is not a copy of that file on the viewer's system that can easily be passed around to anyone they please, I am dealing with people who are only concerned with things "as they know them to be".  Like a magician at a child's birthday party, the problem can be solved with "sleight of hand" temporarily if necessary until I can find a way to employ a "REAL" solution.

The Ultimate Solution will probably involve using some proprietary file type that would require using a "viewer" which we would have to provide to the people who are supposed to be able to "view" it.  If anyone here can offer alternatives, I am open to anything but I need to resolve it, at least in some temporary way, as soon as possible. 

I also would be very interested in knowing what change over the last 6 months would have created this issue?

There are several of these sites and every one of them now has the same issue above but as recently as 6 months ago all of them worked fine (as far as I know).  The shift from ending up as an actual downloaded file in the "Viewer's " downloads folder instead of a rather anonymous file in the temporary internet files cache just "happened".   If I knew what caused this change, there could be an quick solution. 

While looking at other sites that host PDF files, One of them I entered shows their files as  "filename.pdf” with a download button next to it.  When I clicked the filename nothing happened but when I clicked the download button, the file opens in a PDF viewer of some kind.    If I do the SAME THING on one of our sites, the file immediately goes to the Chrome Downloads.   I just found this yesterday so I am not sure which app opened when I clicked that “Download” button  but if I could get my sites to do the same thing that would solve a lot of the problem.

On their site, instead of instantly downloading in Chrome (the browser I was using) , it opened in another application which allowed me to read the whole pdf on-screen.  This "shell viewer" does not look like Adobe and more resembles the one built into Edge.  As long as I did not Click the option to SAVE or PRINT (which were icons of a floppy disk, or a printer) when I closed the file after reading there was no "left over" files on my system that I could easily find.

However this happened, it would be perfect for what I need.  Knowing Windows 10, this could even be tied to whatever program is set as the "Default PDF handler" on the Website Server that hosts the Apache Websites.   At the least it demonstrate that there must be a solution even though I do not yet know what it is.

Thanks to everyone for any ideas as this is an urgent situation in the eyes of some and I have to resolve it in way that would have the least impact to the ability to view the files by the people who were meant to view them.
« Last Edit: March 03, 2016, 12:26 PM by questorfla, Reason: Spelling and clarification »

wraith808

  • Supporting Member
  • Joined in 2006
  • **
  • default avatar
  • Posts: 11,186
    • View Profile
    • Donate to Member
The only way is to make the user culpable in their restrictions, and trust your workaround.  And even then, you'll have problems.

You need to make something that downloads the PDFs to a cache that you control, and renders them itself using that cache.

A good example is shown here: https://www.scribd.c...Delta-Green-Rulebook

I uploaded that as a PDF.  Note that you can't download it as such.  It's rendered as an image.  And when you go there, the original PDF is not in your internet cache. (As a note, you can download the original PDF through their interface- but that's allowed use rather than a function of the mechanics for viewing).

questorfla

  • Supporting Member
  • Joined in 2012
  • **
  • Posts: 570
  • Fighting Slime all the Time
    • View Profile
    • Donate to Member
Thanks Wraith.  Here is a site that I found the example I described an I am trying to contact whoever hosts this setup so they can explain to me how to institute it on my servers.  So far i have not gotten past the secretary who answers the phone at the business :(  Maybe it you see what it does, this will look familiar to you. Click HERE

I hope i did that properly as I do not normally post links.

On that page if you click any of the "Download Buttons" it opens a viewer of some kind , it does not automatically download anything
in that viewer, you DO have options of saving, printing etc.  I noticed that the link that opens when you click Download is not direct to the pdf file itself.  I am now even seeing that if the file is rather large you can see that it is being opened by what is called WORD, i am not sure if it is an APP or the full program nor whether it is on my system or on the HOST system. 
 
I am using CHROME as a browser with no special add ons.   My windows 10 system is set to use FOXIT READER as the default for PDF file extensions  This site does exactly what I need my site to do as far as handling pdf files as I cannot ask eash user to make changes to their default configurations.


Perhaps that site has WORD set to be the default viewer for PDF files or some other?   If I choose to open a PDF file using WORD on my own system locally I get various warnings about "may not create a 100% accurate view of the item displayed" and other disclaimers.  But on this site, clicking download apparently causes the document to either open in Word or a WORD APP either at their end or go directly into Word at my end.

While this does leave open the 'OPTION' of saving, printing etc, it would put an end to the Automatic downloading of every pdf when viewed.




wraith808

  • Supporting Member
  • Joined in 2006
  • **
  • default avatar
  • Posts: 11,186
    • View Profile
    • Donate to Member
On my browser, it downloads the PDF and opens it in the Adobe reader plugin.  The URL's are even to the files themselves, i.e. http://www.regionalu...Quality%20Report.pdf

That's the first link on that page.

I'm also using Chrome with no special extensions to handle anything.

questorfla

  • Supporting Member
  • Joined in 2012
  • **
  • Posts: 570
  • Fighting Slime all the Time
    • View Profile
    • Donate to Member
That IS odd that it would react differently on different systems using the same browser
now I more than ever want to blame Windows 10.  There was always a sort of "default program" for various file types but Windows 10 has taken that to a whole new level.
I may try to send attach a screenshot here.
In any event, your method would work as well or better.  How hard would it be to use that on Pre-existing Apache websites?  Is it an application or a method of display and of course, where can i buy it?  At least with yours, i can see how i could BLOCK downloads.  the other looks to me like it opens in some kind of Apache ...  or maybe it is Fox-it Viewer since Fox-it is my default for opening PDF files on that system.  Shows how seldom i ever use it!  I like yours Better

wraith808

  • Supporting Member
  • Joined in 2006
  • **
  • default avatar
  • Posts: 11,186
    • View Profile
    • Donate to Member
I guess the answer to that question depends on what the powers that be are thinking.  Why do they want to restrict downloading?  Is it because of printing and/or copy and paste?  If so, it seems that it would be better to alter the PDF to restrict those.

Otherwise, what language are you using?  I looked for something you can use from .NET, and came up empty.  That might be a custom solution on scribd.

questorfla

  • Supporting Member
  • Joined in 2012
  • **
  • Posts: 570
  • Fighting Slime all the Time
    • View Profile
    • Donate to Member
It's like this:
The downloaders are people on a committee who are discussing candidates for a position to be filled at a company.  Because the documents they are reading are only resumes and similar, there isn't much there which could not be obtained from various public resources. 
But!
IF someone were to have access to all the resumes that had been read by a specific committee or member of one they would know who the specific applicants are and could also make a pretty good guess at how far along specific applicants might be in the deliberation process.  Since every document involved is now copied to the downloads folder of every committee member's laptop it just makes a very neat package for anyone who wanted to be underhanded about it.

As this is a temporary committee formed for only this one purpose, they are most likely using laptops drawn from a company "equipment pool" While they "could" just be told be told to make new rules about completely cleaning all files and folders from each laptop after every meeting anyway (something I would have advised doing anyway as a matter of course)  It isn't how things have been done before.  And I get the non-envious job of having to explain WHY things have changed to people who barely understand how things work in the first place as well as defend the methods which once worked one way but now work another while trying to explain that I have no control over it anyway. :(

In my eyes, if it matters this much they should be doing it altogether differently anyway.  BUT.  That decision is above my paygrade.  I only have to find a way to prevent the files from being automatically downloaded to the "downloads folder" while keeping the names and text in such an obvious and easy to access form.   Even normal people know to look for downloaded files in the downloads folder.  Well... most do anyway.

But few if any would dream that the music they are listening to on Pandora could easily be captured and saved to make a "bootleg" recording to keep if they only knew where to find the stored data stream and convert it back to an mp3 format.  So as long as those files went to that Unnamed temporary data storage compartment, all was well in the Universe.  I need to put them back into the "dark-hole" so to speak.


 

Stoic Joker

  • Honorary Member
  • Joined in 2008
  • **
  • Posts: 6,646
    • View Profile
    • Donate to Member
As this is a temporary committee formed for only this one purpose, they are most likely using laptops drawn from a company "equipment pool" While they "could" just be told be told to make new rules about completely cleaning all files and folders from each laptop after every meeting anyway (something I would have advised doing anyway as a matter of course)  It isn't how things have been done before.


Ding! Ding! Ding!!! We have a Winner!!!

Seriously, this should be a totally carved in stone policy for any company controlled equipment/data. Part of what we do as an MPS (Managed Print Services) provider is to lease printers to clients that need high-end printing equipment for variable periods of time. Most of the devices have Send-2-X networking features that require usernames, passwords, address books, etc... to do said task. So there is potential for much sensitive info to be gleaned from the device if one knows how to extract it. This is why we have a carved in stone policy of enabling the security encryption on these devices and wiping them when they're returned. From anywhere, for any reason. Data should never be left on a machine...ever.


One other thing that comes to mind, is that your issue of auto downloaded files sounds much like a - completely client side - browser configuration issue. The type of issue that would never have existed in the first place if you had a proper domain...that allowed you to enforce configuration restriction policies on company (should be) controlled equipment. You may want to consider looking into a - Cloud Based - Intune/Azure AD solution that affords you the ability to control what happens to your companies information on both company controlled and BYOD equipment. That would show management some value because you could flatten a device remotely as they were using it to demonstrate the advantage of having truly and properly centralized administration available.

skwire

  • Global Moderator
  • Joined in 2005
  • *****
  • Posts: 5,286
    • View Profile
    • Donate to Member
Folks, are we certain this isn't a server-side MIME type issue?

Typically, PDFs have a server-side content-type of application/pdf which browsers can recognise and choose to display said PDF within the browser.  However, if PDFs are set as a content-type of application/octet-stream, most browsers would then simply download the file.  At any rate, it's something worth checking out and you should be able to see if this is the problem by looking at the headers returned in the links you're using for the PDFs.

wraith808

  • Supporting Member
  • Joined in 2006
  • **
  • default avatar
  • Posts: 11,186
    • View Profile
    • Donate to Member
Folks, are we certain this isn't a server-side MIME type issue?

Typically, PDFs have a server-side content-type of application/pdf which browsers can recognise and choose to display said PDF within the browser.  However, if PDFs are set as a content-type of application/octet-stream, most browsers would then simply download the file.  At any rate, it's something worth checking out and you should be able to see if this is the problem by looking at the headers returned in the links you're using for the PDFs.

But if it changed suddenly, then the mime type shouldn't have changed- but it's worth checking.  Good catch, skwire!

questorfla

  • Supporting Member
  • Joined in 2012
  • **
  • Posts: 570
  • Fighting Slime all the Time
    • View Profile
    • Donate to Member
hanks Skwire and I did bring that up to the webdesigners.  The reply was "We have not made any changes to anything".  Along with ...
"This must be something (you) changed on the hosting servers".  Again, as far as I know, "NO one" did anything to anything.  All was well until one day it wasn't.  I was informed just a few days ago and started looking into it so I can't even say for sure how long it has been happening.

But I am very sure of what once happened and this isn't it.

RE: "  ...However, if PDFs are set as a content-type of application/octet-stream.."  there are statements to the exact opposite to prevent them from downloading unless Apache has changed the manner in which it is supposed to be stated.  However, I think you are on the right track. Something along these lines is exactly what is happening if I can only figure out why and when it began.

BUT!  In the process, if I could find a better way that offered more control over the documents themselves it would certainly be a "Bonus".  This has always been a problem even when it worked properly.  The original idea of buying ADOBE Acrobat was a waste before we even got it.   The "crack" programs for all the ADOBE "file-locks" are FREE. 

The applications I looked at a couple of years ago to handle server-side viewing control were either only available if you used that particular  company's web hosting or so expensive and complicated as to make it not feasible for our needs.  Maybe there are better ways now and i should do some research on current methods.

Something that could do both (or multiple products if necessary) would be an even better solution as there are times that we would want specific people to be able to download the files but would like to be able to control further dissemination of them to other systems or other people

questorfla

  • Supporting Member
  • Joined in 2012
  • **
  • Posts: 570
  • Fighting Slime all the Time
    • View Profile
    • Donate to Member
If i could get Xpdf into the sites display module correctly and if I have interpreted what it does and why, that should solve the problem.  Maybe. 
Either way, I have now dipped into Dark Waters as far as I am concerned.  Adding the viewer into the site even if it works is beyond anything I can do.   And Xpdf was the simplest embeddable viewer i could find. 
As there are no cases where the pdf files would ever be downloaded it would be fine if they Always displayed and had no option to download at all.

questorfla

  • Supporting Member
  • Joined in 2012
  • **
  • Posts: 570
  • Fighting Slime all the Time
    • View Profile
    • Donate to Member
OK, Update.  I now have a Moodle App (??) that is a multi file-type webviewer cross platform compatibility but the author does not use Windows at all  :(
So, i am lost in this mess trying to find the specific PHP command module that is activated when someone clicks on the desired document. 
If any browser reaches the file by going through through this app, it works perfectly as a viewer with all the usual controls available.

So far, though, I have to do this manually by adding the filename to the address I type into the browser.

For it to be of any use it needs to work within the context of the site as being viewed by a user who clicks the name rather than typing it into their web browser   I guess it's back to Website School for me :(