Show Posts - jity2

Screenshot Captor / Re: Scroll capture - Errors and Home button

« on: July 19, 2020, 04:18 AM »

Dear Mouser,

I think I have a similar problem with the scrolling window capture (for some non downloadable Google Drive shared big pdf files.).
Maybe an idea : The "save each capture as a separate image" feature is great but could it be automatic without keeping all the capture images in RAM memory at once ? Like that no more RAM problem ! ;)
I.e. : Screenshot Captor saves each screenshot file on the hard drive just before scrolling to the next page. ;)
Next I could split the saved images files at once for instance with XnView (See : Tools/Batch processing). And make a big pdf out of it. ;)

Many thanks in advance ;)

Post New Requests Here / Re: REQ: CLI : basically unzip + run PDFTextChecker and move resulting files

« on: November 21, 2019, 09:09 AM »

Hi 4wd,

Thank you so much for your updated code. ;) This is working great. ;)

Not sure whether you mean folder1 should be completely empty at the end or not since files that aren't PDF will be remaining - ie. delete everything in folder1 after running the Check-PDF function.

In which case, what happens to the non-PDF files that were in the folders?
Delete or move with PDF?

Keeping (like you have already done) what is left inside folder1 is fine. ;)

I did some tests and here is a zip files with examples :
The main bug (see "2-itextsharp.pdfa") is when in folder1 a folder is named "bla.pdf" it causes a "stackoverflow" bug in powershell (it closes Powershell and restarts it when I run it in edit mode).
The other thing is strange filenames (maybe some asian text?). Edit 1: it is because on my computer the folder path is too long on some of those one ! Otherwise it is renaming them fine ! ;)

Others are detecting invalid pdf files (see "1-malformed_pdf"- from what I understand there are at least 2 problems : 1)Invalid pdf file and 2)The image file format has not been recognized ). From experience it is a complex problem and I think it is better if I do it by hand with PDFinfoGUI (*) and remove them with an excel macro as I can see very fast if I need to download again some important files. So please forget about those. ;)
(*) neither yes or no in column encrypted and other columns - Then I copy the list - except the important one
https://filebin.net/t5ai1inqcfw7vp67/20191121tests.zip?t=6qfxn4l3

Also, many thanks for the detailed explanations of long names. I appreciated.;) Your truncate filename current code is just already very fine for me. ;) Thanks. ;)

Currently doesn't check for the existence of a file with the same name before renaming.

If I understand well it is because "folder1\venise-.pdf" would have the same name of a file already available in folder3 "folder3\venise.pdf". Renaming the new one (for instance with a counter "venise1.pdf" would be fine).

I have added a small function in order to delete empty folders in folder3

Function Delete-SmallPDF2 {
param (
[bool]$delSmall
)
if ($delSmall) {
Get-ChildItem "$($folder3)\*" -Include *.pdf -Recurse | ? {$_.length -lt 2048} | % {Remove-Item $_.fullname -Force}
}
Get-ChildItem "$($folder3)" -recurse | Where {$_.PSIsContainer -and `
@(Get-ChildItem -Lit $_.Fullname -r | Where {!$_.PSIsContainer}).Length -eq 0} | Remove-Item -recurse
}

Edit 2:
I forgot I had this error message :

[Select]

PS C:\Windows\System32\WindowsPowerShell\v1.0> C:\Users\E\Documents\tests\jityPDF.ps1
Add-Type : Cannot bind parameter 'Path' to the target. Exception setting "Path": "Cannot find path 'C:\Windows\System32\WindowsPowerShell\v1.0\itextsharp.dll' because it does not exist."

So I have copied "itextsharp.dll" in C:\Windows\System32\WindowsPowerShell\v1.0\itextsharp.dll
It may explain why if I try to use 4wd's code in another hard drive (example : L:\) even if I copied the 7z.dll, 7z.exe and itextsharp.dll files and adapt the code for new locations, the script doesn't show errors but it fails to run properly the Check-PDF part. It moves some image based pdf in folder2 instead of folder3 for some files ? So I stay in C:\ ;)

Thanks in advance ;)
Jity2

Post New Requests Here / Re: REQ: CLI : basically unzip + run PDFTextChecker and move resulting files

« on: November 18, 2019, 12:36 PM »

Hi 4wd,
Wow. Many thanks. This is fantastic. ;)

I did some tests and the only small things that are not working are :
- it doesn't delete empty folders of folder1 (if some subfolders are empty or not with other kind of files).

-

[Select]

Testing for text PDFs ...
Move-Item : Cannot retrieve the dynamic parameters for the cmdlet. The specified wildcard character pattern is not valid: [Lac_ven_Drnyvn,_Sr_ehajoe_Uduizn,_Giles_Suilo-Sm(e-kjd.org).pdf
At C:\Users\E\Documents\test\jityPDFt3v5_7zip.ps1:82 char:7
+ Move-Item "$($files[$i])" -Destination "$($outfile)"
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : InvalidArgument: (:) [Move-Item], ParameterBindingException
+ FullyQualifiedErrorId : GetDynamicParametersException,Microsoft.PowerShell.Commands.MoveItemCommand

Then, the test "for text PDFs" stops and the other same kind of files are not processed.

Also if the pdf file has "[" in its filename, it ignores it (but creates an empty folder in folder2 if the pdf is located into a subfolder).

And the great thing is that it finds encrypted pdf with or without OCR already done moves them accordingly. ;)

Thanks in advance ;)
Jity2

Post New Requests Here / Re: REQ: CLI : basically unzip + run PDFTextChecker and move resulting files

« on: November 17, 2019, 01:39 PM »

Hi 4wd,

Many thanks for your detailed explanations. ;) Your last code update is working great for 'unzipping' zip and rar files. ;) :Thmbsup:

(A simple note for those following : I have downloaded this 7zip version (https://www.7-zip.org/a/7z1900-x64.exe) that I have extracted it in "C:\Program Files\7-Zip". Then I have copied "7z.exe" and "7z.dll" in my working directory.)

Still doesn't do extracted archives ... still thinking about it :P

I thought that if I run it twice it would find them (example : folder1\6\6\example.zip) during the next run (which would be fine) but no.

Thanks in advance ;)
Jity2

Post New Requests Here / Re: REQ: CLI : basically unzip + run PDFTextChecker and move resulting files

« on: November 16, 2019, 04:48 AM »

Hi again,

Would this help for testing if PDF have been OCRed with Powershell ?
https://superuser.com/a/1278521/27956

Thanks in advance, ;)
Jity2

Messages - jity2 [ switch to compact view ]

Screenshot Captor / Re: Scroll capture - Errors and Home button

Post New Requests Here / Re: REQ: CLI : basically unzip + run PDFTextChecker and move resulting files

Post New Requests Here / Re: REQ: CLI : basically unzip + run PDFTextChecker and move resulting files

Post New Requests Here / Re: REQ: CLI : basically unzip + run PDFTextChecker and move resulting files

Post New Requests Here / Re: REQ: CLI : basically unzip + run PDFTextChecker and move resulting files