Hi,
Many thanks 4wd.
I much appreciated.
I did some tests (one old month saved with IE and one old month saved with Firefox) with your updated script and it works fine for html and htm files at the same time.
As I have quite some files converting htm and html files should run for about a few months! But it seems that I can speed up the converting if I run several (copied and renamed -shortcut included ) powershell instances.
DONE: Mass convert already locally saved html (+htm +mht) files to pdfI have added a few manual steps :
Modified from
https://stackoverflo...irectory-recursively , here is the Powershell code that I use to remove the pdf files that are smaller than 3ko (created by wkhtmltopdf, they in fact contains no text in my case):
Get-ChildItem $path -Filter *.pdf -recurse -file | ? {$_.length -lt 3000} | % {Remove-Item $_.fullname}
Then I use the freeware "Remove Empty Directories"
http://www.jonasjohn.de/red.htm which removes..all empty directories in all the subfolders. It is very powerful IMHO.
I don't know if this is possible but it would be great if the Powershell could exclude converting htm and html files that are smaller than 3ko ? Thanks in advance
You also have a great memory for my 2016 request.
But I must acknowledge that I wouldn't have been able to modify the 10% left in the new code!!!
For mht files to pdf:Thanks for the mht to html link. It reminds me that finding a simple solution can leads to many trials !
Mine were not created with Internet Explorer but were and are created with Google Chrome. In my manual tests, these mht files are often better displayed in Chrome than in I.E.
Here are my tests :
[windows+R]
cmd
[Select]
cd C:\Program Files (x86)\Google\Chrome\Application
Apparently Chrome also understands if I change the code from chrome --headless --print-to-pdf="C:\
result\20170619_075623.pdf" "C:\
source\t2\20170619_075623.htm"
tochrome --headless "C:\
source\t2\20170619_075623.htm" --print-to-pdf="C:\
result\20170619_075623.pdf"
And after some tests (thanks to
https://www.autohotk...iewtopic.php?t=26819) it helped me having a working code ! It works
but it copies other files (png..) in the target folder !
DONE: Mass convert already locally saved html (+htm +mht) files to pdf :
WorKingDir := "C:\Program Files (x86)\Google\Chrome\Application"
pdParams := "chrome.exe --headless "
FileSelectFolder,SourcePath,,0,Select Source Folder
If SourcePath =
ExitApp
FileSelectFolder,TargetPath,*%SourcePath%,0,Select Target Folder
If TargetPath =
ExitApp
pdParams := "chrome.exe --headless "
WorKingDir := "C:\Program Files (x86)\Google\Chrome\Application"
RunWait % comspec " /c xCopy """ SourcePath A_loopField """ """ TargetPath A_loopField """ *.mht /s /i /y",, Hide
Loop, Files, % TargetPath "\*.mht" , R
{
SplitPath, A_LoopFileFullPath, name, dir, ext, name_no_ext
outPDF_repared := dir "\" name_no_ext "" ".pdf"
pCmd := pdParams " " """" A_LoopFileFullPath """" " " """" "--print-to-pdf="outPDF_repared """"
RunWait % comspec " /c " pCmd , % WorKingDir , Hide
FileAppend, % "Result pdrepair`n" outPDF_repared "`n", % A_Temp "\LOG_pdrepair.txt"
FileRead, outLOG, % TargetPath "\LOG.txt"
FileAppend, % outLOG "`n" , % A_Temp "\LOG_pdrepair.txt"
FileDelete, % A_LoopFileFullPath
}
Msgbox 0x40000,, % "END!",1
ExitApp
I have tried to change:RunWait % comspec " /c xCopy """ SourcePath A_loopField """ """ TargetPath A_loopField """ /s /i /y",, Hide
with
RunWait % comspec " /c xCopy """ SourcePath A_loopField """ """ TargetPath A_loopField """
*.mht /s /i /y",, Hide
or
RunWait % comspec " /c xCopy """ SourcePath A_loopField """
*.mht """ TargetPath A_loopField """ /s /i /y",, Hide
or
RunWait % comspec " /c xCopy
*.mht """ SourcePath A_loopField """ """ TargetPath A_loopField """ /s /i /y",, Hide
or
RunWait % comspec " /c xCopy
"\*.mht" """ SourcePath A_loopField """ """ TargetPath A_loopField """ /s /i /y",, Hide
Alas I am stuck !
So I have tried to modify your Powershell script :<#
CTP.ps1
Recursively convert *.mht to PDF.
#>
Function Get-Folder {
Add-Type -AssemblyName System.Windows.Forms
$FolderBrowser = New-Object System.Windows.Forms.FolderBrowserDialog
[void]$FolderBrowser.ShowDialog()
$temp = $FolderBrowser.SelectedPath
If($temp -eq '') {Exit}
If(-Not $temp.EndsWith('\')) {$temp = $temp + '\'}
Return $temp
}
If($PSVersionTable.PSVersion.Major -lt 3) {
Write-Host '** Script requires at least Powershell V3 **'
} else {
Write-Host 'Choose folder with PDF files: ' -NoNewline -BackgroundColor DarkGreen -ForegroundColor White
$srcFolder = (Get-Folder)
Write-Host $srcFolder
Write-Host 'Choose output folder: ' -NoNewline -BackgroundColor DarkGreen -ForegroundColor White
Do {$dstFolder = (Get-Folder)} While($dstFolder -eq $srcFolder)
Write-Host $dstFolder
$aFiles = (Get-ChildItem -Include *.mht -Path ($srcFolder + "*") -Recurse)
for($i = 0; $i -lt $aFiles.Count; $i++) {
$inFile = [string]$aFiles[$i]
Write-Host 'File:' $inFile -BackgroundColor DarkBlue -ForegroundColor Yellow
$outFile = $dstFolder + $inFile.Replace($srcFolder, "") + '.pdf'
$temp = Split-Path $outFile -Parent
if (!(Test-Path $temp)) {
New-Item $temp -ItemType Directory | Out-Null
}
$args = "`"$($infile)`" chrome --headless --print-to-pdf=`"$($outFile)`""
Start-Process -FilePath "C:\Program Files (x86)\Google\Chrome\Application\chrome.exe" -Wait -NoNewWindow -ArgumentList $args
}
}
See "--print-to-pdf=". Alas this doesn't work !
@lainB: I am saving mht like you now (
https://www.donation....msg417446#msg417446). I don't use Google Docs now for uploaded files (I used to save htm files from Firefox but I have stopped). I just now mainly upload pdf files into Google Drive.
Many thanks in advance