topbanner_forum
  *

avatar image

Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length
  • Thursday March 28, 2024, 3:45 am
  • Proudly celebrating 15+ years online.
  • Donate now to become a lifetime supporting member of the site and get a non-expiring license key for all of our programs.
  • donate

Author Topic: DONE: Delete every N files, or keep only every N files  (Read 14642 times)

Deozaan

  • Charter Member
  • Joined in 2006
  • ***
  • Points: 1
  • Posts: 9,747
    • View Profile
    • Read more about this member.
    • Donate to Member
DONE: Delete every N files, or keep only every N files
« on: April 21, 2015, 06:12 PM »
Hi folks,

When I do Ludum Dare, I take screenshots which I can compile into a timelapse video. This means that over the course of 72 hours I often end up with around 80,000 to 120,000 screenshots. I've recently decided that I don't need to take so many screenshots and the timelapse video will be just fine, either by being shorter or having a slower framerate. But that's all kind of beside the point.

So basically, I have ~80,000+ files all in the format of ss######.png (where the #s are actually digits, such as ss000001.png) and I'd like a utility that will allow me to easily cull them down such that I either delete every N images, or perhaps even more effective, delete all of them except every N images.

Is there already a utility that exists to do this? If not, can one be provided as a Coding Snack?

Thanks.
« Last Edit: September 15, 2015, 11:54 AM by Deozaan »

MilesAhead

  • Supporting Member
  • Joined in 2009
  • **
  • Posts: 7,736
    • View Profile
    • Donate to Member
Re: IDEA: Delete every N files, or keep only every N files
« Reply #1 on: April 21, 2015, 06:45 PM »
You could run this as a test to see what would happen.  I'm paranoid about doing file deletions.  But if you have a folder with no subfolders and that folder is backed up you should be ok.

This will just say how many it would delete per setting of the Modulus in the script
Also set PngFolder to the real folder.  :)

#NoEnv
SendMode Input
SetWorkingDir %A_ScriptDir%
Total := 0
Deleted := 0
Modulus := 4 ; set this to every Nth file to save
Loop c:\PngFolder\*.png
{
   Total++
   if (! Mod(A_Index ,Modulus))
      continue
   Deleted++
}
MsgBox %Deleted% out of %Total% PNG files erased

skwire

  • Global Moderator
  • Joined in 2005
  • *****
  • Posts: 5,286
    • View Profile
    • Donate to Member
Re: IDEA: Delete every N files, or keep only every N files
« Reply #2 on: April 21, 2015, 07:36 PM »
Nice work, MilesAhead.   :Thmbsup:

Here's a script that handles both scenarios.  It should be pretty self-explanatory.  Comment/uncomment the message box and delete/recycle lines as necessary.

Code: Autohotkey [Select]
  1. Path := "C:\tmp\04"
  2. Mode := "K" ; "Use K to keep every Nth file or D to delete every Nth file."
  3. N := 5
  4.  
  5. Loop, % Path . "\*.png"
  6. {
  7.     If ( Mode = "K" )
  8.     {
  9.         If ( Mod( A_Index, N ) != 0 )
  10.         {
  11.             MsgBox, % "Delete: " . A_LoopFileFullPath
  12.             ; FileRecycle, % A_LoopFileFullPath
  13.             ; FileDelete, % A_LoopFileFullPath
  14.         }
  15.     }
  16.     Else If ( Mode = "D" )
  17.     {
  18.         If ( Mod( A_Index, N ) = 0 )
  19.         {
  20.             MsgBox, % "Delete: " . A_LoopFileFullPath
  21.             ; FileRecycle, % A_LoopFileFullPath
  22.             ; FileDelete, % A_LoopFileFullPath
  23.         }        
  24.     }
  25. }
« Last Edit: April 23, 2015, 08:31 AM by skwire »

Deozaan

  • Charter Member
  • Joined in 2006
  • ***
  • Points: 1
  • Posts: 9,747
    • View Profile
    • Read more about this member.
    • Donate to Member
Re: IDEA: Delete every N files, or keep only every N files
« Reply #3 on: April 22, 2015, 01:06 AM »
Thanks!

I'm letting it run now, keeping only every 6th file. It doesn't win any prizes for speed, but it does seem to be working as advertised. I'll let it run overnight so it can finish its thing and I'll report back later.

:Thmbsup:

MilesAhead

  • Supporting Member
  • Joined in 2009
  • **
  • Posts: 7,736
    • View Profile
    • Donate to Member
Re: IDEA: Delete every N files, or keep only every N files
« Reply #4 on: April 22, 2015, 05:34 AM »
Thanks!

I'm letting it run now, keeping only every 6th file. It doesn't win any prizes for speed, but it does seem to be working as advertised. I'll let it run overnight so it can finish its thing and I'll report back later.

:Thmbsup:

Even "compiled" it is run through an interpreter.  But the good thing is it is programmer coding efficient.  Meaning the file handling is so simple I could knock it off without making a mistake.  In C based I would have to use some variant of FindFirstFile FindNextFile which needs to be kept as a snippet or encapsulated in a class or wrapper once it works as desired.  Lots of little gotcha's when using the API calls directly.  :)

It might run a tad faster if compiled just because ahk would only be running the single script.  You may want to change it to output the filename it would keep to a result file to make sure it works as expected before inserting the actual delete line.

Deozaan

  • Charter Member
  • Joined in 2006
  • ***
  • Points: 1
  • Posts: 9,747
    • View Profile
    • Read more about this member.
    • Donate to Member
Re: IDEA: Delete every N files, or keep only every N files
« Reply #5 on: April 22, 2015, 11:15 AM »
Sorry I wasn't clear. I took your advice about being paranoid about file deletions, made a backup of the 80GB directory of screenshots, then I used Skwire's script to delete things. I figured if I wasn't happy with the results while it ran over night I could just restore the backup and try something else.

Starting at almost 80,000 files (~78k to be more precise) I let it run for about half an hour before I went to bed, at which point it had reduced the number of files to just below 60,000 files. So, about 20k files per half hour meant it should have finished in about 2 hours.

When I came back to my computer this morning just before 10AM, my PC was completely locked up and unresponsive. The clock still said 8:23AM, the mouse cursor wouldn't move, and ctrl-alt-delete or ctrl-shift-escape would not bring up the task manager.

After resetting my computer, I see that the AHK script didn't finish the job. There are still about 23,000 files left and there should be roughly only 13,000 when it finishes. So it still had about 10k to go before things locked up.

I'm guessing that the reason my PC locked up was due to the script, as I can't recall ever having such a total lockup on this system, and the script was putting my Core i7 2600K to near 100% usage on all cores/threads.
« Last Edit: April 22, 2015, 08:32 PM by Deozaan »

MilesAhead

  • Supporting Member
  • Joined in 2009
  • **
  • Posts: 7,736
    • View Profile
    • Donate to Member
Re: IDEA: Delete every N files, or keep only every N files
« Reply #6 on: April 22, 2015, 04:04 PM »
I used Skwires script to delete things.

When I last posted Skwire's script wasn't visible to me.  Sorry for the confusion.  :)

Deozaan

  • Charter Member
  • Joined in 2006
  • ***
  • Points: 1
  • Posts: 9,747
    • View Profile
    • Read more about this member.
    • Donate to Member
Re: IDEA: Delete every N files, or keep only every N files
« Reply #7 on: April 22, 2015, 08:31 PM »
Yeah, that script definitely caused issues. I moved all the images that were already kept into another directory so it could just resume from where it left off and started the script up again. Here are some screenshots of behavior I saw.

As expected, it immediately maxed out my CPU usage to 100% and made my computer somewhat sluggish (occasional "hiccups" of unresponsiveness). I opened the task manager to take a look at the CPU usage and noticed the RAM usage climbing sky high:

Screenshot - 15-04-22, 10-19-37.pngDONE: Delete every N files, or keep only every N files

Screenshot - 15-04-22, 10-21-35.pngDONE: Delete every N files, or keep only every N files

RAM usage kept climbing, so I closed out Chrome, Firefox, and other misc. programs, anything using more than 100MB of RAM got shut down.

Screenshot - 15-04-22, 10-22-04.pngDONE: Delete every N files, or keep only every N files

The RAM kept climbing up, but soon peaked around 13-14GB of RAM(!), and then started alternating between going down and climbing back up, like so:

Screenshot - 15-04-22, 10-26-25.pngDONE: Delete every N files, or keep only every N files

And to get an idea of how fast/slow the files were being deleted, here's an animated gif:

AHK Scrip Speed.gif

I lowered the process priority so it wouldn't slow down my PC and left it running for a few hours while I was away from my PC again. It eventually finished the job, though I'm not sure how long it took.

Final results of the cull: ~80,000 files totaling ~80GB reduced to ~13,000 files totaling ~13GB. I guess that means the average filesize is about 1MB per image.
« Last Edit: April 23, 2015, 09:57 AM by Deozaan »

Ath

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 3,612
    • View Profile
    • Donate to Member
Re: IDEA: Delete every N files, or keep only every N files
« Reply #8 on: April 23, 2015, 01:59 AM »
Be aware that having a (1) directory with ~80,000 files in it will bring Explorer, and other tools looking at that file-list, to a crawl :'(

MilesAhead

  • Supporting Member
  • Joined in 2009
  • **
  • Posts: 7,736
    • View Profile
    • Donate to Member
Re: IDEA: Delete every N files, or keep only every N files
« Reply #9 on: April 23, 2015, 05:29 AM »
Be aware that having a (1) directory with ~80,000 files in it will bring Explorer, and other tools looking at that file-list, to a crawl :'(

It may have been easier to do a move of the Nths files to an outside folder on the same drive.  Then just delete the original folder and recreate it.  Not sure if AHK would be faster that way(depends how FileRemoveDir is implemented) but it would be fewer file operations until right at the end.  A Move on the same drive should be quite fast.

Edit:  It is difficult when dealing with a large number of objects with untested code.  For one thing, it's not trivial to set up a test case generating scads of fake data.  Nobody wants to do that unless they have a guinea pig PC lying around.  Plus it is very time consuming.  Perhaps a mechanism to lower the AHK task priority(I have never used SetBatchLines.  I'm not sure if it's designed for this situation) would help to make it more of a background task.
« Last Edit: April 23, 2015, 05:37 AM by MilesAhead »

mouser

  • First Author
  • Administrator
  • Joined in 2005
  • *****
  • Posts: 40,896
    • View Profile
    • Mouser's Software Zone on DonationCoder.com
    • Read more about this member.
    • Donate to Member
Re: IDEA: Delete every N files, or keep only every N files
« Reply #10 on: April 23, 2015, 07:56 AM »
From the cpu patterns, but more significantly the memory patterns, use you are describing, it sounds like what may be happening is that the delete function (perhaps because it is sending files to recycle bin), is being run asynchronously and creating a new thread for each iteration of the loop, so that essentially the loop is creating thousands and thousands of threads trying to delete files at the SAME TIME, while the loop runs super spawning all of the threads/processes -- rather than it processing each one in turn.

Alternatively, it could be an antivirus program having the same effect.

It would be one thing for the cpu use on one of your cores going to 100%, but the memory use suggests that each iteration of the loop is causing something to be spawned.
« Last Edit: April 23, 2015, 09:26 AM by mouser »

skwire

  • Global Moderator
  • Joined in 2005
  • *****
  • Posts: 5,286
    • View Profile
    • Donate to Member
Re: IDEA: Delete every N files, or keep only every N files
« Reply #11 on: April 23, 2015, 09:21 AM »
I set up a couple of test folders, each with 80,000 one meg files.  From my tests, using FileRecycle will cause the issues you saw.  When I used FileDelete, the script took about five seconds to run with little to no hit to CPU and RAM.

Deozaan

  • Charter Member
  • Joined in 2006
  • ***
  • Points: 1
  • Posts: 9,747
    • View Profile
    • Read more about this member.
    • Donate to Member
Re: IDEA: Delete every N files, or keep only every N files
« Reply #12 on: April 23, 2015, 10:00 AM »
I set up a couple of test folders, each with 80,000 one meg files.  From my tests, using FileRecycle will cause the issues you saw.  When I used FileDelete, the script took about five seconds to run with little to no hit to CPU and RAM.

Well shoot! Looks like I picked the wrong one. Thanks for getting to the bottom of this. And thanks for the script in the first place. :Thmbsup:

skwire

  • Global Moderator
  • Joined in 2005
  • *****
  • Posts: 5,286
    • View Profile
    • Donate to Member
Re: IDEA: Delete every N files, or keep only every N files
« Reply #13 on: April 23, 2015, 10:32 AM »
You're welcome.  Thanks for the DC credits, too.   :D

Stoic Joker

  • Honorary Member
  • Joined in 2008
  • **
  • Posts: 6,646
    • View Profile
    • Donate to Member
Re: IDEA: Delete every N files, or keep only every N files
« Reply #14 on: April 23, 2015, 11:19 AM »
Be aware that having a (1) directory with ~80,000 files in it will bring Explorer, and other tools looking at that file-list, to a crawl :'(

Damn it! ...So what is the best way to store porn?

kyrathaba

  • N.A.N.Y. Organizer
  • Honorary Member
  • Joined in 2006
  • **
  • Posts: 3,200
    • View Profile
    • Donate to Member
Re: IDEA: Delete every N files, or keep only every N files
« Reply #15 on: July 08, 2015, 08:33 AM »
Consider this one DONE? Using the script myself. Good work  :Thmbsup:

Deozaan

  • Charter Member
  • Joined in 2006
  • ***
  • Points: 1
  • Posts: 9,747
    • View Profile
    • Read more about this member.
    • Donate to Member
Re: DONE: Delete every N files, or keep only every N files
« Reply #16 on: July 08, 2015, 02:58 PM »
While I think it would be nice to have a simple front-end GUI to make changing the settings a little easier, it works well enough to consider it done.

Thanks again, Skwire! :Thmbsup:
« Last Edit: September 15, 2015, 11:54 AM by Deozaan »