topbanner_forum
  *

avatar image

Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length
  • Friday December 13, 2024, 12:21 am
  • Proudly celebrating 15+ years online.
  • Donate now to become a lifetime supporting member of the site and get a non-expiring license key for all of our programs.
  • donate

Last post Author Topic: The Best Way to handle finding and removiong Duplicate Files  (Read 47531 times)

questorfla

  • Supporting Member
  • Joined in 2012
  • **
  • Posts: 570
  • Fighting Slime all the Time
    • View Profile
    • Donate to Member
In trying to create an all encompassing archive of all the company documents, i have ended up with an extremely bloated folder containing more redundant info than anything else.  There are probably 10 copies of the exact same file in all in different places.  I can't risk running a mass "delete all but one copy"(though i wold like to) and am looking for the best way to scan this huge repository maybe in small chunks the first time to try to remove as many as i can.  
Some of these files are backup copies of PST files over 5GB in size that I may have 3 or 4 copies of.

This has been a work in progress for some time and and as it all starts getting into a single location i am finding this more and more often.  The ability to scan for duplicates exceeding 500MB might be a good start.

Years ago I used to have a program i used a lot on MP3's and Photos but don;t remember the name and it may not even be Windows 8 compatible.  If you can point me anything at all right now, just cleaning out the duplicated PST files would be a huge help.

The more control the program offers for filters would probably be nice.  I doubt I would even throw away the extras but getting them out of the main reference area would be a big help.

4wd

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 5,644
    • View Profile
    • Donate to Member
Re: The Best Way to handle finding and removiong Duplicate Files
« Reply #1 on: October 26, 2014, 12:14 AM »
Well, up to 4GB in size you could use VBinDiff but HxD will handle files up to 8EB ... if you feel the need  :D

Possibly a good start would be to generate hashes on each file, (SHA-1, etc), then you need only compare those with matching hashes - you could probably whip up a command file to do it  ;)

EDIT: The inbuilt Windows fc command can apparently also handle files >4GB.

Do all these files resides in the same directory?

Maybe I'll look at a simple, (or not), command file.
« Last Edit: October 26, 2014, 01:50 AM by 4wd »

IainB

  • Supporting Member
  • Joined in 2008
  • **
  • Posts: 7,544
  • @Slartibartfarst
    • View Profile
    • Read more about this member.
    • Donate to Member
Re: The Best Way to handle finding and removiong Duplicate Files
« Reply #2 on: October 26, 2014, 03:28 AM »
My exercises in comparing/selecting duplicates and removing/deleting them have usually tended to be of sub-folders nested within a folder/directory, whose contents have been treated as a "flat file" (i.e., the process has not necessitated documents being moved from their original holding folders first). I use xplorer² to do this, and it has an exceptionally powerful and handy duplicate checker.
Example:

Here is a partial screen clip of the duplicate checker being selected for the nested folders shown:
xplorer² duplicate checking 01.png


The folders' contents are first treated as a flat file.
Here is a partial screen clip of the duplicate checker selection box floating above the flat file display (which is in a "scrap" pane that can be operated on in various ways as a logical object). Note the checks for "size" and "content" (checksum), and "select all duplicates":
xplorer² duplicate checking 02.png


Here is a a partial screen clip of the list of all the duplicates, with all but one duplicate (I think it might be the earliest-dated or something) being "auto-selected" in each case (this is also a scrap pane):
xplorer² duplicate checking 03.png


You can then select or unselect files on the list, at your whim, and operate on them as a set - e.g., copy them into a .ZIP file for archiving before deleting them en bloc.

I don't know the constraints (if any) on max file sizes. The user guide should describe such, and is a PDF file that can be downloaded from zabkat.com (the xplorer² website) with a trial version.
The support site is http://zabkat.com/x2support.htm , which also has an online manual.

I use the xplorer² PRO version, but the Ultimate may suit your needs better.
For comparison, refer http://zabkat.com/x2down.htm
Hope this helps or is of use.
« Last Edit: October 26, 2014, 06:21 PM by IainB, Reason: Minor corrections for clarity. »

4wd

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 5,644
    • View Profile
    • Donate to Member
Re: The Best Way to handle finding and removiong Duplicate Files
« Reply #3 on: October 26, 2014, 06:18 AM »
A modification of one of my other command files that's around here somewhere.

FCompare-GNU.cmd

REQUIREMENTS:
It requires the following three (3) files from the GNU DiffUtils for Windows packages, (because they're faster than the native Windows commands).  Put them in the same directory as the command file.

  • cmp.exe
  • libiconv2.dll
  • libintl3.dll

These are available from the following two archives, (they're both on SourceForge, both less than 1MB in size):
Binaries
Dependencies


RUNNING:
Just double-click on it, it'll prompt you for two directories and the extension type of the files to compare.

It'll output to both the console it opens and a log file, which you get the option to open at the end.

As a matter of interest, it did a compare of two 8.17GB ISOs in ~3:20 - the files were on separate HDDs for speed, otherwise you're going to get disk thrashing.

NOTES:
  • It recurses through both directories chosen looking for matching file extensions.
  • It performs a size comparison before it resorts to binary comparison, (seems logical).
  • Because it uses DisableDelayedExpansion it's not going to work properly if your filenames have ! in them - rename them.
  • Recommend you choose directories on different HDDs to avoid disk thrashing.

If you want, I can put up the native Windows command version but it is woefully slow by comparison.

Code: Text [Select]
  1. @if (@CodeSection == @Batch) @then
  2.  
  3.     @echo off
  4.     color 1a
  5.     setlocal EnableExtensions
  6.     setlocal DisableDelayedExpansion
  7.    
  8.     echo Select SOURCE folder
  9.     for /F "delims=" %%S in ('CScript //nologo //E:JScript "%~F0"') do (
  10.         set srce=%%S
  11.     )
  12.     if "%srce%"=="" goto :End2
  13.     echo Selected folder: "%srce%"
  14.    
  15.     echo Select DESTINATION folder
  16.     for /F "delims=" %%D in ('CScript //nologo //E:JScript "%~F0"') do (
  17.        set dest=%%D
  18.     )
  19.     if "%dest%"=="" goto :End2
  20.    
  21.     echo Selected folder: "%dest%"
  22.    
  23.     if "%srce%"=="%dest%" goto :End1
  24.    
  25.     :GetExt
  26.     echo.
  27.     set /P ext=Please enter extension [eg. jpg] ENTER to exit:
  28.     if not defined ext goto :End2
  29.     set /a totfiles=0
  30.     set /a matchfiles=0
  31.     set logfile="%~dp0%FCompare.log"
  32.    
  33.     echo --------------------------- >>"%logfile%"
  34.     echo %date%  %time% >>"%logfile%"
  35.     echo --------------------------- >>"%logfile%"
  36.  
  37.     for /r "%srce%" %%a in (*.%ext%) do (call :CheckSize "%dest%" %%~za "%%~fa")
  38.  
  39.     :End
  40.     echo Total files:    %totfiles%
  41.     echo Matching files: %matchfiles%
  42.     if %matchfiles% equ 0 (echo Matching files: 0 >>"%logfile%")
  43.     set totfiles=
  44.     set matchfiles=
  45.     set ext=
  46.     if exist "%logfile%" call :ViewLog
  47.     goto :GetExt
  48.     :End1
  49.     color 1c
  50.     echo **** SOURCE and DESTINATION are the same! ****
  51.     :End2
  52.     set srce=
  53.     set dest=
  54.     pause
  55.     exit
  56.  
  57.     :ViewLog
  58.     set /p view=View logfile [y,n]
  59.     if "%view%"=="y" (start notepad.exe "%logfile%")
  60.     set view=
  61.     goto :EOF
  62.  
  63.     :CheckSize
  64.     set /a totfiles+=1
  65.     for /r %1 %%b in (*.%ext%) do (
  66.         if %2==%%~zb (
  67.             echo.
  68.             echo Comparing: "%~3" "%%~b"
  69.             echo Sizes: %2   %%~zb
  70.             .\cmp.exe -s "%~3" "%%~b"
  71.             if errorlevel 0 (call :Matching "%~3" "%%~b")
  72.         )
  73.     )
  74.     goto :EOF
  75.  
  76.     :Matching
  77.     set /a matchfiles+=1
  78.     echo - Match -
  79.     echo Match: "%~1" --- "%~2" >>"%logfile%"
  80.     goto :EOF
  81.  
  82.     endlocal
  83.  
  84.     End of Batch section
  85. @end
  86.  
  87.  
  88. // JScript section
  89.  
  90. // Creates a dialog box that enables the user to select a folder and display it.
  91. var title = "Select a folder", rootFolder = 0x11;
  92. var shl = new ActiveXObject("Shell.Application");
  93. var folder = shl.BrowseForFolder(0, title, 0, rootFolder);
  94. WScript.Stdout.WriteLine(folder ? folder.self.path : "");

Log file example:
Code: Text [Select]
  1. ---------------------------
  2. Sun 26/10/2014  22:19:46.65
  3. ---------------------------
  4. Match: "R:\test\dir1\FC3 - Blood Dragon.iso" --- "D:\dir2\why.iso"
« Last Edit: October 26, 2014, 06:41 AM by 4wd »

x16wda

  • Supporting Member
  • Joined in 2007
  • **
  • Posts: 888
  • what am I doing in this handbasket?
    • View Profile
    • Read more about this member.
    • Donate to Member
Re: The Best Way to handle finding and removiong Duplicate Files
« Reply #4 on: October 26, 2014, 10:09 AM »
If you have everything in one location, it might be easier to use fsum to build a list of the MD5s of the files, then you can sort the result and see what is duplicated. Names and file extensions can be misleading.  ;)

Navigate to the top level and use "fsum -r *.* > checksums.txt"  and you'll get a nice report. Then you could "cat checksums.txt | sort | uniq -D -w 33 > dups.txt" to get a good list of just the duplicates. (You'd need unxutils for cat and uniq - and other useful stuff too.)

Both reports are attached here for a sample folder.
vi vi vi - editor of the beast

IainB

  • Supporting Member
  • Joined in 2008
  • **
  • Posts: 7,544
  • @Slartibartfarst
    • View Profile
    • Read more about this member.
    • Donate to Member
Re: The Best Way to handle finding and removiong Duplicate Files
« Reply #5 on: October 26, 2014, 06:15 PM »
NB: I have just modified the images in my last comment above, to make it clear that file size and contents (checksum) are being used for duplicate checking in the example given. I made the images a bit smaller (so no scrolling needed) and added some comments/arrows to them.

questorfla

  • Supporting Member
  • Joined in 2012
  • **
  • Posts: 570
  • Fighting Slime all the Time
    • View Profile
    • Donate to Member
Re: The Best Way to handle finding and removiong Duplicate Files
« Reply #6 on: October 27, 2014, 02:44 PM »
Well, up to 4GB in size you could use VBinDiff but HxD will handle files up to 8EB ... if you feel the need  :D

Possibly a good start would be to generate hashes on each file, (SHA-1, etc), then you need only compare those with matching hashes - you could probably whip up a command file to do it  ;)

EDIT: The inbuilt Windows fc command can apparently also handle files >4GB.

Do all these files resides in the same directory?

Maybe I'll look at a simple, (or not), command file.

I used to have a Duplicate remover for Photos that would have worked.  It shows you all the various files and where they are along with dates and sizes etc.  They you go to each group of duplicates and pick which one(s) to keep.  I used this back in the Windows XP days do do not remember the name.  But i am sure there are plenty out there like it.  The preference would be to move all but the ones I pick to another folder maintaining the directory structure of where they were removed from.
Seems like it was a pretty fast program (for that point in time)

I can see i need to review your proffered code as well, 4WD  :)  It might be all i need.
« Last Edit: October 27, 2014, 02:55 PM by questorfla »

questorfla

  • Supporting Member
  • Joined in 2012
  • **
  • Posts: 570
  • Fighting Slime all the Time
    • View Profile
    • Donate to Member
Re: The Best Way to handle finding and removiong Duplicate Files
« Reply #7 on: October 27, 2014, 02:54 PM »
Thanks IainB.     ;) I am looking at your choice as well.
This is a HUGE folder with multiple subdirectories and levels. So it isn't the size of the file that matters so much during the scanning phase.
And to be honest, if it had the ability to be selective about the compare fields, i would first wan to look for files of the same name, date, and size. 
This would at least allow me to clean out the "dead-wood".  After that, I can get more selective on subsequent scans.
I need to get it down to a reasonable size to be used with File Locator Pro.  Excellent program but it does not build any kind of Index on each run.  This uses a lot of time on each subsequent run and it is a feature that they wlll be adding in the next version.

Stoic Joker

  • Honorary Member
  • Joined in 2008
  • **
  • Posts: 6,649
    • View Profile
    • Donate to Member
Re: The Best Way to handle finding and removiong Duplicate Files
« Reply #8 on: October 27, 2014, 03:35 PM »
If they ever do let you upgrade that server, Server 2012 has a built-in deduplication feature that produced very promising results (like a 60% size reduction) in the tests I ran.

This approach can be helpful when dealing with users that chronically like to squirrel away common files to their own little stash ... And then freak out when is can't be found. Otherwise 6 months after you spend all that time cleaning up their mess they'll just recreate it and run you out of space again.

4wd

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 5,644
    • View Profile
    • Donate to Member
Re: The Best Way to handle finding and removiong Duplicate Files
« Reply #9 on: October 27, 2014, 03:38 PM »
If you want to just match on name, date (which one?), and size, I can modify the command file easily enough.  I'll look at it in a few hours.

Probably also get it to spit out a command file to do the moving of matching files.

Not far different from the original.

EDIT: New version, old one is still above.

Compares Size, Date, Name, and optionally compares binaries.
Generates a logfile that includes the commands to copy or move the duplicates to another directory using Robocopy, (duplicates folder tree).  Files from the 2nd chosen directory are the sacrificial victims.

Code: Text [Select]
  1. @if (@CodeSection == @Batch) @then
  2.  
  3.     @echo off
  4.     color 1a
  5.     setlocal EnableExtensions
  6.     setlocal DisableDelayedExpansion
  7.    
  8.     echo Select PRIMARY folder
  9.     for /F "delims=" %%S in ('CScript //nologo //E:JScript "%~F0"') do (
  10.         set srce=%%S
  11.     )
  12.     if "%srce%"=="" goto :End2
  13.     echo Selected folder: "%srce%"
  14.    
  15.     echo Select folder to COMPARE
  16.     for /F "delims=" %%D in ('CScript //nologo //E:JScript "%~F0"') do (
  17.        set dest=%%D
  18.     )
  19.     if "%dest%"=="" goto :End2
  20.    
  21.     echo Selected folder: "%dest%"
  22.    
  23.     if "%srce%"=="%dest%" goto :End1
  24.    
  25.     echo.
  26.     echo The following prompt asks you for a folder to move duplicates to.
  27.     echo.
  28.     echo ---- HOWEVER, THIS COMMAND FILE DOES NOT MOVE THEM ----
  29.     echo.
  30.     echo The path is written to the logfile so it can be run separately if
  31.     echo required.
  32.     echo.
  33.     echo Select folder to MOVE duplicates to
  34.     for /F "delims=" %%S in ('CScript //nologo //E:JScript "%~F0"') do (
  35.         set mdest=%%S
  36.     )
  37.     if "%mdest%"=="" goto :End2
  38.     echo Selected folder: "%mdest%"
  39.  
  40.     set /P como=Copy or Move Duplicates (Set in logfile) [ENTER = Copy]:
  41.  
  42.     echo.
  43.     echo If you just want to match on Size, Date, and Name just hit
  44.     echo ENTER at the next prompt.
  45.     echo If you want to do a binary compare also, enter any character.
  46.     echo.
  47.     set /P cbin=Do binary compare [ENTER = No]:
  48.        
  49.     :GetExt
  50.     echo.
  51.     set /P ext=Please enter extension [eg. jpg, * = ALL] ENTER to exit:
  52.     if not defined ext goto :End2
  53.     set /a totfiles=0
  54.     set /a matchfiles=0
  55.     call :SetLogfile %~dp0%
  56.    
  57.     echo @echo off >"%logfile%"
  58.     echo echo %date%  %time% >>"%logfile%"
  59.  
  60.     for /r "%srce%" %%a in (*.%ext%) do (call :CheckSize "%dest%" %%~za "%%~fa")
  61.  
  62.     :End
  63.     echo Total files:    %totfiles%
  64.     echo Matching files: %matchfiles%
  65.     if %matchfiles% equ 0 (echo Matching files: 0 >>"%logfile%")
  66.     set totfiles=
  67.     set matchfiles=
  68.     set ext=
  69.     if exist "%logfile%" call :ViewLog
  70.     goto :GetExt
  71.     :End1
  72.     color 1c
  73.     echo **** SOURCE and DESTINATION are the same! ****
  74.     :End2
  75.     set srce=
  76.     set dest=
  77.     pause
  78.     rem exit
  79.  
  80.     :SetLogfile
  81.     set "logfile=%~1FCompare-%time:~0,2%%time:~3,2%%time:~6,2%.cmd"
  82.     goto :EOF
  83.  
  84.     :ViewLog
  85.     set /p view=View logfile [y,n]
  86.     if "%view%"=="y" (start notepad.exe "%logfile%")
  87.     set view=
  88.     goto :EOF
  89.  
  90.     :CheckSize
  91.     set /a totfiles+=1
  92.     for /r %1 %%b in (*.%ext%) do (
  93.         if %2==%%~zb (
  94.             echo.
  95.             echo Comparing: "%~3" "%%~b"
  96.             echo Sizes: Match
  97.             call :CheckDate "%~3" "%%~b"
  98.         )
  99.     )
  100.     goto :EOF
  101.  
  102.     :CheckDate
  103.     if "%~t1" equ "%~t2" (
  104.         echo Dates: Match
  105.         call :CheckName "%~1" "%~2"
  106.     )
  107.     goto :EOF
  108.  
  109.     :CheckName
  110.     if "%~nx1" equ "%~nx2" (
  111.         echo Names: Match
  112.         if not defined cbin (
  113.             call :Matching "%~1" "%~2"
  114.         ) else (
  115.             call :CheckBin "%~1" "%~2"
  116.         )
  117.     )
  118.     goto :EOF
  119.  
  120.     :CheckBin
  121.     .\cmp.exe -s "%~1" "%~2"
  122.     if errorlevel 0 (
  123.         echo Binaries: Match
  124.         call :Matching "%~1" "%~2"
  125.     )
  126.     goto :EOF
  127.  
  128.     :Matching
  129.     set /a matchfiles+=1
  130.     echo echo "%~1" matches "%~2" >>"%logfile%"
  131.     set tdir=%~dp2
  132.     set tdir=%tdir:~0,-1%
  133.     set tdir2=%tdir:~2%
  134.     if not defined como (
  135.         echo robocopy "%tdir%" "%mdest%%tdir2%" "%~nx2" >>"%logfile%"
  136.     ) else (
  137.         echo robocopy "%tdir%" "%mdest%%tdir2%" "%~nx2" /MOV >>"%logfile%"
  138.     )
  139.     echo.
  140.     goto :EOF
  141.  
  142.     endlocal
  143.  
  144.     End of Batch section
  145. @end
  146.  
  147.  
  148. // JScript section
  149.  
  150. // Creates a dialog box that enables the user to select a folder and display it.
  151. var title = "Select a folder", rootFolder = 0x11;
  152. var shl = new ActiveXObject("Shell.Application");
  153. var folder = shl.BrowseForFolder(0, title, 0, rootFolder);
  154. WScript.Stdout.WriteLine(folder ? folder.self.path : "");

Sample output with Move option.
Code: Text [Select]
  1. @echo off
  2. echo Wed 29/10/2014  12:53:49.64
  3. echo "R:\test\Root\SimpleBackup.cmd" matches "D:\Root\SimpleBackup.cmd"
  4. robocopy "D:\Root" "D:\test\Root" "SimpleBackup.cmd" /MOV
  5. echo "R:\test\Root\fred h\1234\Hunters & Collecters - Holy Grail.mp3" matches "D:\Root\fred h\1234\Hunters & Collecters - Holy Grail.mp3"
  6. robocopy "D:\Root\fred h\1234" "D:\test\Root\fred h\1234" "Hunters & Collecters - Holy Grail.mp3" /MOV
  7. echo "R:\test\Root\fred h\34\cache\Don McLean - American Pie.mp3" matches "D:\Root\fred h\34\cache\Don McLean - American Pie.mp3"
  8. robocopy "D:\Root\fred h\34\cache" "D:\test\Root\fred h\34\cache" "Don McLean - American Pie.mp3" /MOV


NOTE:
If you want another free program, I heartily recommend Duplicate File Finder by Rashid Hoda - it has best layout for a dupe checker I have ever seen, all the options are right in front of you without having to screw around in menus.

Hasn't been updated in years but the only problem I've found with it is if you have a rather large number of files to check, eg. 40k+

2014-10-29 13_34_21.pngThe Best Way to handle finding and removiong Duplicate Files
« Last Edit: October 28, 2014, 09:38 PM by 4wd »

questorfla

  • Supporting Member
  • Joined in 2012
  • **
  • Posts: 570
  • Fighting Slime all the Time
    • View Profile
    • Donate to Member
Re: The Best Way to handle finding and removiong Duplicate Files
« Reply #10 on: November 15, 2014, 08:35 PM »
OK.  Another pile of treasure!    My lucky night.
My own contribution to this would be "Yet Another Duplicate File Remover" <YADFR> on Source Forge.  (I hope no one chimes in now that they got that and it messed up their system :(.  )
While it seems slower than some, it also was very thorough and extremely simple to use, Wizard and all.  I have not yet tested 4WD's scripts but from past experience i trust them implicitly.
Today has been one surprise after the next (all good so far) so I am hoping for some home runs tonight on something.


tomos

  • Charter Member
  • Joined in 2006
  • ***
  • Posts: 11,964
    • View Profile
    • Donate to Member
Re: The Best Way to handle finding and removiong Duplicate Files
« Reply #11 on: November 16, 2014, 08:47 AM »
Good to get a couple of recommendations for duplicate file finders :up:

@questerflora, any chance you could change the title of the thread to something more topic-related?
That would be helpful :) (I think it needs to be changed in first post to stick)
Tom

MilesAhead

  • Supporting Member
  • Joined in 2009
  • **
  • Posts: 7,736
    • View Profile
    • Donate to Member
Re: The Best Way to handle finding and removiong Duplicate Files
« Reply #12 on: November 16, 2014, 09:59 AM »
@questorfla could you check the total number of files in the tree?

One way is just to right click the root folder and click Properties.  If it is 4096 or less, MD5Hash can do it.  Just drag and drop the folder on a running MD5Hash or use the ancillary program to install the shell extensions. Once that is done, right click the root folder and click MD5Hash.


When done you can cut the results to clipboard and paste it into any editor such as EditPad Lite 7 which does line sorting.  The results are shown as the MD%Sum string followed by white space, then the full file path.  One file per line.

Edit:  This assumes all the files are in a tree with a single root folder.  Hidden and system files are ignored.

Edit2:  During testing I calculated MD5 for folders with some video files over 6 GB thrown in.  It completed with correct results.





« Last Edit: November 16, 2014, 10:10 AM by MilesAhead »

questorfla

  • Supporting Member
  • Joined in 2012
  • **
  • Posts: 570
  • Fighting Slime all the Time
    • View Profile
    • Donate to Member
Re: The Best Way to handle finding and removiong Duplicate Files
« Reply #13 on: January 01, 2015, 04:01 PM »
First:  Happy New Year to all.
2nd.  Tomos:    I can't see a way to change th tile of an existing post?  Maybe here is one but nothing  I clicked on would get me there.
I appreciate everyone's good advice and even if it doesn't help ME, I am sure someone who reads these posts can use something of it.  Tough I agree I should have been more on topic when I named it :(

3rd:  Stoic.  This DE-duplicator in server 2012 sounds great and I can probably manage that.  I had looked at 2012 a while back but not sure what version.  Whatever it was, it seemed to be a command line only version of some sort, I am not sure I ever looked at a full GUI setup and probably should.
Thanks again to all.

tomos

  • Charter Member
  • Joined in 2006
  • ***
  • Posts: 11,964
    • View Profile
    • Donate to Member
Re: The Best Way to handle finding and removiong Duplicate Files
« Reply #14 on: January 01, 2015, 05:32 PM »
2nd.  Tomos:    I can't see a way to change th tile of an existing post?  Maybe here is one but nothing  I clicked on would get me there.
I appreciate everyone's good advice and even if it doesn't help ME, I am sure someone who reads these posts can use something of it.  Tough I agree I should have been more on topic when I named it :(

guess I'm being a fussy beggar, but when you click on 'unread posts' and each time you cant remember what a particular thread is about :-[ (and you are not alone in using vague titles...)

Click 'modify' on the first post in the thread, and edit title thus:

Screenshot - 2015-01-02 , 00_25_30.png
Tom

questorfla

  • Supporting Member
  • Joined in 2012
  • **
  • Posts: 570
  • Fighting Slime all the Time
    • View Profile
    • Donate to Member
Re: The Best Way to handle finding and removiong Duplicate Files
« Reply #15 on: January 03, 2015, 06:17 PM »
THANK YOU TOMOS!
I fixed it I hope.  In the future I will try to be more on target as well.

arvin23

  • Participant
  • Joined in 2015
  • *
  • default avatar
  • Posts: 5
    • View Profile
    • Donate to Member
Re: The Best Way to handle finding and removiong Duplicate Files
« Reply #16 on: October 13, 2015, 04:55 AM »
I use WinMerge, which is a free download. It allows you to look at two folders, side by side, and see if files are indeed identical, different, newer, older, etc... You can then select one file, either left or right side, and delete it. You can also copy a file from left to right or right to left, then delete the one you don't want.

mohanarun

  • Supporting Member
  • Joined in 2015
  • **
  • Posts: 2
    • View Profile
    • Donate to Member
Re: The Best Way to handle finding and removiong Duplicate Files
« Reply #17 on: October 20, 2015, 10:19 PM »
I am afraid I dont have a solution for company-wide "possibly" duplicate files in the same folder
except for a semi-automated process wherein the manual process you do diff. Scanning and
directory listing can be automated but you will need to be careful before choosing to delete something.

Manual:
Choosing carefully a selection of files that look like possible duplicates.
Diff them manually.
Delete the duplicates.

Automated:
Scan the directory fast.
Produce a large list of files that look like possible duplicates.

Here is how I do duplicate file detection for personal use. I use CCLEANER.
Click on Tools> Duplicate file finder...

Screenshot attached.
[attachthumb-=1]

David.P

  • Supporting Member
  • Joined in 2008
  • **
  • Posts: 208
  • Ergonomics Junkie
    • View Profile
    • Donate to Member
Re: The Best Way to handle finding and removing Duplicate Files
« Reply #18 on: March 16, 2019, 04:46 AM »
Hi forum,

is there any news on a possibly best duplicate finder?

I'm looking for a program where I can check a specific folder (typically like the "Downloads" folder...) to see if the files it contains already exist somewhere else on the computer -- possibly also under a different name.

Thanks for tips or experience!

David.P

  • Supporting Member
  • Joined in 2008
  • **
  • Posts: 208
  • Ergonomics Junkie
    • View Profile
    • Donate to Member
Re: The Best Way to handle finding and removiong Duplicate Files
« Reply #19 on: March 16, 2019, 03:37 PM »
Done. I am very confident that one does not have to look any further than to AllDup.

Akertyna

  • Participant
  • Joined in 2019
  • *
  • Posts: 9
    • View Profile
    • Donate to Member
Re: The Best Way to handle finding and removiong Duplicate Files
« Reply #20 on: June 26, 2019, 02:33 AM »
I tried searching on laptopmag and techgara how to use ccleaner to delete duplicate files but this way can only delete files with the same name.
Is there a way to delete duplicate files but have different names?

Curt

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 7,566
    • View Profile
    • Donate to Member
Re: The Best Way to handle finding and removiong Duplicate Files
« Reply #21 on: June 26, 2019, 04:09 AM »
Is there a way to delete duplicate files but have different names?

AllDup (https://download.cne...20_4-10029585-1.html) and Duplicate File Finder (https://www.softpedi...te-File-Finder.shtml) were both mentioned.


Akertyna

  • Participant
  • Joined in 2019
  • *
  • Posts: 9
    • View Profile
    • Donate to Member
Re: The Best Way to handle finding and removiong Duplicate Files
« Reply #22 on: July 03, 2019, 03:44 AM »
Is there a way to delete duplicate files but have different names?

AllDup (https://download.cne...20_4-10029585-1.html) and Duplicate File Finder (https://www.softpedi...te-File-Finder.shtml) were both mentioned.
Thanks for your response, sir!

ewemoa

  • Honorary Member
  • Joined in 2008
  • **
  • Posts: 2,922
    • View Profile
    • Donate to Member
Re: The Best Way to handle finding and removiong Duplicate Files
« Reply #23 on: December 21, 2019, 01:54 PM »
I'm looking for something with source code because I'm a bit wary of using a program that's going to be scanning a lot of files -- though in this case it's for someone who is unlikely to be using a non-GUI solution.

My preliminary searches turned up a couple of candidates.

Are there any reasons to not consider the current incarnation of dupeguru or Duplicate Files Finder?

https://github.com/arsenetar/dupeguru
http://doubles.sourceforge.net/

Appreciate some opinions (informed or otherwise) :)

4wd

  • Supporting Member
  • Joined in 2006
  • **
  • Posts: 5,644
    • View Profile
    • Donate to Member
Re: The Best Way to handle finding and removiong Duplicate Files
« Reply #24 on: December 22, 2019, 01:27 AM »
I'm looking for something with source code because I'm a bit wary of using a program that's going to be scanning a lot of files -- though in this case it's for someone who is unlikely to be using a non-GUI solution.

My preliminary searches turned up a couple of candidates.

Are there any reasons to not consider the current incarnation of dupeguru or Duplicate Files Finder?

https://github.com/arsenetar/dupeguru

Appreciate some opinions (informed or otherwise) :)

Used dupeGuru reasonably recently to remove something like 400+GB of duplicate backup files for a friend, (a laborious undertaking - backups of backups of backups of .... ), worked well.

There's also AntiDupl if you want to find primarily image duplicates, (or close duplicates), including across different formats - my go to software for that instance.