topbanner_forum
  *

avatar image

Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length
  • Thursday March 28, 2024, 2:49 pm
  • Proudly celebrating 15+ years online.
  • Donate now to become a lifetime supporting member of the site and get a non-expiring license key for all of our programs.
  • donate

Author Topic: detect duplicates for 100% sure  (Read 4974 times)

kalos

  • Member
  • Joined in 2006
  • **
  • default avatar
  • Posts: 1,823
    • View Profile
    • Donate to Member
detect duplicates for 100% sure
« on: March 12, 2012, 09:50 AM »
hello!

I want to detect and delete duplicate files, but is there a way to be 100% that they are identical?

thanks!

Renegade

  • Charter Member
  • Joined in 2005
  • ***
  • Posts: 13,288
  • Tell me something you don't know...
    • View Profile
    • Renegade Minds
    • Donate to Member
Re: detect duplicates for 100% sure
« Reply #1 on: March 12, 2012, 09:56 AM »
hello!

I want to detect and delete duplicate files, but is there a way to be 100% that they are identical?

thanks!

The ONLY way is to do a byte for byte check, and to also check for streams.

This can be done by first checking file sizes then comparing identically sized files, then checking for streams.

You CANNOT rely on hashes as they can collide.

Streams MUST be checked, because even if the main part of the files are identical, they could have streams attached, which are normally invisible to most software.

I do not know of any software that does this, but I'm sure there is some. Perhaps someone else can chime in there.

Slow Down Music - Where I commit thought crimes...

Freedom is the right to be wrong, not the right to do wrong. - John Diefenbaker

flamerz

  • Supporting Member
  • Joined in 2011
  • **
  • Posts: 157
    • View Profile
    • Donate to Member
Re: detect duplicates for 100% sure
« Reply #2 on: March 12, 2012, 09:56 AM »
i use duplicate file detective.

do you want to check it manually?

kalos

  • Member
  • Joined in 2006
  • **
  • default avatar
  • Posts: 1,823
    • View Profile
    • Donate to Member
Re: detect duplicates for 100% sure
« Reply #3 on: March 12, 2012, 06:02 PM »
hello!

I want to detect and delete duplicate files, but is there a way to be 100% that they are identical?

thanks!

The ONLY way is to do a byte for byte check, and to also check for streams.

This can be done by first checking file sizes then comparing identically sized files, then checking for streams.

You CANNOT rely on hashes as they can collide.

Streams MUST be checked, because even if the main part of the files are identical, they could have streams attached, which are normally invisible to most software.

I do not know of any software that does this, but I'm sure there is some. Perhaps someone else can chime in there.



congrats for the depth of knowledge, where you learned these?

indeed, it seems I need a software like this

Renegade

  • Charter Member
  • Joined in 2005
  • ***
  • Posts: 13,288
  • Tell me something you don't know...
    • View Profile
    • Renegade Minds
    • Donate to Member
Re: detect duplicates for 100% sure
« Reply #4 on: March 12, 2012, 07:14 PM »
congrats for the depth of knowledge, where you learned these?

indeed, it seems I need a software like this

That's just some basic knowledge about programming and file systems. Nothing special. There are tonnes more people here at DC than can tell you a lot more than I can. :)

Now, hopefully someone will chime in about this with a software recommendation...

Slow Down Music - Where I commit thought crimes...

Freedom is the right to be wrong, not the right to do wrong. - John Diefenbaker

rjbull

  • Charter Member
  • Joined in 2005
  • ***
  • default avatar
  • Posts: 3,199
    • View Profile
    • Donate to Member
Re: detect duplicates for 100% sure
« Reply #5 on: March 13, 2012, 04:01 PM »
DoubleKillerPro has plenty of other options, though it doesn't mention streams.  There's a lighter free-for-personal-use version as well.

xtabber

  • Supporting Member
  • Joined in 2007
  • **
  • Posts: 618
    • View Profile
    • Donate to Member
Re: detect duplicates for 100% sure
« Reply #6 on: March 16, 2012, 05:07 PM »
Unless a file is open, I don't believe streams have any bearing on its content, so they probably can be safely ignored unless you are operating in an  environment where files to be compared are constantly being accessed.

I use Beyond Compare to do byte-for-byte comparisons of any two files, or all the files in two different directory trees.  However, Beyond Compare does not search for duplicates.

I use TreeSize Pro to find duplicates and optionally delete them. TreeSize Pro has a number of options for comparing files, incluuding MD5 and SHA256 checksums, which I feel is more than adequate. If you don't feel that's enough, you can always check based only on size and use Beyond Compare to do a byte-for-byte comparison of all matches found.


SKA

  • Charter Member
  • Joined in 2006
  • ***
  • default avatar
  • Posts: 229
    • View Profile
    • Donate to Member
Re: detect duplicates for 100% sure
« Reply #7 on: March 18, 2012, 12:11 AM »
Check out Delete Duplicate Files (USD 19/-):
http://www.drivehq.com/web/brana/ddf.htm

Ska