ATTENTION: You are viewing a page formatted for mobile devices; to view the full web page, click HERE.

DonationCoder.com Software > N.A.N.Y. 2011

NANY 2011 Release: Duplicate Photo Finder

<< < (9/11) > >>

Renegade:
Well, I'm on to doing some threading stuff now to speed things up. (Got a few hours to burn today.)

On a funny note, my first attempt focused on threading for the simple comparison method, but it was simply too fast in parts and didn't work. In other words, time to do some significant refactoring. :)

nharding:
When you do the check hashes for identical files, you can speed that up significantly by only calculation the hashes for duplicate file sizes. So if you have 8 files
a.jpg [102456 bytes] b.jpg [232583 bytes] c.jpg [104356 bytes] d.jpg [232483 bytes] e.jpg [102456 bytes] f.jpg [232583 bytes] g.jpg [38914 bytes] h.jpg [89583 bytes] then you only need to calculate hashes for the files that are 102456 or 232583 bytes long. This is what I do as part of my check all archives for duplicates in DCDisplay (only there, since I want to go via image data, I check if hash maps then I check number of pages, and average resolution)

Neil Harding

Renegade:
When you do the check hashes for identical files, you can speed that up significantly by only calculation the hashes for duplicate file sizes. So if you have 8 files
a.jpg [102456 bytes] b.jpg [232583 bytes] c.jpg [104356 bytes] d.jpg [232483 bytes] e.jpg [102456 bytes] f.jpg [232583 bytes] g.jpg [38914 bytes] h.jpg [89583 bytes] then you only need to calculate hashes for the files that are 102456 or 232583 bytes long. This is what I do as part of my check all archives for duplicates in DCDisplay (only there, since I want to go via image data, I check if hash maps then I check number of pages, and average resolution)

Neil Harding

-nharding (January 14, 2011, 09:47 PM)
--- End quote ---

Thanks for the tip Neil.

At the moment, my logic for 2 methods are combined and need refactoring. One checks for file exactness while the other compares image data exactness. File size doesn't help for image data exactness, so at the moment that won't get done. I need to refactor things to make it cleaner. Threading forces that, so once I get that done, your tip will be an excellent optimization~! :)


tomos:
I hope you wont be cursing me for these bug reports Renegade  :)

I was helping someone move their photos to a new computer today. Their older pc has a maze of duplicated folders and files - there seemed to be 10 copies of one batch of images in the one folder :tellme:
Everything was already copied onto an external drive.

So I installed your app on the new laptop:
Windows 7 64 bit; classic theme (I believe it's uptodate but dont know if SP1 installed)

I compared the pics on the external drive.

Some minor problems:

1) when I selected the folder in the lower pane, the selected folder in the upper pane was no longer readable - ie, text and highlighted background were both dark blue, if I then clicked on the folder in the upper half, it became readable - but the lower one was then unreadable.

2) when it was checking for duplicates, there was no indication it was working, until I actually clicked (anywhere) on the programme window - then the hour-glass showed.

3) eventually I had to compare the "my pictures" folder with itself. (I used the fast setting.) I'm not sure how many exactly, but there was over two thousand loose images in there and a few folders**. It showed the warning message, I clicked okay, but basically it seized up after that - when I clicked (anywhere) on the window it said no response (or whatever it says - "keine rückmeldung") in the titlebar. So eventually I closed it down - it did close normally, I didnt have to kill it. I tried a few times but it was definitely too much for it...


** I presume contents of subfolders are NOT looked into?

Renegade:
I hope you wont be cursing me for these bug reports Renegade  :)

I was helping someone move their photos to a new computer today. Their older pc has a maze of duplicated folders and files - there seemed to be 10 copies of one batch of images in the one folder :tellme:
Everything was already copied onto an external drive.

So I installed your app on the new laptop:
Windows 7 64 bit; classic theme (I believe it's uptodate but dont know if SP1 installed)

I compared the pics on the external drive.

Some minor problems:

1) when I selected the folder in the lower pane, the selected folder in the upper pane was no longer readable - ie, text and highlighted background were both dark blue, if I then clicked on the folder in the upper half, it became readable - but the lower one was then unreadable.

2) when it was checking for duplicates, there was no indication it was working, until I actually clicked (anywhere) on the programme window - then the hour-glass showed.

3) eventually I had to compare the "my pictures" folder with itself. (I used the fast setting.) I'm not sure how many exactly, but there was over two thousand loose images in there and a few folders**. It showed the warning message, I clicked okay, but basically it seized up after that - when I clicked (anywhere) on the window it said no response (or whatever it says - "keine rückmeldung") in the titlebar. So eventually I closed it down - it did close normally, I didnt have to kill it. I tried a few times but it was definitely too much for it...


** I presume contents of subfolders are NOT looked into?
-tomos (March 12, 2011, 03:36 PM)
--- End quote ---

Cursing? Heck no! I'm glad you've told me~! :)


1) when I selected the folder in the lower pane, the selected folder in the upper pane was no longer readable - ie, text and highlighted background were both dark blue, if I then clicked on the folder in the upper half, it became readable - but the lower one was then unreadable.
--- End quote ---

Is the computer using a theme? Or is the default color scheme changed in Windows?

2) when it was checking for duplicates, there was no indication it was working, until I actually clicked (anywhere) on the programme window - then the hour-glass showed.
--- End quote ---

That shouldn't happen. The progress bars at the bottom should start.

3) eventually I had to compare the "my pictures" folder with itself. (I used the fast setting.) I'm not sure how many exactly, but there was over two thousand loose images in there and a few folders**. It showed the warning message, I clicked okay, but basically it seized up after that - when I clicked (anywhere) on the window it said no response (or whatever it says - "keine rückmeldung") in the titlebar. So eventually I closed it down - it did close normally, I didnt have to kill it. I tried a few times but it was definitely too much for it...
--- End quote ---

Can you let me know the computer specs? I was testing on folders of about 1,000 photos that were 4~5 GB, and it worked fine.

Also, can you let me know the rough sizes of the pictures?

Offhand, I don't know what the problem is. There's nothing special going on, and nothing really tricky that could "mess up".

I think it might be the folder browser control... It's based on the stock listview. I probably should replace that with a custom control I have that's designed for very large data sets. Still... 2000 isn't "that" many...

I have some work to get done here, so I'll look into it this evening if I get done in time, or tomorrow, and get back to you.

Thanks for letting me know.

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version