Duplicate File Finder - Printable Version +- Support Forums (https://www.supportforums.net) +-- Forum: Categories (https://www.supportforums.net/forumdisplay.php?fid=87) +--- Forum: Coding Support Forums (https://www.supportforums.net/forumdisplay.php?fid=18) +---- Forum: Perl Programming Support (https://www.supportforums.net/forumdisplay.php?fid=31) +---- Thread: Duplicate File Finder (/showthread.php?tid=15675) |
RE: Duplicate File Finder - AceInfinity - 01-04-2012 (01-04-2012, 04:01 PM)Yellows Wrote: I explained that, it compares files of similar sizes to narrow it's search, which speeds things up, then takes those similarly sized files and compares each to check for comparable MD5 hashes, to return them in the printed results. It is quicker for that reason. RE: Duplicate File Finder - Yellows - 01-04-2012 (01-04-2012, 04:10 PM)AceInfinity Wrote: I explained that, it compares files of similar sizes to narrow it's search, which speeds things up, then takes those similarly sized files and compares each to check for comparable MD5 hashes, to return them in the printed results. What I am trying to ask is: Why does it search for similar file sizes when, if two file sizes are not the exact same size, it can't be the same file? RE: Duplicate File Finder - AceInfinity - 01-04-2012 (01-04-2012, 04:12 PM)Yellows Wrote: Answer: To narrow the search. For example if you have files in a directory as the following: -File1 (10GB) -File2 (11GB) -File3 (3.05Kb) -File4 (3.21Kb) -File5 (3.25Kb) -File6 (3.35Kb) -File7 (3.05Kb) If you're comparing for similar MD5 hashes, to compare File5 to all files including File1 and File2 would be senseless because obviously they aren't going to be the same file, they have GB's more data than the file in comparison itself. So if you were to compare File5 to File3 through to File7, then that would speed things up because you're not comparing the file to every other file to check for the same MD5 hash, but only a select few now. RE: Duplicate File Finder - Yellows - 01-04-2012
Yes, but it is also sensible to believe that the only files that could possibly be identical are File7 and File3, no other files could possibly be the same as another.
Why even do the search on the others? RE: Duplicate File Finder - AceInfinity - 01-04-2012 But that's what i'm trying to say lol, I though you weren't understanding the method of narrowing a search, but this is what my script does with the comparison operation here $b <=> $a from my hash array. RE: Duplicate File Finder - Yellows - 01-04-2012
Wait, so File4 and File5 wouldn't be scanned against each other?
If not, what we've said is completely worthless as I've understood what you're saying, LOL. RE: Duplicate File Finder - AceInfinity - 01-04-2012 (01-04-2012, 04:40 PM)Yellows Wrote: I wouldn't say completely worthless as it did raise some discussion on how my script works. I haven't had that before because the perl section here is virtually dead anyway. But yes, that's what the comparison method does. Otherwise I would have to try and set a range to look for in similar filesizes, which wouldn't do any benefit anyway because it's a waste of lines in my code, and any file off by a byte can change the MD5 even if it's a few added null bytes. RE: Duplicate File Finder - AceInfinity - 01-05-2012 No, but have you tried getting banned yet? RE: Duplicate File Finder - brianofen - 11-04-2012 I do not know the code, thats why I use duplicate files deleter, its very easy to use and very helpful too to find duplicated files. RE: Duplicate File Finder - AnnaLorf - 03-19-2013 I highly recommend to use duplicate files deleter. Here's the link: http://DuplicateFilesDeleter.com |