sai Posted April 14, 2014 Share Posted April 14, 2014 <p>Hi all,<br> I have multiple hard drives that I have accumulated over time and there are multiple duplicated photos in all of them.I've managed to put every file in a 4Tb hard drive, manually deleting a great number of obvious duplicated photos.<br> Now, here is my plan to get rid off all of the other 'not-so-obvious' duplicates, e.g., file with different file names. I'll start a brand new Lightroom 5 catalog, using both the option to copy the files to a new location and the option to not import suspected duplicates. In theory, at least in my head, this will go through all the files I compiled in the 4Tb drive and copy them to the new location while skipping duplicated ones.<br> Does this make sense to anyone else than me? Any other suggestions on how to accomplish this?<br> <br />Thanks,<br> Simon</p> Link to comment Share on other sites More sharing options...
Charles_Webster Posted April 14, 2014 Share Posted April 14, 2014 <p>The "don't import suspected duplicates" looks only at file name and time/date stamp. It won't find duplicate photos with different names.<br> <Chas><br /><br /></p> Link to comment Share on other sites More sharing options...
sai Posted April 14, 2014 Author Share Posted April 14, 2014 <p>Hi Chas, <br> I actually did a test with a few photos, i.e., duplicate photos with different names in different folders in my computer, and when I tried to import the duplicated version it was "greyed out" and I wasn't able to select it. Do you not see this? <br> I though that LR looked at various types of data (file name, time stamp, and other Exif and other metadata) to compare files.</p> <p> </p> Link to comment Share on other sites More sharing options...
bgelfand Posted April 14, 2014 Share Posted April 14, 2014 <p>There are many "Duplicate File Finder" programs" available. Digital pictures are simply digital files. Of course, change one byte in the file in an edit and it is no longer a true duplicate.</p> <p>Here are links to two programs (I have never used either of them):<br> http://www.pcworld.com/article/2013264/review-auslogics-duplicate-file-finder-frees-up-hard-drive-space-quickly.html</p> <p>http://www.pcworld.com/article/2025412/review-ashisoft-duplicate-finder-can-get-rid-of-duplicate-files-if-you-help-it.html</p> <p>Before using any program to remove files en mass, be sure you have <strong>very, very good backups</strong> - two or more logical backups and at least one system image backup. Remember, "Never go nowhere you can't get back from no how."</p> Link to comment Share on other sites More sharing options...
sai Posted April 14, 2014 Author Share Posted April 14, 2014 <p>Thanks for the suggestions Brooks, I'll give them a try! Yes, I'll definitively have a back up of a back up before trying anything.</p> Link to comment Share on other sites More sharing options...
JeffOwen Posted April 14, 2014 Share Posted April 14, 2014 <p>One problem with deleting duplicate names that I have is that during my change of Canon cameras they have always used the same file format i.e. xxx_IMG.jpg. This means that unless I have changed the file name (which I often do) there may be several files with the same name (albeit taken with different dates etc.). I have now discovered that my iPad also uses this xxx_IMG.jpg format!<br> All this means is that if you use an automatic search for similar file names and auto delete them you could loose a lot of files.</p> Link to comment Share on other sites More sharing options...
sai Posted April 14, 2014 Author Share Posted April 14, 2014 <p>Thanks for the warning Jeff! Definitively don't want that!<br> However, don't these type of programs (including Lightroom) look for many different types of information to identify duplicates, e.g., file name, date, camera, shutter speed, etc?</p> Link to comment Share on other sites More sharing options...
JeffOwen Posted April 15, 2014 Share Posted April 15, 2014 <p>I have not used the more modern file search and delete programs but I guess that if they use even some the EXIF data this would avoid the problem I suggested.</p> Link to comment Share on other sites More sharing options...
Colin O Posted April 15, 2014 Share Posted April 15, 2014 <p>I have successfully used a freeware program called VisiPics that uses an algorithm to find duplicates based on the actual images themselves, not filenames or EXIF data. It can "match" images that are edited slightly or resized, as well as images that may not be duplicates but are judged to be similar. Using it on 4 TB worth of images would be a tedious job though!</p> <p>I would suggest first slimming down your duplicates by using a program to rename all your images according to the date/time the photo was taken (from the EXIF data). Don't let the program automatically delete "duplicates" at this stage - that would be something you would do with some kind of manual intervention. At least, I would.</p> Link to comment Share on other sites More sharing options...
alan_cox3 Posted April 15, 2014 Share Posted April 15, 2014 <p>In the past I have used ThumbsPlus to find similar images. It does a good job and doesn't require them to be an exact match. In fact shots taken at the same shoot are some times (rightly) identified as similar. You can control the amount of similarity required.</p> Link to comment Share on other sites More sharing options...
sai Posted April 15, 2014 Author Share Posted April 15, 2014 <p>Thanks for the input Colin. Unfortunately, I'm using a Mac and VisiPics only runs on Windows. I'll do some searching around for a similar program for Mac.<br> After I trimmed my collection from obvious duplicates I end up with 'only' 1Tb of images. I'll do some more testing around with dummy sets using LR and other software and report back!</p> Link to comment Share on other sites More sharing options...
lex_jenkins Posted April 15, 2014 Share Posted April 15, 2014 <p>About a month ago I stumbled across a Lightroom add-on doodad that supposedly would seek and find duplicates from within the Lightroom environment. It seemed to work okay, although I tried only the free trial version which had very limited functionality. Might be worth Googling for if it sounds like it might do the job for you.</p> Link to comment Share on other sites More sharing options...
parv Posted April 16, 2014 Share Posted April 16, 2014 There is software named "rsync" [0,1,2,3] that can sync, among other ways, based on checksum of files[4] as the case here. There should be a port for Apple OS X[5] & MS Windows[5]; one does exists for Unix-like systems[1]. [0] The Source: https://rsync.samba.org/ [1] Multiple ways to download: https://rsync.samba.org/download.html [2] A tutorial: http://everythinglinux.org/rsync/ [3] How it works: https://rsync.samba.org/how-rsync-works.html [4] Option "--checksum" in manual page: https://rsync.samba.org/ftp/rsync/rsync.html [5] Resources for, among other things, running on Apple OS X & MS Windows: https://rsync.samba.org/resources.html Link to comment Share on other sites More sharing options...
parv Posted April 16, 2014 Share Posted April 16, 2014 Also, from a 2010 thread entitled Duplicate image finder: http://www.photo.net/casual-conversations-forum/00VQYq ... http://www.photo.net/casual-conversations-forum/00VQcp Link to comment Share on other sites More sharing options...
sai Posted April 16, 2014 Author Share Posted April 16, 2014 <p>Thank you parv.! This is turning out to be a more complicated than I though it would be.<br> <br />Last night I did an experiment: I created a directory with a couple subdirectories within it. I place the same image in all of them, as well as that same image with a different name, a a copy where I edited the image, i.e., original file, original file with different names, duplicated original files, edited files. Lightroom was only able to identify the files with the same name but not the edited ones or some of the ones with different names... <br> I then did the same experiment using a nice piece of free software that I found online, dupeGuru http://www.hardcoded.net/dupeguru/, and it was able to find all copies of the file incuding the ones with different names, I'm guessing it's doing checksum. It did of course not find the edited version of the file since this file is effectively not a duplicate anymore.<br> I think that I will have to do this slowly and on a directory to directory basis, using dupeGuru or other piece of software that does some sort of checksum, and not in a giant batch mode. Probably better and safer this way anyways. <br> <br />Thank you all for the multiple suggestions!<br> <br />Cheers,<br> Simon</p> Link to comment Share on other sites More sharing options...
lex_jenkins Posted April 16, 2014 Share Posted April 16, 2014 <p>Thanks for that reference, Simon. I'm going through my laptop's photo files folder by folder to eliminate duplicates while trying to avoid deleting similar but not identical photos. I somehow screwed up when I setup LR on this laptop and didn't realize until the hard drive had filled up unusually quickly that imported photos were not only being duplicated but also the raw files and JPEGs were being sorted into separate folders with apparently identical dates. Fortunately that mess only covers one year of photos.</p> Link to comment Share on other sites More sharing options...
peterson07 Posted June 5, 2017 Share Posted June 5, 2017 Why not try Duplicate Files Deleter. It will do a thorough search of your hard disk and find out the two or more duplicate files of the same file which may be stored at different locations. This will give you a comprehensive list of all those files and you can decide for yourself what you want to do with them. Link to comment Share on other sites More sharing options...
Ed_Ingold Posted June 5, 2017 Share Posted June 5, 2017 There is a risk using software to delete duplicates, that files with the same names will be deleted even if they're different. I keep the original image names intact, but they roll over after only 10,000 or so. Mine are distinguished by placing them in named directories. Cleanup software may not recognize the directory name as part of the unique identifier. Just as bad, you may be asked to decide for each duplicate. My conclusion is, they're a waste of time and pose a considerable risk. Now software to synchronize two drives or directories can be very useful. Link to comment Share on other sites More sharing options...
bgelfand Posted June 5, 2017 Share Posted June 5, 2017 There is a risk using software to delete duplicates, that files with the same names will be deleted even if they're different. I keep the original image names intact, but they roll over after only 10,000 or so. Mine are distinguished by placing them in named directories. Cleanup software may not recognize the directory name as part of the unique identifier. Just as bad, you may be asked to decide for each duplicate. My conclusion is, they're a waste of time and pose a considerable risk. Now software to synchronize two drives or directories can be very useful. I would depend upon how the programs determine the files are duplicates. Most use some sort of check sum on the file contents. If the program used the SHA-256 hash of the file to define duplicates, I would have a very, very high confidence that the files were, indeed, duplicates of one another. Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now