Review: Gemini – the Duplicate File Finder
With hard drives getting both bigger and cheaper these days, you may not think you need an app whose sole purpose is to locate duplicate files and folders to help you reclaim disk space. However, with Apple making a fairly clear move towards smaller, more expensive SSD drives, getting back an extra 5-10 GB or so of space may actually be worth shelling out five bucks, which just happens to be how much Gemini ( $4.99 Mac App Store) costs.
Gemini is a very slick little app that makes finding duplicate files a snap. Simply launch Gemini, drag a folder (or folders Command-click multiple folders) and drag them onto Gemini’s interface.
It will then begin analyzing the contents of the folders…
… and when it’s done you’ll be prompted to view the duplicates it found.
Oh my! Look at all the duplicate files Gemini found! And mainly porn, apparently! (Actually, believe it or not, those images were actually for a legitimate job I was doing).
Anyway, the interface is quite slick. When you have finished your scan, Gemini will display a list of the potentially duplicate files and folders it found, and clicking one will show you the file path of each copy of the file. You can then manually click a checkbox and hit the red REMOVE SELECTED button at the top of the interface. This will add that version of the file to the removal queue (it does NOT yet delete the file) and you can continue to go through the list, adding files as you see fit. Once you’ve completed choosing all the files you’d like to remove, you can then hit the REMOVE button and you will be treated to a little animation of your files being “shredded”. And really, even at this point your files are still OK. Shredding simply moves the duplicates tot he Trash. If you REALLY wanted to, you could find a mis-shredded file and restore it.
Now, if the idea of manually picking and choosing files seems too time-intensive to you, Gemini also has an “auto-select duplicates function”, which will sift through the duplicate results and check the ones off for you which it thinks you don’t need. However, in my results, with nearly 4 GB of duplicate files found in one folder, the auto-select only found 4 files it was willing to delete. I suppose ultimately it is safer to err on the side of caution when it comes to deleting files, and Gemini claims they are working to refine the algorithms used by the software, so while perhaps one day this will be a useable feature, for now I feel you’ll probably still end up making file-by file decisions.
There’s a nice little pie-chart graph at the bottom left that shows you graphically which types of files are taking up the most room on your drive, and clicking a section will filter the results, so if you only want to focus on removing duplicate FOLDERS, or just see your duplicate photos, you can.
And speaking of photos, one nice feature that should give you a little piece of mind when using Gemini is that it will NOT remove any photos or songs that are currently in your iPhoto, iTunes, Aperture or Photo Booth libraries. If it finds a duplicated file that is both inside AND outside your iTunes library, it will show the one NOT in your library, and you are free to delete it, however Gemini will not, which I think is ultimately a good thing, since iTunes can look for duplicate files anyway.
What’s nice is the way Gemini decides which files to keep or to throw away. There are certain things Gemini will look for when helping you decide which version of a file to keep, such as dates of creation and modification, folders they are placed in and so on. For instance, if you have a file in your Downloads folder, but Gemini finds you have an identical copy in another folder, it will suggest you delete the file in your Downloads folder, as it assumes you copied it from there to a place you wanted, and just forgot to delete it.
Whatever the algorithm Gemini uses to compare files is, it works pretty well. We downloaded the same Muppet file at different times, renamed one, and it still flagged them as dups.
While I am pleased with the dupe finding abilities of Gemini, there are a couple issues. The first is the somewhat useless auto-select feature I mentioned – but as I said, I sort of want to oversee any utility that would decide what files I should be deleting. And Gemini’s developers claim they are working to improve the dupe-finding algorithms, so in time this automated feature may prove more useful. But a more important issue is that there aren’t really any settings that allow you to specify folders you do NOT want Gemini to pull duplicates for in its results. While your iPhoto/iTunes results may be protected, any other folder you may have that you want untouched will not be afforded that protection. This means you can’t simply drag your entire hard drive or user folder onto Gemini unless you are sure of where your keep certain files. This is not ultimately a killer by any stretch, but it just means you need to pay attention as you make your selections.
The final issue I have with Gemini is not so much one I have, as others who who downloaded the app have had. According to reviews, each time Gemini is launched it sends out some sort of data to a remote server without asking you. I questioned Gemini’s lead developer who explains that “It is just launch reports, to have real time stats on how many people purchased and launched product. No personal information is sent or collected. We are going to included option of turning off this functionality in users preferences. We are going to push this update in nearest week or two along with some other advanced settings.”
So assuming that is indeed the case, it shouldn’t be a problem, although for the life of me I can’t think of why a developer would need to know how many times a person is using their software, especially one designed to remove duplicate files. What would they change about it if I launched it every day versus once a year? Once it’s bought and paid for, what does it matter to them? Either way, privacy aficionados may want to wait for the update before downloading.
Conclusion
Gemini is a handy utility you may very well want to add to your arsenal. The slick, user-friendly interface is easy to use and understand, and makes finding and deleting duplicate files a breeze. However, the interface might be just a little TOO simple for some people, as there are not a lot of options to allow things like setting up folders you’d like excluded from results. Still, if you are looking for and easy way to reclaim disk space, Gemini is a well-designed tool that will do just that.
Price: $4.99
Pros: Simple, sleek design, fast, displays results in an easy to understand way making it easy to decide which files to keep
Cons: The “auto-select files” feature is a bit hit or miss, no way to exclude certain folders form results. Currently sends “launch data” to the developers every time you launch the app
To bad apple dropped ZFS when it turned into an Oracle product. The latest version has block-level deduplication that takes care of these things automatically.
ZFS is back via third parties. You can get it here for only twenty bucks! http://tenscomplement.com
No you can’t. This is vaporware.
They talk about products, but they have no pricing and no downloads, and they don’t mention duplicate file issues anywhere.
Great post Doc! I’ve been trying to wrangle up all those Emma Rea Randell duplicates in my porn folder for months. Of course I’ll have to look at all those dups one last time before I bid them a fond farewell. Now to tackle those annoying Muppet pics that keep showing up and get them off my Mac once and for all.
🙂
Nice!
One question though, I can’t find this information on their site or your review: is it able to find “similar images”, even if not 100% duplicates?
For example, say you have one photo, and a second one that is a 80% size version of it. 1000×1000 for one and 800×800 for the other. Not cropped, just resized. Will it be able to consider them as duplicates?
“Duplicate Image Detector” (http://www.blackbilby.com/did.html) does that, but the app is not very pretty, but most of all, is very unpractical to use, and unmaintained since 2010.
Thank you for the review in any case, this one seems to be pretty handy anyway, and handy in any case 🙂
Doc, does it work on NAS drives too?
Didn’t try, but they claim in their FAQ
“Q Can I scan removable media and network volumes?
A Yes. You can either drag them from Finder and drop on Gemini window, or choose them in a new Finder window that will appear after you click “+” button.”
So maybe.
– The Doc
Thanks, Doc – will give it a shot and see …
It looks like they doubled their price to $9.99 after getting a little free publicity from you. At that price I’ll pass on Gemini.
If you don’t mind not having a GUI interface, I wrote a script a while back to compare two folders after copying. It didn’t take much to modify it to compare files and isolate all the duplicates. I’m not sure what algorithm Gemini uses; this script will find any files that are exact duplicates.
http://pastebin.com/gW9MAuq4
It’s a Bash script, so download and save as a text file, chmod +x, and run in Terminal. It uses MD5 sums to recursively compare files in a directory and move any duplicates to a “duplicates” folder (leaving one copy alone).
Disclaimer: I’ve tested it a bit, but YMMV, so use at your own risk!
The new Product is Gemini II. It needs OS X 10.7. Too bad, so sad. It’s of no use to me at present.
Well, the job HAD to be legitimate for obvious reasons, I mean noone is distributing soft erotic pics in CR2 format. As such I wonder was this article an ad for Gemini, for your Canon, for Emma or for your job as a photographer 🙂
Otherwise it is odd to see removal options in Gemini choosing between “remove oldest” or “remove newest”. They are obviously comparing filesize, hash, meta tags and the likes, but definitely NOT filename and creation date, or the above mentioned option couldn’t have had any meaning. This proves your Muppet test as well.
Was there any reason you needed to use soft porn images* to illustrate this software?
Totally turned me off reading the review – you’ve done Gemini a disservice because I’ve now only read the review on Macdaily for their competitor, PhotoSweep.
I’m sure there were other images you could have used!
* Just to make it clear, I don’t object to these images in porn, but for a review of a totally unrelated software? Unneccessary.
You must have a hard time walking around.