I've used https://github.com/elcorto/imagecluster. It computes neural feature vector via VGG16. Uses around 100G RAM for 20k photos. Works OK on cpu, if you have like 16 of them.
For clustering, sqeuclidean metric worked best for me.
There's also https://github.com/facebookresearch/deepcluster, but all it did for me is mess with CUDA and crash.
Bonus picture from dataset