Losslessly Compressing My JPEG Photos with jpegoptim
I’ve recently been running low on disk space on my laptop. I’ve freed some by removing files, but I’ve also been looking for ways to save space through compression.
My photo collection is currently 117GB. And that’s after removing the “everyone closed their eyes” shots and walrus memes!
Looks like a prime candidate for compression.
jpegoptim is an open source utility for losslessly optimizing JPEGs. It’s from simpler times when names were obvious and websites didn’t need CSS. It works its magic by optimizing the Huffman coding used to compress the image data.
JPEG encoders don’t always find the optimal coding for an image, prioritizing speed over perfection. Camera software especially opts for speed to keep the “shutter” available.
I wanted to verify jpegoptim would be safe to use on my photos. I first optimized a single photo:
$ cp IMG_4586.JPG IMG_4586-optim.JPG $ jpegoptim IMG_4586-optim.JPG IMG_4586-optim.JPG 4032x3024 24bit N Exif XMP ICC [OK] 4750852 --> 4018279 bytes (15.42%), optimized.
I verified jpegoptim preseved the exact pixels with GraphicsMagick
compare on the before and after images:
$ gm compare -metric mse IMG_4586.JPG IMG_4586-optim.JPG Image Difference (MeanSquaredError): Normalized Absolute ============ ========== Red: 0.0000000000 0.0 Green: 0.0000000000 0.0 Blue: 0.0000000000 0.0 Total: 0.0000000000 0.0
Zero difference - they're exactly the same!
I double-checked this by converting the images to bitmaps with GraphicsMagick
convert, then comparing the hashes with
$ gm convert IMG_4586.JPG IMG_4586.bmp $ gm convert IMG_4586-optim.JPG IMG_4586-optim.bmp $ sha256sum IMG_4586.bmp IMG_4586-optim.bmp 0fd1d8d5a2286ca220746317612c0d8dfb757919a027ca8ad21e0c680f0954df IMG_4586.bmp 0fd1d8d5a2286ca220746317612c0d8dfb757919a027ca8ad21e0c680f0954df IMG_4586-optim.bmp
The hashes of the bitmap files are the same, so they contain exactly the same data. So
jpegoptim is indeed lossless.
JPEGs also have Exif metadata. This contains tags such as date/time the photo was actually taken, the GPS coordinates, and the camera settings.
diff to check that
jpegoptim preserves all this metadata:
$ exiftool IMG_4586.JPG > IMG_4586.JPG-exif $ exiftool IMG_4586-optim.JPG > IMG_4586-optim.JPG-exif $ diff IMG_4586.JPG-exif IMG_4586-optim.JPG-exif 2c2 < File Name : IMG_4586.JPG --- > File Name : IMG_4586-optim.JPG 4,7c4,7 < File Size : 4.5 MB < File Modification Date/Time : 2019:09:20 11:27:24+01:00 < File Access Date/Time : 2019:09:20 11:27:33+01:00 < File Inode Change Date/Time : 2019:09:20 11:27:24+01:00 --- > File Size : 3.8 MB > File Modification Date/Time : 2019:09:20 11:27:36+01:00 > File Access Date/Time : 2019:09:20 11:27:37+01:00 > File Inode Change Date/Time : 2019:09:20 11:27:36+01:00 11a12 > JFIF Version : 1.01 72c73 < Thumbnail Offset : 2312 --- > Thumbnail Offset : 2330
Great. The only differences are in non-EXIF fields that
exiftool outputs, such as the file name and creation time. The other EXIF data remains in tact, including the time the photo was actually taken.
But How Much Savings?
When I ran
jpegoptim above its output ended with:
4750852 --> 4018279 bytes (15.42%), optimized
15% saving - not bad! (Or “good” in non-British English.)
This image might have been randomly more or less compressible though. To get a more accurate figure, I ran
jpegoptim on my entire “incoming imports” folder. This contains the last 30 days of photos imported from my phone (my only camera).
I checked the disk usage of the folder before and after with
$ du -sh . 2.4G . $ jpegoptim *.JPG APFG7754.JPG 1920x1440 24bit P JFIF [OK] 377974 --> 366834 bytes (2.95%), optimized. AVEL7283.JPG 2583x2583 24bit N Exif XMP IPTC JFIF [OK] 1245429 --> 1208442 bytes (2.97%), optimized. BFJO7034.JPG 1440x1920 24bit P JFIF [OK] 359923 --> 352126 bytes (2.17%), optimized. ... XTPF0658.JPG 1081x1920 24bit P JFIF [OK] 272653 --> 264491 bytes (2.99%), optimized. YCQQ4283.JPG 3840x2160 24bit N Exif [OK] 1583447 --> 1534028 bytes (3.12%), optimized. YHDI8749.JPG 1600x901 24bit P JFIF [OK] 161337 --> 156934 bytes (2.73%), optimized. YHFZ1946.JPG 1600x1200 24bit P JFIF [OK] 115151 --> 115385 bytes (-0.20%), skipped. $ du -sh . 2.0G .
So we went from 2.4GB to 2.0GB. 0.4 / 2.4 = 0.16 = 16% savings. Across my current collection that would be 117 ⨉ 0.16 = 18.7GB. Not bad indeed!
(N.B. The first and last images in the folder, starting with random letters rather than
IMG_, are from WhatsApp. It seems they are better compressed than those taken with my phone’s camera, so there are less savings to be had. The camera images compressed up to 40%.)
jpegoptim compresses an image “negatively,” like the last one in my output, it means it can only find worse codings. It leaves the original in place, rather than increasing the file size!)
I’ll be running
jpegoptim on my whole collection in due course. It takes a lot of internet bandwidth afterwards: my backup software Arq has to back up the photos again, since they’re entirely new data.
I’ll also make it part of my import process so I don’t need to think about it again.
I hope this post has helped advertise a great tool to you,
Improve your Django develompent experience with my new book.
One summary email a week, no spam, I pinky promise.