Managing RAW photos with Python
Use Python to recursively scan directories and remove unused and redundant CR2 files.
I use a Canon EOS 5D Mark II digital SLR camera to take photos. Because I shoot in raw, my camera generates CR2 proprietary RAW image files which then need to be processed into a common readable format (typically JPG, but sometimes PNG).
I use Adobe Photoshop to read the CR2 files, process my images and make the required format conversions. When Adobe Photoshop processes raw files, it generates an XMP file which tracks all the edits made to the CR2 and is automatically saved to the same directory. So there is a chain of file formats whenever I process an image: CR2, XMP, JPG.
Of course, I never use all of the images that I take. For tricky subjects where there is a lot of movement (such as children or events), I may take hundreds of photos to get a smaller number of good ones. Until I download the images off the camera and take a look, I cannot be certain what is worth keeping and what can be discarded. But what I typically end up with is a directory full of CR2 files that I will never use and have no need to retain mixed in with a few that I absolutely must keep.
Yeah, I can delete these CR2 files manually, but it’s tedious and I’d prefer to automate the process if I can. The solution is this Python script that I wrote to do the task in Windows for me.
The code works on the command line in Windows by:
- Recursively reading through a directory.
- Identifying CR2 files.
- Checking if there is an associated XMP or JPG file (which will always have the same name as the original CR2 file)
- If there is no associated XMP or JPG file, sending the CR2 file to the Recycle Bin.
#!/usr/bin/env python import os import argparse import sys import winshell def options(): parser = argparse.ArgumentParser(description="Return a recursive list of files that match a criterion") parser.add_argument("-f", "--folder", help="Target folder of images.", required=True) args = parser.parse_args() return args def list_files(): # Get options args = options() # Identify the target directory target_raw = args.folder # Clean up Windows file paths if sys.platform.startswith('win'): target_raw = target_raw.replace('\\', '/') if not target_raw.endswith('/'): target = target_raw+"/" else: target = os.path.join(target_raw, '') # Add trailing slash if missing for root, dirnames, filenames in os.walk(target): if not root.endswith('/'): root = root+"/" for filename in filenames: if filename.endswith("CR2"): filepath = os.path.join(root, filename) # Remove CR2 file extension to test for XMP or JPG name = os.path.splitext(filepath) # Check that there is no XMP or JPG if os.path.isfile(name+".xmp") == False and os.path.isfile(name+".jpg") == False: print("Deleted: "+filepath) # Move CR2 to Recycle Bin winshell.delete_file(filepath, no_confirm=True) if __name__ == '__main__': list_files()
The code relies on argparse to receive commands from the command line and winshell to send files to the Recycle Bin (which is a safer option than deleting the files, just in case something goes wrong). Only one argument is required;
-f which is the target folder. For example:
/path/to/phototidy.py -f c:\photos\2021\queensland
Caution: Before running this, be sure that this is suitable for your workflow. After all, if you are the sort of person who wants to keep everything, even if it’s blurry or otherwise unsuitable, then this isn’t the script for you. But if you want to clear out your folders and only keep the CR2 files associated with your best shots, then this will make the task of clearing out directories a whole lot easier and save you a massive amount of storage.