Skip to content

Dear Internet Explorer user: Your browser is no longer supported

Please switch to a modern browser such as Microsoft Edge, Mozilla Firefox or Google Chrome to view this website's content.

Managing RAW photos with Python

Use Python to recursively scan directories and remove unused and redundant CR2 files.

I use a Canon EOS 5D Mark II digital SLR camera to take photos. Because I shoot in raw, my camera generates CR2 proprietary RAW image files which then need to be processed into a common readable format (typically JPG, but sometimes PNG).

I use Adobe Photoshop to read the CR2 files, process my images and make the required format conversions. When Adobe Photoshop processes raw files, it generates an XMP file which tracks all the edits made to the CR2 and is automatically saved to the same directory. So there is a chain of file formats whenever I process an image: CR2, XMP, JPG.

Of course, I never use all of the images that I take. For tricky subjects where there is a lot of movement (such as children or events), I may take hundreds of photos to get a smaller number of good ones. Until I download the images off the camera and take a look, I cannot be certain what is worth keeping and what can be discarded. But what I typically end up with is a directory full of CR2 files that I will never use and have no need to retain mixed in with a few that I absolutely must keep.

Yeah, I can delete these CR2 files manually, but it’s tedious and I’d prefer to automate the process if I can. The solution is this Python script that I wrote to do the task in Windows for me.

The code works on the command line in Windows by:

#!/usr/bin/env python

import os
import argparse
import sys
import winshell

def options():
 parser = argparse.ArgumentParser(description="Return a recursive list of files that match a criterion")
 parser.add_argument("-f", "--folder", help="Target folder of images.", required=True)
 args = parser.parse_args()
 return args

def list_files():
 # Get options
 args = options()

 # Identify the target directory
 target_raw = args.folder

 # Clean up Windows file paths
 if sys.platform.startswith('win'):
  target_raw = target_raw.replace('\\', '/')  	
  if not target_raw.endswith('/'):
   target = target_raw+"/"
  else: 
   target = os.path.join(target_raw, '') # Add trailing slash if missing

  for root, dirnames, filenames in os.walk(target):
    if not root.endswith('/'):
      root = root+"/"
    for filename in filenames:
      if filename.endswith("CR2"):
        filepath = os.path.join(root, filename)
        # Remove CR2 file extension to test for XMP or JPG
        name = os.path.splitext(filepath)[0]
        # Check that there is no XMP or JPG
        if os.path.isfile(name+".xmp") == False and os.path.isfile(name+".jpg") == False:
        	print("Deleted: "+filepath)
        	# Move CR2 to Recycle Bin
        	winshell.delete_file(filepath, no_confirm=True)

if __name__ == '__main__':
 list_files()

The code relies on argparse to receive commands from the command line and winshell to send files to the Recycle Bin (which is a safer option than deleting the files, just in case something goes wrong). Only one argument is required; -f which is the target folder. For example:

/path/to/phototidy.py -f c:\photos\2021\queensland

Caution: Before running this, be sure that this is suitable for your workflow. After all, if you are the sort of person who wants to keep everything, even if it’s blurry or otherwise unsuitable, then this isn’t the script for you. But if you want to clear out your folders and only keep the CR2 files associated with your best shots, then this will make the task of clearing out directories a whole lot easier and save you a massive amount of storage.

   

Comments

No comments have yet been submitted. Be the first!

Have Your Say

The following HTML is permitted:
<a href="" title=""> <b> <blockquote cite=""> <code> <em> <i> <q cite=""> <strike> <strong>

Comments will be published subject to the Editorial Policy.