Creating Metadata Free Images

In this post we’ll go over how to view metadata in images and how to create a new metadata-free image from it. If you don’t know what metadata is, it’s basically described as “data about data”. What this means is that it hold information about the data, in this case the image, such as the time the image was taken, camera used, model of camera, GPS location, etc. In images, I’ve found out that this is commonly known as exif data.

Before we begin you need to install Pillow, it’s the library we’ll be using in this tutorial. Also a quick note, I’m using Python 2 but I tested out the code in Python 3 and it should work just fine. You can install Pillow through pip: pip install Pillow or if you’re in Windows you can use this site.

We’ll begin by creating a file, metadata.py, and creating a function to view the metadata first.

# 3rd party libs
from PIL import Image
from PIL.ExifTags import Tags, GPSTAGS  # Dictionaries containing key codes

def getMetadata(file_path):
    '''
    Read metadata (exif) from images and return it
    :param file_path: file path to where image is at
    :return: metadata from image
    '''
    # Open Image
    img = Image.open(file_path)
    # Grab metadata from image if any
    exif_data = img._getexif()
    # if image has metadata do this
    if exif_data is not None:
        # make metadata 'readable'
        exif_data = {TAGS[k] if k in TAGS  # Known generic exif codes
                     else GPSTAGS[k] if k in GPSTAGS  # Known GPS exif codes
                     else k: v for k, v in exif_data.items()}
    return exif_data

Here everything is straight forward, we open the image, read it for metadata and if doesn’t have any it’ll simply return None. If it has data we do a dictionary comprehension running the data we retrieved with the commonly known exif keys. If you want to see what is happening, you can print exif_data before and after we run the dictionary comprehension. Also here is file that contains those keys if you’re curious.

To test our new fuction, lets add this to the end of the file:

if __name__ == '__main__':
    from pprint import pprint as _ # This modules makes printing stuff pretty
    path = r'path/to/image.jpg'
    metadata = getMetadata(path)
    # print out the metadata
    _(metadata)

If you want, you can download this image and test it out. It’s the local rodeo from where I live. It has some metadata.

The pprint library is something I use when I’m printing stuff to the screen. It just makes the output ‘pretty’, just like it’s name. Next we give the path to where the image is located, pass that to the function and then print the metdata we received. This function serves well if you just want to see what metadata you have in your image, but what if you want to create a new image without the metadata? The next function we will build takes care of that. Let’s make it

def createCleanImage(file_path):
    '''
    Create a metadata free image from image path given
    :param file_path: path of file to create a 'clean' copy.
    :return: path of 'clean' image
    '''
    # Open Image
    img = Image.open(file_path)

    # Next 3 lines strip metadata
    data = list(img.getdata())  # Get pixels
    # Create image with same size and mode only
    clean_img = Image.new(img.mode, img.size)
    # Put the pixels in here
    clean_img.putdata(data)

    # Save file with 'CLEAN_' appended to the font of file name
    path, filename = os.path.split(file_path)
    name, extension = filename.split('.')
    filename = "CLEAN_{name}.{ext}".format(name=name, ext=extension)

    path = os.path.join(path, filename)
    # quality=100, if you don't want your image to be compressed
    # If not passed, it uses the default compression ratio
    clean_img.save(path)

Here we are getting the pixels from the image we want, create a new template image with the same size and mode so everything fits together and then finally throw the pixel data into the new image. This way, no metadata is copied over. The next couple lines just create a path string where your file should be save. It’ll also be saved under a different name. The last line is the bit of code that actually saves the image to your computer. The save method takes in a couple of optional parameters, you can read them here, but basically if you don’t want your file to be compressed, pass in quality=100 else just leave it as is.

Lets test this out.

if __name__ == "__main__":
    from pprint import pprint as _
    path = r'/path/to/image/rodeo.jpg'
    metadata = getMetadata(path)
    _(metadata)
    # If image has metadata, lets create one that doesn't
    if metadata is not None:
        print("Creating clean image")
        createCleanImage(path)

If you run this you should be able to see the metadata the file contained and then after a new image created in the same directory. If you check the metadata on the new image you will see that it won’t have any. Also, if you din’t pass quality=100 you should be able to see the size of image changed drastically. It was 2.2MB and now it’s only .5MB, this happened because of the defaul compresion that happens with jpg. There you go, you now know how to create metadata free images! unfortunately, I don’t think that you are able to edit the metadata from Pillow. From what I’ve read, you would have to use a library called pexif.