in ,

The dumb reason your fancy Computer Vision app isn’t working: Exif Orientation, Hacker News

The dumb reason your fancy Computer Vision app isn’t working: Exif Orientation, Hacker News


Adam Geitgey

I’ve written about lots of computer vision and machine learning projects likeobject recognition systemsandface recognition projects. I also have an open source Pythonface recognition librarythat is somehow one of thetop 10 most popular machine learning libraries on Github. Together, that means that I get asked alotof questions from people new to Python and computer vision.

In my experience, there is one technical problem that trips people up more often than any other. No, it’s not a complicated theoretical issue or an issue with expensive GPUs. It’s the fact that almost everyone is loading their images into memory sideways without even knowing it. And computers areless than excellentat detecting objects or identifying faces in sideways images .

How Digital Cameras Auto-Rotate Images

When you take a picture, the camera will sense which end you have tilted up. This is so the picture will appear in the correct orientation when you look at it again in another program:

But the tricky part is that your camera doesn’t actually rotate the image data inside the file that it saves to disk. Because image sensors inside digital cameras are read line-by-line as a continuous stream of pixel information, it’s easier for a camera to always save the pixel data in the same order no matter which way the camera was held.

It’s actually up to the image viewer application to rotate the image correctly before displaying it. Along with the image data, your camera also saves metadata about each picture – lens settings, location data, and of course, the camera’s rotation angle. The image viewer is supposed to use this information to display the image correctly.

The most common format for image metadata is calledExif(short forExchangeable image file format). The Exif-formatted metadata is shoved inside the jpeg file that your camera saves. You can’t see Exif data as part of the image itself, but it is readable by any program that knows where to look for it.

Here’s the Exif metadata inside our Goose jpeg image as displayed byexiftool(************************************:

Notice the ‘Orientation’ data element. This tells the image viewer program that the image needs to be rotated 90 degrees counter-clockwise before being displayed on screen. If the program forgets to do this, the image will be sideways!

Why does this break so many Python Computer Vision Applications?

Exif metadata is not a native part of the Jpeg file format. It was an afterthought taken from the TIFF file format and tacked onto the Jpeg file format much later. This maintained backwards compatibility with old image viewers, but it meant that some programs never bothered to parse Exif data.

Most Python libraries for working with image data like numpy, scipy, TensorFlow, Keras, etc, think of themselves asscientific tools for serious peoplewho work with generic arrays of data. They don’t concern themselves withconsumer-level problemslike automatic image rotation – even though basically every image in the world captured with a modern camera needs it.

This means that when you load an image with almost any Python library, you get the original, unrotated image data. And guess what happens when you try to feed a sideways or upside-down image into a face detection or object detection model? The detector fails because you gave it bad data.

You might think this problem is limited to Python scripts written by beginners and students, but that’s not the case! EvenGoogle’s flagship Vision API demodoesn’t handle Exif orientation correctly:

Google Vision’s API demo fails to rotate a portrait-oriented image captured with a standard cell phone.

And while Google Vision still manages to detect some of the animals in the sideways image, it detects them with a non-specific “Animal” label. This is because it is a lot harder for a model to detect a sideways goose than an upright goose. Here’s what Google Vision detects if the image is correctly rotated before being fed into the model:

With the correct image orientation, Google detects the birds with the more specific “Goose” label and a higher confidence score. Much better!

This is a super obvious problem if youcan see that the image is sidewayslike in this demo. But this is where things get insidious —normally you can’t see it! Every normal program on your computer will only display the image in its properly rotated form instead of how it is actually stored sideways on disk. So when you try to view the image to see why your model isn’t working, it will be displayed the right way and you won’t know why your model isn’t working!

Finder on a Mac always displays images with Exif rotation applied. There is no way to see that the image data is actually sideways inside the file.

This inevitably leads to people posting issues on Github complaining that the open source projects that they are using are broken or the models aren’t very accurate. But the problem is so much simpler – they are feeding in sideways and / or upside-down images!

Fixing the Problem

The solution is that whenever you load images in your Python programs, you should check them for Exif Orientation metadata and rotate the images if needed. It’s pretty simple to do, but surprisingly hard to find examples of code online that does it correctly for all orientations.

Here is code to load any image into a numpy array with the correct rotation applied:

From there, you can pass the array of image data to any standard Python ML library that expects arrays of image data, like Keras or TensorFlow.

Since this comes up so often, I published this function as a library on pip calledimage_to_numpy. You can install it like this:

pip3 install image_to_numpy

You can use it in any Python program to load an image correctly, like this:

import matplotlib.pyplot as plt
import image_to_numpy
# Load yo ur image file
img=image_to_numpy.load_image_file ("my_file.jpg")
# Show it on the screen (or whatever you want to do)
plt.imshow (img)
plt.show ()

Check out thereadme file for more details.

Have fun!


Brave Browser
Read More
Payeer

What do you think?

Leave a Reply

Your email address will not be published. Required fields are marked *

GIPHY App Key not set. Please check settings

NBA's Free Speech Bluster Exposed by Ejection of Pro-Hong Kong 76ers Fans, Crypto Coins News

NBA's Free Speech Bluster Exposed by Ejection of Pro-Hong Kong 76ers Fans, Crypto Coins News

Newsrooms, let’s talk about G Suite, Hacker News

Newsrooms, let’s talk about G Suite, Hacker News