In this post, I’m going to use all of these tools to build an automatic image processing system that manipulates two images and creates a final image in an automatic manner.
Use case: Putting sunglasses on your face
Imagine that you work for an e-commerce site selling sunglasses. You’ll have thousands of images of frames, but when it comes down to it – it doesn’t really matter how cool the sunglasses look by themselves – people want to know what they look like on their own face:
So – wouldn’t it be cool to build a tool where users upload their own image – and then the sunglasses are superimposed on their face? And the process is completely automatic with no human interaction? Let’s give it a go!
Manipulating the Sunglass Images:
Let’s start by grabbing an image of a pair of sunglasses:
Ok, there are a couple of things we need to fix here. First, the white background has to go. Then, we need to remove the temples from behind the lens, because that would just look weird if you superimposed that on a face:
And finally, we should add some transparency to see you eyes through the lenses.
Removing the Temples from the image:
In Part 2 of this series, I used Google Cloud’s vision platform to build an image object detection model. We’re going to do that again – to identify the temples of the sunglasses. So I uploaded 36 images of sunglasses to the platform, and I manually identified the temples of the sunglasses:
You may notice that I did not just grab the entire temple with one rectangle – and the reason for this is that I want to minimise the amount of frame that I capture in the identification boxes.
I can then train this model to identify the temples of the glasses. And the model works really well – but only when an even number of objects is identified. It turns out that logos on the lenses can break the model (here is a pair of Ray-Bans where the upper left part of the temple is not IDed):
Removing the Temples
Now that we have IDed the areas with the temples, I’d like to remove them from the photo. We can do this using inpainting (which is included in the Python OpenCV library).
Inpainting takes a region of the image (the boxes IDed by my object identification library), and paints the inside of the region based on the colors on the perimeter of the region. This really only works well for images with uniform colors on the edge of the region. For example – when I try to remove the finger over the lens with inpainting, the sky area is inpainted well, but the regions with buildings looks horrible:
Luckily, lenses of sunglasses are uniform in color, so the inpainting algorithm works remarkably well. (this is the reason I used multiple regions to avoid the frames. Capturing a corner of the frame would adversely affect the inpainting step).
So for each box, I can implement the dst_ns inpainting algorithm (and then do some blurring to smooth the inpainting:
As long as the object detection tool finds an even number of regions, this actually works very well:
If you look closely on a larger version of the image, you can see the rectangles where the inpainting occured, but when we reduce the image size, they are much harder to see:
Adding transparency and removing the background
Now that we have removed the temples, we’re nearly there! Now we just remove the background and add some transparency to the glasses. I can do this in one step while also uploading the image to Cloudinary:
I force the image to be a png, and make it 70% transparent. This makes the lenses a bit transparent – and you can see your eyes through the lenses, making the effect more realistic. I also use Cloudinary’s background removal tool to remove the white background from the image.
Trying on the glasses
We now have images of glasses where i used object detection and inpainting to remove the temples from the image, and Cloudinary to remove the background. Now we’re ready to add them to a face! If a user were to upload a photo, we would have no idea where the face would be.
In part 1 of this series I used Cloudinary’s g_auto parameter to crop around the main object in the image. There is a g_face parameter that will identify a face, and allow you to crop about the face:
This makes a 1500×1500 image cropped around my face. We apply this to make sure that the face is front and center (and so are the sunglasses).
Applying the Glasses
Now to add the sunglasses. I want to make sure that they are placed over my eyes, and Cloudinary has a transformation that will identify the region with eyes: “g_adv_eyes.”
When I uploaded the sunglasses image, they were given a label with the filename, so I can apply the following transformation:
The red text crops the image of me (as above). The purple transformation grabs the image with label 6558, and makes it transparent to 20% (opaque at 80%). The green transformation applies the sunglasses to the parent image using the g_adv_eyes parameter to place the sunglasses over my eyes.
And there we have it – AI and Machine Learning being used to automatically manipulates images of glasses to remove unwanted objects, and place them on the face of a person!
In this post, I was able to train a machine learning model to identify the temples on images of sunglasses, and then use inpainting methods in python to remove them form the image. I could then apply a transparent image of the sunglasses on any face using Cloudinary’s face and eye identification tools.
AI and ML manipulation of images are still in their infancy, but the tools that are available are extremely powerful and have so many potential applications that your imagination is really the limit. If you have an idea of how you’d like to use object identification models or AI in your images, contact me, and let’s make it happen!