Using Screenshots in Production

While looking at a WebPageTest waterfalls, I often come across surprising attributes.  A few weeks back, I came across a page that had images with a familiar file naming convention:

This website is using the Mac screenshot tool to create images that they are using online!

 

Image Optimization

I’ve written a lot about image optimization, on GIFs,  Base64 encoding and more (read more in the history of this blog :). My gut feeling was that websites that use screenshots are not optimizing the content for the web.  So I decided to find out.

 

Are Screen Shot Images On The Web Optimized?

The HTTP Archive dataset for January 2019 has data on the load characteristics of ~4M mobile websites.  Searching the filenames of all requests made for those sites, I discovered that 36k (~1%) of pages have an image with the name “screen shot” (112k discrete files).  To determine if they were optimized or not, I selected 1500 images with the term screen shot in the filename. To keep the data timely, I also only used images with ‘2019’ in the filename.

To create optimized versions of the images in my dataset, I used Cloudinary’s fetch feature which allows cloudinary to optimize images on remote servers.  I used the f_auto and q_auto parameters to optimize the format and quality of the images.   Cloudinary converts the pngs to jpg (f_auto for curl), and then found the optimal size/quality using SSIM (the q_auto command).

Original:https://example.com/screenshot.png

Optimized: https://res.cloudinary.com/demo/image/fetch/f_auto,q_auto/https://example.com/screenshot.png

To get the file size of each of these images, I used curl to create a csv with the image url and the size downloaded:

xargs -n 1 curl --write-out '%{url_effective},%{size_download}\n' --silent --output /dev/null < 1500screenshots.txt >results.txt

Of 1289 results, 11% were within 10% of the original size – so we will call those optimized.  However, 86% of the images were reduced by at least 50%, and 73% could be made 75% smaller KB from the original. This confirms my hypothesis, that files with the term “screen shot” in the file name are generally not optimized for web delivery.

Screen Shot 2019-02-09 at 9.58.36 PM.png

Aside: All the images in this post are screen shots.

But with ImageOptim installed on my Mac, I can just “right click” – choose ImageOptimize before uploading to WordPress.  One extra click – and all the images are optimized.

 

Fun with Regex: Dates and Times in Screenshots

So, there are a lot of unoptimized images out there – this is not a huge surprise.  But we have some cool extra data here – the exact moment the image was captured.  So – when are these screen captures made?  Using some fancy regular expressions, we can run the numbers:

Hours that screenshots are taken:

Screen Shot 2019-02-09 at 10.26.36 AM.png

 

Most screenshots are taken between 9AM and 6PM.

Note: While it looks like there is a huge drop during lunch and a big spike at midnight, I think this is due to my data processing – adding 12 to any time with the “PM” suffix (12:30 PM becomes 24:30 – maths can be hard).

Day of Week

Screen Shot 2019-02-09 at 10.31.27 AM.png

Mostly during the week, and Friday is slightly lower than M-Th.

What year are screenshots taken?

This actually surprised me a bit – for a dataset taken in early January – already 14% of all screenshots were from 2019, and nearly 70% are from the last year.  Of course, that means that 30% of screenshots on the web are from 2017 or earlier.

Screen Shot 2019-02-09 at 10.39.42 AM.png

We can further see how recently the photos were taken.  Here is the count of screenshots by day for the last 2 years.

Screen Shot 2019-02-09 at 11.04.39 AM.png

The numbers are not terribly important, but we can see that most of the screenshots are recent – and a huge number were taken in January – while the dataset was being collected! 🙂

 

Conclusion

There are many websites that are using screen capture on the Mac as an essential part of their image processing pipeline for regular updates on the web.  However, very few of these images have any optimizations performed (86% of images can be reduced in size by at least 50%).  Based on this, it is a reasonable assumption that if your website is using Screen Shot images, you have some optimization work to do.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s