I blog a lot about data that I find in the HTTP Archive, and how I apply that data to improve websites. One interesting thing that I continuously come across in the data are files with names like “optimized.jpg” or “/images/optimized/small.jpg” and then when I look at the images, they are not optimized for delivery (meaning that they are way too big).
That got me thinking – how many files in the HTTP Archive are labeled in the url with the term ‘optimized’? (Oh, and hello my British friends, I also included “optimised” in my query, but it appears much less frequently). There are also a lot of tracking gif files that are very small that have optimized in them, so I limited my search to files over 500 bytes. Finally, I broke the data down by the file extension, so we can see how many of the files are images or text or videos:
ext | count optimized |
js | 11794 |
jpg | 5937 |
png | 5891 |
php | 3844 |
css | 3747 |
1366 | |
svg | 526 |
gif | 309 |
jpeg | 184 |
woff | 156 |
mp4 | 123 |
JPG | 115 |
woff2 | 81 |
tif | 70 |
PNG | 54 |
ashx | 53 |
ico | 43 |
webp | 40 |
ttf | 19 |
html | 18 |
jsonp | 11 |
I can also obtain the median size of these files. But – just knowing the size does not mean much, I want to know if these files are actually optimized. That means we need to compare these files to the general population.
Are They Optimized?
Now, I am just doing a comparison to other files in the HTTP Archive dataset. The files listed as optimized may actually have been optimized (as in “you should have seen this file before!”) and are now a lot slimmer. These optimized files may exist on pages that are more complicated that the typical website- and thus they might be expected to be larger.
But, just for fun – let’s see if our optimized files are smaller than the general population. We can compare the percentiles of optimized files to the general population. A positive value indicates that the optimized files are smaller than the general population – and the percentage tells us how optimized they are. In the chart below, we see that JavaScript, png, PHP and HTML files that are named “optimized” are generally smaller than all files.
However, it is pretty clear that jpg, CSS, svg, gif, jpeg, woff, mp4, are all larger when the term ‘optimized’ is present. I had to cut off GIF, MP4 and ASHX files to keep the scale of the chart reasonable to read.
Conclusion
Take all url terms at face value – trust but verify. If it says “optimized”, that does not mean that the file has been optimized for size. If an image has ‘width_1000’ in the url path – I check anyway. Changing url names is (marginally) easier than re-encoding, and sometimes only one of the two steps is taken.