It is pretty well regarded in the Web performance community that compressing text files (with algorithms like Gzip or Brotli) ensures that they download faster. Gzip (in general) compresses text files 5-8x smaller, and Brotli’s compression is (generally) better than Gzip.
WebpageTest grades your website’s use of text compression at the top of every report:
If you are testing a mobile app, AT&T’s Video Optimizer runs a packet capture of all the files transferred, and identifies the files that were not compressed.
Unfortunately, in my experience, text compression is a test that many mobile applications fail. In the above chart, there are 3 files over 200KBm and 2 ~50KB (921 KB of text files total) that could have been compressed. I find that we often report to developers to turn on text compression.
Last week, I did just that. I presented evidence to an application developer that they were not compressing the JSON files sent to their mobile app, and made the recommendation to start doing it.. Their reply surprised me a little bit:
Now, I instinctively know that compressed files must download faster. Every round trip on mobile is on the order of hundreds of ms, and when files are 5-8x larger, they have to use many more round trips to deliver the file.
So, I knew this assertion to be incorrect. But, as I lay awake in bed the other night, I wondered “how much faster will the file download?” (ASIDE: This is normal right? To ponder web performance whilst trying to go to sleep? Should I be counting WOFF files or something?)
How Much Faster are files downloaded when compressed?
How can we determine this? The HTTPArchive provides webpagetest data every 2 weeks of the top 1M mobile websites. The data is all up on BigQuery to examine in detail. So, I started there. In the table of all mobile responses, we can determine whether or not Gzip is utilized, the download size and the time it took to download the file. But, if you look more carefully, you’ll also find the parameter “_gzip_save” which tells you how much data would have been saved if the file *had* used gzip compression.
In this query, I am only looking at files with potential savings (I force _gzip_save>0):
SELECT time, resp_content_encoding, type, respSize, _gzip_save, respSize - _gzip_save potentialSize from httparchive.runs.latest_requests where (type contains"text"||type contains"html"||type contains"css"||type contains"js"||type contains"JSON") and respSize>1000 and _gzip_save>0
potentialSize is simply the size of the download minus the potential savings (“gzip_save). The results appear below:
It is pretty clear that respSize – _gzip_save = potentialSize will give us the file size if the server had Gzip compression turned on. If I apply this math to files that already have been compressed, respSize will equal potentialSize (because the _gzip_save term is 0).
Now, let’s group the data by whether we use compression (gzip, brotli or none); and the potentialSize (<2, <5, <10, <25, <50, <75, <100, >100 KB) categories. (Search is here)
In this study, I discarded the brotli data – so I only comparing gzip vs not gzip. I end up with a big table of data to slice and dice. Let’s compare the number of files by their potential size, and whether or not they are compressed:
It is pretty clear that most files are using some sort of compression (as we previously calculated, about 82% of all text files are zipped on desktop).
We also see that most text files are under 25KB (whether compressed, or not). Brotli has a really big bump in the 10-50KB range, showing that for these files, the added compression must really make a difference. (In this chart, I excluded all files lower than 1KB, as it really skewed the axes).
We can chart the other median values as well. Let’s look at file size. We have the median Brotli, gzipped, not gzipped and a 4th line of POTENTIAL file size if they were zipped:
What we see here is that noncompressed files are generally a lot larger, but that if we apply a compression algorithm – the files will match the size of those already compressed (the blue, red and gray lines are nearly identical). In general, applying gzip compression will give ~80% savings (slightly less for smaller files):
But What about TIME savings
We get it – compressed files are smaller. Of course, smaller files download faster – but HOW much faster? Let’s compare in a chart:
At all speeds, the median time to download is much faster for compressed vs. non-compressed text file. For files under 25 KB, the median compressed time is 40% faster, and for files over 25KB, the median file is over 55% faster when compressed. The median savings for files between 50-75KB is around 2.9 seconds! That’s a significant speed up!
The Brotli compressed files seem to always be faster than even those that are gzipped. In fact, for files > 10KB, the median Brotli files are 290-477ms faster than the median Gzip file. That’s 25-75% faster download time. This is certainly something that should be examined further.
Most text files are compressed for transport. But, for those files that are not, we were looking to quantify the potential savings. When we apply the gzip savings data in HTTPArchive, we see that the larger uncompressed files would generally match the size of the compressed files.
We also see that the uncompressed files take longer to download, and applying compression would speed the download by 32-64%, eliminating hundreds of milliseconds to seconds of transfer time. One unexpected result was that despite similar median file sizes, the Brotli compressed files had a faster median download than those using gzip (25-75% faster).
Finally (and most importantly), test your website with WebPageTest or your mobile app with Video Optimizer. These tools are free, and they will help you make sure you are not missing simple performance wins like Text File Compression.