Too Many Redirects on Load Time

One of the main culprits for slowly loading pages is the presence of redirects.  The site tells the browser the content has moved, and a second request ash to be made for the content to actually be downloaded.

A few years ago, I attempted to create a query to discover the webpage in the HTTP Archive with the highest number of redirects before the first HTML response. I was never totally satisfied with the results, and keep coming back from time to time to identify the biggest culprits.  

I think I have it working now.  The bottom query identifies the requestid for the “firsthtml.”  I then join it to another query for all files with the same pageid (meaning the same website), but with a requestid smaller than the firsthtml file.  I further limit to non HTTP 200 responses (this is ~95% 301 and 302 redirects).

 

select allreq.pageid, requestid, status

from httparchive:summary_requests.2018_10_15_mobile allreq

join(

 

//gets firsthtml for pageid

select pageid, requestid as htmlreq, firsthtml

from httparchive:summary_requests.2018_10_15_mobile

where firsthtml=true

) firsthtmlreq

on (allreq.pageid= firsthtmlreq.pageid)

where allreq.requestid < htmlreq and allreq.status !=200

order by allreq.pageid desc, requestid asc

 

By counting them all, and joining to the url of the page from the summary_pages table, the final query looks like this:

select allreq.pageid, count(allreq.pageid) cnt, pages.url

 

from(

select allreq.pageid, requestid, status

from httparchive:summary_requests.2018_10_15_mobile allreq

join(

 

//gets firsthtml for pageid

select pageid, requestid as htmlreq, firsthtml

from httparchive:summary_requests.2018_10_15_mobile

where firsthtml=true

) firsthtmlreq

on (allreq.pageid= firsthtmlreq.pageid)

where allreq.requestid < htmlreq and allreq.status !=200

order by allreq.pageid desc, requestid asc

) redirects

join

(select url, pageid

 

from httparchive:summary_pages.2018_10_15_mobile) pages

on(pages.pageid = redirects.pageid)

//where allreq.pageid=35053890

group by allreq.pageid, status, pages.url

order by cnt desc

 

There are a few false positives, but a number of interesting trends that we can quickly find in the results.

And the Winner is:

The “winner” (Its really hard to call a website with a lot of redirects a winner) has 13 redirects before successfully requesting HTML It actually fails when re-run in WebPageTest with the error “Too Many redirects”

Screen Shot 2018-11-22 at 8.13.06 PM

But what is the site actually doing? Let’s take a look in devTools:

  1. https://www.site.com redirects to https://www.site.com/en-US, as the site is not originally in English.
  2. The www English site sees that I am testing on a mobile device, so redirects me to http://m.site.com
    1. Now here, they are doing something correctly.  They are using
      “Upgrade-Insecure-Requests:1” to enforce HTTPS on all pages.
    2. However, you’ll notice that they redirected to an http:// site.
  3. http://m.site.com redirects to https://m.site.com, because of the directive to enforce HTTPS.
  4. Now, the https, m dot site recognizes that the browser is set to english, so we redirect to https://m.site.com/en-US
  5. We now enter a loop of sorts and the english site redirects to http://m.site.com (and we repeat steps 2-5 several times) before the page fully loads.

On an emulated 3G connection, we see that the redirects add about 10.5 seconds to the time to First byte.

Screen Shot 2018-11-22 at 8.23.05 PM

 

Trends in the Data

 

Logout Before Login

There are a lot of Login pages on the internet.  Many of the pages with a lot of redirects are sites that what to ensure you are completely logged off every domain before logging you in. The waterfalls look like this:

Screen Shot 2018-11-22 at 8.29.11 PM

The page circulates through all the domains you might be logged into, and logs you out.  In the above screenshot, we see requests to subdomains portal, talent, and onboarding – all are logged out, and then you can commence logging in.  This appears too the the major method used for many e-commerce and e-mail login systems (as there are many in the results).  Sometimes these requests are cached on subsequent visits, but many have the same number of redirects on subsequent visits.

Journals

All of the Elsevier Health Journals use the same initial load setup.  As above, the sites are looking to see if the site is logged in properly. Then, several cookies are added serially (each with its own 302 redirect). As a result, all of these journals have 7 redirects on initial load:

Screen Shot 2018-11-22 at 8.44.32 PM

Finally, one of the top offenders in the data is CVS.com.  In testing with WebPageTest, I initially saw no redirects (and sometimes the HTTP Archive catches a site in a funny state that no longer exists). But testing in my own browser (in the EU) the site did fail.  It turns out that the redirect to http://www.cvs.com/international.html begins an infinite looping that quickly fails in Chrome:

Screen Shot 2018-11-22 at 8.52.57 PM

 

Conclusion

Using multiple redirects on page load increases the time to first byte, and can seriously effect the load time of your website (especially on mobile!).  Working to optimize login procedures (minimizing the number of logouts before logging in), can speed up the content appearing on the page.

Finally, test every version of your website. If you redirect to an English version, or an international version – they may not be your main target audience, but since you built the page, you should ensure that it loads without an infinite number of redirects.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s