Quick check book to reduce page load time

by Thierry Notermans | Feb 18, 2022 | Application Performance, Articles

To reduce page load time, you can work at three different levels...

Thierry Notermans

Chief Product Officer & Chief Information Security Officer

Introduction

Nowadays, when you access digital services like SaaS applications or simply surf the net, you expect a high level of responsiveness, right ? You typically expect getting useful information on your screen within 1 to 2 seconds. This high quality expectation is even reinforced by high bandwidth network technologies, like 5G.

With such an available bandwidth (20Gbps for 5G), users expect an instantaneous interactivity with all web applications. It cannot be otherwise. Well, in reality, rendering a web page in a browser quickly becomes complex and requires a certain level of knowledge in order to optimize all the different pieces of the puzzle.

This article does not pretend to go into all possible implementation technical details but you can consider it as a general guideline when it comes to reducing the load time of web pages.

The basic principles

When you access web applications or websites, you basically ask your browser to get data from one or multiple servers. Sometimes, the requests the browser sends must be processed at the server level before getting any data back, sometimes the browser has to perform some local processing before rendering all final data on the screen.

But in all cases, this is what happens: communications between a browser and servers. And what the user finally sees on his/her screen is nothing else than the result of all these communications.

To reduce page load time, you can work at three different levels:

You can try to minimize the number of communications required
You can try to optimize the sequence of required communications (when they should happen and how)
For each required communication, you can try to reduce its duration

The picture below illustrates these basic concepts:

In this example, sending one big request requires 800ms in total
For the same amount of data to transport, sending three consecutive smaller requests requiring communications of 200ms each will reduce the page load time by 200ms
But using small requests is not the ultimate solution. Sometimes, using bigger requests (300ms for each communication in the third example), but putting them in parallel, can even be a better choice

Of course, combining these techniques will be the best way to reduce the page load time.

Let’s now discuss some techniques you can make use of to put these general concepts into practice.

Minimizing the number of communications

Less communications between your browser and the server means less time spent on the network transferring data and less requests the server must handle.

One obvious way to reduce the number of communications is to request more data at once. There are techniques like bundling your assets (JavaScripts, CSS, images). Webpack is a well-know solution for that. This kind of technique is especially useful in Single Page Applications (SPA) where a lot of application logic is processed at the client side.

Other techniques can help reduce the number of communications. Make sure for example to avoid HTTP redirections as much as possible. As explained in details in our article “How HTTP redirections impact your web performance”, avoiding them can really positively impact the application performance.

Optimizing the communications sequence

As illustrated above, being able to process multiple requests in parallel greatly helps reduce the page load time.

Looking at protocols like HTTP/2 and HTTP/3 should be on your todo list! Have a look at the following articles to know more about how these protocols can boost your web performance:

Optimizing the communications sequence is not only about parallelizing flows. It is also about carefully thinking about which elements to request, and in which order! The Critical Rendering Path is an important concept you should care about. Choosing when to fetch and execute CSS or JavaScripts can have a huge impact on the page load time.

So think about best practices like putting the critical components at the top of your HTML page so that the browser processes them early, and exclude others from the critical path. For example, you can postpone non-essential components by using defer or async tags for JavaScripts (have a look at our article “How to minimize performance impact of your JavaScripts by usine “defer” or “async” attributes” for more information) or use loadCSS to asynchronously fetch non essential CSS files.

You can also work proactively. This means that you can anticipate the fact the browser will have to fetch resources before it actually needs them. Some techniques are embedded in the HTTP protocol itself. « Server Push » is one of these techniques included in the HTTP/2 protocol. With « Server Push », the server anticipates the needs for resources by sending them to the browser before it actually needs them. A browser can also work proactively by requesting a DNS resolution (through dns-prefetch) and connecting to the server hosting a resource (through preconnect) before it really needs to fetch the resource.

Optimizing each communication performance

When you have determined which elements to load first and how the sequence of events should happen, you can focus on the communication performance itself.

When a client and a server exchange data on a network, you can act at three different levels to optimize performances: the client, the network and the server.

Client and Server side

When dealing with client-side and server-side performance optimization, the main objective is to find the right balance between executing code (mainly JavaScripts) at the browser level versus at the server level. Single Page Applications heavily rely on the browser to process and render the page. This is done through a client-side JavaScript logic that requires the browser to fetch a bunch of JavaScripts. Frameworks and libraries, like React, Angular, and Vue are well-known for that purpose.

At the other end, the server-side rendering concept works by using the server to render the application into HTML.

It would be too easy to have one solution that fits all requirements:

A client-side rendering focus means a lot more JavaScripts to fetch and extensive load on the device’s CPU
A server-side rendering focus means increasing the TTFB (Time To First Byte), that is the time it takes for the server to process the request, compared to delivering a simple static HTML content

The network part

Transferring data from one point to another on a network takes time. To reduce the duration of a communication on a network, you can act on two main factors.

You can think of transferring data quicker. SD-WAN is a technology that automatically routes traffic through the most performant path. Using CDNs (Content Delivery Networks) ensures that static resources are available as close as possible to the client.
You can reduce the number of round trips between the client and the server. For example, make sure that you can cache static resources on the browser. For more information, refer to our article “How to improve web performance with web content caching“. You can also try to reduce the size of the resources to send over the network. Techniques like minifying CSS and JavaScript files help reduce their size. You can also reduce Images size by using compression techniques like WebP. All these techniques aim at achieving the same goal: reducing the amount of data to transmit on the network. Less data means less packets to transmit in sequence on the network, which at the end, reduce the duration of the communication.

Takeaways

Reducing the page load time can represent a significant challenge, especially in modern web applications. The very first step in your optimization journey consists of understanding your application performance profile. To do this, you must be able to:

Identify all application dependencies and third-party services, like DNS and CDNs
Monitor network performances in potentially complex environments like SD-WAN (check our article “Best practices for SD-WAN monitoring” for more details)
Identify all critical resources, their profile (nature, size, possibility to be cached, …), as well as their individual performances
Identify clients and their respective profiles in terms of locations, devices, operating system, browser type, and way to connect to the application

Kadiska can help you in this challenging journey. If you want to know how, keep reading here.

Share this post

All our latest network monitoring and user experience stories and insights straight to your inbox.

Resources

← Best Practices for SD-WAN Monitoring Kadiska completes first funding round to fuel its teleworking monitoring solution and expand into the U.S. →

Quick check book to reduce page load time

To reduce page load time, you can work at three different levels...

Thierry Notermans

Introduction

The basic principles

Minimizing the number of communications

Optimizing the communications sequence

Optimizing each communication performance

Client and Server side

The network part

Takeaways

Share this post

Newsletter

Resources

Digital Experience Monitoring

Android Enterprise Mobility Performance Monitoring

Salesforce Performance

Hybrid Working Digital Experience

SaaS Performance Monitoring

SaaS Migration Monitoring

Book a meeting to get started

Products

Resources

Company

Customers

Newsletter