Optimizing web performance is all about understanding what happens in the intermediate steps between receiving the HTML, CSS, and JavaScript bytes and the required processing to turn them into rendered pixels. This is what’s called the ”critical rendering path”.
In this article, we’ll mainly focus on what happens locally in a browser. So we’ll not address network flows and their impact on web performance. Other articles like “Network latency and jitter monitoring explained”, “What drives the latency from users to cloud services?” or “Why network latency drives digital performance” cover this topic more specifically.
Here, we’ll analyze the different steps involved in building and rendering a webpage and will discuss the main components that must be carefully configured to ensure optimal web performance.
The main principles of a webpage rendering
Let’s say that a browser requests a webpage to a web server through an HTTP(S) GET request. If this request is successful, the server sends a “status code 200” response back to the browser together with a certain amount of bytes of data corresponding to the HTML page.
From this point, the browser performs the following actions:
- Construct the DOM
- Construct the CSSOM
- Build the render tree
- Build the layout
- Paint
Wow, it seems that the browser has to do a lot of things before finally painting something on the screen. But what does this all mean? Let’s explain a bit more in details what exactly happens.
Building the DOM
DOM stands for “Document Object Model”. It basically defines the webpage structure according to the HTML file markups. When receiving data (bytes) from the web server, this is what the browser does:
- Conversion: The browser reads the raw bytes of HTML and translates them to individual characters based on specified encoding of the file (for example, UTF-8)
- Tokenizing: The browser converts strings within angle brackets into distinct tokens. Each token has a special meaning and its own set of rules. Examples are <head>, <body> and <p>
- Lexing: The emitted tokens are converted into “objects,” which define their properties and rules
- DOM construction: Finally, as the HTML markup language defines relationships between different tags, the created objects are linked together to form a tree structure
The DOM tree captures the properties and relationships of the document markups, but it does not tell us how the element will look when rendered. That is the responsibility of the CSSOM.
Building the CSSOM
CSS stands for “Cascading Style Sheet”. It is the language used to style an HTML document. It describes how HTML elements must be displayed.
As with HTML, the CSS should be converted to something that the browser will understand. This is done through the CSSOM (CSS Object Model) building process.
Typically, CSS stylesheets are referenced by a <link> tag in the <head> section of the HTML document. In such a scenario, they are fetched through additional HTTP(S) requests. Then the process of building the CSSOM is similar to its DOM counterpart.
The CSS bytes are converted into characters, then tokens, then nodes, and finally they are linked into a CSSOM tree structure:
Render Tree
As such, the CSSOM and DOM trees are two totally distinct and separate entities. The browser combines them into a render tree, which is then used to compute the layout of each visible element and serves as an input to the paint process that renders the pixels on the screen.
Layout stage or “reflow”
A this point, the browser knows which nodes must be visible and with which corresponding styles, but it does not know yet their exact position and size within the viewport of the user’s device. That is the purpose of the “layout” stage, also referred to as “reflow”.
The output of the layout process is a “box model,” which precisely captures the exact position and size of each element within the viewport: all of the relative measurements are converted to absolute pixels on the screen.
Painting
Finally, now that the browser knows which nodes are visible, and their computed styles and geometry, it can go forward to the final stage, which converts each node in the render tree to actual pixels on the screen. This step is often referred to as “painting” or “rasterizing.”
Recap
Here’s a quick recap of the steps involved at the browser level when rendering a page:
- Process HTML markups and build the DOM tree
- Process CSS markups and build the CSSOM tree
- Combine the DOM and CSSOM into a render tree
- Run layout on the render tree to compute geometry of each node
- Paint the individual nodes to the screen
Optimizing the critical rendering path is the process of minimizing the total amount of time spent performing steps 1 through 5 in the above sequence.
The challenges
The critical rendering path relies on both the DOM and the CSSOM to construct the render tree. In other words, HTML must be completely parsed and CSS must be fetched before the browser can paint anything on the screen. HTML and CSS are “render blocking” resources.
In its simplest form, a webpage contains an HTML file that does not include any CSS or other types of resources (like JavaScripts).
The whole process of requesting the page and rendering the result on the screen can be schematized as follows:
The process is the following:
- The browser requests the page by sending an HTTP(S) GET request to the web server through the network
- While waiting for the web server’s response, the browser cannot do anything and waits (idle state)
- As soon as the browser gets the HTML file, it builds the DOM as described previously
- Once the DOM is constructed, the browser can render the page
Quite simple, right? Let’s now see what happens when there is a CSS file involved.
CSS
By default, CSS is treated as a render blocking resource, which means that the browser won’t render any processed content until the CSSOM is constructed. So it is important to get it as quickly as possible to avoid additional rendering delay.
This time, the process is the following:
- The browser requests the page by sending a GET request to the web server through the network
- While waiting for the web server’s response, the browser cannot do anything and waits (idle state)
- As soon as the browser gets the HTML file, it builds the DOM. This time though, there is a <link> tag referring to a CSS file
- The browser can proceed with the DOM construction but will have to also build a CSSOM. For this, it sends a request on the network to get the CSS file
- In the meantime, the DOM is ready, but the browser must build the CSSOM in order to finally build the render tree. In the meantime, the browser stays in an idle state
- Once the CSSOM is done, the browser can construct the render tree and finally render the page
JavaScript
Blocking JavaScript
When the HTML parser encounters a <script> tag, it pauses its process of constructing the DOM and yields control to the JavaScript engine. After the JavaScript engine finishes running, the browser then picks up where it left off and resumes DOM construction.
Obviously, delaying the DOM construction ultimately means delaying the rendering process.
This is what happens when the browser encounters a JavaScript during the HTML parsing process:
- The browser requests the page by sending a GET request to the web server through the network
- While waiting for the web server’s response, the browser cannot do anything and waits (idle state)
- As soon as the browser gets the HTML file, it builds the DOM. This time though, there is a <link> tag referring to a CSS file as well as a <script> tag referring to a JS file
- The browser cannot proceed with the DOM construction and must first fetch and execute the JS file. In parallel, it must also fetch the CSS file to construct the CSSOM
- Once the CSS file is received, the browser can construct the CSSOM
- As soon as the JS file is received, the browser must execute it and wait for the result before continuing parsing the HTML file and constructing the DOM
- Once the DOM is built, the browser can construct the render tree and finally render the page
Non-blocking JavaScript
As a conclusion, to deliver optimal performance, try to minimize the impact that executing a JavaScript during the HTML parsing process (DOM construction) has.
For this, you can:
- Eliminate any unnecessary JavaScript from the critical rendering path
- Make your JavaScript execute in “async” mode
This last method is explained in our article “How to minimize performance impact of your JavaScripts by using “defer” or “async” attributes”.
This is the process when using the “async” mode of JavaScript execution:
- The browser requests the page by sending a GET request to the web server through the network
- While waiting for the web server’s response, the browser cannot do anything and waits (idle state)
- As soon as the browser gets the HTML file, it builds the DOM. This time though, there is a <link> tag referring to a CSS file as well as a <script> tag referring to a JS file, but this time in ‘async’ mode
- As the JavaScript is referred in ‘async’ mode, the browser can proceed with the DOM construction. While building it, It sends a request on the network to get the CSS file and another to get the JS file
- Once the CSS file is received, the browser can construct the CSSOM
- Once the JS file is received, the browser executes it and wait for the result before rendering the page
A possible bad scenario
What if the browser hasn’t finished downloading and building the CSSOM when the browser is ready to execute the script? The answer is simple and not very good from a performance standpoint: the browser delays the script execution and DOM construction until it has finished downloading and constructing the CSSOM.
In short, JavaScript introduces a lot of new dependencies between the DOM, the CSSOM, and JavaScript execution. This can cause significant delays in processing and rendering the page on the screen.
Takeaways
Guidelines to optimize the critical rendering path
These are some general rules you can follow to optimize the critical rendering path:
- Analyze and characterize your critical path: number of resources, sizes and sequence of events
- Minimize number of critical resources: eliminate them, defer their download, mark them as async, …
- Optimize the resources’ size to reduce the download time (number of roundtrips)
- Optimize the order in which the remaining critical resources are loaded: download all critical assets as early as possible to shorten the critical path length
But it’s not all about critical rendering path…
To a large degree, “optimizing the critical rendering path” refers to understanding and optimizing the dependency graph between HTML, CSS, and JavaScript.
Nevertheless, not all resources are critical to trigger the first paint. In fact, when we talk about the critical rendering path we are typically talking about the HTML markups, CSS, and JavaScript. Images for example do not block the initial rendering of the page. It does not mean though that we do not care about loading images fast. In a lot of cases, images will directly impact the core web vitals metrics. Furthermore, your browser operation itself can be affected by other processes like anti-virus checks. So the critical rendering path is definitely not the end of the story…