Faster multipage applications with streams

These days, websites—or web apps if you prefer—tend to use one of two navigation schemes:

  • The navigation scheme browsers provide by default—that is, you enter a URL in your browser's address bar and a navigation request returns a document as a response. Then you click on a link, which unloads the current document for another one, ad infinitum.
  • The single page application pattern, which involves an initial navigation request to load the application shell and relies on JavaScript to populate the application shell with client-rendered markup with content from a back-end API for each "navigation".

The benefits of each approach have been touted by their proponents:

  • The navigation scheme that browsers provide by default is resilient, as routes don't require JavaScript to be accessible. Client-rendering of markup by way of JavaScript can also be a potentially expensive process, meaning that lower-end devices may end up in a situation where content is delayed because the device is blocked processing scripts that provide content.
  • On the other hand, Single Page Applications (SPAs) may provide faster navigations after the initial load. Rather than relying on the browser to unload a document for an entirely brand new one (and repeating this for every navigation) they can offer what feels like a faster, more "app-like" experience—even if that requires JavaScript to function.

In this post, we're going to talk about a third method that strikes a balance between the two approaches described above: relying on a service worker to precache the common elements of a website—such as header and footer markup—and using streams to provide an HTML response to the client as fast as possible, all while still using the browser's default navigation scheme.

Why stream HTML responses in a service worker?

Streaming is something your web browser already does when it makes requests. This is extremely important in the context of navigation requests, as it ensures the browser isn't blocked waiting for the entirety of a response before it can start to parse document markup and render a page.

A diagram depicting non-streaming HTML versus streaming HTML. In the former case, the entire markup payload isn't processed until it arrives. In the latter, markup is processed incrementally as it arrives in chunks from the network.

For service workers, streaming is a bit different as it uses the JavaScript Streams API. The most important task a service worker fulfills is to intercept and respond to requests—including navigation requests.

These requests can interact with the cache in a number of ways, but a common caching pattern for markup is to favor using a response from the network first, but fall back to the cache if an older copy is available—and optionally provide a generic fallback response if a usable response isn't in the cache.

This is a time-tested pattern for markup that works well, but while it helps with reliability in terms of offline access, it doesn't offer any inherent performance advantages for navigation requests that rely on a network first or network only strategy. That's where streaming comes in, and we'll explore how to use the Streams API-powered workbox-streams module in your Workbox service worker to speed up navigation requests on your multipage website.

Breaking down a typical web page

Structurally speaking, websites tend to have common elements that exist on every page. A typical arrangement of page elements often goes something like:

  • Header.
  • Content.
  • Footer.

Using as an example, that breakdown of common elements looks like this:

A breakdown of the common elements on the website. The common areas delineated are marked 'header', 'content', and 'footer'.

The goal behind identifying parts of a page is that we determine what can be precached and retrieved without going to the network—namely the header and footer markup common to all pages—and the part of the page that we'll always go to the network for first—the content in this case.

When we know how to segment the parts of a page and identify the common elements, we can write a service worker that always retrieves the header and footer markup instantly from the cache while requesting only the content from the network.

Then, using the Streams API via workbox-streams, we can stitch all these parts together and respond to navigation requests instantly—while requesting the minimum amount of markup necessary from the network.

Building a streaming service worker

There's a lot of moving parts when it comes to streaming partial content in a service worker, but each step of the process will be explored in detail as you go, starting with how to structure your website.

Segmenting your website into partials

Before you can start writing a streaming service worker, you'll need to do three things:

  1. Create a file containing only your website's header markup.
  2. Create a file containing only your website's footer markup.
  3. Pull out each page's main content into a separate file, or set up your back end to conditionally serve only the page content based on an HTTP request header.

As you might expect, the last step is the hardest, especially if your website is static. If that's the case for you, you'll need to generate two versions of each page: one version will contain the full page markup, while the other will contain only the content.

Composing a streaming service worker

If you haven't installed the workbox-streams module, you'll need to do so in addition to whatever Workbox modules you have currently installed. For this specific example, that involves the following packages:

npm i workbox-navigation-preload workbox-strategies workbox-routing workbox-precaching workbox-streams --save

From here, the next step is to create your new service worker and precache your header and footer partials.

Precaching partials

The first thing you'll do is create a service worker in the root of your project named sw.js (or whatever filename you prefer). In it, you'll start off with the following:

// sw.js
import * as navigationPreload from 'workbox-navigation-preload';
import {NetworkFirst} from 'workbox-strategies';
import {registerRoute} from 'workbox-routing';
import {matchPrecache, precacheAndRoute} from 'workbox-precaching';
import {strategy as composeStrategies} from 'workbox-streams';

// Enable navigation preload for supporting browsers

// Precache partials and some static assets
// using the InjectManifest method.
  // The header partial:
    url: '/partial-header.php',
    revision: __PARTIAL_HEADER_HASH__
  // The footer partial:
    url: '/partial-footer.php',
    revision: __PARTIAL_FOOTER_HASH__
  // The offline fallback:
    url: '/offline.php',
    revision: __OFFLINE_FALLBACK_HASH__

// To be continued...

This code does a couple of things:

  1. Enables navigation preload for browsers that support it.
  2. Precaches the header and footer markup. This means that the header and footer markup for every page will be retrieved instantaneously, as it won't be blocked by the network.
  3. Precaches static assets in the __WB_MANIFEST placeholder that uses injectManifest method.

Streaming responses

Getting your service worker to stream concatenated responses is the biggest part of this whole effort. Even so, Workbox and its workbox-streams makes this a much more succinct affair than if you had to do all of this on your own:

// sw.js
import * as navigationPreload from 'workbox-navigation-preload';
import {NetworkFirst} from 'workbox-strategies';
import {registerRoute} from 'workbox-routing';
import {matchPrecache, precacheAndRoute} from 'workbox-precaching';
import {strategy as composeStrategies} from 'workbox-streams';

// ...
// Prior navigation preload and precaching code omitted...
// ...

// The strategy for retrieving content partials from the network:
const contentStrategy = new NetworkFirst({
  cacheName: 'content',
  plugins: [
      // NOTE: This callback will never be run if navigation
      // preload is not supported, because the navigation
      // request is dispatched while the service worker is
      // booting up. This callback will only run if navigation
      // preload is _not_ supported.
      requestWillFetch: ({request}) => {
        const headers = new Headers();

        // If the browser doesn't support navigation preload, we need to
        // send a custom `X-Content-Mode` header for the back end to use
        // instead of the `Service-Worker-Navigation-Preload` header.
        headers.append('X-Content-Mode', 'partial');

        // Send the request with the new headers.
        // Note: if you're using a static site generator to generate
        // both full pages and content partials rather than a back end
        // (as this example assumes), you'll need to point to a new URL.
        return new Request(request.url, {
          method: 'GET',
      // What to do if the request fails.
      handlerDidError: async ({request}) => {
        return await matchPrecache('/offline.php');

// Concatenates precached partials with the content partial
// obtained from the network (or its fallback response).
const navigationHandler = composeStrategies([
  // Get the precached header markup.
  () => matchPrecache('/partial-header.php'),
  // Get the content partial from the network.
  ({event}) => contentStrategy.handle(event),
  // Get the precached footer markup.
  () => matchPrecache('/partial-footer.php')

// Register the streaming route for all navigation requests.
registerRoute(({request}) => request.mode === 'navigate', navigationHandler);

// Your service worker can end here, or you can add more
// logic to suit your needs, such as runtime caching, etc.

This code consists of three main parts that satisfy the following requirements:

  1. A NetworkFirst strategy is used to handle requests for content partials. Using this strategy, a custom cache name of content is specified to contain the content partials, as well as a custom plugin that handles whether to set an X-Content-Mode request header for browsers that don't support navigation preload (and therefore don't send a Service-Worker-Navigation-Preload header). This plugin also figures out whether to send the last cached version of a content partial, or send an offline fallback page in the event that no cached version for the current request is stored.
  2. The strategy method in workbox-streams (aliased as composeStrategies here) is used to concatenate the precached header and footer partials along with the content partial requested from the network.
  3. The whole scheme is rigged up via registerRoute for navigation requests.

With this logic in place, we have streaming responses set up. However, there may be some work you'll need to do on a back end in order to ensure that the content from the network is a partial page that you can merge with the precached partials.

If your website has a back end

You'll recall that when navigation preload is enabled, the browser sends a Service-Worker-Navigation-Preload header with a value of true. However, in the code sample above, we sent a custom header of X-Content-Mode in the event navigation preload is unsupported in a browser. In the back end, you'd change the response based on the presence of these headers. In a PHP back end, that might look something like this for a given page:

// Check if we need to render a content partial
$partialContentMode = isset($_SERVER['HTTP_X_CONTENT_MODE']) && $_SERVER['HTTP_X_CONTENT_MODE'] === 'partial';
$isPartial = $navPreloadSupported || $partialContentMode;

// Figure out whether to render the header
if ($isPartial === false) {
  // Get the header include
  require_once($_SERVER['DOCUMENT_ROOT'] . '/includes/site-header.php');

  // Render the header

// Get the content include

// Render the content

// Figure out whether to render the footer
if ($isPartial === false) {
  // Get the footer include
  require_once($_SERVER['DOCUMENT_ROOT'] . '/includes/site-footer.php');

  // Render the footer

In the above example, the content partials are invoked as functions, which take the value of $isPartial to change how the partials are rendered. For example, the content renderer function may only include certain markup in conditions when retrieved as a partial—something that'll be covered shortly.


Before you deploy a service worker to stream and stitch partials together, there are some things you must consider. While it's true that using a service worker in this way doesn't fundamentally change the browser's default navigation behavior, there are some things that you'll likely need to address.

Updating page elements when navigating

The trickiest part of this approach is that some things will need to be updated on the client. For example, precaching header markup means the page will have the same content in the <title> element, or even managing on/off states for navigation items will have to be updated on each navigation. These things—and others—may have to be updated on the client for each navigation request.

The way to get around this might be to place an inline <script> element into the content partial that comes from the network to update a few important things:

<!-- The JSON below contains information about the current page. -->
<script id="page-data" type="application/json">'{"title":"Sand Wasp &mdash; World of Wasps","description":"Read all about the sand wasp in this tidy little post."}'</script>
  const pageData = JSON.parse(document.getElementById('page-data').textContent);

  // Update the page title
  document.title = pageData.title;
  <!-- Page content omitted... -->

This is just one example of what you might have to do if you decide to go with this service worker setup. For more complex applications with user information, for example, you might have to store bits of relevant data in a web store like localStorage and update the page from there.

Dealing with slow networks

One drawback of streaming responses using markup from the precache can occur when network connections are slow. The problem is that the header markup from the precache will arrive instantaneously, but the content partial from the network can take quite some time to arrive after the initial paint of the header markup.

This can create something of a confusing experience, and if networks are very slow, it can even feel like the page is broken and not rendering any further. In cases like this, you can opt to put a loading icon or message in the content partial's markup that you can hide once the content is loaded.

One way to do this is through CSS. Say your header partial ends with an opening <article> element that's empty until the content partial arrives to populate it. You could write a CSS rule similar to this:

article:empty::before {
  text-align: center;
  content: 'Loading...';

This works, but it will show a loading message on the client regardless of the network speed. If you want to avoid a strange flash of messaging, you can try this approach where we nest the selector in the above snippet within a slow class:

.slow article:empty::before {
  text-align: center;
  content: 'Loading...';

From here you could use JavaScript in your header partial to read the effective connection type (at least in Chromium browsers) to add the slow class to the <html> element on select connection types:

  const effectiveType = navigator?.connection?.effectiveType;

  if (effectiveType !== '4g') {

This will ensure that effective connection types slower than the 4g type will get a loading message. Then in the content partial, you can put an inline <script> element to remove the slow class from the HTML to get rid of the loading message:


Providing a fallback response

Let's say you're using a network-first strategy for your content partials. If the user is offline and goes to a page they've already been to, they're covered. However, if they go to a page they haven't been to yet, they'll get nothing. To avoid this, you'll need to serve a fallback response.

The code required to achieve a fallback response is demonstrated in prior code samples. The process requires two steps:

  1. Precache an offline fallback response.
  2. Set up a handlerDidError callback in the plugin for your network-first strategy to check the cache for the last-accessed version of a page. If the page was never accessed, you'll need to use the matchPrecache method from the workbox-precaching module to retrieve the fallback response from the precache.

Caching and CDNs

If you're using this streaming pattern in your service worker, assess whether the following applies to your situation:

  • You use a CDN or any other sort of intermediate/public cache.
  • You have specified a Cache-Control header with a non-zero max-age and/or s-maxage directive(s) in combination with the public directive.

If both of these are the case for you, the intermediate cache may hold onto responses for navigation requests. However, remember that when you use this pattern, you may be serving two different responses for any given URL:

  • The full response, containing the header, content, and footer markup.
  • The partial response, containing only the content.

This can cause some undesired behaviors, resulting in doubled header and footer markup, because the service worker may be fetching a full response from the CDN cache and combining that with your precached header and footer markup.

To get around this, you'll need to rely on the Vary header, which affects caching behavior by keying cacheable responses to one or more headers that were present in the request. Because we're varying the responses to navigation requests based on the Service-Worker-Navigation-Preload and custom X-Content-Mode request headers, we need specify this Vary header in the response:

Vary: Service-Worker-Navigation-Preload,X-Content-Mode

With this header, the browser will differentiate between complete and partial responses for navigation requests, avoiding issues with doubled header and footer markup, as will any intermediate caches.

The outcome

Most load-time performance advice boils down to "show them what you got"—don't hold back, don't wait until you have everything before showing the user anything.

Jake Archibald in Fun Hacks for Faster Content

Browsers excel when it comes to dealing with responses to navigation requests, even for huge HTML response bodies. By default, browsers progressively stream and process markup in chunks that avoid long tasks, which is good for startup performance.

This works to our advantage when we use a streaming service worker pattern. Whenever you respond to a request from the service worker cache from the get-go, the start of the response arrives almost instantaneously. When you stitch together precached header and footer markup with a response from the network, you get some notable performance advantages:

  • Time to First Byte (TTFB) will often be greatly reduced, as the first byte of the response to a navigation request is instant.
  • First Contentful Paint (FCP) will be very fast, as the precached header markup will contain a reference to a cached style sheet, meaning that the page will paint very, very quickly.
  • In some cases, Largest Contentful Paint (LCP) can be faster as well, particularly if the largest onscreen element is provided by the precached header partial. Even so, just serving something out of the service worker cache as soon as possible in tandem with smaller markup payloads may result in a better LCP.

Streaming multipage architectures can be a bit tricky to set up and iterate on, but the complexity involved is often no more onerous than SPAs in theory. The main benefit is that you're not replacing the browser's default navigation scheme—you're enhancing it.

Better yet, Workbox makes this architecture not just possible, but easier than if you were to implement this on your own. Give it a try on your own website and see how much faster your multipage website can be for users in the field.