Zero-latency WordPress Front-end ================================ In this example, we're going to build a zero-latency front-end for WordPress. When a visitor clicks on a link, a story will instantly appear. No hourglass. No spinner. No blank page. We'll accomplish this by aggressively prefetching data in our client-side code. At the same time, we're going to employ server-side rendering (SSR) to minimize time to first impression. The page should appear within a fraction of a second after the visitor enters the URL. Combined with aggressive back-end caching, we'll end up with a web site that feels very fast and is cheap to host. This is a complex example with many moving parts. It's definitely not for beginners. You should already be familiar with technologies involved: [React](https://reactjs.org/), [Nginx caching](https://www.nginx.com/blog/nginx-caching-guide/), and of course [WordPress](https://wordpress.org/) itself. ## Live demo For the purpose of demonstrating what the example code can do, I've prepared two web sites: * [pfj.trambar.io](https://pfj.trambar.io) * [et.trambar.io](https://et.trambar.io) Both are hosted on the same AWS [A1 medium instance](https://aws.amazon.com/ec2/instance-types/a1/), powered by a single core of a [Graviton CPU](https://www.phoronix.com/scan.php?page=article&item=ec2-graviton-performance&num=1) and backed by 2G of RAM. In terms of computational capability, we have roughly one fourth that of a phone. Not much. For our system though it's more than enough. Most requests will result in cache hits. Nginx will spend most of its time sending data already in memory. We'll be IO-bound long before we're CPU-bound. [pfj.trambar.io](https://pfj.trambar.io) obtains its data from a test WordPress instance running on the same server. It's populated with random lorem ipsum text. You can log into the [WordPress admin page](https://pfj.trambar.io/wp-admin/) and post a article using the account `bdickus` (password: `incontinentia`). Publication of a new article will trigger a cache purge. The article should appear in the front page automatically after 30 seconds or so (no need to hit refresh button). You can see a list of what's in the Nginx cache [here](https://pfj.trambar.io/.cache). [et.trambar.io](https://et.trambar.io) obtains its data from [ExtremeTech](https://www.extremetech.com/). It's meant to give you a better sense of how the example code fares with real-world contents. The site has close to two decades' worth of articles. Our server does not receive cache purge commands from this WordPress instance so the contents could be out of date. Cache misses will also lead to slightly longer pauses. ## Server-side rendering Isomorphic React components are capable of rendering on a web server and in a web browser. One primary purpose of server-side rendering (SSR) is search engine optimization. Another is to mask JavaScript loading time. Rather than displaying a spinner or progress bar, we render the front-end on the server and send the HTML to the browser. Effectively, we're using the front-end's own appearance as its loading screen. The following animation depicts how an SSR-augmented single-page web-site works. Click on it if you wish to view it as separate images. [![Server-side rendering](docs/img/ssr.gif)](docs/ssr.md) While the SSR HTML is not backed by JavaScript, it does have functional hyperlinks. If the visitor clicks on a link before the JavaScript bundle is done loading, he'll end up at another SSR page. As the server has immediate access to both code and data, it can generate this page very quickly. It's also possible that the page exists already in the server-side cache, in which case it'll be sent even sooner. ## Back-end services Our back-end consists of three services: WordPress itself, Nginx, and Node.js. The following diagram shows how contents of various types move between them: ![Back-end services](docs/img/services.png) Note how Nginx does not fetch JSON data directly from WordPress. Instead, data goes through Node first. This detour is due mainly to WordPress not attaching [e-tags](https://en.wikipedia.org/wiki/HTTP_ETag) to JSON responses. Without e-tags the browser cannot perform cache validation (i.e. conditional request → 304 not modified). Passing the data through Node also gives us a chance to strip out unnecessary fields. Finally, it lets us compress the data prior to sending it to Nginx. Size reduction means more contents will fit in the cache. It also saves Nginx from having to gzip the same data over and over again. Node will request JSON data from Nginx when it runs the front-end code. If the data isn't found in the cache, Node will end up serving its own request. This round-trip will result in Nginx caching the JSON data. We want that to happen since the browser will soon be requesting the same data (since it'll be running the same front-end code). ## Uncached page access The following animation shows what happens when the browser requests a page and Nginx's cache is empty. Click on it to view it as separate images. [![Uncached page access](docs/img/uncached.gif)](docs/uncached.md) ## Cached page access The following animation shows how page requests are handled once contents (both HTML and JSON) are cached. This is what happens most of the time. [![Cached page access](docs/img/cached.gif)](docs/cached.md) ## Cache purging The following animation depicts what happens when a new article is published on WordPress. [![Cached cache purging](docs/img/purge.gif)](docs/purge.md) ## Getting started This example is delivered as a Docker app. Please install Docker and Docker Compose if they aren't already installed on your computer. On Windows and OSX, you might need to enable port forwarding for port 8000. In a command-line prompt, run `npm install` or `npm ci`. Once all libraries have been downloaded, run `npm run start-server`. Docker will proceed to download four official images from Docker Hub: [WordPress](https://hub.docker.com/_/wordpress/), [MariaDB](https://hub.docker.com/_/mariadb), [Nginx](https://hub.docker.com/_/nginx), and [Node.js](https://hub.docker.com/_/node/). Once the services are up and running, go to `http://localhost:8000/wp-admin/`. You should be greeted by WordPress's installation page. Enter some information about your test site and create the admin account. Log in and go to **Settings** > **Permalinks**. Choose one of the URL schemas. Next, go to **Plugins** > **Add New**. Search for `Proxy Cache Purge`. Install and activate the plugin. A new **Proxy Cache** item will appear in the side navigation bar. Click on it. At the bottom of the page, set the **Custom IP** to 172.129.0.3. This is the address of our Node.js service. In a different browser tab, go to `http://localhost:8000/`. You should see the front page with just a sample post: ![Welcome page](docs/img/front-page-initial.png) Now return to the WordPress admin page and publish another test post. After 30 seconds or so, the post should automatically appear in the front page: ![Welcome page](docs/img/front-page-new-story.png) To see the code running in debug mode, run `npm run watch`. The client-side code will be rebuilt whenever changes occurs. To populate your test site with dummy data, install the [FakerPress plugin](https://wordpress.org/plugins/fakerpress/). To shut down the test server, run `npm run stop-server`. To remove Docker volumes used by the example, run `npm run remove-server`. If you have a production web site running WordPress, you can see how its contents look in the example front-end (provided that the REST interface is exposed and permalinks are enabled). Open `docker-compose-remote.yml` and change the environment variable `WORDPRESS_HOST` to the address of the site. Then run `npm run start-server-remote`. ## Nginx configuration Let us look at the [Nginx configuration file](https://github.com/trambarhq/relaks-wordpress-example/blob/master/server/nginx/default.conf). The first two lines tell Nginx where to place cached responses, how large the cache should be (1 GB), and for how long to keep inactive entries (7 days): ``` proxy_cache_path /var/cache/nginx/data keys_zone=data:10m max_size=1g inactive=7d; proxy_temp_path /var/cache/nginx/tmp; ``` [`proxy_cache_path`](http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_cache_path) is specified without `levels` so that files are stored in a flat directory structure. This makes it easier to scan the cache. [`proxy_temp_path`](http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_temp_path) points to a location on the same volume as the cache so Nginx can move files into it with a rename operation. The following section configures reverse-proxying for the WordPress admin page: ``` location ~ ^/wp-* { proxy_pass http://wordpress; proxy_set_header Host $http_host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $remote_addr; proxy_set_header X-Forwarded-Host $server_name; proxy_set_header X-Forwarded-Proto $scheme; proxy_pass_header Set-Cookie; proxy_redirect off; } ``` The following section controls Nginx's interaction with Node: ``` location / { proxy_pass http://node; proxy_set_header Host $http_host; proxy_cache data; proxy_cache_key $uri$is_args$args; proxy_cache_min_uses 1; proxy_cache_valid 400 404 1m; proxy_ignore_headers Vary; add_header Access-Control-Allow-Origin *; add_header Access-Control-Expose-Headers X-WP-Total; add_header X-Cache-Date $upstream_http_date; add_header X-Cache-Status $upstream_cache_status; } ``` We select the cache zone we defined earlier with the [`proxy_cache`](http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_cache) directive. We set the cache key using [`proxy_cache_key`](http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_cache_key). The MD5 hash of the path plus the query string will be the name used to save each cached server response. With the [`proxy_cache_min_uses`](http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_cache_min_uses) directive we tell Nginx to start caching on the very first request. With the [`proxy_cache_valid`](http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_cache_valid) directive we ask Nginx to cache error responses for one minute. The [`proxy_ignore_headers`](http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_ignore_headers) directive is there to keep Nginx from creating separate cache entries when requests to the same URL have different `Accept-Encoding` headers (additional compression methods, for example). The first two headers added using [add_header](http://nginx.org/en/docs/http/ngx_http_headers_module.html#add_header) are there to enable [CORS](https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS). The last two `X-Cache-*` headers are for debugging purpose. They let us figure out whether a request has resulted in a cache hit when we examine it using the browser's development tools: ![Chrome Dev Tools](docs/img/dev-tool-x-cache.png) ## HTML page generation The following Express handler ([index.js](https://github.com/trambarhq/relaks-wordpress-example/blob/master/server/index.js#L100)) is invoked when Nginx asks for an HTML page. This should happen infrequently as page navigation is handled client-side. Most visitors will enter the site through the root page and that's inevitably cached. The handler detects whether the remote agent is a search-engine spider and handle the request accordingly. ```javascript async function handlePageRequest(req, res, next) { try { const path = req.url; const noJS = (req.query.js === '0'); const target = (req.isSpider() || noJS) ? 'seo' : 'hydrate'; const page = await PageRenderer.generate(path, target); if (target === 'seo') { // not caching content generated for SEO res.set({ 'X-Accel-Expires': 0 }); } else { res.set({ 'Cache-Control': CACHE_CONTROL }); // remember the URLs used by the page pageDependencies[path] = page.sourceURLs; } res.type('html').send(page.html); } catch (err) { next(err); } } ``` `PageRenderer.generate()` ([page-renderer.js](https://github.com/trambarhq/relaks-wordpress-example/blob/master/server/page-renderer.js#L13)) uses our isomorphic React code to generate the page. Since the fetch API doesn't exist on Node.js, we need to supply a compatible function to the data source. We use this opportunity to capture the list of URLs that the front-end accesses. Later, we'll use this list to determine whether a cached page has become out-of-date. ```javascript async function generate(path, target) { console.log(`Regenerating page: ${path}`); // retrieve cached JSON through Nginx const host = NGINX_HOST; // create a fetch() that remembers the URLs used const sourceURLs = []; const agent = new HTTP.Agent({ keepAlive: true }); const fetch = (url, options) => { if (url.startsWith(host)) { sourceURLs.push(url.substr(host.length)); options = addHostHeader(options); options.agent = agent; } return CrossFetch(url, options); }; const options = { host, path, target, fetch }; const frontEndHTML = await FrontEnd.render(options); const htmlTemplate = await FS.readFileAsync(HTML_TEMPLATE, 'utf-8'); let html = htmlTemplate.replace(``, frontEndHTML); if (target === 'hydrate') { // add