Welcome to NBSoftSolutions, home of the software development company and writings of its main developer: Nick Babcock. If you would like to contact NBSoftSolutions, please see the Contact section of the about page.

My Bet on Rust has been Vindicated

Programmers navigating mountainous opinions on languages

As a side project, I’m writing an Europa Universalis IV (EU4) leaderboard and in-browser save file analyzer called Rakaly. No need to be familiar with the game or the app, but feel free to check Rakaly out if you want, we’re just getting started.

I’m writing this post as whenever Rust is posted on the internet, it tends to elicit polar opposite responses: the evangelists and the disillusioned. Both can be correct in their viewpoints, but when one is neck deep in a Rust side project and stumbles across a discussion that is particularly critical of Rust, self doubt may creep into thoughts. And while self doubt isn’t inherently bad (second guessing can be beneficial), too much doubt can cause one to disengage with their hobby. This post is me sharing struggles seen along the way and realizing that the wins Rust gave me far outweigh the struggles.

One of the visualizations created by Rakaly based on game data

It may seem surprising that someone who has been programming Rust as a hobby for 4 years would feel the need to have vindication now, but this project was my first user facing app with over 15k lines of Rust, and while I’ve programming Rust for a “long” time, I would only consider myself proficient and not an expert by any means.

Also for those who need to hear it: any side project that you’re proud of is a success.

As one might imagine from the title of this post, Rakaly is mainly written in Rust. This single Rust code base serves several use cases:

  • The server side app which receives an EU4 game save file, runs diagnostics, and stores the save.
  • The wasm bundle which runs the same diagnostics client side
  • A shared library that I can give to third party developers so they can integrate the functionality.

In total: a backend server, client side wasm, and a shared library all are derived from the same code base. I can’t imagine doing this in any other language.

Why the focus on client side

Before diving into Rust specifics, I want to take a moment to share my philosophy that it’s important to offer a service that requires a minimal amount of friction for visitors and myself. A solution that can run self contained on the client is perfect. This is where Rust compiled to WASM (accessed through web workers) is critical.

Visitors can jump right in with just their browser without needing an account. No need to download a potentially sketchy executable or give credentials. They just drag and drop a file and all the processing is done inside the browser. Spotty internet, bandwidth limitations, or a slow or down server doesn’t inhibit the visitor’s experience.

For me a client side solution means that the server isn’t critical. A spike in visitors doesn’t necessarily mean a load spike on the server. If no cross user state needs persistence, one may not even need a backend. That’s the route I took with the Rocket League replay parser I wrote (can be seen hosted here and there’s even a community written Python wrapper – I seem to like to program game tools more than play them). With a leaderboard though, backend state is necessary for Rakaly. Though I take comfort in knowing that if I ever want to sunset the app whether for cost or time savings, the users will still be able to use the app, as all the static assets are stored in Cloudflare (and the use of Cloudflare workers and B2 will allow even uploaded game save files to remain useable).

For those curious about the architecture:

  • Cloudflare Workers (static assets + fetch S3 files)
  • Backblaze B2 (S3 compatible storage)
  • DigitalOcean (VPS)
  • Postgres (relational data)
  • Redis (leaderboard, session storage)

Cloudflare ❤️ Backblaze ❤️ DigitalOcean ❤️ Postgres ❤️ Redis

Enough architecture, more Rust.

Wins

Sharing. The headlining win is the shared core. Simplifying things a bit, there are essentially 4 crates in this project:

  • The parser which sets the foundation for all the other crates
  • The server app written with warp
  • A C compatible dynamic library with a nice header file created through cbindgen. Other parties have asked for Rakaly functionality (specifically in a C++ wx app) and this fits the bill perfectly.
  • The wasm interface that is translated by wasm-bindgen
  • Then there is the react frontend that communicates with the wasm through a web worker (can’t block the UI thread!)
  • Additional use cases can be added with ease. I’m thinking about one day adding a native app (different use case than web) and there’ll be an incredible amount of code reuse.

All this sharing allows one person development teams to be anywhere and everywhere at once. Nothing is off limits.

Serde. I’m able to write regular structs that happen to hold deserialized data. It’s important to not have additional ceremony in parsing the many fields as there are over 5000 unique fields that occur in a save. Having to write any additional code to support a field would be overwhelming.

To give an example, below is the only struct needed to deserialize and work with the version that the game save file declares itself to be. Notice that it has the logic to deal with a misspelling in the data, but the code sees the correct spelling.

#[derive(Debug, Clone, Deserialize, Serialize, PartialEq)]
pub struct SavegameVersion {
    pub first: u16,
    pub second: u16,
    pub third: u16,
    #[serde(rename(deserialize = "forth"))]
    pub fourth: u16,
    pub name: String,
}

That’s it. That’s 5 fields covered, only 4,995 more to go.

Additional benefit of serde is that it is parser agnostic. Even though save files can come in two flavors: JSON-like and binary (keys are identified by a 16 bit lookup table), both formats will be deserialized into the same struct. I shudder at the thought of where I would be in this project if I needed to duplicate every field, one for plaintext and another for binary. Serde is a huge productivity gain.

Speed. Save files are zip files that when inflated can reach in excess of 100MB. Parsing these files quickly is of the utmost importance. The same parsing code is executed on the client and server side and so must be fast for both. With Rust, files are parsed and deserialized at over 500 MB/s on the server side and around 100 MB/s through client side wasm. Could other languages be as fast or faster? Sure, especially if they are written in the style of simdjson, but out of the parsers I’ve seen or written myself in years past, none can hold a candle to the performance of Rust.

Tools: The tools that Rust and the Rust community provides are invaluable:

Below is a screenshot from a criterion run showing that the core parsing logic for the binary format is almost reaching 1000 MB/s.

Struggles

With Rust wins comes Rust struggles. Some will be new struggles that may not have been discussed yet, while others are well known, but it’s important to list them all as they all have a share in casting doubt.

Serde. Serde is both a win and struggle. The underlying format of the game save files is proprietary and undocumented – it’s JSON-like but definitely not JSON. Here’s a snippet.

{
    "core": "core1",
    "nums": [1, 2, 3, 4, 5],
    "core": "core2"
}

Notice the core field occurs multiple times and the occurrences don’t follow one another. The end result is that data should be deserialized into:

struct MyDocument {
  core: Vec<String>,
  nums: Vec<u8>,
}

Since serde doesn’t support aggregating multiple occurrences of a field, I needed to buffer the whole document in memory and provide a small translation layer where I’d gather similarly named fields in one place for serde. I wrote a small segment about this inside another post with the intimidating title: Parsing Performance Improvement with Tapes and Spatial Locality. While ideally I’d like to parse the document iteratively, it hasn’t been a blocker on either the client or server side (probably because people are already used to browsers being memory hogs).

2020-08-12 Update: a custom derive macro was created that will aggregate fields marked with #[jomini(duplicated)]. And while it would have been preferable to have this natively in serde, there ended up being a big enough performance improvement around edge cases that made this macro worth the time to maintain.

Compile times. Everyone already expected this, so I’ll keep it short. There’s about 15k lines of Rust total and building the server crate on my 8 core machine takes 9 minutes for a clean build and 6 minutes for incremental. It’s not great, but I haven’t found it intolerable yet. For coping mechanisms: write more code between rebuilds, move code into a lower level crate (compiling a lower level crate will shave compile times down to a hair under a minute, but there’s only so much that can be done when using crates like serde (serde seems full of pros and cons)), or try to be productive during compiles and watch (Rust) youtube videos. On a lower specced machine, compile times do verge on intolerable – my mid-range laptop (about 7 years old) takes a bit over 25 minutes for a clean compile and will make the machine occasionally unresponsive. Been thinking about buying a new laptop.

Test execution times. This may or may not come as a surprise but when working with files that can reach 100MB (and this 100MB can’t really be whittled down), tests can be slow. I previously lauded Rust for how performant the code can be, but in debug mode the opposite is true. I don’t think I’ve seen a language with such a performance gap between debug and production mode. I’ll leave a snippet of my Cargo.toml to explain things.

# We override the test profile so that our tests run in a tolerable time as
# some of the asset files are heavyweight and can take a significant amount of
# time. Here is some timing data recorded to run one test:
#
# cargo test                 0m15.037s
# cargo test (opt-level=3)   0m9.644s
# cargo test (+lto=thin)     0m0.907s
# cargo test --release       0m0.620s
[profile.test]
opt-level = 3
lto = "thin"

So that’s 15s vs 0.6s to run one test. An incredible difference. The tests ran 25x faster in release mode, which may not make a big difference with miniscule tests (25x a tiny number can also be tiny), but these tests are anything but small. So the fix is to essentially configure tests to be optimized before running. As can be imagined, this has a detrimental effect on test compilation, but at least I’m not waiting a half hour for the test suite to finish.

Conclusion

The moment of vindication came after I announced Rakaly to the world. 3rd party developers saw it and wanted some of the functionality in their C++ app. I hadn’t considered this use case. If I had written Rakaly in another language I might be hard pressed to meet this use case and create a shared library for them to integrate. Thankfully, it wasn’t hard and I was able to produce a shared library for them.

I do want to quickly mention that Go could also satisfy every use case as well (I recently learned one can create shared libraries in Go), but it is discouraging that the official Go wiki states that the created wasm bundle will be a minimum of 2MB. TinyGo is listed as a solution, though my preference is still Rust as it has official great support for wasm.

So yes, it should go without saying that Rakaly could have been implemented in other languages (eg: C, C++, Go), but Rust was able to deliver the project quickly, safely, and without too many compromises. This is why I like Rust and will continue to use it, as it can be adapted to whatever I need next.


Reasons to Migrate away from Gatsby

Holding steady against the onslaught of Gatsby bloat and dependencies

Gatsby is a site generation framework. I recently was using it for one of my side projects, sff.life, a site dedicated to small form factor computing (eg: small computers). I decided to migrate away from Gatsby, as it is not a zero cost abstraction for static sites for two reasons:

  • Needless bloat via javascript and JSON bundles
  • Way too many dependencies and releases

Before I give a bit more background and expand on these reasons, I still believe that in many situations, Gatsby benefits outweigh the costs. In this way, Gatsby isn’t “bad” insomuch as one (ie: me) should be privy to exploring the simplest solution first.

Background

I had initially settled on Gatsby as I had used it for another site that I created for a friend. This friend is not technical and so any content they would add would have to be done in a user friendly way. Netlify CMS allows one to make edits in the browser and when coupled with Gatsby, the in-browser preview is quite convincing as one constructs a Gatsby site through Javascript. This made for a great solution for the non-technical and despite the downsides (which we’ll get to), I’d still consider deploying Gatsby + Netlify CMS for non-technical folks.

So since I already had working knowledge of Gatsby, I decided to use it for project for myself, documenting my journey through SFF. I had grand visions of allowing contributors who could help via a non-technical interface. It’s been awhile since I’ve written content for it, as life has gotten in the way until fall, when I plan to breathe life into it again. However, Gatsby has also gotten in the way.

Screenshot of sff.life below

Bloat

Gatsby is not lying when it claims that it is blazing fast. It is fast.

But there is a lot of room for improvement.

When using Gatsby, requesting the article How to undervolt your ryzen cpu results in 18 network requests and a total of 504 kB of resources. As a comparison, requesting the article at the time of this post, only 6 requests are made with 143 kB of resources. That’s a 3x reduction in number of requests and data transfer – a marked improvement.

Where did all of the Gatsby bloat come from? Two major factors: javascript and JSON. And here’s the thing, I can’t tell you why Gatsby needs to bring down 300 kB of this stuff. For instance, there is a page-data.json that is fetched (100 kB) that contains the markdown from which the page was converted. Why a static site would need to see how it was rendered is beyond me and I tried to dig frantically in the gatsby docs over several days to find where this was coming from and remove them. No dice. This made me realize that gatsby is adding too many abstractions on top of a site generation for me to be comfortable with.

I believe there are solutions (or more likely – there are plugins) that can help strip away needless javascript and JSON, but this will clearly be going against the Gatsby mantra.

Here’s a couple of statements from the Gatsby team that I’d like to examine that illustrates their viewpoints on JS.

From the creator of Gatsby:

Why remove JavaScript? That disables your React components on the client, slows down page transitions, prevents service workers from preloading page data, etc.

From the 2nd top contributor on removing JS

you would [lose] a lot of functionality like page prefetching and in-browser navigation, lazy image loading, etc.

I agree that javascript is an integral part of the web, and it’s important to use it to the full extent in web apps, but I would prefer it is not shoved down throats for sites that don’t need it.

For non-apps, I do not see the benefit to page prefetching and in-browser navigation. Visitors to a blog most likely arrived via a google search or a link and are interested in only that one page. Therefore priority number one should be loading the initial page as quick as possible so they can see the content. I do not understand why there would be a significant focus on prefetching and navigation – content is king, let’s focus on getting that in front of the user as fast as possible.

To address lazy loading images – I have good news, browsers natively can lazy load images without any JS as shown below

<img src="my-img.jpg" loading="lazy" />

70% of all users’ browsers support lazy loading images. This percentage will only increase in time, and while 70% is not as comfortable as 90, 95, or 99%, when working with a side project, if a majority of your users can take advantage of an emerging standard then I’m happy to drop the JS.

The Gatsby team isn’t blind to natively supported lazy load images as they will also tack on loading="lazy" (or is it via a plugin?) to the generated page.

I’m satisfied with how I’ve addressed their need for JS, and how it’s not as widely applicable as they imagine. Though, interestingly, it also sounds like Gatsby advertises that it is faster than any equivalent site: from the creator of Gatsby:

PWAs are faster and use less data than a normal website. It’s a bit counterintuitive but in 2017, removing JavaScript makes your site slower and waste more data.

The quote is a bit dated, but I don’t think the author has changed opinions. The quote seems to be implying that all sites should be PWAs. A List Apart’s “Yes, That Web Project Should Be a PWA” agrees with that sentiment, though it acknowledges that PWA’s “oft-touted ‘major’ benefits […] are less applicable” for such a simple site like SFF.life. So when a site requires zero javascript, introducing javascript strictly for PWA seems like a inefficient use of my time and the visitor’s resources.

I also am struggling to follow the second half of that statement, “removing JavaScript makes your site slower and waste more data”. I wish the Gatsby creator elaborated or provided a link because intuitively if I serve a page to a visitor who is only ever going to visit once and this page is as small as possible (no js, inlined css, lazy loading images, etc), then adding anything at all will, by definition, make the site slower and consume more data.

I also have a hypothesis that executing javascript is more draining on the battery than not executing it. To me this seems like an obvious statement (doing something should be more taxing than doing nothing). On the topic of javascript and batteries, recently Chromium announced consideration in throttling javascript timers of background tabs to conserve battery. While I believe Gatsby doesn’t use the affected APIs, it’s undeniable that javascript does have a tangible effect on batteries.

Dependencies

To do anything in Gatsby, a plethora of plugins are needed. Sff.life, as simple as it is (a point I can’t stress enough), needed 21 direct dependencies. These 21 dependencies explodes into a 21k line package-lock.json. A large lockfile seems unmanageable, as one of the recommendations for mitigating vulnerabilities introduced via lockfiles is to “Carefully review changes to lockfiles”, but that won’t be happening to a 21k line file that is for a side project.

Don’t get me wrong, I can appreciate a slim core with plenty of extensibility points; however, Gatsby takes it to the extreme. Less than 4% of the commits on the site are authored by me. I’ve made 40 commits and dependency updates are over 1000 commits. This is in the span of less than a year. That’s hard to stomach and makes me numb to dependency updates. I should be attentive, ensuring that nothing nefarious sneaks in or just keeping up with what’s new or bugs fixed, but I just can’t. Waking up to hundreds of emails about new dependencies prompted me to create a separate email folder that is essentially black holed and ignore github notifications.

Just take a look at the Gatsby Github page and see how many releases there are:

Yes over 13,000 releases. This is almost hard to fathom.

There’s a potential monetary cost to all these dependencies updates. Previously I’ve praised Netlify CMS + Gatsby combination, and one of these benefits is that whenever there was an update to the backing repo, Netlify would build and deploy the latest version. The issue is that when there are so many dependency updates, one quickly closes in on the max allowed build minutes on the free tier. I normally have no problem paying for rendered services, but this is not one of those times.

Maybe upselling builds is how GatsbyJS (the business) drives most of their revenue. But if it’s not already the case, it wouldn’t be a bad idea for Netlify (and maybe other CI / CD companies) to sponsor Gatsby. Make it seem that Gatsby should be used for every situation no matter how simple (a hobby project or personal blog) – then watch as the users are unable to cope with pace with dependencies updates and are forced onto paid plans.

The Solution

Can we reduce the bloat and dependencies? If not then this would be a pretty somber post, but it turns out we can. While I will be recounting the solution I settled upon, keep in mind that any dedicated static site generator will probably be sufficient.

Without further ado, I replaced Gatsby with Hugo. You can view the pull request for in depth details on the change, but it was remarkably straightforward, mainly fixing file paths, and adding a custom shortcode so that images are lazy loaded and sized appropriately.

Some notes on why I chose Hugo:

Hugo is a single binary that can be downloaded and installed easily for any platform.

Hugo extended bundles a SASS compiler, so I could install sanitize.css via npm and write SASS that references this dependency. I have hardly any experience with SASS, but the fact that SASS is included out of the box and can reference node_modules made me ditch postcss in a heartbeat. The new package-lock.json file is now 13 lines.

Hugo has a built in image processing pipeline. I make good use of it to create an array of images at varying sizes so that users don’t have to waste data and time downloading a high fidelity image when a smaller one would suffice.

CI Build times have decreased as well. Here are some back of the envelope timings:

  • old (gatsby): 120 seconds
  • new (hugo): 30 seconds

I enjoy that Hugo abstracts away all of their internal dependencies. When Hugo creates a release, it’s implied that it is a bundle of internal dependencies that work well together. One is not always so lucky when piecing together a mish-mash of javascript dependencies.

Conclusion

Hopefully it’s a bit more clear where Gatsby doesn’t shine as bright. I routinely see articles spinning Gatsby as the panacea for all web sites, and I just wanted to add a dissenting view to the conversation. To me, there’s a time and place for Gatsby, but it’s not for those projects that desire minimal maintenance, cost, and bloat.


Opinionated Guide for Web Development with Rust Web Workers

Webpack configuration: look – don’t touch

I have a toy site that parses Rocket League replays using the rust crate boxcars. Parsing replays is done synchronously and shouldn’t block the browser’s UI thread, so parsing is offloaded to a web worker. Getting this side project to a point where it works + is maintainable (minimum configuration) has been an exercise in doggedness as I spent weeks exploring options.

I believe I’ve found a happy medium and here’s the recipe:

  • Wasm-pack for converting rust to js + wasm
  • Typescript. Refrain from anything special that can’t be compiled (eg: no css modules or importing images). jsx / tsx is fine. No need for babel
  • Use a highly trimmed webpack config to output fingerprinted files
  • Hugo to glue everything together and preprocess sass files and other assets

Here’s the philosophy: By minimizing configuration and dependencies while maintaining production quality, our project becomes easier to maintain

To give a sneak peek, the entire pipeline can be executed as

wasm-pack build crate && tsc && mkdir -p data && webpack --mode production && hugo

or an npm run build.

Other than wasm-pack everything else has basically been derived from process of elimination, so let’s go through this recipe in detail.

Sample Code

Here is some sample code to demonstrate the core interaction between our UI and web assembly. First we need a web worker so that one can pass messages between the UI and worker thread.

src/index.tsx

// unused, but an example node_modules import
import { render, h } from "preact";

const replayWorker = new Worker("worker.js", { type: "module" });
replayWorker.postMessage("LOAD");

While { type: "module" } is a valid option for a worker, no browser supports it, so typescript will leave it in without complaint and have webpack split the worker into a separate bundle.

“worker.js” is not a typo even though it is written in typescript as typescript will convert the code to js before they are passed to the webpack phase.

src/worker.ts

In our worker code we load the web assembly.

let parser: ReplayParser | null = null;
onmessage = async (e) => {
  switch (e.data) {
    case "LOAD":
      const module = await import("../crate/pkg/rl_wasm");
      parser = new ReplayParser(module); 
      break;
  }
}

src/ReplayParser.ts

And we can use a neat typescript trick in ReplayParser to take advantage of the typescript declarations written by wasm-pack to ensure that our javascript is adhering to the wasm return types.

import * as rl_mod from "../crate/pkg/rl_wasm";
type RLMod = typeof rl_mod;
export class ReplayParser {
  mod: RLMod;
  constructor(mod: RLMod) {
    this.mod = mod;
  }
}

Configuration Files

I hate configuration files. I’ve tried to whittle them down, but they are necessary so let’s get them out of the way

tsconfig.json (remove comments if copying + pasting)

{
  "compilerOptions": {
    "strict": true,              // strict code quality
    "module": "esnext",          // allow for dynamic import of wasm bundle
    "moduleResolution": "node",  // allow imports of npm dependnecies
    "target": "es2018",          // Allow modern features as we're using wasm anyways
    "jsx": "react",              // I like preact so these options are for writing tsx
    "jsxFactory": "h",
    "outDir": "./dist/"          // output directory for webpack to consume
  },
  "include": ["src/**/*"]
}

The typescript phase can be ran with npx tsc.

webpack.json

const path = require("path");
const WorkerPlugin = require("worker-plugin");
const AssetsPlugin = require("assets-webpack-plugin");

module.exports = {
  entry: "./dist/index.js", // the output of our typescript phase
  devtool: "source-map",
  plugins: [
    new WorkerPlugin(),
    new AssetsPlugin({ filename: "data/webpack.json" }),
  ],
  output: {
    filename: "[name].[contenthash].js",
    path: path.join(__dirname, "static", "js"),
    publicPath: "/js/",
  },
};

There are only two webpack plugins needed to accomplish this:

  • worker-plugin so that webpack correctly splits the worker script into a fingerprinted file that also can correctly import a fingerprinted web assembly module
  • assets-webpack-plugin which will output a file that contains a map of original js filenames to the fingerprinted ones for the downstream hugo process.

That’s it for webpack. This all can be accomplished with one of the shortest webpack config I’ve ever seen:

Couple things to note:

  • The source maps are distributed into production. I have no problem with this and neither should you.
  • The source maps will be of the compiled javascript (not original typescript source). Since typescript generates modern javascript that is quite close to the original typescript, one shouldn’t have a problem stepping through the code.

The webpack phase can be ran with npx webpack --mode production.

Directory structure

The project directory should be setup with these folders:

  • assets: owned by hugo. Place files that will be processed by hugo here (images, sass, etc)
  • crate: our rust crate that wasm-pack compiles to wasm
  • cypress: (optional) used for end to end integration testing.
  • data: owned by hugo but generated by the build process. We configure webpack to output a webpack.json here which contains a mapping of original source file names to fingerprinted ones.
  • dev: (optional) misellaneous files (eg: nginx.conf used for static hosting) or files used in multiple places to be symlinked.
  • layouts: owned by hugo. Place an index.html in this directory.
  • src: typescript code
  • static: owned by hugo. Where one sticks static content, so this is where we’ll put the resulting js bundles written by webpack as well as any static images
  • tests: unit tests written with jest

Hugo

Let’s take a look at the most unorthodox piece of this puzzle – using the static site generator hugo (specifically hugo extended). Over time I’ve grown more sure of this decision of using hugo for a few reasons:

  • Hugo’s philosophy of convention over configuration means one can introduce it with very little configuration.
  • Hugo is distributed as a single binary for all platforms
  • Hugo (extended) has a built in Sass preprocessor, so no additional dependencies or configuration for css processing
  • Hugo can fingerprint assets (which ruled out the Rust alternative, zola)
  • Hugo can work hand in hand with webpack through through the data/webpack.json file generated by assets-webpack-plugin so hugo can insert the fingerprinted js links
  • A static site generator is purpose made to glue everything together to make a cohesive site through a built in templating language, which unlike webpack needs plugin after plugin and tons of configuration to try and replicate.

Aside: the reason why we need webpack to fingerprint the js (instead of having hugo do it) is that webpack also spits out multiple fingerprinted js files even with a single entry point (eg: web worker, wasm, extracted vendor dependencies)

Those familiar with the hugo ecosystem, may recognize this as a heavily simplified victor-hugo setup. In my opinion this guide is better due to it’s simplicity and being opinionated can drastically decrease the cognitive overhead. That is the goal here. Supporting everything under the sun leads to adding too much magic or configuration.

Yes it seems like overkill to introduce a static site generator for this project, but I believe it is the right approach. There is no magic glue between any of the components. Nothing really to go wrong.

I see people complain about the complexity of hugo and all the ceremony it takes to set it up but I only had to move around some files (see directory structure section described above) and add a single config.toml file. Brace yourself, it’s long:

disableKinds = ["taxonomy", "taxonomyTerm", "RSS", "sitemap"]

Even this isn’t required, but for a simple project I don’t want all those extra files generated.

To import our main javascript into app:

<script type="text/javascript" src={{ .Site.Data.webpack.main.js | relURL }}></script>

Hugo fits nicely into the equation. While hearing that hugo is a static site generator may scare of SPA enthusiasts, don’t be, this works well for a SPA setup.

JS Bundlers

I’ve tried to love the parcel bundler. It’s zero configuration setup works well most of the time, and while it still has a place for standard projects that need to hit the ground running – I’ve grown disillusioned with it for a Rust WebAssembly purposes:

  • Bizarre, baroque, and hard to reproduce errors
  • Incorrect cache busting causing stale files to break the production web app (I’ve had to reset to a poor person’s implementation of an asset pipeline: sha256 + sed)
  • And a whole host of other problems revolving around rust.
  • That specifying typescript compilation target means nothing without a browserlist.

Such began the search for other bundlers. I dabbled in the “no bundler” snowpack, but it’s reliance on etags instead of fingerprinting assets relects poorly in lighthouse performance audits (though this may be changing). Also the potential for loading each source file as a separate request is frightening. I need fingerprinting.

Webpack is a natural replacement for parcel, except I have a big gripe with webpack: it’s configuration. I’m sure if you polled developers how they feel about webpack configuration it’d be one word: magic. I find that they are write once and cross fingers it’s perfect for eternity. I’m not ok with trading one type of magic (parcel’s one size fits all) for another (webpack’s unintelligible config).

It seems everyday there is a new article about how to configure webpack (the irony of me writing an article on the same topic isn’t lost on me). Here are some top results for webpack “How to configure Webpack 4 from scratch”, “A tale of Webpack 4 and how finally configure it”, “An annotated webpack 4 config” – these articles contain bloated dependencies and configs. The last example in particular tops out at 60 dependencies and a 130 line config just to build the project. It can feel like webpack is a hammer and everything is a nail. No you don’t need hot module replacement (HMR), css modules, mdx, babel, a separate dev / prod config. Keep it simple.

There are a ton of webpack intermediaries that claim a zero or simplified config: create-react-app, nwb, neutrino, poi but they all fell short and left a sour taste in my mouth as these intermediaries only service their use case and rust + webassembly isn’t in it. I spent way more time trying to simplify a webpack setup through these tools than if I had just written it myself.

Ts-loader

One can consolidate the typescript then webpack invocations into the webpack config if they want to add a dash of complexity to their webpack config and add the ts-loader dependency. Personally, I don’t see the need for this consolidation as the “simplest setup” is still far too long to be a replacement for a single tsc command. Here is what our config would look like if we added in ts-loader

Not recommended

const path = require("path");
const WorkerPlugin = require("worker-plugin");
const AssetsPlugin = require("assets-webpack-plugin");

module.exports = {
  entry: "./src/index.tsx",
  devtool: "source-map",
  module: {
    rules: [
      {
        test: /\.tsx?$/,
        loader: "ts-loader",
      },
    ],
  },
  resolve: {
    extensions: [".ts", ".tsx", ".js"],
  },
  plugins: [
    new WorkerPlugin(),
    new AssetsPlugin({ filename: "data/webpack.json" }),
  ],
  output: {
    filename: "[name].[contenthash].js",
    path: path.join(__dirname, "static", "js"),
    publicPath: "/js/",
  },
};

The benefits of ts-loader:

  • Sourcemaps can now contain the original typescript code (may need to update tsconfig.json to enable source maps)

If it had been new TsLoaderPlugin() then it could maybe be worth considering, but the extra configuration seems a bit much.

Typescript

I write my projects in Typescript because I like static types.

SASS

Sass is not my first choice for a css preprocessor. I had gotten quite fond of using postcss in other projects, but I was not fond of adding the necessary configuration + multiple dependencies. If Hugo had an out of the box experience with postcss, I would have gone with that, but it doesn’t – it requires configuration and for the postcss dependency installed out of band.

So sass it is. At first I was grimacing at the thought of the work converting the postcss dialect chosen to scss, but it turns out that browsers have really good support for modern css features so there was actually a minimum amount of conversion necessary.

Sass usage in hugo is quite pleasant. I define an variable to hold a dictionary of options dictating we want to have includes from “node_modules” and that we want the result css compressed

{{ $opts := (dict "outputStyle" "compressed" "includePaths" (slice "node_modules")) }}

Critical css can have it’s own stylesheet and inlined

{{ $critical := resources.Get "critical.scss" | toCSS $opts }}
<style>{{$critical.Content | safeCSS }}</style>

With the rest of the app’s styles coming later.

{{ $app_css := resources.Get "app.scss" | toCSS $opts | fingerprint }}
<link rel="stylesheet" type="text/css" href="{{ $app_css.RelPermalink }}" />

One can also create a stylesheet that vendors all 3rd party style sheets so small app css updates don’t trigger a download of potentially heavy 3rd party stylesheet.

Testing

Cypress is used for end to end tests against the actual site. These tests confirm ability of the WebAssembly and js to function against real world input. Integration tests allows the project to skip component testing in favor of real life interactions. With no component tests, a project is free to swap out the internals (ie: move to a different framework or eschew all of them) and not invalidate any tests (easy maintenance). There are still unit tests, but only those that don’t use the DOM.

I’m not too opinionated with unit tests. Ideally js would have a built in test runner kinda like rust, but for now jest with ts-jest is fine (ts-jest necessary until jest natively supports es6 modules). Removing jest from the dependencies shrunk a package-lock.json from 12k to 6k lines, which is incredible, so there is still a lot of room left for improvement for unit testing.

FAQ

How to handle cross platform contributors

The command posted earlier:

wasm-pack build crate && tsc && mkdir -p data && webpack --mode production && hugo

Is not cross platform and leaves Window users out in the cold. This can be ok if this project of yours is for yourself and you don’t use windows. Don’t sweat the details. With WSL so popular these days, it’s not even a problem.

If native windows development is absolutely necessary, one can write cross platform node js scripts.

Development & Livereload

Rust and typescript aren’t known for their quick compilation times and these lags in response can be felt as even the tiniest change would necessitate an entire rebuild with npm run build. It would be nice when a typescript file changed to not need to execute the rust toolchain. When I edit frontend code, I like to see the browser refreshed quickly with my changes without effort on my part.

Fortunately, 3 out of the 4 tools used so far natively support a watch mode and they meld together nicely. Below is a script I have that is aliased to npm start

#!/bin/bash

npm run build
npx tsc --watch --preserveWatchOutput &
npx webpack --mode production --progress --watch &
hugo serve --renderToDisk

Modifying a typescript file causes a quick site reload. Perfect for quick iteration. Even though I exclusively use webpack in production mode, I’ve not found watch recompilation times to be an issue.

The one tool that doesn’t support a watch mode is wasm-pack, but it is a known issue with workarounds using fswatch, cargo-watch, or entr, etc.

while true; do
  ls -R -d crate/Cargo.toml crate/Cargo.lock crate/src/ \
    | entr -d wasm-pack build crate;
done

In practice, I have not needed to use the above workaround as it’s been intuitive enough for me to execute wasm-pack build after any rust modifications. After wasm-pack is done executing, webpack sees the wasm was modified and automatically picks it up and a new site is generated.

How to deploy

I prefer deploying sites by packaging them up into docker images. No exception here, just ensure that no generated files are copied into the dockerfile to aid reproducibility.

If instead you are uploading the public/ folder directly someplace, it’s probably a good to clean out the webpack generated static/js on every build with the clean-webpack-plugin. Since I don’t consider this essential, I’ve excluded this from posted webpack config. And when the additional configuration is approximately the same size of the equivalent command (rm -rf ./static/js), the configuration is preferred. Anyways, here is the config if one wants to clean directory on every build.

const path = require("path");
const WorkerPlugin = require("worker-plugin");
const AssetsPlugin = require("assets-webpack-plugin");
const { CleanWebpackPlugin } = require('clean-webpack-plugin');

module.exports = {
  entry: "./dist/index.js", // the output of our typescript phase
  devtool: "source-map",
  plugins: [
    new CleanWebpackPlugin(),
    new WorkerPlugin(),
    new AssetsPlugin({ filename: "data/webpack.json" }),
  ],
  output: {
    filename: "[name].[contenthash].js",
    path: path.join(__dirname, "static", "js"),
    publicPath: "/js/",
  },
};

Splitting Vendored JS

In the event that you are embedding heavy dependencies that don’t change at nearly the same rate as your application, one can split the dependencies out in order to ease the amount of data that users need to pull down when one deploys their app. One can do this with a optimization.splitChunks

  optimization: { splitChunks: { chunks: "all" } },

That’s it. Don’t get carried away by all the options. Full config shown below:

const path = require("path");
const WorkerPlugin = require("worker-plugin");
const AssetsPlugin = require("assets-webpack-plugin");

module.exports = {
  entry: "./dist/index.js",
  devtool: "source-map",
  plugins: [
    new WorkerPlugin(),
    new AssetsPlugin({ filename: "data/webpack.json" }),
  ],
  output: {
    filename: "[name].[contenthash].js",
    path: path.join(__dirname, "static", "js"),
    publicPath: "/js/",
  },
  optimization: { splitChunks: { chunks: "all" } },
};

Again, only use this optimization if dependencies significantly outweigh the app and one deploys often. I have a preact app where I don’t use this trick as there really isn’t any dependency baggage so there isn’t a reason to split dependencies off. Other times it’s a fat react + antd where I don’t want my users to download these again if I simply fix a typo in the app.

Proxy backend requests

When developing a frontend, it is beneficial to interact with the backend. While this can vary greatly from project to project, for my projects I’m able to reuse everything that is used for production. I have a docker container for housing the API and containers for S3 (minio), postgres, and redis. All of these are orchestrated by docker compose.

And since I’m already using nginx to host the frontend assets, I can simply bind mount /usr/share/nginx/html to the ./public directory to serve all the latest assets as they change. This is why the start.sh script shown earlier has hugo use --renderToDisk.

The last thing I do is have a traefik container which is the entrypoint for all requests route /api requests to the api server and / to the nginx server.

I’m quite satisfied with the result:

  • Purpose built tools watching their domain for any changes
  • Re-using production containers for development

Linting / Formatting / Other tools

For formatting, run prettier without installing as a dev dependency. Pinning a formatter to a specific version seems a tad asinine, as one should always be using the latest version. Just like how I don’t customize rustfmt through rustfmt.toml, don’t customize prettier, as it can be confusing where configuration is located (eg package.json, .prettierrc, prettier.config.js) and serves only to pollute the project directory.

I eschew linting in frontend projects. It’s a bit of a surprise considering how much I love clippy for rust projects, but I don’t have any love for eslint. Every guide on how to get started with eslint, typescript, prettier, react, etc has one installing a dozen linting dependencies and crafting a configuration that can reach hundreds of lines. This is not ok. My preference would be for an opinionated, self-contained tool that does not need to be added as a dev dependency to run lints. Rome can’t come fast enough. Until then, typescript type checking + prettier get me most of the way there.

As for other tools – when in doubt don’t use them:

  • You don’t need css modules when you can write regular sass files or attach styles directly to the component
  • You don’t need to Extract & Inline Critical-path CSS in HTML pages: split up your sass stylesheets and have hugo inline the style in the html.
  • You don’t need to Remove unused CSS when you import stylesheets of just the components you need
  • You don’t need dynamically loaded images fingerprinted (this is where etags do come in handy)
  • You don’t need it. It’s not worth maintaining an extra dependency + configuration

CSS in JS

Thus far, I’ve been mainly advocating for using a separate sass stylesheet. And I believe that one should continue using a separate stylesheet. However I recognize arguments of the shortcomings of css. So with that said, I have some thoughts if one finds the use of css in js a necessity:

  • Don’t use styled-components, as one “will have to do a little bit of configuration” to use with Typescript.
  • Consider avoiding TypeStyle even though it is a css in js library that has a deep integration with Typescript, as developer mind share will be less due to the library targeting a specific language and the slow pace of development.
  • styled-jsx requires configuration and doesn’t play nice with typescript.

The one css in js library that checked most the boxes is framework agnostic version of emotion mainly due to “no additional setup, babel plugin, or other config changes [required].” I can drop it into any project and expect it to work. The same can’t be said the @emotion/core package which needs the magic comment /** @jsx jsx */ at the head of the file in order to work (and was unable to witness the css typechecking promised by the @emotion/core package).

There is no pressure to jump on the css in js bandwagon. I had a pure preact project, migrated to css in js with emotion, and then migrated back to standalone stylesheet as the styles in that project hadn’t grown unruly enough that I felt the need for a css in js solution. Emotion is relatively lightweight, but for very lightweight sites it’s effect on the bundle size can be felt.

In Conclusion

It may seem like I’m bending over backwards to find pieces that fit the web development puzzle, but I’m left quite satisfied. Every tool chosen (wasm-pack, typescript, webpack, hugo, emotion (optional)) are industry standard and no one can say they are an obscure or bad choice. I like to lay out the investigative process, which means weeding out the tools that don’t work.

Some parting words of advice:

  • Keep the project simple
  • You don’t need that configuration
  • You don’t need that dependency
  • Use purpose built tools for the job

Yes there will be those who clearly shouldn’t follow this recipe, but I needed to reduce my overhead as a I switched between projects and reducing magic through dependencies and configuration has proven valuable.