Welcome to NBSoftSolutions, home of the software development company and writings of its main developer: Nick Babcock. If you would like to contact NBSoftSolutions, please see the Contact section of the about page.

Lessons learned: ZFS, databases, and backups

The graves of data without backups

I have a self-hosted Nextcloud for cloud storage installed on a ZFS Raid-6 array. I use rclone to keep my laptop in sync with my cloud. I was setting up a new computer wanted a local copy of the cloud, so I executed rclone sync . nextcloud:. This ended up deleting a good chunk of my cloud files. The correct command was rclone sync nextcloud: .. The manual for rclone sync includes this snippet:

Important: Since this can cause data loss, test first with the –dry-run flag to see exactly what would be copied and deleted.

✅ - Lesson: Prefer rclone copy or rclone copyto where possible as they do not delete files.

Oof. Now that I just deleted a bunch of files, it became a test to see if I could restore them. Since I use zfs-auto-snapshot I figured rolling back to the most recent snapshot would fix the problem. So I logged onto the server to see zfs list

NAME                        USED  AVAIL  REFER  MOUNTPOINT
tank                       1.03T  9.36T   941G  /tank

I have only a single ZFS dataset. So if I rolled back to a snapshot, I’d be rolling back every single application, database, media files to a certain point in time. Since I just executed the erroneous rclone command, I thought it safe to rollback everything to previous snapshot taken a few prior. So I did it.

✅ - Lesson: Use more datasets. Datasets are cheap and configured to have different configuration (sharing, compression, snapshots, etc). The FreeBSD handbook on zfs states:

The only drawbacks to having an extremely large number of datasets is that some commands like zfs list will be slower, and the mounting of hundreds or even thousands of datasets can slow the FreeBSD boot process. […] Destroying a dataset is much quicker than deleting all of the files that reside on the dataset, as it does not involve scanning all of the files and updating all of the corresponding metadata.”

I regretted rolling back. I opened up Nextcloud to see a blank screen. Nextcloud relies on MySQL and logs showed severe MySQL errors. Uh oh, why would MySQL be broken when it had been working at the provided snapshot? MySQL wouldn’t start. Without too much thought I incremented innodb_force_recovery all the way to 5 to get it to start, but then no data was visible. I had no database backups.

✅ - Lesson: Always make database backups using proper database tools (mysqldump, pg_dumpall, .backup). Store these in a snapshotted directory in case you need to rollback the backup.

So I scrapped that database, but why had it gone awry? Here I only have hypotheses. The internet is not abundant in technicians diagnosing why a file system snapshot of a database failed, but here are some good leads. A zfs snapshot is not instantaneous. A database has a data file and several logs that ensure that power loss doesn’t cause any corruption. However, if the database and these logs get out of sync (like they might with a snapshot), you might see the database try and insert data into unavailable space. I say “might” because with a low volume application or snapshotting at just the right time, the files may be in sync and you won’t see this problem.

✅ - Lesson: If you are taking automatic zfs snapshots do not take snapshots of datasets containing databases: zfs set com.sun:auto-snapshot=false tank/containers-db

I went back through the initial installation for Nextcloud. Thankfully, it recognized all the files restored from the snapshot. I thought my troubles were over, but no such luck. I wrote an application called rrinlog that ingests nginx logs and exposes metrics for Grafana (previously blogged: Replacing Elasticsearch with Rust and SQLite). This application uses SQLite with journal_mode=WAL and I started noticing that writes didn’t go through. They didn’t fail, they just didn’t insert! Well, from the application’s perspective, the data appear to insert, but I couldn’t SELECT them. A VACUUM remarked that the database was corrupt.

✅ - Lesson: SQLite, while heavily resistant to corruption, is still susceptible, so don’t forget to backup SQLite databases too!

Maybe it’s a bug in the library that I’m using or maybe it’s a SQLite bug. An error should have been raised somewhere along the way, as I could have caught the issue earlier and not lost as much data. Next step was to recover what data I had left using .backup. Annoyingly, this backup ended with a ROLLBACK statement, so I needed to hand edit the backup.

After these trials I’ve changed my directory structure a little bit and applied all the lessons learned:

|- applications (snapshotted)
|- databases (not snapshotted)
|- database-backups (snapshotted)

It’s always a shame when one has to undergo a bit of stress in order to realize best practices, but the hope is that by having this experience, I should apply these practices in round 1 instead of round 2.


Routing Select Docker Containers through Wireguard VPN

Wouldn’t be a docker post without an image of a ship

Scenario: You have a host running many Docker containers. Several sets of these containers need to route traffic through different VPNs. Below I’ll describe my solution that doesn’t resort to VMs and doesn’t require modification to any docker images.

This post assumes that one has already set up working wireguard servers, and will focus only on client side. For a quick wireguard intro: see WireGuard VPN Walkthrough.

Solution #1

If you’re familiar with the openvpn client trick then this will look familiar. We’re going to create a Wireguard container and link all desired containers to this Wireguard container.

First we’re going to create a Wireguard Dockerfile:

FROM ubuntu:16.04

RUN apt-get update && \
    apt-get install -y software-properties-common debconf-utils iptables curl && \
    add-apt-repository --yes ppa:wireguard/wireguard && \
    apt-get update && \
    echo resolvconf resolvconf/linkify-resolvconf boolean false | debconf-set-selections && \
    apt-get install -y iproute2 wireguard-dkms wireguard-tools curl resolvconf

COPY wgnet0.conf /etc/wireguard/.
COPY startup.sh /.

EXPOSE <PORT-LIST?>

ENTRYPOINT ["/startup.sh"]

Some notes about this Dockerfile:

  • One needs to supply wgnet0.conf (described in previously linked post), as it’ll contain VPN configuration
  • The EXPOSE contains all the applications in the VPN that will need their port exposed. Feel free to omit EXPOSE as it’s completely optional
  • The startup.sh script connects to the VPN, verifies that the container’s IP address is the same VPN server’s, and if not to stop the container. Implementation below.
#!/bin/bash
set -euo pipefail

wg-quick up wgnet0

VPN_IP=$(grep -Po 'Endpoint\s=\s\K[^:]*' /etc/wireguard/wgnet0.conf)

function finish {
    echo "$(date): Shutting down vpn"
    wg-quick down wgnet0
}

# Our IP address should be the VPN endpoint for the duration of the
# container, so this function will give us a true or false if our IP is
# actually the same as the VPN's
function has_vpn_ip {
    curl --silent --show-error --retry 10 --fail http://checkip.dyndns.com/ | \
        grep $VPN_IP
}

# If our container is terminated or interrupted, we'll be tidy and bring down
# the vpn
trap finish TERM INT

# Every minute we check to our IP address
while [[ has_vpn_ip ]]; do
    sleep 60;
done

echo "$(date): VPN IP address not detected"

Some notes:

  • Every minute we check to see what our IP address is from dyndns.com. This is the same endpoint that the dynamic dns client, ddclient, uses.
  • A more efficient kill switch would integrate with ip tables to ensure that all traffic is routed via the VPN. But don’t use the example provided in wg-quick as that will block traffic from the host (192.168.1.x) to the containers. I have a cool trick later on with port forwarding.
  • When the script exits, it brings down the VPN

The best way to see this in action is through a docker compose file. We’ll have grafana traffic routed through the VPN.

version: '3'
services:
  wireguard:
    container_name: 'wireguard'
    build: .
    restart: 'unless-stopped'
    sysctls:
      - "net.ipv4.conf.all.rp_filter=2"
    cap_add:
      - net_admin 
      - sys_module
    ports:
      - '3000:3000' # grafana
  grafana:
    container_name: 'grafana'
    image: 'grafana/grafana'
    restart: 'unless-stopped'
    network_mode: "service:wireguard"

Notes:

  • Just like the OpenVPN solution, we need the NET_ADMIN capability
  • Since Wireguard is a kernel module we need the SYS_MODULE capability too
  • See the sysctls configuration? This affects only the container’s networking.
  • All ports exposed for the wireguard container are the ones that would normally be exposed in other services. For instance, I have wireguard exposing the grafana port 3000.
  • network_mode: "service:wireguard" is the magic that has grafana use the wireguard vpn

When dependant services bind to wireguard’s network they are binding to that container’s id. If you rebuild the wireguard container, you’ll need to rebuild all dependant containers. This is somewhat annoying. Ideally, they would bind to whatever container’s network that had the name of wireguard.

Quick quiz, which of these addresses will resolve to our grafana instance (taken from the host machine)?

localhost
[host-ip:192.168.1.6]
[vpn-ip:10.192.122.2]
[vpn-external-ip]
[wireguard-docker-interface:172.19.0.1]
[docker0:172.17.0.1]

Correct answers:

localhost
[wireguard-docker-interface:172.19.0.1]
[docker0:172.17.0.1]

If I hadn’t ran the experiment, I would have gotten this wrong!

If we log onto the VPN server, we see that only curling only our client IP address will return Grafana. This is good news, it means we are not accidentally exposing services on our VPN’s external IP address. It also allows a cool trick to see the services locally through the host machine without being on the VPN. Normally one would would put in http://host-ip.com:3000 to see Grafana, but as we just discovered, that no longer routes to Grafana because it lives on the VPN. We can, however, ssh into the host machine and port forward localhost:3000 to great success!

The end result gives me a good feeling about the security of the implementation. There is no way someone can access the services routed through the VPN unless they are also on the VPN, they are on the host machine, or port forward to the host machine. We have a rudimentary kill switch as well for some more comfort.

Solution #2

Our second solution will involve installing Wireguard on the host machine. This requires gcc and other build tools, which is annoying as the whole point of docker is to keep hosts disposable, but we’ll see how this solution shakes out as it has some nice properties too.

Initial plans were to follow Wireguard’s official Routing & Network Namespace Integration, as it explicitly mentions docker as a use case, but it’s light on docker instructions. It mentions only using the pid of a docker process. This doesn’t seem like the “docker” approach, as it’s cumbersome. If you are a linux networking guru, I may be missing something obvious, and this may be your most viable solution. For mere mortals like myself, I’ll show a similar approach, but more docker friendly.

First, wg-quick has been my crutch, as it abstracts away some of the routing configuration. Running wg-quick up wgnet0 to have all traffic routed through the Wireguard interface is a desirable property, and it was a struggle to figure out how to route only select traffic.

For those coming from wg-quick we’re going to be doing things manually, so to avoid confusion, I’m going to be creating another interface called wg1. Our beloved DNS and Address configurations found in wgnet0.conf have to be commented out as they are wg-quick specific. These settings are explicitly written in manual invocation:

ip link add dev wg1 type wireguard
wg setconf wg1 /etc/wireguard/wg1.conf
ip address add 10.192.122.2/24 dev wg1
ip link set up dev wg1
printf 'nameserver %s\n' '10.192.122.1' | resolvconf -a tun.wg1 -m 0 -x
sysctl -w net.ipv4.conf.all.rp_filter=2

At this point if your VPN is hosted externally you can test that the Wireguard interface is working by comparing these two outputs:

curl 'http://httpbin.org/ip'
curl --interface wg1 'http://httpbin.org/ip'

For my future self, I’m going to break down what just happened by annotating the commands.

# Create a wireguard interface (device) named `wg1`. The kernel knows what a 
# wireguard interface is as we've already installed the kernel module
ip link add dev wg1 type wireguard

# Point our new wireguard interface at the VPN server and allocate addresses
# for the interface
wg setconf wg1 /etc/wireguard/wg1.conf
ip address add 10.192.122.2/24 dev wg1

# Start the interface and add the VPN server as our DNS nameserver. This is so
# our VPN will resolve hostnames like httpbin.org or google.com.
ip link set up dev wg1
printf 'nameserver %s\n' '10.192.122.1' | resolvconf -a tun.wg1 -m 0 -x

# rp_filter is reverse path filtering. By default it will ensure that the
# source of the received packet belongs to the receiving interface. While a nice
# default, it will block data for our VPN client. By switching it to '2' we only
# drop the packet if it is not routable through any of the defined interfaced.
sysctl -w net.ipv4.conf.all.rp_filter=2

Now for the docker fun. We’re going to create a new docker network for our VPN docker containers:

docker network create docker-vpn0 --subnet 10.193.0.0/16

Now to route traffic for docker-vpn0 through our new wg1 interface:

ip rule add from 10.193.0.0/16 table 200
ip route add default via 10.192.122.2 table 200

My layman understanding is that we mark traffic from our docker subnet as “200”, kinda like fwmark. We then set the default route for the docker subnet to our wg1 interface. The default route allows the docker subnet to query unknown IPs and hosts (ie. everything that is not a docker container in the 10.193.0.0/16 space). By having the route be more specific, as it mentions a table, data is routed through wg1 instead of eth0.

You can test it out with:

docker run -ti --rm --net=docker-vpn0 appropriate/curl http://httpbin.org/ip

Once we docker network remove docker-vpn0, we slim down our docker compose file.

version: '3'
services:
  grafana:
    container_name: 'grafana'
    image: 'grafana/grafana'
    restart: 'unless-stopped'
    ports:
      - '3000:3000'
    networks:
      wireguard: {}
networks:
  wireguard:
    ipam:
      config:
        - subnet: 10.193.0.0/16

Now when we bring up grafana, it will automatically be connected through the VPN thanks to the subnet routing.

If we want to bring the VPN up on boot, we need to create /etc/network/interfaces.d/wg1 with encoded commands:

auto wg1
iface wg1 inet manual
pre-up ip link add dev wg1 type wireguard
pre-up wg setconf wg1 /etc/wireguard/wg1.conf
pre-up ip address add 10.192.122.2/24 dev wg1
up ip link set up dev wg1
post-up /bin/bash -c "printf 'nameserver %s\n' '10.192.122.1' | resolvconf -a tun.wg1 -m 0 -x"
post-up ip rule add from 10.193.0.0/16 table 200
post-up ip route add default via 10.192.122.2 table 200
post-up sysctl -w net.ipv4.conf.all.rp_filter=2
post-down ip link del dev wg1

The auto wg1 is what starts the interface automatically on boot, else we’d have to rely on ifup and ifdown. Everything else should look familiar.

The last thing that needs mentioning is the kill switch. We’ve seen calling curl inside our networked docker container. We could use this to periodically check the IP address is as expected. But I know we can do better, but I can’t quite yet formulate a complete solution, so I’ll include my work in progress.

We can deny all traffic from our subnet with the following:

ip route add unreachable 0.0.0.0/0 table 200

But how to run this command when the VPN disintegrates? I’ve thought about putting it in a post-down step for wg1, but I don’t think it’s surefire approach. The scary part is if wg1 goes down, the docker ip table rule is no longer effective, so instead of dropping packets for an interface that no longer exists, they are sent to the next applicable rule which is eth0! We have to be smarter. For reference, the kill switch used as an example in wg-quick:

PostUp = iptables -I OUTPUT ! -o %i -m mark ! --mark $(wg show %i fwmark) -j REJECT
PreDown = iptables -D OUTPUT ! -o %i -m mark ! --mark $(wg show %i fwmark) -j REJECT

The end result should look something like this. It will be more complete than any curl kill switch. I didn’t want to bumble through to a halfway decent solution, so I called it a night! If I come across the solution or someone shouts it at me, I’ll update the post.

Conclusion

I think both solutions should be in one’s toolkit. At this stage I’m not sure if there is a clear winner. I ran the first solution for about a week. It can feel like a bit of a hack, but knowing that everything is isolated in the container and managed by docker can be a relief.

I’ve been running the second solution for about a day or so, as I’ve only just figured out how all the pieces fit together. The solution feels more flexible. The apps can be more easily deployed anywhere, as there is nothing encoded about a VPN in the compose file. The only thing that will stand out as different is the networking section. I also find pros and cons to having the VPN managed by the host machine. On one hand, having all the linux tools and wg show readily available to monitor the tunnel is nice. Like collectd, it’s easiest to report stats on top level interfaces. But on the other hand, installing build tools is annoying and managing routing tables makes me anxious if I think too much about it. In testing these solutions, I’ve locked myself out of a VM more than once – forcing a reboot, and I don’t take rebooting actual servers lightly.

Only time will tell which solution is best, but I thought I should document both.


Integrating Kotlin with a Dropwizard App

Witness the beauty of Kotlin (not actually an image of Kotlin island)

Kotlin is a newish language that runs on the JVM and integrates seamlessly with existing Java code. Kotlin is concise, ergonomic, and thoughtful. We’ll look at low cost and frictionless ways of integrating Kotlin into an existing Java codebase. You can add in Kotlin piecemeal so there is very little time lost to writing Kotlin today. For purposes of demonstration our frame of mind will be from writing a Dropwizard app.

Setup

Basic structure of our app:

  • JDK8
  • Maven for building
  • Intellij IDE

Your setup may be different, but can always be adapted.

When creating your first Kotlin class, Intellij will ask to modify your pom.xml. Here’s what it’s adding:

A couple of dependencies

<dependency>
    <groupId>org.jetbrains.kotlin</groupId>
    <artifactId>kotlin-stdlib-jdk8</artifactId>
    <version>${kotlin.version}</version>
</dependency>
<dependency>
    <groupId>org.jetbrains.kotlin</groupId>
    <artifactId>kotlin-test</artifactId>
    <version>${kotlin.version}</version>
    <scope>test</scope>
</dependency>

The kotlin stdlib defines all the function goodies Kotlin brings to the table and also Java interoperation. It will increase your shaded jars by approximately 1MB (tad overestimate).

We also can’t forget to instruct maven to compile our kotlin code (something has to convert it into JVM bytecode).

<plugin>
    <groupId>org.jetbrains.kotlin</groupId>
    <artifactId>kotlin-maven-plugin</artifactId>
    <version>${kotlin.version}</version>
    <executions>
        <execution>
            <id>compile</id>
            <phase>compile</phase>
            <goals>
                <goal>compile</goal>
            </goals>
        </execution>
        <execution>
            <id>test-compile</id>
            <phase>test-compile</phase>
            <goals>
                <goal>test-compile</goal>
            </goals>
        </execution>
    </executions>
    <configuration>
        <jvmTarget>1.8</jvmTarget>
    </configuration>
</plugin>

Data Classes

Value objects are key to any application. Having small, immutable, hashable classes that are equal to another instance that contains the same values make understanding code significantly easier, as there can’t be clever usage tricks / hacks.

I currently write my Java value objects using AutoValue. Why AutoValue? Even though the amount of code to write dramatically decreases, it is still a non-negligible amount to write and maintain. Below is PointJ class for representing x and y coordinates using AutoValue.

@AutoValue
public abstract class PointJ {
    public abstract int x();
    public abstract int y();

    public static Builder builder() {
        return new AutoValue_PointJ.Builder();
    }

    public abstract Builder toBuilder();

    public static PointJ create(int x, int y) {
        return builder()
                .x(x)
                .y(y)
                .build();
    }

    public PointJ withX(int x) {
        return toBuilder().x(x).build();
    }

    public PointJ withY(int y) {
        return toBuilder().y(y).build();
    }

    @AutoValue.Builder
    public abstract static class Builder {
        public abstract Builder x(int x);

        public abstract Builder y(int y);

        public abstract PointJ build();
    }
}

The generated class (AutoValue_PointJ) expands to 101 lines, so we’re saving anywhere between 2x and 3x the amount of code we’d need to write otherwise. A big win!

Aside: using a builder here was not necessary, but keeps a certain form of consistency as classes scale up to additional members. Using a builder makes it obvious where x and y is set.

Using Kotlin data classes, we need only a single line

data class Point(val x: Int, val y: Int)

Quick usage in Kotlin:

fun usePoint() {
    val point = Point(x = 10, y = 20)
    val x = point.x

    // Simulate translation along y-axis and destructure
    // the data class into two variables
    val (new_x, new_y) = point.copy(y = 30)
}
  • No need for a builder anymore as we see the explicit x = 10 in initialization
  • Data is immutable (you can’t reassign x or y to different values)
  • Easily create copies with updated properties

How does the same usage look like in Java?

final Point point = new Point(10, 20);
final int x = point.getX();
final Point newPoint = point.copy(x, 30);
  • The lack of explicit argument names make instantiation more error prone (may be contrived with a Point class, but a concern nonetheless). The fix would be to create a custom builder – but I understand if this suggestion is met with disdain.
  • The function getX() is needed in Java else someone could change the value of x and ruin immutability
  • To update a point (and receive a new value) we have to pass in previous values (creating helper functions like withX and withY are possible)
  • Lack of type inference makes code more verbose
  • Denoting variables as final makes the immutable path, which should be the default, more verbose than the mutable way

To make updating a little easier on the Java side, you can define expression functions that delegate to the Kotlin copy:

data class Point(val x: Int, val y: Int) {
    fun withX(x: Int): Point = copy(x = x)
    fun withY(y: Int): Point = copy(y = y)
}

It’s up to the reader how they feel about adding these with functions. I’m torn. I don’t like to pollute data classes with too many functions, as they start to move away from truly being a “data” class. But on the other hand, these functions are concise, less error prone, and it makes Kotlin data classes easier to use in Java.

Jackson

If you’re using Dropwizard, you’re using Jackson. If you’re using Jackson, you’re using Jackson annotations. To update our AutoValue class for the bare minimum number of Jackson annotations needed:

@AutoValue
@JsonDeserialize(builder = PointJ.Builder.class)
public abstract class PointJ {
    @JsonProperty
    public abstract int x();

    @JsonProperty
    public abstract int y();

    public static Builder builder() {
        return new AutoValue_PointJ.Builder();
    }

    public abstract Builder toBuilder();

    public static PointJ create(int x, int y) {
        return builder()
                .x(x)
                .y(y)
                .build();
    }

    public PointJ withX(int x) {
        return toBuilder().x(x).build();
    }

    public PointJ withY(int y) {
        return toBuilder().y(y).build();
    }

    @AutoValue.Builder
    @JsonPOJOBuilder(withPrefix = "")
    public abstract static class Builder {
        @JsonCreator
        private static Builder create() {
            return PointJ.builder();
        }

        public abstract Builder x(int x);

        public abstract Builder y(int y);

        public abstract PointJ build();
    }
}

At this point, you start feeling less satisified with AutoValue. What about Kotlin?

  • Add the jackson-module-kotlin dependency
  • Register the KotlinModule on the ObjectMapper
<dependency>
    <groupId>com.fasterxml.jackson.module</groupId>
    <artifactId>jackson-module-kotlin</artifactId>
</dependency>

And a java junit test case.

@Test
public void deserialize() throws IOException {
    final ObjectMapper mapper = Jackson.newObjectMapper();
    mapper.registerModule(new KotlinModule());
    final Point point = mapper.readValue("{ \"x\": 1, \"y\": 2 }", Point.class);
    assertThat(point.getX()).isEqualTo(1);
    assertThat(point.getY()).isEqualTo(2);
}

Done. We did not need to modify our data class at all!

I can’t recommend data classes enough. In a single line of Kotlin

  • Have an immutable value class
  • Have Jackson support
  • Decent out of the box Java interop
  • Can add concise with methods to make Java interop even better
  • Eliminate 53 lines of an AutoValue class, which previously held the gold standard.

To give one final example, you know the configuration class in the Dropwizard getting started page? Those 27 lines can be reduced to:

data class MyConfig(
        @field:NotEmpty val template: String?,
        @field:NotEmpty val defaultName: String = "Stranger"
): Configuration()

A few new things here:

  • Default value for defaultName
  • Allow template to be null with the String? type
  • Annotate the backing fields of the data class with @NotEmpty. By default, the annotation would apply to the argument, which is not validated – fix with @field.

It seems contradictory, doesn’t it? That there is a annotation for NotEmpty on a type that clearly states that it can be null. If we remove the contradicting ? and the annotation, Jackson will provide an ok error message when we try and deserialize a null value. Though another solution would be to have a MyConfigRaw with the annotations and MyConfig with the non-nullable types and convert the raw data class to MyConfig after validation. Data classes are concise enough that this is a viable alternative. Or live with the contradiction.

Also, don’t forget, to get this to work in your app, you’ll need to bootstrap with the KotlinModule:

@Override
public void initialize(final Bootstrap<MyConfig> bootstrap) {
    bootstrap.getObjectMapper().registerModule(new KotlinModule());
}

Algebraic Data Types

After you get a taste of data classes you’ll want to write your business logic in Kotlin. Algebraic Data Types allow a single type to encompass other types without inheritance and all those problems. ADTs are perfect for business logic as it allows inner functions to be pure and outer level functions deal with the side effects (logging, metrics, etc). Previously, I used Derive4j, which is a decent Java-only solution, but it only gets you so far due to constraints of the language.

Kotlin solves this with a combination of sealed classes and smart casts.

One of my pet peeves is that there is often a lack of precision associated with date and time types. For instance, if I said “I was born in 2018” or even “I was born in January 2018”, it is not the same as saying “I was born on January 1st, 2018”, yet all would be represented as LocalDate.of(2018, 1, 1) and they would all be considered equal. Same thing with time “I have practice on Tuesdays” vs “I have practice on Tuesdays at 5:30pm”. There is not enough granularity between the times to know if they really are the same. I could be talking about soccer and violin practices that happen both on Tuesdays. So we are going to fix this by creating a new type for each granularity (year, month, day). This task would be burdensome in Java, but not so in Kotlin

sealed class DatePrecision
data class Year(val year: Int) : DatePrecision()
data class Month(val year: Int, val month: Int) : DatePrecision()
data class Day(val year: Int, val month: Int, val day: Int) : DatePrecision()

Working with these classes is easy (using explicit types for readibility)

val dp: DatePrecision = Year(year = 2018)
val dy: Year = Year(year = 2018)
val dt: DatePrecision = Month(year = 2018, month = 1)
assertThat(dp).isNotEqualTo(dt)
assertThat(dy).isEqualTo(dp)
assertThat(dp).isEqualTo(Year(year = 2018))

Using when statement for smart casts, we have each precision’s fields available.

fun datePrint(dt: DatePrecision) = when(dt) {
    is Year -> println("Year precision: ${dt.year}")
    is Month -> println("Month precision: ${dt.year} - ${dt.month}")
    is Day -> println("Day precision: ${dt.year} - ${dt.month} - ${dt.day}")
}

Now the business logic can determine how 2018 and January 2018 are treated instead of needing a companion enum that holds the precision of a LocalDate. It becomes significantly less error prone as you can’t accidentally interpret a monthly date as a yearly one.

I’m a large proponent of ADTs and I could continue to extol their benefits ad nauseam, but I’ll spare you and leave just one more thought.

Anyone that knows me, knows that I’m not a fan of exceptions. I prefer Rust’s form of error handling, Haskell’s, or even Go’s. Being able to communicate the fact that the return value is either a success or has some sort of error is invaluable. I won’t dive any deeper as Kotlin’s premier functional companion, Arrow, defines an Either type. The docs should be enough to get one started on their way to fewer exceptions. The one gripe is that Either has the error type listed first in the signature. I find this confusing, as it is more intuitive to have a separate Result type with success listed before failure, but this argument would probably fall on deaf ears as the hard core functional crowd made up their minds a long time ago.

Yeah Basic

Here’s a hodgepodge of Kotlin features you’ll immediately love:

Iterate iterators!

Iterator<String> lines = /* ... */;
while (lines.hasNext()) {
    final String line = lines.next();
}

is less pretty than Kotlin’s

val lines: Iterator<String> = /* ... */
for (line in lines) {
}

Lazy and eager collections! Java has too much ceremony surrounding lazy streams.

final List<Integer> list = Collections.singletonList(1);
final List<Integer> anotherList = list.stream()
        .map(x -> x + 1)
        .collect(Collectors.toList());

Kotlin gives us the ergonomic option and the lazy option (not that it makes much of a difference here, but if there are many chains then it would become noticeable)

val list = listOf(1)
val eager: List<Int> = list.map { it + 1 }
val lazy: List<Int> = list.asSequence().map { it + 1 }.toList()

Single constructor classes! All of my Jersey resource classes only have a single constructor – in fact, I think almost all my classes have a single constructor.

class MyResource {
    private final MyDb mydb;
    public MyResource(MyDb mydb) {
        this.mydb = mydb;
    }
}

becomes

class MyResource(private val mydb: MyDb) {
}

Raw string literals! Writing string literals that contain quotes, can be a pain in Java, especially writing inline tests for JSON serialization and deserialization – it’s never a copy and paste task. Kotlin to the rescue.

final ObjectMapper mapper = Jackson.newObjectMapper();
final String jsonStr = "[\"a\", \"b\", \"c\"]";
final List<String> list = mapper.readValue(jsonStr,
        new TypeReference<List<String>>() {});
assertThat(list).containsExactly("a", "b", "c");

becomes

val mapper = Jackson.newObjectMapper()
val jsonStr = """["a", "b", "c"]"""
val list: List<String> = mapper.readValue(jsonStr)
assertThat(list).containsExactly("a", "b", "c")

It doesn’t look like a large improvement, but I want to point out:

  • Raw string literal allows us to sanely type JSON by hand or copy and paste. Use trimMargin if you want a multiline string trimmed
  • Reified generics means we don’t need to specify List<String> type twice
  • And, as always, type inference saves unnecessary typing

In the interest of keeping this post a reasonable length, I’ll direct you to idioms in Kotlin for more reasons

Testing

If the Dropwizard project in question can’t have its production code touched for some reason (eg. adding Kotlin code seems like rocking the boat too much). Then you can always add Kotlin exclusively for tests. You still benefit from all the niceties Kotlin brings to the table with none of the downsides (not that there are many to begin with). I find this approach similar Java projects that use Groovy for tests (spoiler: I like Kotlin better).

Dropwizard currently bundles junit 4. Dropwizard 1.3 will optionally bundle junit 5. I personally am still using junit 4 because Dropwizard 1.3 is not released yet so I haven’t gotten a chance to explore the new features for junit 5 (though it appear to be not as straightforward as it should be). I hate all the ceremony surrounding parameterized tests in junit 4, which has been fixed in 5. Unfortunately, parameterized tests are labelled as experimental. So when you see guides for testing kotlin, make sure you note the framework used: junit 4/5, and kotlin specific frameworks like Spek (built on junit), and KotlinTest

Lots of options here, lots of room for confusion. I keep it simple. I use the frameworks included with dropwizard: junit4 and AssertJ. The examples I’ve posted thus far use this approach. I tend to prefer to write business logic tests in Kotlin and tests that are heavy in junit annotations to be written in java. The thought is to keep tests straightforward enough for anyone to pickup and contribute to, yet concise and readable. I groan when projects use Groovy for tests as it makes the barrier to contributing code with tests much harder – and I don’t want that to be the case with Kotlin tests.

Coroutines

I had my first trip up with Kotlin when I started dabbling in coroutines.

java.lang.NoClassDefFoundError: kotlin/coroutines/experimental/CoroutineContext$Element$DefaultImpls

This is due to an old version of kotlin-stdlib being picked up, as it was not explicitly stated as a dependency. Curiously, Intellij didn’t generate it when setting up a project for Kotlin. I expect this is an oversight and will soon be remedied. The fix was to state it in the pom

<dependency>
    <groupId>org.jetbrains.kotlin</groupId>
    <artifactId>kotlin-stdlib</artifactId>
    <version>${kotlin.version}</version>
</dependency>

Anyways, I hope this helps someone frantically googling (like I was). Onto the good stuff. Since coroutines are still experimental, subject to change, and the setup process isn’t flawless, I’ve left them to the end of this post.

Jersey supports asynchronous endpoints. Using coroutines, let’s simulate a redis increment and an http request (each taking a second to complete). Below is such an endpoint:

@Path("/")
@Produces(MediaType.APPLICATION_OCTET_STREAM)
class ProxyResource {
    @GET
    fun proxy(@Suspended resp: AsyncResponse) {
        launch {
            val incr = redisIncr().await()
            val content = async {
                delay(1, SECONDS)
                Utility.httpRequest().await()
            }.await()

            resp.resume("The answer is $content + $incr")
        }
    }

    fun redisIncr(): Deferred<Int> = async {
        delay(1, SECONDS)
        10
    }
}

For a bit of Java interop fun, I created Utility.httpRequest() to return a CompletableFuture.

public class Utility {
    public static CompletableFuture<String> httpRequest() {
        return CompletableFuture.completedFuture("Boom");
    }
}

Our endpoint executes the redis command and the http request sequentially. This is ideal when the http request depends on the redis one before it; however, since the dependency is not needed we can execute both coroutines concurrently shaving latency in half.

launch {
    val incr = redisIncr()
    val content = async {
        delay(1, SECONDS)
        Utility.httpRequest().await()
    }

    resp.resume("The answer is ${content.await()} + ${incr.await()}")
}

Caution

By default, launch and async schedule their tasks on the ForkJoinPool.commonPool(). This pool is typically used for CPU intensive tasks as, by default, it will only run the number of tasks concurrently equal to the number of cores - 1 (my four core, hyper-threaded cpu has a parallelism of seven). My recommendation is if you want to use the default common pool (ie not specify a context param to launch or async), keep everything CPU intensive or in a deferred.

Replacing

val content = async {
    delay(1, SECONDS)
    Utility.httpRequest().await()
}.await()

with

val content = async {
    Thread.sleep(1000)
    Utility.httpRequest().await()
}.await()

Caused 16 concurrent requests to finish in three seconds instead of one! (16 requests at 7 parallelism = 3 seconds)

Another option is to use the Unconfined context which should use the thread pool that Jetty uses to service requests. If my hunch is correct this will allow us to still benefit from Jetty’s handling of request pressure but still allow us to increase throughput.

Conclusion

If you’ve followed through the whirlwind of a tour of data classes, algebraic data types, lambda, function expressions, smart casts, interoperability, coroutines, reified generics, raw strings, and immutability then congratulations! It may seem overwhelming to see all features mentioned in tandem, but each feature examined individually is digestible. Kotlin isn’t here to make writers feel smart and readers dumb, but improve on Java, a language sorely lacking these features. I hope you give Kotlin a try in your next app. Let me know how it goes!