ISO 8601 and Nanosecond Precision Across LanguagesPublished on
Date formats are like fruits, sometimes you don’t know they’re rotten until you cut them open
Given a date format that supports arbitrary precision, the question of how best to represent the date in a language of choice is generally tough to answer. In situations like these, I like to look at current standards and see how implementations work. As an example I will use ISO 8601, which is a standard way to represent a datetime with arbitrary precision (obligatory xkcd).
Before we get to the examples, the impetus for this post comes from an emerging healthcare standard called fhir (it’s pronounced “fire” and is an acronym but I prefer writing in lowercase). One of the jobs that fhir has is to define models described in JSON to improve interoperability and part of these models include dates. JSON does not specify how dates should be formatted, so fhir defines it for us as a regular expression (visualized):
It’s not pretty and it looks to have a large intersection with valid ISO 8601 dates. We’ll have to see if ISO 8601 compliance status gets added to the docs.
Anyways, the author of hapi fhir (“happy fire”) was looking for implementation tips for implementing these dates in Java with an emphasis on Java 6, precision, and data representation. As a comparison, I decided to look at ISO 8601 implementations.
YYYY-MM-DDTHH:MM:SS.mmmmmm or, if microsecond is 0, YYYY-MM-DDTHH:MM:SS
If more than microsecond precision is specified in the payload during parsing you’ll start seeing errors.
Popular Python libraries don’t do much more with a lack of an intermediate format and simply truncate the trailing digits:
If you want a higher resolution you’ll have to use NumPy’s
datetime64, which supports 64 bits of precision of your choosing. For instance, I could have 64 bits worth of years to 64 bits of attoseconds:
So if one wants ISO 8601 strings down to the nanosecond, only a single Numby
datetime64 is needed – any higher precision will require coordination between two
Eric Lippert wrote a great article on Precision and accuracy of DateTime, which uses .NET specific
DateTime, but his findings that accuracy vs precision is a very real distinction that is applicable across all languages.
DateTime has a resolution of 100 nanoseconds (known as a tick)
I’m not familiar with any other date libraries for .NET so if you need higher precision than 100 nanoseconds then you may have to consider rolling your own object as suggested by a stackoverflow (unaccepted) answer, though the answer could certainly be improved upon.
To get additional precision,
process.hrtime() exists, which returns a tuple of
[seconds, nanoseconds] where the nanoseconds are the nanoseconds since the last second. Additionally, it’s
seconds from an arbitrary point in time, so the function is not applicable to this discussion, but I wanted to showcase it for completeness sake.
Popular ELK frontend, kibana, seems afflicted with this limited granularity partly because the underlying Elasticsearch (written in Java) suffers the same problem that dates are recorded down to milliseconds
Rust is a breath of fresh air. Everything related to datetimes is handled through the rust-chrono package, which is precise to the nanosecond and can parse ISO 8601 formatted strings, truncating digits more granular than nanoseconds.
The team’s research is documented on their wiki.
Go is in the same boat as Rust, support for up to nanosecond resolution ISO8601 dates is readily accessible in the standard library. In Go’s case, it’s RFC 3339, the subset of ISO 8601. Pretty straightforward.
time.LocalDateTime there was no built in datetime class that had sub-millisecond resolution (lookin’ at you
sql.Timestamp doesn’t count as it inherits
util.Date only for implementation and not for semantics (makes one almost wish that Java had C++’s private inheritance).
Another alternative is date4j, which is a pretty straight forward (and small) library and can parse ISO 8601 dates down to nanoseconds:
Only ISO 8601 format is supported, but since this library is open source there isn’t a reason why it couldn’t support other formats.
Probably the newest and best date library for pre-Java8 code is Threeten which is Java8 time backported to Java6. Quick example:
Anyone familiar with Java8 time module will be familiar with Threeten. If the only reason for not using the Java8 time classes is that it requires Java8 then Threeten seems an appropriate fit.
Sometimes one wants to keep track the precision of the data. For instance,
2016 would be tagged with a precision of
2016-01 would be tagged with
Month. Here’s are a couple of ways to represent this model (I’m using F# because the implementation is the shortest!)
Alternatively, one can use a discriminated union of tuples to ensure the data can never be misinterpreted (eg. a
2016 parsed into a
2016-01-01T00:00:00 could be misinterpreted as having second precision).
I believe that it may not be good idea to keep track of the original date’s precision, as programmers may erroneously latch onto the precision as the source of truth. To give an example, imagine two systems recording the time of an operation that happened at 10:15am. System A has hour precision so creates a datetime of 10am. System B has half hour precision so records 10:30am. While System B has more precision, System A has the same accuracy in this instance. A downstream recipient of these events may get confused if using parsed date’s precision in any meaningful business logic. Most of the time when grouping, filtering, and selecting dates they are truncated anyways. For instance, the business logic may detect if two dates occurred on the same day:
By eschewing retention of the parsed precision, it’s easier to support standard parsing mechanisms. Notice that none of the date parsing libraries showcased, outside of date4j (and arguably Numpy) kept track of precision.
But I can understand some situations where precision is wanted. If someone sends in a “2016-06” for a birthdate the client should have the opportunity to reject it because the date is not precise enough. This may be the strongest argument to write a custom parser, though it is certainly possible to determine precision with custom Java 8 (and by extension Threeten) datetime formatters.
Before committing to a custom date format, check if ISO 8601 doesn’t already cover it.
Nanosecond precision is as granular as one should go. It should be clear from the all the code examples that even though ISO 8601 supports arbitrary precision dates, the libraries around it don’t, partially because a 1GHz processor has a clock rate of 1ns, so recording a higher precision is rare and unlikely. Thus, creating an arbitrarily precise date class would only hurt interoperability as too many home grown solutions of questionable quality will appear across systems to satisfy the requirement or they’ll truncate the data.
If what you care about is supporting Java 6, nanosecond granularity, and ability to parse strings in multiple formats, consider using the ThreeTen library. Precision can be determined based on the parser that succeeds.
If one’s still stuck on using
util.Dateand building a custom parser for multiple formats, then at least have
int nanosecondssince the last millisecond so there is no data overlap between
util.Dateand nanoseconds, as
util.Datehas a millisecond field (ie, you don’t want a fractional seconds field, you want fractional milliseconds).