Climate model output is large. The ACCESS-OM2-01 run used here produces roughly a terabyte of NetCDF files per simulated year, spread across thousands of individual files on Acacia — Pawsey Supercomputing Centre's object storage. Traditionally, doing anything with that data means either being on the HPC cluster yourself, or downloading a substantial chunk of it first. Neither option is great if you just want to take a quick look.
The first step in the pipeline is virtualisation with VirtualiZarr. Rather than copying or converting the data, VirtualiZarr reads the internal structure of each NetCDF file — where each variable's chunks live on disk, their byte offsets and lengths — and builds a lightweight virtual Zarr store. Nothing is moved; you end up with a reference catalogue that says "chunk [0,0,0] of sst_m is bytes 171,279,300–172,408,502 of this file on S3". The whole catalogue for 42 months of the global 0.1° CICE sea ice run shown here fits in a few-hundred-kilobyte JSON file.
That JSON is written out in Kerchunk reference format — a simple spec that maps Zarr chunk keys to [url, offset, length] triples. Any Zarr-aware client that knows how to issue HTTP range requests can consume it directly, without any special server software. The files themselves never move; you're just handing the client a roadmap.
In the browser, the reference JSON is loaded and passed to zarrita.js (a TypeScript Zarr implementation) backed by a range-request store. When you select a time step, the client works out which chunks are needed for that slice, fires off a handful of HTTP Range requests to the object storage endpoint, decompresses the chunks in-browser using a WASM codec, and renders the result. You never download the full dataset — for a single monthly SST field at 0.1°, that's pulling around 4–8 MB of compressed chunks out of a ~200 GB archive.
The result is, again, completely serverless: no backend, no tiling service, no data pipeline running on a VM somewhere. The only infrastructure is the object storage bucket (already there for model output) and the static site you're reading this on. The main catch is that the S3 bucket needs permissive CORS headers — which, at a supercomputing centre, is sometimes easier said than done.