Link

Kopia is a backup/restore tool which allows creating snapshots of filesystem contents and uploading them to some specified remote cloud storage (it supports S3, GCS, and a few more).

As of this commit, Kopia does non-quiescent periodic garbage collection (abbreviated GC) to free up data corresponding to deleted snapshots which is known to break safety conditions (data of a snapshot is deleted even when the snapshot isn’t deleted). Let us call the strategy of garbage collection followed in the specified commit as vanilla garbage collection. The TLA+ specification in the master branch of this repository corresponds to vanilla GC and is written to confirm and identify the behaviour where safety is broken. There is a new design of GC (called GC2) documented here by Julio Lopez. The document also explains why vanilla GC breaks safety (I will explain it here as well). The specification in gc2 branch corresponds to the “GC with metadata” design as presented in the GC2 design document (we will call this gc2 unless explicitly mentioned). However, GC2 doesn’t guarantee safety either and the specification for gc2 demonstrates this. Later sections explain the various optimizations and tricks used to abstract out relevant parts of the implementation to be specified. The final section explains the behaviour that violates safety in GC2.

The architecture of Kopia is explained well here. I will explain the important pieces to keep in mind for our purposes of specifying behaviours relevant to GC in Kopia.