Link

Let’s know discuss the final peice, the steps we need to have in the specification -

  1. TickTimeForward - just increment the logical_timestamp. This step is always enabled i.e., has no preconditions.
  2. For snapshot processing -
    1. TriggerSnapshot -
      • Precondition - cardinality of snapshot_processes_bag < NumSnapshots
      • Action - add new snapshot process record to the bag of snapshot processes. Populate content_ids_to_be_written_list with a permutation of a subset of possbile content IDs (i.e., 0 to NumContentIDs-1). Populate local_index_entries_set based on index_blobs_list. Set timestamp as logical_timestamp.
    2. PackContents -
      • Precondition - snapshot hasn’t yet packed/ written all contents it was supposed to write (i.e., pointer_to_next_content_to_pack < length of content_ids_to_be_written_list)
      • Action - choose some contents from content_ids_to_be_written_list (starting at pointer_to_next_content_to_pack) to be packed in the same timestamp (and which are not in local_index_entries_set as an undeleted latest entry) and append them to local_data_blob_record.
    3. FlushLocalDataBlob -
      • Precondition - local_data_blob_list is not empty
      • Action - append local_data_blob_list to data_blobs_list and append corresponding index entries to local_index_blob_list
    4. FlushLocalIndexBlob -
      • Precondition - local_index_blob_list is not empty
      • Action - append the local_index_blob_list blob to the global index_blob_list if the local_data_blob_list is empty (so that index blob is flushed only for all contents already flushed to the repository). Add entries in local_index_blob_list to local_index_entries_set.
    5. MarkComplete -
      • Precondition - all contents have been written with the timelimit (i.e., pointer_to_next_content_to_pack = length of content_ids_to_be_written_list and local_data_blob_list is empty and logical_timestamp < start_timestamp + MaxSnapshotTime)
      • Action - mark the status of the snapshot as complete
    6. DeleteSnapshot -
      • Precondition - snapshot is in complete status
      • Action - change status to deleted
  3. For garbage collection processing -
    1. TriggerGC -
      • Precondition - cardinality of gc_processes_bag < NumGCs
      • Action - add a new gc process record to the global bag. Populate local_snapshot_entries_bag with snapshot_processes_bag. Populate local_index_entries_set based on entries in index_blobs_list.
    2. MarkContentDeleted -
      • Precondition - based on the local copy of snapshots and index blobs, the GC knows which content IDs are to be marked deleted. If there are content IDs which haven’t been marked deleted, but need to be deleted, the precondition is True.
      • Action - Mark deleted some contents which are yet to be deleted. This involves adding the delete index entries to the local_index_blob_list. Also add the entries to local_index_entries_set.
    3. FlushLocalIndexBlob -
      • Precondition - local_index_blob_list is not empty.
      • Action - append local_index_blob_list to index_blobs_list and reset local_index_blob_list to empty list.
  4. For index blob compaction processing -
    1. TriggerIndexBlobCompcation -
      • Precondition - cardinality of index_blob_compaction_processes_bag < NumIndexBlobCompactions
      • Action - add an index compaction process record to the global bag. Populate local_index_blobs_list with a copy of index_blobs_list. Set new_index_written to False.
    2. FlushMergedIndexBlob -
      • Precondition - new_index_written is False.
      • Action - Flush the new index based on compaction (it is easy to compute which index blobs are to be compacted, skipping the details ehre). Set new_index_written to True. Populate index_blobs_ids_to_be_deleted_set based on the compaction process.
    3. DeleteOldIndexBlob -
      • Precondition - new_index_written is True and there exists an index blob in index_blobs_ids_to_be_deleted_set.
      • Action - Remove such a blob from index_blobs_list and remove the index blob id from index_blobs_ids_to_be_deleted_set.
  5. For data blob compaction processing -
    1. TriggerDataBlobCompaction -
      • Precondition - cardinality of data_blob_compaction_processes_bag < NumDataBlobCompactions
      • Action - add a data blob compaction process record to the global bag. Populate local_index_blob_ids_list with a copy of blob_ids from index_blobs_list. Populate local_index_entries_set based on index_blobs_list. Set data_blobs_written to False.
    2. PackContents -
      • Precondition - current_blob_content_pointer hasn’t reached the last content in the last data blob in local_index_blob_ids_list.
      • Action - add the content to the local_data_blob_record. Add the data blob id to data_blobs_ids_to_be_deleted_set in case it is not already added.
    3. FlushLocalDataBlob - Similar to the snapshot process
    4. FlushLocalIndexBlob - Similar to the snapshot process. If it was the last index to be flushed, mark data_blobs_written as True.
    5. DeleteDataBlob -
      • Precondition - data_blobs_ids_to_be_deleted is not empty and data_blobs_written is True
      • Action - delete a data blob from data_blobs_ids_to_be_deleted_set and remove from data_blobs_ids_to_be_deleted_set.

Note that we don’t have a “failed” state for any snapshot process. In the actual implementation, any snapshot that has not completed within its timelimit can stay in progress for anytime after the timelimit and still write contents, or it can fail. But we haven’t explicitly specified a step to fail the snapshot because we will still captures behaviour where a snapshot in progress after the timelimit never takes a step to write any content and this is as good as a failed snapshot process.