Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add VersionedSMT for historical queries #20

Closed
wants to merge 16 commits into from
Closed
6 changes: 3 additions & 3 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,15 +9,15 @@ on:

env:
# Even though we can test against multiple versions, this one is considered a target version.
TARGET_GOLANG_VERSION: "1.18"
TARGET_GOLANG_VERSION: "1.20"

jobs:
tests:
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
go: ["1.18"]
go: ["1.20"]
name: Go Tests
steps:
- uses: actions/checkout@v3
Expand Down Expand Up @@ -83,7 +83,7 @@ jobs:
fail-fast: false
matrix:
goarch: ["arm64", "amd64"]
go: ["1.18"]
go: ["1.20"]
timeout-minutes: 5
name: Build for ${{ matrix.goarch }}
steps:
Expand Down
104 changes: 104 additions & 0 deletions KVStore.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
# KVStore <!-- omit in toc -->

- [Overview](#overview)
- [Implementation](#implementation)
- [In-Memory and Persistent](#in-memory-and-persistent)
- [Store methods](#store-methods)
- [Lifecycle Methods](#lifecycle-methods)
- [Data Methods](#data-methods)
- [Backups](#backups)
- [Restorations](#restorations)
- [Accessor Methods](#accessor-methods)
- [Prefixed and Sorted Get All](#prefixed-and-sorted-get-all)
- [Clear All Key-Value Pairs](#clear-all-key-value-pairs)
- [Len](#len)

## Overview

The `KVStore` interface is a key-value store that is used by the `SMT` and `SMST` as its underlying database for its nodes. However, it is an independent key-value store that can be used for any purpose.

## Implementation

The `KVStore` is implemented in [`kvstore.go`](./kvstore.go) and is a wrapper around the [BadgerDB](https://github.com/dgraph-io/badger) key-value database.

The interface defines simple key-value store accessor methods as well as other methods desired from a key-value database in general.

```go
type KVStore interface {
// Store methods
Get(key []byte) ([]byte, error)
Set(key, value []byte) error
Delete(key []byte) error

// Lifecycle methods
Stop() error

// Data methods
Backup(writer io.Writer, incremental bool) error
Restore(io.Reader) error

// Accessors
GetAll(prefixKey []byte, descending bool) (keys, values [][]byte, err error)
Exists(key []byte) (bool, error)
ClearAll() error
Len() int
}
```

_NOTE: The `KVStore` interface can be implemented by another key-value store that satisfies the interface and used as the database for the `SM(S)T`_

### In-Memory and Persistent

The `KVStore` implementation can be used as an in-memory or persistent key-value store. The `NewKVStore` function takes a `path` argument that can be used to specify a path to a directory to store the database files. If the `path` is an empty string, the database will be stored in-memory.

_NOTE: When providing a path for a persistent database, the directory must exist and be writeable by the user running the application._

### Store methods

As a key-value store the `KVStore` interface defines the simple `Get`, `Set` and `Delete` methods to access and modify the underlying database.

### Lifecycle Methods

The `Stop` method **must** be called when the `KVStore` is no longer needed. This method closes the underlying database and frees up any resources used by the `KVStore`.

For persistent databases, the `Stop` method should be called when the application no longer needs to access the database. For in-memory databases, the `Stop` method should be called when the `KVStore` is no longer needed.

_NOTE: A persistent `KVStore` that is not stopped will stop another `KVStore` from opening the database._

### Data Methods

The `KVStore` interface provides two methods to allow backups and restorations.

#### Backups

The `Backup` method takes an `io.Writer` and a `bool` to indicate whether the backup should be incremental or not. The `io.Writer` is then filled with the contents of the database in an opaque format used by the underlying database for this purpose.

When the `incremental` bool is `false` a full backup will be performed, otherwise an incremental backup will be performed. This is enabled by the `KVStore` keeping the timestamp of its last backup and only backing up data that has been modified since the last backup.

#### Restorations

The `Restore` method takes an `io.Reader` and restores the database from this reader.

The `KVStore` calling the `Restore` method is expected to be initialised and open, otherwise the restore will fail.

_NOTE: Any data contained in the `KVStore` when calling restore will be overwritten._

### Accessor Methods

The accessor methods enable simpler access to the underlying database for certain tasks that are desirable in a key-value store.

#### Prefixed and Sorted Get All

The `GetAll` method supports the retrieval of all keys and values, where the key has a specific prefix. The `descending` bool indicates whether the keys should be returned in descending order or not.

_NOTE: In order to retrieve all keys and values the empty prefix `[]byte{}` should be used to match all keys_

#### Clear All Key-Value Pairs

The `ClearAll` method removes all key-value pairs from the database.

_NOTE: The `ClearAll` method is intended to debug purposes and should not be used in production unless necessary_

#### Len

The `Len` method returns the number of keys in the database, similarly to how the `len` function can return the length of a map.
10 changes: 8 additions & 2 deletions MerkleSumTree.md
Original file line number Diff line number Diff line change
Expand Up @@ -233,9 +233,12 @@ import (
)

func main() {
// Initialise a new key-value store to store the nodes of the tree
// Initialise a new in-memory key-value store to store the nodes of the tree
// (Note: the tree only stores hashed values, not raw value data)
nodeStore := smt.NewSimpleMap()
nodeStore := smt.NewKVStore("")

// Ensure the database connection closes
defer nodeStore.Stop()

// Initialise the tree
tree := smt.NewSparseMerkleSumTree(nodeStore, sha256.New())
Expand All @@ -245,6 +248,9 @@ func main() {
_ = tree.Update([]byte("baz"), []byte("zab"), 7)
_ = tree.Update([]byte("bin"), []byte("nib"), 3)

// Commit the changes to the nodeStore
_ = tree.Commit()

sum := tree.Sum()
fmt.Println(sum == 20) // true

Expand Down
55 changes: 29 additions & 26 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,28 +6,29 @@
[![Tests](https://github.com/pokt-network/smt/actions/workflows/test.yml/badge.svg)](https://github.com/pokt-network/smt/actions/workflows/test.yml)
[![codecov](https://codecov.io/gh/pokt-network/smt/branch/main/graph/badge.svg)](https://codecov.io/gh/pokt-network/smt)

Note: **Requires Go 1.18+**
Note: **Requires Go 1.20+**

- [Overview](#overview)
- [Implementation](#implementation)
- [Inner Nodes](#inner-nodes)
- [Extension Nodes](#extension-nodes)
- [Leaf Nodes](#leaf-nodes)
- [Lazy Nodes](#lazy-nodes)
- [Lazy Loading](#lazy-loading)
- [Visualisations](#visualisations)
- [General Tree Structure](#general-tree-structure)
- [Lazy Nodes](#lazy-nodes-1)
- [Inner Nodes](#inner-nodes)
- [Extension Nodes](#extension-nodes)
- [Leaf Nodes](#leaf-nodes)
- [Lazy Nodes](#lazy-nodes)
- [Lazy Loading](#lazy-loading)
- [Visualisations](#visualisations)
- [General Tree Structure](#general-tree-structure)
- [Lazy Nodes](#lazy-nodes-1)
- [Paths](#paths)
- [Visualisation](#visualisation)
- [Visualisation](#visualisation)
- [Values](#values)
- [Nil values](#nil-values)
- [Nil values](#nil-values)
- [Hashers \& Digests](#hashers--digests)
- [Proofs](#proofs)
- [Verification](#verification)
- [Verification](#verification)
- [Database](#database)
- [Data Loss](#data-loss)
- [Data Loss](#data-loss)
- [Sparse Merkle Sum Tree](#sparse-merkle-sum-tree)
- [Versioned and Immutable Trees](#versioned-and-immutable-trees)
- [Example](#example)

## Overview
Expand Down Expand Up @@ -295,20 +296,12 @@ The verification step simply uses the proof data to recompute the root hash with

## Database

This library defines the `MapStore` interface, in [mapstore.go](./mapstore.go)

```go
type MapStore interface {
Get(key []byte) ([]byte, error)
Set(key []byte, value []byte) error
Delete(key []byte) error
}
```

This interface abstracts the `SimpleMap` key-value store and can be used by the SMT to store the nodes of the tree. Any key-value store that implements the `MapStore` interface can be used with this library.
This library defines the `KVStore` interface which by default is implemented using [BadgerDB](https://github.com/dgraph-io/badger), however any databse that implements this interface can be used as a drop in replacement. The `KVStore` allows for both in memory and persisted databases to be used to store the nodes for the SMT.

When changes are commited to the underlying database using `Commit()` the digests of the leaf nodes are stored at their respective paths. If retrieved manually from the database the returned value will be the digest of the leaf node, **not** the leaf node's value, even when `WithValueHasher(nil)` is used. The node value can be parsed from this value, as the tree `Get` function does by removing the prefix and path bytes from the returned value.

See [KVStore.md](./KVStore.md) for the details of the implementation.

### Data Loss

In the event of a system crash or unexpected failure of the program utilising the SMT, if the `Commit()` function has not been called, any changes to the tree will be lost. This is due to the underlying database not being changed **until** the `Commit()` function is called and changes are persisted.
Expand All @@ -317,6 +310,10 @@ In the event of a system crash or unexpected failure of the program utilising th

This library also implements a Sparse Merkle Sum Tree (SMST), the documentation for which can be found [here](./MerkleSumTree.md).

## Versioned and Immutable Trees

This library provides a versioned tree that allows for multiple versions of the same SMT to be stored and used for historical queries. The documentation for which can be found [here](./VersionedTree.md).

## Example

```go
Expand All @@ -330,16 +327,22 @@ import (
)

func main() {
// Initialise a new key-value store to store the nodes of the tree
// Initialise a new in-memory key-value store to store the nodes of the tree
// (Note: the tree only stores hashed values, not raw value data)
nodeStore := smt.NewSimpleMap()
nodeStore := smt.NewKVStore("")

// Ensure the database connection closes
defer nodeStore.Stop()

// Initialise the tree
tree := smt.NewSparseMerkleTree(nodeStore, sha256.New())

// Update the key "foo" with the value "bar"
_ = tree.Update([]byte("foo"), []byte("bar"))

// Commit the changes to the node store
_ = tree.Commit()

// Generate a Merkle proof for "foo"
proof, _ := tree.Prove([]byte("foo"))
root := tree.Root() // We also need the current tree root for the proof
Expand Down
113 changes: 113 additions & 0 deletions Versioned.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
# Versioned Trees <!-- omit in toc -->

- [Overview](#overview)
- [Implementation](#implementation)
- [SMT Methods](#smt-methods)
- [Versioning](#versioning)
- [Set Initial Version](#set-initial-version)
- [Version Accessors](#version-accessors)
- [Save Version](#save-version)
- [Get Versioned Keys](#get-versioned-keys)
- [Get Immutable Tree](#get-immutable-tree)
- [Database Lifecycle](#database-lifecycle)

## Overview

The `VersionedSMT` interface defines an SMT that supports versioning. Where the versioned tree stores its previous versions allowing for historical queries and proofs to be generated. The interface is implemented in two separate types: `ImmutableTree` and `VersionedTree`. A `VersionedTree` is mutable and can be changed, and saved. Once saved a tree becomes immutable and the `Update`, `Delete`, `Commit`, `SetInitialVersion` and `SaveVersion` methods will cause a panic.

A `VersionedTree` maintains its own internal state, keeping track of its available versions and keeping a `KVStore` which it can use to retrieve the data for previous versions. The `KVStore` contains the information needed to import an `ImmutableTree` from its specific database path.

_NOTE: If the user deletes the database for a previous version the tree will not be able to import it and will cause a panic if use is attempted_

See: [immutable.go](./immutable.go) and [versioned.go](./versioned.go) for more details on the implementation specifics.

## Implementation

The `VersionedSMT` interface is as follows

```go
type VersionedSMT interface {
SparseMerkleTree

// --- Versioning ---
SetInitialVersion(uint64) error
Version() uint64
AvailableVersions() []uint64
SaveVersion() error
VersionExists(uint64) bool
GetVersioned(key []byte, version uint64) ([]byte, error)
GetImmutable(uint64) (*ImmutableTree, error)

// --- Database ---
Stop() error
}
```

Both the `VersionedTree` and `ImmutableTree` types implement this interface where the `ImmutableTree` panics when any modifying method is called, due to it being immutable.

The `VersionedTree` implementation keeps track of the available versions, and keeps a `KVStore` to restore these versions when needed. The last saved version is not only stored with the others in the `KVStore` but is also kept open and accessible as an `ImmutableTree` embedded within the `VersionedTree` struct. This allows for the `VersionedTree` to easily access its most recent version without having to restore it from the `KVStore`.

When a new version is saved the previous version is closed and the current version (the one being saved) is imported as an `ImmutableTree`, as well as being saved in the `KVStore`, replacing the older previous version in the struct.

_NOTE: This does not over right the previous saved versions database at all only a pointer to an open, in memory `ImmutableTree`_

### SMT Methods

The inclusion of the `SparseMerkleTree` interface within the `VersionedSMT` interface enables the use of all the regular SMT methods.

See: [smt.go](./smt.go) and [types.go](./types.go) for more details on the SMT implementation.

### Versioning

The `VersionedSMT` interface naturally defines methods relevant to storing previous versions of the tree.

#### Set Initial Version

Upon creation a `VersionedTree` will by default have a version of 0, this can be overridden once, if and only if the tree has not been saved and its version incremented.

#### Version Accessors

The following version accessors have simple functionalities:

- `Version` method returns the current version of the `VersionedSMT`.
- `AvailableVersions` method returns a slice (`[]uint64`) of all the available versions of the `VersionedSMT`.
- `VersionExists` method returns a boolean indicating if the given version exists.

#### Save Version

The `SaveVersion` method is used to save the current version of the `VersionedTree`. As detailed above ([Implementation](#implementation)) this will keep the most recently saved version open and embedded in the `VersionedTree` struct, as well as saving the current version in the `KVStore`. This in memory previous stored version is used for easy access to the most recently saved version without having to open its database from the data retrieved from the `KVStore`.

A `VersionedTree` can be saved by encoding it into the following struct:

```go
type storedTree struct {
Db_path string
Root []byte
Version uint64
Th hash.Hash
Ph hash.Hash
Vh hash.Hash
}
```

This struct is what is serialised and stored in the `KVStore` with the key corresponding to the version (`binary.BigEndian.PutUint64(version)`).

If `max_stored_versions` was set during the creation/importing of a `VersionedTree` when saving if the number of saved versions exceeds `max_stored_versions` the oldest version will be deleted from the `KVStore` and the database will be deleted from the file system.

#### Get Versioned Keys

The `GetVersioned` method allows you to retrieve the value for a given key at the version specified, it does so by either using the current or previously stored versions which are open in memory (if the version is correct), or by opening the database using the data from the `KVStore` and retrieving the value from an imported `ImmutableTree`.

#### Get Immutable Tree

The `GetImmutable` method returns an `ImmutableTree` for the given version. It can only do so if the version has already been stored. If the version has not been stored it will return an error.

The `ImmutableTree` returned cannot be modified without directly writing to the underlying database, which **should never** be done.

_NOTE: If the user retrieves an `ImmutableTree` they are responsible for closing the connection to its underlying database, if the version is that of the previous stored version it **should not** be closed as this is expected to remain open_

### Database Lifecycle

As the `VersionedTree` requires a persistent `KVStore` the database must be closed properly once the tree is no longer in use. When calling the `SaveVersion` method this is handled automatically, however if the user wishes to stop using the tree before saving a version they must call the `Stop` method to close the database connection.

The `Stop` method will close the `VersionedTree`'s node store, as well as that of the previously stored `ImmutableTree` and also the `KVStore` that stores the saved versions.
Loading
Loading