Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Archive storage #44

Open
Scooletz opened this issue Apr 17, 2023 · 1 comment
Open

Archive storage #44

Scooletz opened this issue Apr 17, 2023 · 1 comment
Labels
ethereum An Ethereum specific work item that requires a good understanding of Eth

Comments

@Scooletz
Copy link
Contributor

Scooletz commented Apr 17, 2023

An archive node is an instance of an Ethereum client configured to build an archive of all historical states. There are two queries that it needs to answer:

  1. the account information at the given block
  2. the storage information at the given block for the given account.

This means that both queries could be transformed into a query that:

  1. starts with the account = @account equality,
  2. then is followed by storage = @storage (for storage)
  3. then is followed by block <= @block as the change might have been applied in the past

It should be possible then to use the same db structure where the key is encoded as a NibblePath by concatenating account + storage + block and mapping some payload to it. This concatenation would allow Paprika's prefixing behavior to potentially make the dataset smaller. Potentially a separate page type with a different flushing behavior and a different fan out could be applied.

From the consumption point of view, archive processing could be a sidecar that processes storage & state slots that are flushed by the Blockchain.FlusherTask method. Then, each block that is applied could be inspected and applied to a separate archive root. This could be done by extending RootPage by only one address that would point to the separate Archive tree.

Remarks:

  1. The application of the data could be done in a relaxed manner, where the flush to disk happens only now and then. There are options for this already.
  2. Storing or calculating hashes, could be made optional.
  3. Querying the state should take into consideration the transient in memory state and the archive.
  4. The other option is to have the Archive totally separated. That could be considered, but there's something into having it embedded into the block processing, where the data are fresh, in-memory and ready to be stored by this archiving sidecar.
@Scooletz Scooletz added this to the Protocol&Migration milestone Apr 17, 2023
@LukaszRozmej
Copy link
Member

There should be a (slow) way to support deep reorganistations with archive storage.
I expect archive storage to be indexed by block (or each transactions within blocks), it would be good to have an option to prune archive storage, keeping some arbitrary number of blocks there (for example from only last year or month).

@Scooletz Scooletz added the ethereum An Ethereum specific work item that requires a good understanding of Eth label Aug 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ethereum An Ethereum specific work item that requires a good understanding of Eth
Projects
Status: Backlog
Development

No branches or pull requests

2 participants