Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Full Proto Encoding #5444

Closed
18 of 28 tasks
alexanderbez opened this issue Dec 23, 2019 · 28 comments
Closed
18 of 28 tasks

Full Proto Encoding #5444

alexanderbez opened this issue Dec 23, 2019 · 28 comments

Comments

@alexanderbez
Copy link
Contributor

alexanderbez commented Dec 23, 2019

Summary

Encoding in the SDK is currently handled by the go-amino wire codec. The implementation and use of this encoding protocol has introduced several major concerns over time, with the two biggest being performance and cross-client/language support.

The context and discussions had over amino are dense, so the above is solely meant to be a high-level primer.

Problem Definition

In order to facilitate easier integration for clients and drastically improve the usability of the SDK and its performance, alternative encoding methods need to be explored.

Proposal

As it stands, the proposals are as follows:

  1. Update go-amino to be fully proto3 compatible. There is already a PR that attempts to achieve this.
    a. This is probably the path of least resistance but doesn't actually solve any of our problems directly and seems like a fragile attempt that will require handling corner-cases increasing the surface area for bugs and poor UX support.
  2. Remove go-amino entirely in favor of gogo-proto.
    a. This seems to be the most viable and promising approach. It will address both performance and client UX concerns.
  3. Remove go-amino entirely in favor of capn-proto.
    a. Similar to proposal (2) but with zero-copy and canonicalization features. However, the concern is somewhat of "vendor-lockin" in terms of not having as rich of a multi-client support as pure proto. AFAIK, encoding uses fixed-width integers so this approach could lead to larger encoding size. Compaction is available, but doing so negates the zero-copy feature.
  4. Consider another encoding scheme entirely (e.g. flatbuffers).
    a. Too late for this most likely (atm). It would require heavy research and pretty compelling arguments to steer toward this direction. In addition, it would probably take drastically more time than any other proposal.

Accepted Proposal

Proposal (2) will be adopted. However, @jaekwon will continue to develop go-amino with the goal of it being fully wire-compatible with this proposal. The go-amino proposal is tracked in tendermint/go-amino#302.

What does this mean?

  • When the amino changes are complete, we shall revisit reincorporating go-amino based mainly on UX and performance.
  • Interfaces must adhere to the oneof implementation where the oneof is the only field member and the sum fields must start at 1.
  • Modules will define their own messages and types that must be binary serialized and persisted using package name cosmos_sdk.{module}.v1.
  • An application-level codec will exist with a single oneof to handle messages.

Roadmap & Actionables

State

Client

TX CLI/REST

Tutorials

  • nameservice
  • scavenge

Documentation & Upgrade Guide

Client libs

  • cosmos/amino-js (if need be)

@jordansexton @aaronc


For Admin Use

  • Not duplicate issue
  • Appropriate labels applied
  • Appropriate contributors tagged
  • Contributor assigned/self-assigned
@dshulyak
Copy link
Contributor

There is one concern with using protobufs for blockchain protocols. Protobuf supports arbitrary maps, but serialization of maps is not deterministic. And usually depends on the library and the language, some libraries sort maps before serializing them. But afaik there is no specification or decision for it, and developers must not depend on sorting even if it happens that library provides such feature.

gogo-proto serializes map in the order of iteration, and hence it is almost guaranteed that two consecutive serialization will produce different results. easy to imagine all sort of bugs if someone will misuse it, consider example:

syntax = "proto3";

message Block {
        repeated Tx tx = 1;
}

message Tx {
        map<string, string> OpCodes = 1;
}
package types

import (
	"crypto/ed25519"
	"strconv"
	"testing"
)

func TestNondeterministic(t *testing.T) {
	tx := Tx{
		OpCodes: map[string]string{},
	}
	block := Block{
		Tx: []*Tx{&tx},
	}

	for i := 0; i < 100; i++ {
		ia := strconv.Itoa(i)
		tx.OpCodes[ia] = ia
	}

	data1, _ := block.Marshal()
	data2, _ := block.Marshal()

	pubk, pk, _ := ed25519.GenerateKey(nil)
	sig := ed25519.Sign(pk, data1)

	if !ed25519.Verify(pubk, data2, sig) {
		t.Error("how come?")
	}
}

I know that ethereum 2 client developers opted out from using protobuf because of this reason, but the protocol requirements might be a bit different. not sure if there is a safe workaround, that will make protobuf less error prone for these use cases.

@dshulyak
Copy link
Contributor

i thought initially that the nondeterminism issue is only present if maps are used, but turns out there are more subtle cases https://gist.github.com/kchristidis/39c8b310fd9da43d515c4394c3cd9510

@aaronc
Copy link
Member

aaronc commented Dec 26, 2019

Thanks for sharing @dshulyak. We are aware of these concerns and gist you linked above. First of all, we are not intending to use maps at all but yes there are still potential issues. Here is my current take on what can be done with protobuf:

  1. we define a deterministic protobuf encoding, publish its behavior clelarly, and ensure that the go implementation and clients follow this for signing. Likely this is mostly covered by saying all fields should in order and no empty fields. Gogo proto has a deterministic mode, but I think we should be willing to upstream PR's to gogo proto if it isn't sufficient. I am aware that a generalized canonical encoding is impossible for protobuf because of the presence of unknown fields - so a necessary constraint on this deterministic encoding would be that messages can contain no unknown fields. This is maybe okay since generally transaction processors should reject messages they don't understand.
  2. we champion Canonical JSON and insist that clients canonicalize the protobuf generated json for signing purposes. In my mind using JSON provides the most safety - because a textual encoding captures intent the best - with likely less effort for implementors, with a definite performance cost for signature verification. However, if this can be done in CheckTx and the results can be cached maybe it's irrelevant.
  3. don't worry about deterministic encoding because tendermint is going to store the raw message bytes anyway and we can just use those for signing. This has the least performance hit and least effort, but least safety for clients.

I would love it if someone else could comment on the relative pros and cons of the above approaches.

If a canonical encoding is really important, I have been arguing that we should use Cap'n Proto where this is defined in the spec. The downsides for doing this now are a) less mature client support in general - some implementations are active, others less so, and b) we would likely need to fork and improve the go implementation - no json support, getters are hard to use. There are however other benefits like zero copy which I think are not completely negated by using a packed encoding. Anyway, if there were really strong argument for going this approach I could be convinced that the benefits outweigh the effort.

For the time-being, my current thinking is that we can make protobuf work with one of the above approaches, but definitely welcome other input.

@alexanderbez
Copy link
Contributor Author

alexanderbez commented Dec 26, 2019

We don't and will not use maps, so that's not even a concern. Essentially, I second what @aaronc stated.

@dshulyak
Copy link
Contributor

I was mainly worried that cosmos module (non-core) developers will be able to accidentally use encoding features, without full understanding what that can lead to.

@aaronc
Copy link
Member

aaronc commented Jan 2, 2020

So I actually think even maps wouldn't be an issue if we figure out the correct signature scheme. Sorting map keys canonically would address this. We are currently leaning to approach (2) FYI.

@ebuchman
Copy link
Member

ebuchman commented Jan 6, 2020

Cool stuff!

Why do you say making go-amino proto compatible doesn't solve any problems? It doesn't solve performance, but doesn't it solve cross-client/language support (everyone else can use protobuf libs)?

My main concern with a change like this is the impact on the codebase, in terms of large PRs and API changes. There also does seem to be desire to consider experiments with other encoding formats, and so I would hope this work will proceed in a direction that could facilitate that (so if someone later wanted to try cap n' proto, it would be easier than it is today). I think this means preserving something like a Codec, but making it an interface (it probably always should have been one).

In general, gogo-proto seems like a fine direction. It's what we use in Tendermint for the ABCI and many popular projects use. But in Tendermint we don't use the proto generated types for business logic directly, only for communicating over the ABCI wire. We translate back and forth between the generated types and our Go types. While gogo-proto does have support for custom types, which in theory solve this, I recall this being a source of headaches when I experimented with it. I've found the proto generated types much less ergonomic, and I would consider depending on the output of their generators for your every day types to be a significant risk. So I would be a bit wary, and would probably recommend having distinct types for serialization and business logic and shuffling back and forth as necessary. Granted, shuffling back and forth between proto and native types can be a source of errors, but it can mitigate the dependency risk, and it can be very helpful for enforcing boundaries in the code and preserving APIs into the future in the case there may be changes, so it might be a net good thing.

As for determinism, where can I learn about the gogo-proto deterministic mode? In theory the SDK was always supposed to come with some level of static analysis tooling, since you can already shoot yourself in the foot by doing things like using maps, iteration over which is randomized by Go. Users shouldn't use maps and we should some day have tooling to check for it, both in proto types and in Go types.

As for unknown fields, I would expect this to be covered by a strict protocol versioning policy, such that where an unknown field would cause a problem (ie. it's expected to be in a hash), there would be a protocol version bump, so peers would already know they're incompatible.

In any case, championing deterministic protobuf for use in blockchains and pushing upstream to gogo-proto seem like valiant causes.

As for CanonicalJSON, seems right, and I think we basically already do that. But note for instance in Tendermint we don't and won't use JSON for validator signing because the performance impact is too large, there's no humans in the loop, and it's much more painful to verify in the EVM.

@liamsi
Copy link
Contributor

liamsi commented Jan 6, 2020

As for determinism, where can I learn about the gogo-proto deterministic mode?

AFAIR, gogo-proto's deterministic mode only deals with maps. Seems like nowadays this is achievable via the marhsalto plugin and the stable_marshaler option:
https://godoc.org/github.com/gogo/protobuf/plugin/marshalto

@alexanderbez
Copy link
Contributor Author

alexanderbez commented Jan 6, 2020

Why do you say making go-amino proto compatible doesn't solve any problems? It doesn't solve performance, but doesn't it solve cross-client/language support (everyone else can use protobuf libs)?

Is this under the context of necessary changes being made to amino like @liamsi's PR? Do we not see performance as a primary motivating factor? Overall, I'm just simply not optimistic about major features like this being upstreamed to amino any time soon. Not to mention the lack of domain knowledge and overall set of eyes on it.

My main concern with a change like this is the impact on the codebase, in terms of large PRs and API changes.

Mine as well -- it's a very big risk. But all technical decisions come with tradeoffs. The question is, is this decision worth those tradeoffs?

I think this means preserving something like a Codec, but making it an interface (it probably always should have been one).

What would this codec interface look like? Would it be unique per module's needs?

... I recall this being a source of headaches when I experimented with it. I've found the proto generated types much less ergonomic, and I would consider depending on the output of their generators for your every day types to be a significant risk.

Can you shed some light on as to why this is a big risk and why you recommend doing this? What issues do you come across that would warrant this?

As for determinism, protobuf provides us this already, no? The only thing that needs to be guaranteed deterministic outside of this context, is the tx bytes over which you sign and we're using canonical JSON for this.

@ebuchman
Copy link
Member

ebuchman commented Jan 6, 2020

Is this under the context of necessary changes being made to amino like @liamsi's PR?

Yes

Do we not see performance as a primary motivating factor?

Sure.

Overall, I'm just simply not optimistic about major features like this being upstreamed to amino any time soon. Not to mention the lack of domain knowledge and overall set of eyes on it.

Fair enough!

The question is, is this decision worth those tradeoffs?

I think the change can be done iteratively and that doing so will probably be worth it.

What would this codec interface look like? Would it be unique per module's needs?

I expect it would just have Marshal/Unmarshal methods, maybe one for Binary and one for JSON.

Can you shed some light on as to why this is a big risk and why you recommend doing this? What issues do you come across that would warrant this?

I don't remember the exact issues now. But I do remember it being very painful. Looking at the generated ABCI types, they're riddled with garbage fields, for starters. Maybe we're doing something wrong, but that's kind of the point, you become very dependent on the particulars of gogo-proto to get things working the way you want and that's an awfully tight coupling that is almost guaranteed to cause problems, especially for a large project that is itself a dependency of others. Perhaps this can be mitigated somewhat by looking at how other similar projects are using/doing things, but I think that investigation needs to happen first to derisk the effort.

And regardless, I think having separate types for use in the code and use in serialization is still a new positive and good code hygiene. It also may enable these changes to proceed more incrementally.

As for determinism, protobuf provides us this already, no? The only thing that needs to be guaranteed deterministic outside of this context, is the tx bytes over which you sign and we're using canonical JSON for this.

So AFAIK there's three sources of serialization in the SDK:

  • serializing a transaction for the blockchain
  • serializing a transaction for signing (JSON)
  • serializing state to be stored in the merkle tree

The first one is most likely to be handled by code in many languages and so any differences in proto impls may cause certain clients to fail. Just flagging in, but as far as I understood it should work properly ...


In sum, my main recomendation is that work be done to ensure this change can proceed incrementally as much as possible. This will both have the effect of improving the structure of the SDK codebase, it will make the work more manageable, and it should keep us flexible to experiment with other encoding schemes down the road

@aaronc
Copy link
Member

aaronc commented Jan 6, 2020

Thanks @ebuchman for chiming in here!

I want to say that I suspect the issue with gogo proto generated structs is mostly related to configuration. By default there are all these XXX_ fields but there's a flag to shut those off and we're using it. Also with things like customtype and castype, there's a fair amount of control over what's generated. Not perfect, but in general I'm quite happy with the generated structs in most cases - in many cases almost identical to what was there before. For cases where some conversion is needed, it's usually possible to either add conversion getters to the struct and/or create helper functions.

I agree it's possible to create an interface that looks sort of like amino and that will actually make the refactoring quicker with less code change. I will give it a try in #5478.

I do think, however, that it's unrealistic to think that we can easily swap in or out some other encoding scheme. I spent some time exploring Cap'n proto in #5438 and will maybe comment more on it in a separate post. Basically there's going to be some overhead to whatever encoding scheme you choose and it's good to choose wisely. Maybe there are some levels where things are hot-swappable - i.e. we can support json, yaml, xml, etc. representations on some level. But what we are discussing here is a high performance binary encoding and that comes at a relatively high cost.

This reminds me a bit of the phase of ORM's when people wanted to do everything they could to be compatible with all the SQL engines, only to later find out that vendor lock-in was 1) almost impossible to avoid and 2) not always a bad thing because embracing some of the vendor's custom stuff sometimes brought a lot of value. To me actually getting into some of these encoding decisions in practice feels more akin to marriage - you want to go in with eyes wide open and embrace what you're getting. There are subtle things at many levels of the system (how oneof's behave, determinism, etc.). Maybe it is worth discussing whether cap'n proto really would be the right solution, or something else. But having worked with these encodings for the past couple weeks, I think we're best off making the choice consciously and investing in the ecosystem of whatever we choose.

With gogo proto, there is the issue of generating switch statements to convert oneofs to interfaces. In weave they generated all those by hand initially and then someone eventually wrote a helper to use reflection. I spent an afternoon last week and found out gogo proto has a nice plugin system and now we have this: https://github.com/regen-network/cosmos-proto which auto-generates all those switch statements likely saving lots of developer time and reducing potential error surface. Gogo proto is not immutable, it can be improved. If we use cap'n proto we will need to invest even more in improving it, and sticking with amino would likely be an even bigger investment, but that's a separate discussion...

@aaronc
Copy link
Member

aaronc commented Jan 6, 2020

Based on other discussions that are happening I want to share some additional thoughts that may help make this an incremental transition.

We can separate where encoding is needed into client-facing and store-facing. These can actually be different things and evolve differently.

Client Encoding

Currently Cosmos supports amino binary and json encoding and ripping these out would break all existing client integrations. With careful design we can likely maintain support for a client-facing amino API and add support for a client-facing protobuf SDK. These can likely live side by side. Maybe the existing amino API eventually becomes legacy but is still useful for the time-being for existing wallets, block explorers, etc. Even if these tools were mostly using the amino json (for both tx's and queries) and not binary format, changing the json still does break things.

So how about we add a client-facing protobuf API without breaking the amino API? We could also add a Cap'n Proto API. There are ways to make these things live side-by-side (for instance we can prefix protobuf tx's with some magic bytes).

We could even support different signature formats with flags. For instance, we could have flags to support all 3 of the protobuf signing formats I outlined above - deterministic binary, canonical JSON, and raw bytes.

Store Encoding

For a given module, only a single store encoding scheme can be supported, but different modules within an app can actually use different encoding schemes. Some could use protobuf, others could use cap'n proto for example.

I think protobuf is actually a pretty good choice for this level because it's relatively easy to ensure all implementations encode deterministically and protobuf provides tools for maintaining backwards compatibility. I want to re-empahisize that this is also a pretty important thing because changing the block structure in Tendermint makes it hard to upgrade smoothly (tendermint/tendermint#4078). Having proto files would allow us to flag deprecated fields as reserved so that changes to block structure don't just break things.


Just a footnote here... let's all keep in mind this is all dependent on how things go in practice. Certain things are nice in theory and don't play out so well in practice for various reasons. @alexanderbez and I have been trying things in practice and are learning about the gotchas. As I noted in my previous comment - some of the consequences of amino or proto in different places are larger than one might expect and we'll only know once we dig into the code... I would propose we give this incremental approach I'm presenting a try and see how it actually works in practice...

@ebuchman
Copy link
Member

ebuchman commented Jan 7, 2020

Thanks Aaron, that's helpful.

We could even support different signature formats with flags. For instance, we could have flags to support all 3 of the protobuf signing formats I outlined above - deterministic binary, canonical JSON, and raw bytes.

We probably shouldn't do this, seems like unnecessary complexity, especially if it means we need to support all these on the hardware signers! Probably should stick with the JSON-based signing like we're doing now. The cosmos ledger app is awesome (thanks Juan! https://twitter.com/JTremback/status/1148756902370664448)

@wangkui0508
Copy link

Hello, Everyone.

We (the CoinEx Chain Team) happen to be developing a serialization library named codon (https://github.com/coinexchain/codon) to replace go-amino. It is proto3-compatible and fast (use code generation instead of reflection). Besides, it keeps the same API as go-amino (almost), so the interference to Cosmos-SDK is minimized.

If you want to integrate it into Cosmos-SDK, we'd be happy to help.

@alexanderbez
Copy link
Contributor Author

alexanderbez commented Jan 9, 2020

Hey @wangkui0508, pretty neat! I took a look at the code and I don't really see any specification or anything like that. I know this is a direction @jaekwon wanted to take, so I would definitely recommend he take a look at it and work together with you.

Can it generate proto messages/schemas? Note performance is not the only reason why we're looking to migrate. We need rich multi-language and cross-client support.

That being said, I think going with protobuf encoding is still the best strategy forward atm.

@wangkui0508
Copy link

I am writing documents for it. Please wait another 12 hours. Before that, maybe you are interested in how it can be integrated into Cosmos-SDK. Here is a branch using it: https://github.com/coinexchain/cosmos-sdk/tree/use_codon

It can generate proto schemas. But there are some bugs. Maybe this weekend I can fix these bugs.

@alexanderbez
Copy link
Contributor Author

It can generate proto schemas. But there are some bugs. Maybe this weekend I can fix these bugs.

I really don't feel comfortable with this approach as I'd rather use something more battle-tested and with richer client support. Happy to have other's opinions on this though 👍

@wangkui0508
Copy link

I can not fully understand why you are uncomfortable. codon uses protobuf3 encoding and it generates .proto files for you to use any battle-tested and richer protobuf implementations, in any programming language. Here is an example .proto file generated by codon:
https://github.com/coinexchain/cosmos-sdk/blob/use_codon/codongen/types.proto

If you do not like the codec source codes generated by codon, you can use gogo-proto to generate them, with this types.proto file. At least codon avoids manually writing .proto files.

I am still working on the document. Here is rough introduction: https://github.com/coinexchain/codon/blob/master/README.md

@alexanderbez
Copy link
Contributor Author

@wangkui0508 codon has merit and the work done on it is great! I just have a hard time visualizing what it really buys us? We're using gogo-proto to write our proto files and we're doing so intentionally to utilize the rich set of features it provides us, so unless codon can hook in gogo proto somehow, what does it give us atm?

I would also argue that we should take great care in writing the proto messages manually and not have the process automated as that may increase the surface area of bugs as developers are lazy.

@wangkui0508
Copy link

Well it is a matter of different trade-offs.

It is OK to manually manage the proto messages and use the .pb.go files generated by gogo-proto when writing modules. If go-amino was not created and Cosmos-SDK used gogo-proto for serialization from the beginning, there would be no argue now. Because this method CAN WORK.

But go-amino was created and Cosmos-SDK used it in many places. Later projects based on Cosmos-SDK, such as CoinEx Chain, also used go-amino. Go-amino do have advantages: simplicity and easy-to-use. You just register plain Go structs and interfaces and do not care much about serialization, because the codec will take care of it.

When there are lots of code based on go-amino's API, migration will be a huge pain. In CoinEx Chain we developed a lot of modules. We must re-write a lot of files to migrate from go-amino to gogo-proto. And without go-amino's simplicity, we must write more lines of code than before.

So, when we were aware of the performance and compatibility problems of go-amino, we decided to enhance it to solve the problems, instead of abandoning it. Thus we can avoid the efforts of code rewriting and continue to enjoy the simplicity of go-amino.

Of cause you team have the rights to decide what's the best path for Cosmos-SDK. But since Cosmos-SDK wants to be the corner stone for many different blockchains, I think other blockchains' opinions are also important. We DO want to use our existing modules. So if Cosmos-SDK can be backward-compatibile and do not introduce breaking changes, we would be very thankful.

And according to our practice, the rich set of features of gogo-proto are not needed for blockchain. And many of such features are golang-specific and harmful to cross-language interoperation.

In a word, we think Cosmos-SDK is great platform and good enough now. Thank you very much for developing such a good SDK. Please make it stable (to a reasonable degree) and do not make breaking changes.

@aaronc
Copy link
Member

aaronc commented Jan 10, 2020

Thanks for sharing @wangkui0508. First of all codon looks really interesting and I applaud what you guys have come up with. I share your concerns being a zone author building on top of the Cosmos SDK as well.

I first want to address the backwards compatibility concerns. The approach we are currently taking would preserve an interface that is compatible with amino and would preserve essentially the same structs that currently exist in the SDK. These structs will, however, be enhanced with protobuf marshal methods via gogo protobuf. We have been working towards a path that allows this smoothly. With this approach, the hope is that amino or protobuf encoding could be chosen for each module. So I believe this would mean that you could still use codon even with the gogo protobuf code added. We have also been discussing maintaining the existing APIs that use amino JSON format for backwards compatibility.

Would these address your immediate concerns?

My hope long term is that an approach with .proto files and codon could co-exist in the same code base as they are effectively different approaches to the same encoding. There may be some details to work out there, however.

Now regarding the difference in approach between using an amino compatible interface vs codon files, I think there are a few trade-offs. I do think your approach of generating protobuf marshal methods for amino types is very interesting and solves a lot of problems. Here are what I see as the advantages of a hand-written .proto file approach:

  • in order to share the same .proto files between chains and create client modules which are chain independent, there needs to be an extensible solution for translating go interfaces to protobuf oneofs. Our current approach is to manually allow chains to override these oneof's rather than automatically generating a oneof prefix. This does have a bit of overhead and will require some reworking of the code, but does allow .proto files to reference each other which I think wouldn't work with codon's approach although maybe a solution is possible.
  • starting from .proto files does allow us to think about cross-platform encoding questions more directly. Yes we are using gogo proto enhancements, but we do have a method for stripping these extension for code generators for other languages. The gogo proto enhancements, however, let us write our protobuf files in a cross-platform friendly way first while also generating canonical go structures. I actually think the codon proto file looks pretty close, but could also use some tweaks.

Anyway, hopefully this helps address some of your concerns and also explains some of our thinking.

@wangkui0508
Copy link

Thank you very much for the explaination @aaronc

It would be smooth to allow amino or protobuf encoding be chosen for each module. I do hope the interface of codec.Codec would not be changed. So many functions depend on it, such as GetTxCmd(cdc *codec.Codec) *cobra.Command and GetQueryCmd(cdc *codec.Codec) *cobra.Command.

I agree with you that codon and gogo-proto are effectively different approaches to the same encoding (proto3). So it is nature to allow module authors to choose the suitable solution for them. Simplicity or cross-chain awareness, different trade-offs.

When the scheme allowing .proto files to reference each other are finally fixed, we will try to tweak codon to make it compatible.

@alexanderbez
Copy link
Contributor Author

Thanks @aaronc and @wangkui0508 for sharing perspective here. We certainly are keeping backward compatibility and current applications built on the SDK in mind! That being said, don't worry @wangkui0508, the amino interface will not change. In fact, we are going for a more flexible solution in that modules and client logic will work by using an interface instead of a *amino.Codec directly, but as amino already implements the expected interface, it will not be a concern for you 👍.

@fedekunze
Copy link
Collaborator

added ibc and capability modules for reference

@clevinson clevinson added this to the v0.39 milestone Apr 30, 2020
@qizhou
Copy link

qizhou commented Jun 26, 2020

Is there any experiments on

  • the benchmark comparisons between gogo-proto and amino? I can find one here Investigate amino performance/mem usage tendermint/go-amino#254 but not sure it is official or final one;
  • the cost percentage of amino encoding/decoding in a heavily-loaded Cosmos network, i.e., load testing a 4 or 10-node gaia-like testnet?

To my knowledge to other blockchains such as Eth, the major bottleneck is disk IO rather than CPU cost on encoding/decoding. My concern is that without experiments with concrete profiling numbers, the gain of using gogo-proto in practice is not clear.

@alexanderbez
Copy link
Contributor Author

@qizhou you can take a look at #4977 for some reference. In the context of the SDK, the bottleneck was Amino, and then DB IO. Amino is an extreme performance hit we've seen 100x improvements in simulation speeds by switching to proto. After the migration to Proto, yes, then the bottleneck becomes network and DB IO.

@qizhou
Copy link

qizhou commented Jun 29, 2020

@qizhou you can take a look at #4977 for some reference. In the context of the SDK, the bottleneck was Amino, and then DB IO. Amino is an extreme performance hit we've seen 100x improvements in simulation speeds by switching to proto. After the migration to Proto, yes, then the bottleneck becomes network and DB IO.

Thanks for the info. I take a look at the benchmark. First of all, this is really good work and helps to identify the real problem by replaying mainnet blocks. However, my main concern is that replaying the current mainnet blocks may not reflect the actual bottleneck when the network is heavily used such as ETH - the current mainnet blocks contain a very few transactions in each block, and thus I believe the major cost is from BeginBlock/EndBlock/CommitBlock rather than DeliverTx.

So for the optimizations, my questions are:

  1. What is the performance limit of a gaia-like network using Cosmos SDK? The performance can be measured in TPS, and the test transaction can be a balance transfer from a random user to another random user for simplicity;
  2. In such workload, what is the profiling numbers as demonstrated similarly in x/slashing performance #4977? And which part is the most expensive and should be optimized?
  3. After the optimization, what is the new performance number?
  4. Repeat 1-3

Actually, we want to start some experiments on 1, but not sure if Cosmos team has done that and any results/tools that we could re-use (like the one in #4977). Thanks.

@alexanderbez
Copy link
Contributor Author

Great questions, but unfortunately, I'm not aware of any true throughput measurements against the Gaia state machine. I would personally love to see this. I can, however, state that regardless of the ABCI method (e.g. DeliverTx vs BeginBlock), the bottleneck on the state-machine side will be Amino (e.g. getting all validators and decoding them). The low hanging fruit on the state-machine is to really improve encoding performance and then tweaking various design aspects of modules (e.g. x/slashing).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants