Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allocate builtin instructions budget with its actual cost #170

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
177 changes: 177 additions & 0 deletions proposals/simd-0170-builtin-instruction-cost-and-budget.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,177 @@
---
simd: '0170'
title: Allocate precise builtin instructions budget
authors:
- Tao Zhu (Anza)
category: Standard
type: Core
status: Draft
created: 2024-08-26
feature: https://github.com/anza-xyz/agave/issues/2562
supersedes:
superseded-by:
extends:
---

## Summary

Builtin instructions always consume a statically defined amount of compute
units (CUs). Therefore, they should allocate the exact same amount of compute
budget and count the same amount toward the block limit during the banking
stage.

## Motivation

Builtin instructions in the SVM consume their static DEFAULT_COMPUTE_UNITS from
the compute budget during execution. These DEFAULT_COMPUTE_UNITS are also
counted against block limits during the banking stage. However, historically,
builtin instructions have been allocated DEFAULT_INSTRUCTION_COMPUTE_UNIT_LIMIT
units of compute budget. This discrepancy between the allowed consumption and
the actual usage tracked for block limits has led to several issues that need
to be addressed.

Allocating DEFAULT_INSTRUCTION_COMPUTE_UNIT_LIMIT instead of the actual
DEFAULT_COMPUTE_UNITS for builtin instructions distorts the tracking of
transaction compute budgets. This can result in more expensive instructions
being executed when they should fail due to exceeding the budget. Consequently,
the cost tracker may need to account for additional compute units toward block
limits after transaction execution, potentially producing blocks that exceed
those limits.

Furthermore, maintaining consistency in transaction costs between the banking
stage and SVM would simplify the code logic and make reasoning more
straightforward.

## Alternatives Considered
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we were to take this approach of using static costs per builtin program, and alternative is to just fix the implementation of CU usage in execution.

If our cost-model says that builtin program Z always uses X CUs, then that should be what is actually used by the execution, regadless of what it does internally, including CPI.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we were to take this approach of using static costs per builtin program, and alternative is to just fix the implementation of CU usage in execution.

Acknowledging that several alternatives exist, this proposal focuses on the specific issue: "If the VM must consume builtin.default_compute_units, then it should allocate exactly that amount for the builtin, rather than a fixed 200,000 units." This approach addresses block overpacking in the short term, while longer-term solutions are still being explored.

If our cost-model says that builtin program Z always uses X CUs,

To clarify, it's the cost model being told that a builtin program Z always uses X CUs.

then that should be what is actually used by the execution, regadless of what it does internally, including CPI.

Yea, that'd be ideal. I am not 100% how and when this will happen tho. Happy to discuss more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(agave implementation specific)
I believe we'd need to modify the declare_process_instruction macro so that we can somehow have a call that doesn't consume CUs and one that does - either through some flag or separate call.
Calls from inside our builtins would then need to call the correct one that does not consume.
Would also need to make sure that actual user-code cannot call into the non-consuming version.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@apfitzge you can just provide zero and it becomes a no-op.
https://github.com/anza-xyz/agave/blob/a72f981370c3f566fc1becf024f3178da041547a/program-runtime/src/invoke_context.rs#L71-L76

A good standard would be to follow the ZK proof program's instructions (shared previously by @tao-stones). We would just need to sort out how to represent the constants to get some kind of builtin-program dictionary of CUs per instruction.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but then I think then it'd always consume 0, since it is a macro defining a function, not a function itself.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but then I think then it'd always consume 0, since it is a macro defining a function, not a function itself.

Whoops, sorry, I don't think I was specific enough to describe what I meant. Yeah, the generated code consumes zero if you do that, but the salient bit from the ZK program I was intending to share was where each instruction does a manual consumption.

https://github.com/anza-xyz/agave/blob/e207c6e0eaf8e1657fbfaff07da05ca6a7928349/programs/zk-token-proof/src/lib.rs#L205-L208

Globally, default CU consumption is zero, but each instruction consumes its own individual value.


- One possible alternative approach would be to maintain the current allocation
of compute budget for builtin instructions but add logic to the cost tracker
to account for the discrepancy during tracking. However, this would add
complexity and could introduce additional corner cases, potentially leading to
more issues.

- Another alternative would be to treat builtin instructions the same as other
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's already plans underway to move the builtins to bpf.

at that point it'd effectively be the same as this alternative since builtins would be treated the same as other user programs - (?) possible exception for compute-budget which is a configuration not an instruction (imo).

Thus we will need some way to address the over-reserving described here.
There have been discussions on ramping down the default ix CUs over time, what is the issue with such an approach now if it is eventually the path we have to take?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if moving builtins to bpf solves the particular issue (mentioned above) this proposal aims in immediate term, cause still need to allocate "proper" budget for builtin, or any instruction in that scenario.

But lower default cu/ix will help - lowered default per ix will ease over reserving. What would the new "default" be is arbitrary. In extreme case, if it is lowered to 0, we'd be in the ideal world that every transaction will have to set (in other words, configure) its cu limit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, but I think there's simplicity in saying "all programs are treated the same" vs "builtins are treated differently AND these specific builtin instructions are treated even different from that"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely. to be at "all programs are treated the same" in terms of allocating budget for VM, it needs to shift to 100% request-based mode. More tools need to be built to help devs to request cu limits easily and accurately, also lower the default cu/ix when < 100% transaction requests cu limit. This proposal serves stopgap before all that land.

instructions by allocating DEFAULT_INSTRUCTION_COMPUTE_UNIT_LIMIT units for them
as well. However, this approach has concerns. DEFAULT_INSTRUCTION_COMPUTE_UNIT_LIMIT
is a very conservative estimate, currently set at 200,000 CUs per instruction.
Estimating all builtin instructions, including votes and transfers, would cause
the banking stage to significantly over-reserve block space during block
production, potentially leading to under-packed blocks. Additionally, if it's
known that builtin instructions will consume a fixed amount of CUs, it doesn't
make sense to estimate them with a generic DEFAULT_INSTRUCTION_COMPUTE_UNIT_LIMIT.

## New Terminology

None

## Detailed Design

The following changes are proposed to ensure consistent CU allocation and
consumption for builtin instructions: transactions containing only builtin
instructions should not need to explicitly request compute-unit-limits, and
these transactions should execute without any cost adjustment. If a transaction
explicitly sets a compute-unit-limit, the requested limit will be applied
accordingly.

1. Allocate Compute Budget per Builtin Instruction:

When set_compute_unit_limit is not explicitly requested, the compute budget
should always allocate the maximum number of compute units (MAX_COMPUTE_UNIT)
as declared by each individual instruction, including compute-budget
instructions.

Currently, when no set_compute_unit_limit is used, all instructions are
allocated a DEFAULT_INSTRUCTION_COMPUTE_UNIT_LIMIT, which is capped by the
transaction’s MAX_COMPUTE_UNIT_LIMIT. However, historically, no CUs have been
allocated for compute-budget instructions, which can lead to over-packed
blocks in certain cases.

Example Scenarios:

[Example 1](https://github.com/anza-xyz/agave/pull/2746/files#diff-c6c8658338536afbf59d65e9f66b71460e7403119ca76e51dc9125e1719f4f52R13403-R13429):
A transaction contains both a Transfer instruction and an expensive non-
builtin instruction. If the non-builtin instruction requires more compute
units than the DEFAULT_INSTRUCTION_COMPUTE_UNIT_LIMIT, over-budgeting for the
Transfer instruction allows the expensive instruction to complete execution,
forcing an upward adjustment of the compute units used.

[Example 2](https://github.com/anza-xyz/agave/pull/2746/files#diff-c6c8658338536afbf59d65e9f66b71460e7403119ca76e51dc9125e1719f4f52R13431-R13451):
A builtin instruction that calls (CPI) other instructions may not reserve
enough CUs. Upon successful execution, this under-reservation forces an
upward adjustment.

### Detailed Changes:

1.1 Changes to Builtin Programs:

- Each builtin program will now expose the DEFAULT_COMPUTE_UNIT for each of
its instructions (similar to how ZK programs do it).
- Each instruction will also expose its MAX_COMPUTE_UNIT, which represents
the worst-case scenario. This MAX_COMPUTE_UNIT accounts for any variation,
such as the instruction calling (CPI-ing) other instructions based on input
data or account states. It must be equal to or greater than the
DEFAULT_COMPUTE_UNIT.
- This makes builtin programs more transparent about their compute usage.
During execution, the runtime will use the DEFAULT_COMPUTE_UNIT to track
actual compute usage, while MAX_COMPUTE_UNIT will be used to allocate the
compute budget.

1.2 Changes to Builtin-Default-Costs Crate:

Instead of the current dictionary of [builtin program, DEFAULT_COMPUTE_UNIT],
a new dictionary will be created with [instruction, MAX_COMPUTE_UNIT].
This will allow the system to accurately calculate compute allocation for
each instruction, factoring in its potential CPIs.

1.3 Call-Site Implementation:

- Instruction Type Lookup: At the call-site (e.g., in the compute budget or
cost model), the type of builtin instruction will be determined.
- If the instruction type cannot be identified, a small number of CUs will
be allocated to account for basic work done on the transaction, such as
deserialization or instruction type lookup.
- If the instruction type is determined, the system will allocate the
MAX_COMPUTE_UNIT for that instruction.
- The transaction’s program_id_index will only be checked once during this
process, and the result will be cached to prevent redundant lookups for
efficiency.


2. Respect Explicit Compute Unit Requests During Block Production:

When a compute-unit-limit is explicitly requested, always use it to reserve
block space during block production, even if there are no user-space instructions.

[Example 3](https://github.com/anza-xyz/agave/pull/2746/files#diff-c6c8658338536afbf59d65e9f66b71460e7403119ca76e51dc9125e1719f4f52R13457-R13484)
The cost model ignores explicitly requested CUs for transactions has all
buitin instructions, resulting in an upward adjustment instead of the usual
downward adjustment.

3. Fail Transactions with Invalid Compute Unit Limits Early:

If set_compute_unit_limit sets an invalid value, the transaction should fail
before being sent for execution.
"invalid value" is defined as `> MAX_COMPUTE_UNIT_LIMIT || < Sum(builtin_instructions)`

[Example 4](https://github.com/anza-xyz/agave/pull/2746/files#diff-c6c8658338536afbf59d65e9f66b71460e7403119ca76e51dc9125e1719f4f52R13344-R13373):
If the explicitly requested CU limit is invalid, the transaction should
fail during sanitization, saving it from being sent to the SVM for execution.


*Note*: Users are encouraged to explicitly request a reasonable amount of
compute-unit-limits. Requesting more than needed not only increases the
prioritization fee the user pays but also lowers the transaction's priority.

## Impact

Users who previously relied on including builtin instructions, instead of
explicitly setting compute-unit limits, to allocate budget for their
transactions may experience an increase in transaction failures. To avoid this,
users are encouraged to use set_compute_unit_limit to explicitly request the
necessary budget for their transactions.

## Security Considerations

Both Agave and FD clients should implement this proposal to avoid breaking
consensus.

Loading