Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

eval: increase opcode budget with fee credit #5943

Draft
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

joe-p
Copy link
Contributor

@joe-p joe-p commented Feb 24, 2024

Summary

Right now using inner txns is a very common pattern to increase opcode budget within an app call. This works, but it can be quite a clunky experience as a developer trying to figure out where your opups (or opup checks) should be as you iterate. It also leads to some extra bytes in programs and extraneous inner transactions in blocks (and potentially created apps if not properly deleted).

This PR is a proposal to add a new feature that increases the opcode budget by deducting MinTxnFee from the fee credit when extra opcode budget is needed. pooledAllowedInners is used and deducted accordingly to ensure that a single group doesn't consume more budget than what has historically been possible.

Test Plan

TestInnerBudgetIncrementFromFee ensures that this works in applications and verifies budget increase and fee credit deduction works as expected.

  • Test with inner transactions to ensure pooledAllowedInners is working as expected (or maybe just make it public so it can be verified in tests)

TODO

  • Put behind new AVM version
  • Figure out best way to reconcile with the fact that the semantics of transaction have changed. Perhaps increment txn counter?

@joe-p joe-p changed the title feat: increase opcode budget with fee credit eval: increase opcode budget with fee credit Feb 24, 2024
Copy link

codecov bot commented Feb 24, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 55.72%. Comparing base (787f758) to head (54914c5).
Report is 40 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5943      +/-   ##
==========================================
- Coverage   55.75%   55.72%   -0.04%     
==========================================
  Files         488      488              
  Lines       68101    68115      +14     
==========================================
- Hits        37969    37956      -13     
- Misses      27568    27586      +18     
- Partials     2564     2573       +9     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@pbennett
Copy link

This will be a VERY welcome change to the AVM. Can't wait !

@robdmoore
Copy link

Hugely supportive of this. Opup is the one thing in Algorand that feels really inelegant and is a real pain for development.

This approach can be easily combined with simulate to programmatically determine how to appropriately fund the amount of compute needed for a transaction without the contract developer or contract consumer needing to do (sometimes laborious) experimentation for the magic number of opups that are needed for a given call.

@joe-p
Copy link
Contributor Author

joe-p commented May 3, 2024

I have updated the PR so that this only affects apps. Doing this for logic sigs wouldn't doesn't make since due to the fact they are executed in parallel.

The main issue blocking this PR from going further is the fact that this fundamentally changes the semantics of what a transaction is (as pointed out to me by @jannotti). I see a couple of paths forward each with pros/cons

Potential Options

A) Send Inner Transactions

For every additional 700 budget needed, we automatically send an inner txn. This is the most similar to current functionality and just adds a nice QoL to developers. The major downside of this is the complexities around existing subtxns. If a group is currently being formed at the time of more budget being needed, we would need to ignore the existing group and form an additional group.

B) Increment TxnCounter

Rather then actually sending a transaction, we could just increment the TxnCounter in the block for every 700 opcode budget we give. This preserves the meaning of transaction in the context of sequential units of work and circumvents the problem of working around transaction groups that are current being formed. This does, however, change the meaning of transaction in most other contexts (such as explorers, TPS counters, etc.).

C) New Field in Block

Rather than using TxnCounter to track sequential units of work, we could have a new field that is used to measure work done by a block. This could be done in various ways, most simply having a counter that increases per txn AND per 700 opcodes given. We could also have a direct measure of opcode budget consumer per block.

My Thoughts

B seems like the quickest way forward but C would make the most since long term in my opinion. While B preserves the current semantics of transaction in terms of units of sequential work, I question how many consumers are actually interpreting transactions that way. The colloquial interpretation of transaction would get lost with B and A seems like unnecessarily doing work during evaluation. Of course, there are many decisions to made if we go the route of C but I feel like it makes the most logical sense

@jannotti
Copy link
Contributor

jannotti commented May 3, 2024

The major downside of this is the complexities around existing subtxns.

Option A doesn't sound that bad. It would certainly be annoying if there was the possibility of creating a new weird transaction group. But the txn being created is always a singleton, and need not be configurable in any way. So in the worst case, I think the opcode would just move the existing subtxns aside, set up the opup call, execute it, and then restore subtxns. And it's possibly even simpler than that - leave subtxns alone, but invoke the inner txn machinery on the very simple txn needed for opup directly.

This also desperately needs to be opted into by an app. You don't want this draining the app account on wasted cycles. (Perhaps you intend that the opup txn must never charge the app account?)

@joe-p
Copy link
Contributor Author

joe-p commented May 3, 2024

And it's possibly even simpler than that - leave subtxns alone, but invoke the inner txn machinery on the very simple txn needed for opup directly.

True, there's no reason we couldn't do this. I suppose it really just seems kinda "ugly" to me. For example it would add a bunch of extra transaction to the simulate response with traces enabled or on explorers. We could make special exceptions to show these transactions differently, but at that point it would just seem simpler to increment the TxnCounter directly.

This also desperately needs to be opted into by an app. You don't want this draining the app account on wasted cycles. (Perhaps you intend that the opup txn must never charge the app account?)

Yes it's using the FeeCredit which means this action alone is not deducting from account balances. If we were to actually have it emit transactions then I think it would make the most sense to keep the fee at 0 so its covered by fees in other transactions.

@giuliop
Copy link
Contributor

giuliop commented May 4, 2024

The main reason I chose to hack on the AVM is because it feels very elegantly designed and that gives an intangible but very real and important joy factor to developing on it.
Except for the opcode budget mechanism of course, you can feel a piece of you die every time you make up dummy inner transaction :)

So I believe it's important to make it right, not just to make it work, in a way that can also support future increases of the AVM limits without need to retouch it.

To that effect it's worth taking a moment to think from first principles what we want and then figure out the most effective, and least disruptive, implementation strategy.
I will start by recapping the current context to make sure we are all on the same page and attempt then to derive the principles for a redesign, with some limited comments on implementation strategy since I haven't studied the code in detail yet.

Current context

(I might make some mistakes here so please correct me as needed!)

Today the main limitations to smart contracts (apps) / logic signatures (lsigs) on the AVM are:

  • The opcode budget -> a combination of budget per transaction and maximum number of transactions
  • The program size -> total size in KB for a lsig teal program, or an app approva/clear teal program
  • The "inputs" size -> the size of the various arrays you can pass in e.g., arguments, foreign apps, etc.

I will focus on the first two here, since 3. is not directly linked to the opcode budget while 2. actually is, given that the program size is also a direct limit on the computation size for things that cannot be "compressed" through loops or subroutines (of course there are a few more low level details like stack depth, scratch space, etc. which I am also ignoring since I've never run into limits with them)

Opcode budget

At the moment we have the following limits:

  • 20,000 per lsig call
  • 700 per app call
  • 16 max outer transactions
  • 256 max inner transactions (app calls only, cannot invoke lsigs)

So you can spend a max of:

  • 16 * 20,000 = 320,000 on lsigs' computation, and
  • 16 * 700 + 256 * 700 = 190,400 on apps' computation.

To achieve the max possible opcode budget you need to build a transaction group of 16 lsigs signing an app call each and make 256 inner transactions from the app calls, so you can get: 16 * 20,000 + 16 * 700 + 256 * 700 = 510,400

The key here is that you can pool together the lsig budget and have one single lsig consume almost 320,000 by adding 15 dummy lsig transactions and also pool together the app budget and have one single app consume almost 190,400 by adding 15 + 256 dummy app calls but you cannot pool together the two budgets.

The "you cannot pool together lsigs and apps budget" makes sense because lsigs are stateless and can run in parallel to app calls.

Program size

  • 1000 byte for lsigs
  • 2048 * 4 bytes = 8192 bytes for apps

Also, since apps are composable, and you can call up to 16 + 256 apps, you can have a "super app" split into 16 + 256 apps of 8192 bytes each, so plenty of space.

The limit on lsigs really bites instead, lsigs are not composable neither with other lsigs nor with apps, as there is no communication mechanism between lsgis or between lsgis and apps except for success/failure.
That juicy 320,000 opcode budget is really difficult to use :(

As a concrete example, a BLS12-381 zk verifier generated by AlgoPlonk consumes ~185,000 opcode budget which is too much to make it an app.
Ideally it would be made an lsig, it is stateless after all, it just processes its inputs and either succeed/fail.
Unfortunately its program size is over 3 KB, and cannot be split in multiple 1K programs independent from each other.

This means you are forced to use AlgoPlonk's BN254 verifier which "only" consumes ~145,000 opcode budget. Since it has the same program length as the BLS12-381 one, you need to instantiate is as a smart contract and eat into the overall apps' budget.
If you now want to do some other expensive computation of over ~40,000 in cost (e.g., updating a 32 level merkle tree with the upcoming 🙏 mimc opcode) you need to split that into two blocks, which is frustrating because you are "wasting" that 320,000 lsig budget that would be perfect for the job.

Now it might be that this specific example can be optimized for program size and eventually even fit in the 1K limit (I haven't worked on that yet) but the general point stays the same.

Ideal design

My attempt at an ideal design would be:

  1. We define what is the maximum opcode budget allowed for lsigs and apps. We keep two separata opcode budgets since lsigs can run in parallel to apps (in theory we can also run lsigs in parallel among each other arguing for further granularity in budget definition but let's leave this as a future optimization)
    To start with we can use the current ones: 320,000 for lsigs and let's call it 200,000 for apps (intuitively makes sense the former is bigger because of the extra parallelism opportunity).

  2. We give to a single lsig or app call the maximum opcode budget for their category but we also impose that as a limit per atomic transaction group (together with the current limits on total number of outer / inner transaction).
    At the risk of oversimplification, this means setting the current LogicSigMaxCost from 20,000 to 320,000 and the current MaxAppProgramCost from 700 to 200,000 while also checking for those limits at group level (I guess here is where the modification challenge might lie, but I have not studied the code to appreciate it fully).

  3. We modify the minimum fee per transaction to be MinTxnFee * opcode_budget_blocks_count, where

  • MinTxnFee is the current constant (i.e., 0.001 algo)

  • opcode_budget_blocks_count is calculated as: ceiling(total_transaction_opcode_cost / opcode_budget_block_size) and opcode_budget_block_size is set as currently as 20,000 for lsig and 700 for apps (two new constants).
    Here again there is probably a modification challenge I don't fully appreciate.

    This also begs the question of why should you pay the same fees for 20,000 in an lsig and 700 in an app call, I guess it does not really make sense and should be normalized somewhat.

  1. We keep of course a limit on the program size for lsig and apps. The apps one is fine as discussed above.
    The lsig one feels very limited. For instance, there is a limit of 2048 bytes on app call arguments and a limit of 1024 bytes on note fields, both larger than the 1000 byte limit on lsigs.
    I think it would be more balanced to set the lsig program limit to 4096 bytes (half the max for an app teal size), or at least 2048 bytes, as one page of app program.
    In the future we can optimize how we store those transactions in the ledger by caching the lsigs and only storing their hash if we don't do that already. I guess we could also optimize how we broadcast them by sending their hash only once they are 'cached' by the network.

This, and the P2P gossip network, are to me the most important changes to the AVM right now.
The latter because decentralization is the only reason we are all here after all and this one for the developer experience.

@joe-p
Copy link
Contributor Author

joe-p commented May 4, 2024

We modify the minimum fee per transaction to be MinTxnFee * opcode_budget_blocks_count, where

Unless I am misunderstanding, that is essentially what this PR is doing. For every additional MinTxnFee you get another 700 opcode units to work with.

I think it would be more balanced to set the lsig program limit to 4096 bytes (half the max for an app teal size), or at least 2048 bytes, as one page of app program.

Agreed, but seems orthogonal to the problem this PR is specifically addressing. I've created an issue to continue the discussion #5990

@giuliop
Copy link
Contributor

giuliop commented May 5, 2024

We modify the minimum fee per transaction to be MinTxnFee * opcode_budget_blocks_count, where

Unless I am misunderstanding, that is essentially what this PR is doing. For every additional MinTxnFee you get another 700 opcode units to work with.

Yes, I am arguing for solution C) and break free of the inner transaction machinery, so that it works for lsigs too and it's a clean solution. Polluting the ledger with dummy transactions cannot be a long term solutions and will surely come back to bite us sooner or later and will always be more painful to break things later. I'd rather force changes to frontends now if we have to, I'm pretty sure everybody will be happy anyway to get this cleaned up even if creates some short term work.

I think it would be more balanced to set the lsig program limit to 4096 bytes (half the max for an app teal size), or at least 2048 bytes, as one page of app program.

Agreed, but seems orthogonal to the problem this PR is specifically addressing. I've created an issue to continue the discussion #5990

Thank you for that, I commented there

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants