Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AVM: Implement lsig size pooling #6057

Merged
merged 22 commits into from
Nov 14, 2024

Conversation

giuliop
Copy link
Contributor

@giuliop giuliop commented Jul 8, 2024

Implement pooling of logic signatures' size across a transaction group.

This PR addresses this issue by implementing pooling of logic signatures' size across a transaction group.
As discussed in the issue, given that currently it is possible to submit a group with 16 1KB lsigs, it seems logical to allow pooling of the total size across a group.

Implementation

The pooled max lsig length for the group is set at len(group) * LogicSigMaxSize, as done for lsig opcode budget pooling, for both node validation and goal clerk dryrun.

A group is valid if the sum of all lsigs' program and args size in the group does not exceed the pooled max lsig length.

The consensus change(EnableLogicSigSizePooling bool) is enabled in vFuture in this PR.

To make pass the TestMaxSizesCorrect, I had to change the value of TxnTagMaxSize.
Two observations on this:

  1. The comments of TestMaxSizesCorrect say:
// If this test fails, DO NOT JUST UPDATE THE CONSTANTS OR MODIFY THE TEST!
// Instead you need to introduce a new version of the protocol and mechanisms
// to ensure that nodes on different proto versions don't reject each others messages due to exceeding
// max size network protocol version

Beside updating the constant I did introduce a new version of the protocol (in vFuture) but I don't fully understand the comment so not sure if I was supposed to do something else as well.

  1. To calculate the value of TxnTagMaxSize, TestMaxSizesCorrect calls the function SignedTxnMaxSize which calculates the LogicSigMaxSize calling this function:
// MaxSize returns a maximum valid message size for this message type
func LogicSigMaxSize() (s int) {
	s = 1 + 2 + msgp.BytesPrefixSize + config.MaxLogicSigMaxSize + 4 + crypto.SignatureMaxSize() + 5 + crypto.MultisigSigMaxSize() + 4
	// Calculating size of slice: z.Args
	s += msgp.ArrayHeaderSize + ((EvalMaxArgs) * (msgp.BytesPrefixSize + config.MaxLogicSigMaxSize))
	return
}

config.MaxLogicSigMaxSize used to be 1KB and this PR changes it to 16KB.
What I don't undestand is why we multiply config.MaxLogicSigMaxSize by EvalMaxArgs which is 255, ending up with what looks like an exagerrated value for LogicSigMaxSize.
I did not want to touch this in case I am missing something.

Testing plan

The PR is marked as draft since it does not includes tests yet.
I've run separately integration tests testing success and failure for single transactions and groups, both for goal clerk dryrun and actually sending the transactions.

Before integrating the tests in the codebase I'd like to get some feedback:

  1. Is the overall approach ok?
  2. How would you like to see the tests? I suppose integration tests only? Any guidance on where to put the tests and which existing tests to use as reference would be very helpful

Copy link

codecov bot commented Jul 10, 2024

Codecov Report

Attention: Patch coverage is 74.07407% with 7 lines in your changes missing coverage. Please review.

Project coverage is 55.05%. Comparing base (39f7485) to head (0b5564e).
Report is 3 commits behind head on master.

Files with missing lines Patch % Lines
cmd/goal/clerk.go 0.00% 7 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6057      +/-   ##
==========================================
- Coverage   56.28%   55.05%   -1.23%     
==========================================
  Files         494      494              
  Lines       69958    69972      +14     
==========================================
- Hits        39375    38523     -852     
- Misses      27912    28717     +805     
- Partials     2671     2732      +61     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@gmalouf gmalouf requested a review from jannotti July 10, 2024 19:10
cmd/goal/clerk.go Outdated Show resolved Hide resolved
data/transactions/verify/txn.go Outdated Show resolved Hide resolved
data/transactions/verify/txn.go Show resolved Hide resolved
@jannotti
Copy link
Contributor

I tentatively agree with you about you thinking about LogicSigMaxSize(). There is no explicit constant for a maximum argument size, so an argument can indeed by the whole 16kb in size. But, of course, you can't have 255 of them. So I think we want to capture the fact that the the msgp size can by 16kb plus the overhead of (up to) 255 arguments. Each additional argument causes the msgp serialization to be a bit bigger in order to encode the length of the argument. I think that means we just remove config.MaxLogicSIgMaxSize from the sum being multiplied.

protocol/tags.go Outdated Show resolved Hide resolved
@giuliop
Copy link
Contributor Author

giuliop commented Jul 20, 2024

I tentatively agree with you about you thinking about LogicSigMaxSize(). There is no explicit constant for a maximum argument size, so an argument can indeed by the whole 16kb in size. But, of course, you can't have 255 of them. So I think we want to capture the fact that the the msgp size can by 16kb plus the overhead of (up to) 255 arguments. Each additional argument causes the msgp serialization to be a bit bigger in order to encode the length of the argument. I think that means we just remove config.MaxLogicSIgMaxSize from the sum being multiplied.

I agree with your logic and my first attempt was to do:

type LogicSig struct {
	_struct struct{} `codec:",omitempty,omitemptyarray"`

	// Logic signed by Sig or Msig, OR hashed to be the Address of an account.
	Logic []byte `codec:"l,allocbound=config.MaxLogicSigMaxSize"`

	Sig  crypto.Signature   `codec:"sig"`
	Msig crypto.MultisigSig `codec:"msig"`

	// Args are not signed, but checked by Logic
-	Args [][]byte `codec:"arg,allocbound=EvalMaxArgs,allocbound=config.MaxLogicSigMaxSize"
+	Args [][]byte `codec:"arg,allocbound=EvalMaxArgs,allocbound=0"`
}

This makes msgp generate:

func LogicSigMaxSize() (s int) {
	s = 1 + 2 + msgp.BytesPrefixSize + config.MaxLogicSigMaxSize + 4 + crypto.SignatureMaxSize() + 5 + crypto.MultisigSigMaxSize() + 4
	// Calculating size of slice: z.Args
-	s += msgp.ArrayHeaderSize + ((EvalMaxArgs) * (msgp.BytesPrefixSize + config.MaxLogicSigMaxSize))
+	s += msgp.ArrayHeaderSize + ((EvalMaxArgs) * (msgp.BytesPrefixSize + 0))
	return
}

which implements the idea that the size of the actual arguments is already captured in config.MaxLogicSigMaxSize and we only want to add the overhead of the arg len encoding times the max number of args

The issue is that we are saying that the max length on an argument is 0 and this is used also in the func (z *LogicSig) UnmarshalMsgWithState(...) which gets broken as shown by the resulting diff:

diff --git a/data/transactions/msgp_gen.go b/data/transactions/msgp_gen.go
index 7cc22db08..955c1812f 100644
--- a/data/transactions/msgp_gen.go
+++ b/data/transactions/msgp_gen.go
@@ -3306,8 +3306,8 @@ func (z *LogicSig) UnmarshalMsgWithState(bts []byte, st msgp.UnmarshalState) (o
...
-                               if zb0007 > config.MaxLogicSigMaxSize {
-                                       err = msgp.ErrOverflow(uint64(zb0007), uint64(config.MaxLogicSigMaxSize))
+                               if zb0007 > 0 {
+                                       err = msgp.ErrOverflow(uint64(zb0007), uint64(0))
                                        return
                                }
@@ -3395,8 +3395,8 @@ func (z *LogicSig) UnmarshalMsgWithState(bts []byte, st msgp.UnmarshalState) (o
...
-                                       if zb0011 > config.MaxLogicSigMaxSize {
-                                               err = msgp.ErrOverflow(uint64(zb0011), uint64(config.MaxLogicSigMaxSize))
+                                       if zb0011 > 0 {
+                                               err = msgp.ErrOverflow(uint64(zb0011), uint64(0))
                                                return
                                        }

My second attempt was to say instead that the max len of the arg array is one max size arg by doing:

Args [][]byte `codec:"arg,allocbound=1,allocbound=config.MaxLogicSigMaxSize"`

which generates
s += msgp.ArrayHeaderSize + ((1) * (msgp.BytesPrefixSize + config.MaxLogicSigMaxSize))

which also breaks func (z *LogicSig) UnmarshalMsgWithState(...) since we are saying the max number of args is 1:

+++ b/data/transactions/msgp_gen.go
@@ -3287,8 +3287,8 @@ func (z *LogicSig) UnmarshalMsgWithState(bts []byte, st msgp.UnmarshalState) (o
...
-                       if zb0005 > EvalMaxArgs {
-                               err = msgp.ErrOverflow(uint64(zb0005), uint64(EvalMaxArgs))
+                       if zb0005 > 1 {
+                               err = msgp.ErrOverflow(uint64(zb0005), uint64(1))
                                err = msgp.WrapError(err, "struct-from-array", "Args")
                                return
                        }
@@ -3376,8 +3376,8 @@ func (z *LogicSig) UnmarshalMsgWithState(bts []byte, st msgp.UnmarshalState) (o
...
-                               if zb0009 > EvalMaxArgs {
-                                       err = msgp.ErrOverflow(uint64(zb0009), uint64(EvalMaxArgs))
+                               if zb0009 > 1 {
+                                       err = msgp.ErrOverflow(uint64(zb0009), uint64(1))

I am out of quick fixes, would you have some other idea? Or shall we just leave it very generous as it is now?
Alternatively I would need to understand in depth how msgp works and what is go-algorand codec architecture to suggest some other alternatives but it's a non trivial time investment

@giuliop
Copy link
Contributor Author

giuliop commented Jul 29, 2024

OK, wasn't too bad, studying algorand/msgp I found an handy maxtotalbytes directive which I used to fix the max size of the argument array:

@@ -39,7 +39,7 @@ type LogicSig struct {
	// Args are not signed, but checked by Logic
-       Args [][]byte `codec:"arg,allocbound=EvalMaxArgs,allocbound=config.MaxLogicSigMaxSize"`
+       Args [][]byte `codec:"arg,allocbound=EvalMaxArgs,allocbound=config.MaxLogicSigMaxSize,maxtotalbytes=config.MaxLogicSigMaxSize"`
 }

I used config.MaxLogicSigMaxSize for the max size, which generates:

func LogicSigMaxSize() (s int) {
       s = 1 + 2 + msgp.BytesPrefixSize + config.MaxLogicSigMaxSize + 4 + crypto.SignatureMaxSize() + 5 + crypto.MultisigSigMaxSize() + 4
       // Calculating size of slice: z.Args
-       s += msgp.ArrayHeaderSize + ((EvalMaxArgs) * (msgp.BytesPrefixSize + config.MaxLogicSigMaxSize))
+       s += msgp.ArrayHeaderSize + config.MaxLogicSigMaxSize
       return
}

and leaves everything else intact.

Let me know if you like this or would consider something else instead of config.MaxLogicSigMaxSize.

I like it because that's very close to the max size for the args array (a bit more because the Logic cannot be zero, a bit less because we are not including the overhead from maxArgs BytesPrefixSize).

In any case in the overall LogicSigMaxSize() we are double counting MaxLogicSigMaxSize for both the Logic and the Args so we have a generous, but reasonable, margin.

@jannotti
Copy link
Contributor

I was thinking about another approach. How about we just keep the allocbound on Args at 1000? There's no particular reason to need very large args. And, in fact, it's totally pointless for it to be bigger than 4k, since teal values can't be bigger that 4k. If you tried to access an arg > 4k, you will panic anyway.

@giuliop
Copy link
Contributor Author

giuliop commented Jul 29, 2024

You made me realize that indeed we need to decide which is the max limit for a logicsig arg.
Before we had a combined cap for Logic + Args of 1000 which was lower than the AVM limit on teal args of 4096, so it implicitly set the limit at 1000.
Now that the combined cap is 16000 > 4096 we need to decide what is the limit for args.

My first reaction would be to say 4096 to match the AVM, why artificially lower it after all.

The argument I see in favor of using 1000 is not to change the old value of TxnTagMaxSize.
If there is a concern that we might inadvertently break something then it's more prudent to use 1000 and of course if someone needs to pass a larger arg they can always split it up over multiple args.

In sum, I see an allocbound on arg size of 4096 as the most logical choice, but if maintaining the old TxnTagMaxSize is valuable then 1000 is a fine choice too.

@jannotti jannotti self-assigned this Aug 26, 2024
@jannotti jannotti changed the title Implement lsig size pooling AVM: Implement lsig size pooling Aug 26, 2024
@jannotti
Copy link
Contributor

You made me realize that indeed we need to decide which is the max limit for a logicsig arg. Before we had a combined cap for Logic + Args of 1000 which was lower than the AVM limit on teal args of 4096, so it implicitly set the limit at 1000. Now that the combined cap is 16000 > 4096 we need to decide what is the limit for args.

My first reaction would be to say 4096 to match the AVM, why artificially lower it after all.

Yes, we ought to allow 4096 for individual args. I suppose it can be checked in transactions/data/logic/eval.go near here:

	if cx.txn.Lsig.Args != nil && len(cx.txn.Lsig.Args) > transactions.EvalMaxArgs {
		return false, errTooManyArgs
	}

though it's unclear why we don't check this in txn.WellFormed. It can be an unconditional check (no need to have versioning) because old transactions are implicitly limited to < 1000 bytes.

I dislike that this causes a change in TxnTagMaxSize but I don't see a way around it. It was nice when state proof txn was so much bigger than anything else, we didn't have to adjust TxnTagMaxSize for a simple txn addition. In the future, I suppose we will have to do so, so the limit is hit by a "normal" txn, so any increase increases the limit. The comment explaining TxnTagMaxSize will need to be reworked.

There is a unit test that caught this right?

cmd/goal/clerk.go Outdated Show resolved Hide resolved
cmd/goal/clerk.go Show resolved Hide resolved
@jannotti
Copy link
Contributor

I'm really confused by that comment I mentioned. It says:

// TxnTagMaxSize is the maximum size of a TxnTag message. This is equal to SignedTxnMaxSize()
// which is size of just a single message containing maximum Stateproof. Since Stateproof
// transactions can't be batched we don't need to multiply by MaxTxnBatchSize.

But that doesn't seem right. TransactionMaxSize() seems to bound the size of a single transaction, not stateproof in particular, and it includes the use of all fields, though many are mutually exclusive.

But, it also hints that the TxnTag should take into account 16 "normal" transactions, but doesn't bother to do so, since it's the stateproof txn that really causes the number to blow up. (I think the TxnTag is used for groups, not single transactions)

@jannotti
Copy link
Contributor

After further discussion, we're confident that we can make changes to the max size tests so it doesn't fail so easily, without issue. I'll make those changes in a separate PR. If you address the rest in the meantime (limit arg length, maybe remove the assembly checks) I'll get the max length changes in.

@giuliop
Copy link
Contributor Author

giuliop commented Aug 28, 2024

Yes, we ought to allow 4096 for individual args. I suppose it can be checked in transactions/data/logic/eval.go near here:

	if cx.txn.Lsig.Args != nil && len(cx.txn.Lsig.Args) > transactions.EvalMaxArgs {
		return false, errTooManyArgs
	}

though it's unclear why we don't check this in txn.WellFormed.

Done, and I also think that the checks in func eval on the number and size of lsig args ought to be in txn.WellFormed

data/transactions/logic/eval.go
...
+var errLogicSigArgTooLarge = errors.New("LogicSig argument too large")
...
@@ -1305,8 +1306,15 @@ func eval(program []byte, cx *EvalContext) (pass bool, err error) {
        if (cx.EvalParams.Proto == nil) || (cx.EvalParams.Proto.LogicSigVersion == 0) {
                return false, errLogicSigNotSupported
        }
-       if cx.txn.Lsig.Args != nil && len(cx.txn.Lsig.Args) > transactions.EvalMaxArgs {
-               return false, errTooManyArgs
+       if cx.txn.Lsig.Args != nil {
+               if len(cx.txn.Lsig.Args) > transactions.EvalMaxArgs {
+                       return false, errTooManyArgs
+               }
+               for _, arg := range cx.txn.Lsig.Args {
+                       if len(arg) > transactions.MaxLogicSigArgSize {
+                               return false, errLogicSigArgTooLarge
+                       }
+               }

data/transactions/logicsig.go
 // EvalMaxArgs is the maximum number of arguments to an LSig
 const EvalMaxArgs = 255

+// MaxLsigArgSize is the maximum size of an argument to an LSig
+// We use 4096 to match the maximum size of a TEAL value
+// (as defined in `const maxStringSize` in package logic)
+const MaxLogicSigArgSize = 4096
...
@@ -39,7 +44,7 @@ type LogicSig struct {
        Msig crypto.MultisigSig `codec:"msig"`

        // Args are not signed, but checked by Logic
-       Args [][]byte `codec:"arg,allocbound=EvalMaxArgs,allocbound=config.MaxLogicSigMaxSize,maxtotalbytes=config.MaxLogicSigMaxSize"`
+       Args [][]byte `codec:"arg,allocbound=EvalMaxArgs,allocbound=MaxLogicSigArgSize,maxtotalbytes=config.MaxLogicSigMaxSize"`
 }

I dislike that this causes a change in TxnTagMaxSize but I don't see a way around it. It was nice when state proof txn was so much bigger than anything else, we didn't have to adjust TxnTagMaxSize for a simple txn addition. In the future, I suppose we will have to do so, so the limit is hit by a "normal" txn, so any increase increases the limit. The comment explaining TxnTagMaxSize will need to be reworked.

There is a unit test that caught this right?

Yes, this one

@giuliop
Copy link
Contributor Author

giuliop commented Aug 28, 2024

After further discussion, we're confident that we can make changes to the max size tests so it doesn't fail so easily, without issue. I'll make those changes in a separate PR. If you address the rest in the meantime (limit arg length, maybe remove the assembly checks) I'll get the max length changes in.

I did implement them before reading your comment, I spent some time thinking about it and my best idea in the end was to use the max block length as the max for the transaction / transaction group message tag (as per below).
I can remove that if you want.

protocol/tags.go
@@ -84,10 +84,10 @@ const StateProofSigTagMaxSize = 6378
 // Matches  current network.MaxMessageLength
 const TopicMsgRespTagMaxSize = 6 * 1024 * 1024

-// TxnTagMaxSize is the maximum size of a TxnTag message. This is equal to SignedTxnMaxSize()
-// which is size of just a single message containing maximum Stateproof. Since Stateproof
-// transactions can't be batched we don't need to multiply by MaxTxnBatchSize.
-const TxnTagMaxSize = 4394756
+// TxnTagMaxSize is the maximum size of a TxnTag message.
+// Matches current config.MaxTxnBytesPerBlock, the maximum lenght of a block,
+// since a transaction or transaction group cannot be larger than a block.
+const TxnTagMaxSize = 5 * 1024 * 1024

node/node_test.go
@@ -805,7 +804,7 @@ func TestMaxSizesCorrect(t *testing.T) {
        require.Equal(t, ppSize, protocol.ProposalPayloadTag.MaxMessageSize())
        spSize := uint64(stateproof.SigFromAddrMaxSize())
        require.Equal(t, spSize, protocol.StateProofSigTag.MaxMessageSize())
-       txSize := uint64(transactions.SignedTxnMaxSize())
+       txSize := uint64(config.MaxTxnBytesPerBlock)
        require.Equal(t, txSize, protocol.TxnTag.MaxMessageSize())

@giuliop
Copy link
Contributor Author

giuliop commented Oct 9, 2024

I think this PR is good to go now.

Let me know if you prefer that I move the integration test I added to a unit test instead

@jannotti
Copy link
Contributor

Here is the PR to the spec for the similar change for logicsig cost pooling. It contains a lot of other stuff too, but please look for the logicsig bits and make a similar PR to that repo.

https://github.com/algorandfoundation/specs/pull/98/files

jannotti
jannotti previously approved these changes Oct 22, 2024
@jannotti
Copy link
Contributor

There are two small linter errors that should be fixed. They just appeared because the linter run needed approval to run.

Copy link
Contributor

@algorandskiy algorandskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change looks reasonable. Please add unit tests to verify/txn_test.go

data/transactions/verify/txn.go Show resolved Hide resolved
@giuliop
Copy link
Contributor Author

giuliop commented Oct 23, 2024

There are two small linter errors that should be fixed. They just appeared because the linter run needed approval to run.

Done, I could have run make lint locally to catch them

algorandskiy
algorandskiy previously approved these changes Oct 23, 2024
data/transactions/verify/txn_test.go Outdated Show resolved Hide resolved
@giuliop
Copy link
Contributor Author

giuliop commented Oct 23, 2024

All looks good to me now; codecov shows some lines not covered in cmd/goal/clerk.go but I suppose that's ok; let me know otherwise

@giuliop
Copy link
Contributor Author

giuliop commented Oct 24, 2024

I've seen that this PR fails the type serialization checking with go-algorand-sdk so I made this PR to go-algorand-sdk to hopefully fix it.

Below the failing case:

Compare the Type Trees "config.ConsensusParams" v "config.ConsensusParams"

========================================================
			STRUCT DIFF REPORT
comparing
	<<<<<github.com/algorand/go-algorand/config :: "config.ConsensusParams" (struct)>>>>>
VS
	<<<<<github.com/algorand/go-algorand-sdk/v2/protocol/config :: "config.ConsensusParams" (struct)>>>>>
========================================================
--------------------------------------------------------
		DIFFERENCES FOUND (partial)
--------------------------------------------------------
Common path of length 0:

X-DIFF
------
EXISTS IN:	 	"github.com/algorand/go-algorand/config :: \"config.ConsensusParams\" (struct)"
MISSING FROM:		"github.com/algorand/go-algorand-sdk/v2/protocol/config :: \"config.ConsensusParams\" (struct)"
1 TYPES TOTAL:

(1)
[FIELD](+codec): 	[EnableLogicSigSizePooling]()
SOURCE: 		 :: "bool" (bool)

@jannotti jannotti merged commit 62f9082 into algorand:master Nov 14, 2024
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants