-
Notifications
You must be signed in to change notification settings - Fork 292
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
txscript: Zero alloc optimization refactor. #1656
txscript: Zero alloc optimization refactor. #1656
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
24+ hours says testnet3 miner approves
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I fixed the commit message locally, but I'll wait to push it until the end to avoid changing all the commit hashes during the review period. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo in commit message a83a43f
multiple input transaciton
Ran benchmarks before & after and verified numbers.
Fixed commit message for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
message for fae1186 refers to adding checkScriptParses
, but that was added in an earlier commit.
Fixed commit message for Nice catch. I reordered commits for clarity when splitting everything up for easier review and must have missed updating that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed up to commit 0543f6f so far.
This is called out explicitly in the commits but just to point out for those only looking at the PR, this changes the semantics for the exported function GetScriptClass
for the following script types:
func TestMyZZZZ(t *testing.T) {
script01 := mustParseShortForm("0 0 OP_CHECKMULTISIG")
cls01 := GetScriptClass(0, script01)
t.Logf("class of first script is: %v", cls01)
script02 := mustParseShortForm("'b'{100} 1 OP_CHECKSIGALT")
cls02 := GetScriptClass(0, script02)
t.Logf("class of second script is: %v", cls02)
}
Executing this test (with go test -v
) on current master shows that the first script is of class "nonstandard" and the second is "pubkeyalt". Executing the same test in commit 0543f6f
outputs classes of "multisig" and "nonstandard".
As far as I can tell, these particular changes don't provoke a hard fork since GetScriptClass
is only used in consensus code in instances where its result is compared via equality testing to a few pre-defined script classes which don't include the affected ones.
That is correct regarding why it doesn't cause a hard fork. It is also worth noting that the same applies to the I also have issues #1625 and #1626 open which are out that the consensus code needs to be updated to remove the dependence on these functions because they are not supposed to be used in consensus. In the mean time, I was extremely careful to examine all call sites and ensure that any changed semantics do not affect consensus. I'm glad to see you independently verify. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo on commit message for 6188092
"... each the dection of each script ..."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On commit message for 1be0973
"...by making using of..."
"... which describing..."
Fixed commit messages for |
🏁 🏁 🏁 Got to the end of it, reviewing each individual commit. This took somewhat of a long time, but was at least possible (compared to reviewing a single big diff) given that each individual commit made sense on its own. I'm still going to do a couple of functional tests, which I haven't done yet but otherwise looks pretty much good to go based on other people already running this, so if you want to go ahead and rebase with the fixed commit messages, feel free from my end. Might be a bit understated, but nice work! 😄 |
44f3fd3
to
4b45582
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Performed a full sync, all good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made a full sync and an atomic swap against bitcoin core and everything worked just nice.
edit: |
System Configs:
Memory: 3 GB
1.5.0-pre+dev DCRD: testnet:
44 minutes and 54 seconds mainnet:
1 hour, 21 minutes and 24 seconds 1.4.0-pre+dev DCRD: testnet:
1 hour, 11 minutes and 17 seconds mainnet:
1 hour, 47 minutes and 8 seconds Calculations. Testnet: Mainnet: 1.4.0 - 1.5.0: 24 % faster |
Initial sync times: Testnet Mainnet |
This converts SignTxOutput and supporting funcs, namely sign, mergeScripts and mergeMultiSig, to make use of the new tokenizer as well as some recently added funcs that deal with raw scripts in order to remove the reliance on parsed opcodes as a step towards utlimately removing them altogether and updates the comments to explicitly call out the script version semantics. It is worth noting that this has the side effect of optimizing the function as well, however, since this change is not focused on the optimization aspects, no benchmarks are provided.
This introduces a new function named removeOpcodeByDataRaw which accepts the raw scripts and data to remove versus requiring the parsed opcodes to both significantly optimize it as well as make it more flexible for working with raw scripts. There are several places in the rest of the code that currently only have access to the parsed opcodes, so this only introduces the function for use in the future and deprecates the existing one. Note that, in practice, the script will never actually contain the data that is intended to be removed since the function is only used during signature verification to remove the signature itself which would require some incredibly non-standard code to create. Thus, as an optimization, it avoids allocating a new script unless there is actually a match that needs to be removed. Finally, it updates the tests to use the new function.
This converts the isDisabled function defined on a parsed opcode to a standalone function which accepts an opcode as a byte instead in order to make it more flexible for raw script analysis. It also updates all callers accordingly.
This converts the alwaysIllegal function defined on a parsed opcode to a standalone function named isOpcodeAlwaysIllegal which accepts an opcode as a byte instead in order to make it more flexible for raw script analysis. It also updates all callers accordingly.
This converts the isConditional function defined on a parsed opcode to a standalone function named isOpcodeConditional which accepts an opcode as a byte instead in order to make it more flexible for raw script analysis. It also updates all callers accordingly.
This converts the checkMinimalDataPush function defined on a parsed opcode to a standalone function which accepts an opcode and data slice instead in order to make it more flexible for raw script analysis. It also updates all callers accordingly.
This converts the engine's current program counter disasembly to make use of the standalone disassembly function to remove the dependency on the parsed opcode struct. It also updates the tests accordingly.
This refactors the script engine to store and step through raw scripts by making using of the new zero-allocation script tokenizer as opposed to the less efficient method of storing and stepping through parsed opcodes. It also improves several aspects while refactoring such as optimizing the disassembly trace, showing all scripts in the trace in the case of execution failure, and providing additional comments describing the purpose of each field in the engine. It should be noted that this is a step towards removing the parsed opcode struct and associated supporting code altogether, however, in order to ease the review process, this retains the struct and all function signatures for opcode execution which make use of an individual parsed opcode. Those will be updated in future commits. The following is an overview of the changes: - Modify internal engine scripts slice to use raw scripts instead of parsed opcodes - Introduce a tokenizer to the engine to track the current script - Remove no longer needed script offset parameter from the engine since that is tracked by the tokenizer - Add an opcode index counter for disassembly purposes to the engine - Update check for valid program counter to only consider the script index - Update tests for bad program counter accordingly - Rework the NewEngine function - Store the raw scripts - Setup the initial tokenizer - Explicitly check against version 0 instead of DefaultScriptVersion which would break consensus if changed - Check the scripts parse according to version 0 semantics to retain current consensus rules - Improve comments throughout - Rework the Step function - Use the tokenizer and raw scripts - Create a parsed opcode on the fly for now to retain existing opcode execution function signatures - Improve comments throughout - Update the Execute function - Explicitly check against version 0 instead of DefaultScriptVersion which would break consensus if changed - Improve the disassembly tracing in the case of error - Update the CheckErrorCondition function - Modify clean stack error message to make sense in all cases - Improve the comments - Update the DisasmPC and DisasmScript functions on the engine - Use the tokenizer - Optimize construction via the use of strings.Builder - Modify the subScript function to return the raw script bytes since the parsed opcodes are no longer stored - Update the various signature checking opcodes to use the raw opcode data removal and signature hash calculation functions since the subscript is now a raw script - opcodeCheckSig - opcodeCheckMultiSig - opcodeCheckSigAlt
This renames the removeOpcodeByDataRaw to removeOpcodeByData now that the old version has been removed.
This renames the calcSignatureHashRaw to calcSignatureHash now that the old version has been removed.
Also remove tests associated with unparsing opcodes accordingly.
Also remove tests associated with the func accordingly.
This converts the executeOpcode function defined on the engine to accept an opcode and data slice instead of a parsed opcode as a step towards removing the parsed opcode struct and associated supporting code altogether. It also updates all callers accordingly.
This converts the callback function defined on the internal opcode struct to accept the opcode and data slice instead of a parsed opcode as the final step towards removing the parsed opcode struct and associated supporting code altogether. It also updates all of the callbacks and tests accordingly and finally removes the now unused parsedOpcode struct.
4b45582
to
6adbaa6
Compare
// | ||
// NOTE: This function is only valid for version 0 scripts. Since the function | ||
// does not accept a script version, the results are undefined for other script | ||
// versions. | ||
func ExtractCoinbaseNullData(pkScript []byte) ([]byte, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to make this function valid for versions > 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's possible, but would require a v2 module since it requires the addition of a new version parameter which constitutes a major change according to semver. However, it's important to note that no script versions greater than 0 are currently valid and this is simply calling out those existing semantics.
That said, part of introducing new script versions will necessarily involve ensuring the new ScriptTokenizer
which this PR introduces and does take a script version, is able to properly parse the newly introduced script version(s).
To that end, the function this performs is now possible and trivial to implement via the new aforementioned ScriptTokenizer
this PR introduces:
package main
import (
"encoding/hex"
"fmt"
"github.com/decred/dcrd/txscript"
)
func main() {
// This would oridinarily come from the tx output.
const scriptVersion = 0
// This is from main chain block 330233 and would ordinarily come from
// somehwere else.
pkScript, err := hex.DecodeString("6a0cf9090500f1ee035d7e430274")
if err != nil {
fmt.Println(err)
return
}
// Note that this is ignoring the max allowed length which could be added
// too, but for most applications, it won't matter hence I omitted it here.
var data []byte
if len(pkScript) > 1 && pkScript[0] == txscript.OP_RETURN {
tokenizer := txscript.MakeScriptTokenizer(scriptVersion, pkScript[1:])
if tokenizer.Next() && tokenizer.Done() &&
tokenizer.Opcode() <= txscript.OP_PUSHDATA4 {
data = tokenizer.Data()
}
}
fmt.Printf("Coinbase data: %x", data)
}
Output:
Coinbase data: f9090500f1ee035d7e430274
Live profiling data pulled from an instance of dcrd v1.4.0 performing an initial sync shows that roughly 50% of all allocations, which equates to around a massive 295.12GB, are the result of parsing scripts into slices of parsed opcodes prior to performing any analysis on them and executing them. Further, almost 72% of the entire CPU time is spent in garbage collection which is a direct result of producing a ton of allocations.
CPU Profiling Data:
Memory Profiling Data:
Consequently, this completely refactors the
txscript
module to remove its current reliance on parsed opcodes by introducing a new zero-allocation script tokenizer, updating all script analysis functions to make use of a combination of the new tokenizer and raw (unparsed) script analysis, and finally completely reworking the script engine to work with raw scripts by also making use of a a combination of the new tokenizer and raw (unparsed) script analysis.It is important to note that since this is consensus critical code, and, in many respects, one of the most error prone areas of consensus, significant effort has been put into making these changes easier to reason about and review by carefully crafting a series of individual commits such that every commit message thoroughly describes its purpose, maintains consensus, and therefore passes all tests.
As can be seen in the following graphs which show the overall effect of this entire set of changes on initial sync, the total number of allocations is significantly reduced, while the overall in-use memory remains around same, as expected, which results in a significantly faster sync time (the effect is more pronounced without the profiling active that was used to generate these graphs as well):
The following is a before and after comparison of the allocations of the vast majority of the optimized functions:
Finally, the following is a before and after comparison of the performance of the vast majority of the optimized functions: