-
Notifications
You must be signed in to change notification settings - Fork 720
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce memory usage of create staked command #4021
Reduce memory usage of create staked command #4021
Conversation
1d7c997
to
60eae8c
Compare
f817ee4
to
7a72ec1
Compare
661ba47
to
2f68910
Compare
Example run:
|
cfca52c
to
3c8c31d
Compare
Slight improvement and uses about 1GB of memory:
|
I need to make a second pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Latest review comments courtesy of @dcoutts .
distribution = [pool | (pool, poolIx) <- zip pools [1 ..], _ <- [1 .. delegsForPool poolIx]] | ||
|
||
-- Distribute M delegates across N pools: | ||
delegations <- liftIO $ Lazy.forM distribution $ computeDelegation network |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say lets just implement out own version of forM
and replicateM
in the cli and remove the hw-lazy
dependency
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cardano-node
doesn't seem the right place to put them given how generic they are. Can I put them into cardano-base
?
Note these sorts of functions need associated tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cardano-prelude
might be a good place for those functions and tests, unless the overall plan is to stop depending on cardano-prelude
(at least that is what we are striking for in cardano-base
and cardano-ledger
repos)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've put them in Cardano.CLI.IO.Lazy
for now. It can be moved to a better place in a different PR if we find one.
-- times ensures that any data structures that are created as a result of the read is not | ||
-- retained in memory. | ||
|
||
let numDelegations = length delegations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is suspicious, calling length
could destroy the laziness. We could get numDelegations
in this list comprehension: distribution = [pool | (pool, poolIx) <- zip pools [1 ..], _ <- [1 .. delegsForPool poolIx]]
if we multiply the number of pools by the number of delegations per pool
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, calling length
destroys laziness, but I've been in discussion with @kosyrevSerge and he is happy to pay the memory cost in this case.
Note, length
is not the only thing that "destroys laziness". Using delegations
more than once does the same thing. Previously, delegations
was written to a temporary file and the temporarily file was read multiple times to preserve laziness, but @kosyrevSerge didn't want to pay the serialisation cost of that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment above is out of date however and I've deleted it.
StakeVerificationKey stakeVK <- firstExceptT ShelleyGenesisCmdTextEnvReadFileError | ||
. newExceptT | ||
$ readFileTextEnvelope (AsVerificationKey AsStakeKey) stakeVKF | ||
computeDelegation :: () |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So in this function we are generating two keys but all we are interested in are their hashes. We should create a new primitive in the cli that creates non-cryptographically secure key hashes for testing purposes like this.
We can use System.Random
for the entropy and hashFromBytes
(in cardano-base
) to generate dummy key hashes. This should speed things up a lot as generateSigningKey
is an expensive function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A new generateInsecureSigningKey
function has been introduced.
import qualified Data.List as L | ||
import qualified Data.Vector as V | ||
|
||
newtype ListMap k v = ListMap |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be able to generically derive the JSON instances for ListMap
and it should do the right thing. @newhoggy can you try this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note, generically deriving JSON would do one of two things:
deriving newtype
would change the format of the JSON file to use [["key1", "value1"], ["key2", "value2"]]
instead of {"field1": "value1", "field2": "value2"}
.
DeriveGeneric
would additionally add { "unListMap": ... }
wrapper.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like we're settling on changing the ledger code, so this code will disappear once the ledger has been updated.
I'm keeping this the same for now.
3c8c31d
to
d2abe0b
Compare
@@ -330,6 +330,18 @@ source-repository-package | |||
tag: ee59880f47ab835dbd73bea0847dab7869fc20d8 | |||
--sha256: 1lrzknw765pz2j97nvv9ip3l1mcpf2zr4n56hwlz0rk7wq7ls4cm | |||
|
|||
source-repository-package |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think we should duplicate the functions used from hw-lazy
. The node has enough dependencies and we need to be stricter about introducing new ones.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I put the functions into Cardano.CLI.IO.Lazy
for the moment.
cardano-api/src/Cardano/Api/Key.hs
Outdated
@@ -67,6 +69,16 @@ generateSigningKey keytype = do | |||
seedSize = deterministicSigningKeySeedSize keytype | |||
|
|||
|
|||
generateInsecureSigningKey :: (Key keyrole, SerialiseAsRawBytes (SigningKey keyrole)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually we indent the ::
cardano-api/src/Cardano/Api/Key.hs
Outdated
-> IO (SigningKey keyrole) | ||
generateInsecureSigningKey keytype = do | ||
g <- Random.getStdGen | ||
let (bs, _) = Random.genByteString (fromIntegral (deterministicSigningKeySeedSize keytype)) g |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we ditch a pair of the parentheses with $
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
cardano-api/src/Cardano/Api/Key.hs
Outdated
let (bs, _) = Random.genByteString (fromIntegral (deterministicSigningKeySeedSize keytype)) g | ||
case deserialiseFromRawBytes (AsSigningKey keytype) bs of | ||
Just key -> return key | ||
Nothing -> error "Unable to generate insecure key" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A better error message would be "generateInsecureSigningKey: Unable to generate insecure key"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
makeShelleyTransactionBody :: () | ||
=> ShelleyBasedEra era | ||
makeShelleyTransactionBody :: | ||
ShelleyBasedEra era |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not indent the ::
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved the ::
@@ -481,3 +483,8 @@ readVerificationKeyOrHashOrTextEnvFile asType verKeyOrHashOrFile = | |||
eitherVk <- readVerificationKeyOrTextEnvFile asType vkOrFile | |||
pure (verificationKeyHash <$> eitherVk) | |||
VerificationKeyHash vkHash -> pure (Right vkHash) | |||
|
|||
generatePaymentKeys :: Key keyrole => AsType keyrole -> IO (VerificationKey keyrole, SigningKey keyrole) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So these may not necessarily be payment keys because we can vary the keyrole
. generateKeyPair
might be a better name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed
newtype VerificationKey StakeKey = | ||
StakeVerificationKey (Shelley.VKey Shelley.Staking StandardCrypto) | ||
newtype VerificationKey StakeKey = StakeVerificationKey | ||
{ unStakeVerificationKey :: Shelley.VKey Shelley.Staking StandardCrypto |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I use the field accessor here: https://github.com/input-output-hk/cardano-node/pull/4021/files#r915507713
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, I am always in favour of added an accessor to a newtype. If its there and no one uses it, its not a big deal (especially with a totally un-ambiguous name like this), but when its needed having it already there is a huge win IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments but nothing major
AddressKeyHash vkf mOFp -> runAddressKeyHash vkf mOFp | ||
AddressBuild paymentVerifier mbStakeVerifier nw mOutFp -> runAddressBuild paymentVerifier mbStakeVerifier nw mOutFp | ||
AddressBuildMultiSig sFp nId mOutFp -> runAddressBuildScript sFp nId mOutFp | ||
AddressInfo txt mOFp -> firstExceptT ShelleyAddressCmdAddressInfoError $ runAddressInfo txt mOFp | ||
|
||
runAddressKeyGen :: AddressKeyType | ||
runAddressKeyGenToFile :: AddressKeyType |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indent/alignment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved the ::
AddressKeyShelleyExtended -> generateAndWriteKeyFiles AsPaymentExtendedKey vkf skf | ||
AddressKeyByron -> generateAndWriteKeyFiles AsByronKey vkf skf | ||
|
||
generateAndWriteKeyFiles :: Key keyrole |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indent/alignment of ::
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved the ::
generateAndWriteKeyFiles asType vkf skf = do | ||
uncurry (writePaymentKeyFiles vkf skf) =<< liftIO (generatePaymentKeys asType) | ||
|
||
writePaymentKeyFiles :: Key keyrole |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indent/alignment of ::
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved the ::
start genDlgs mNonDlgAmount (length nonDelegAddrs) nonDelegAddrs stakePools stake | ||
stDlgAmount numDelegations delegAddrs stuffedUtxoAddrs template | ||
|
||
-- shelleyGenesis contains lazy loaded data, so using lazyToJson to serialise to avoid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be able to delete this comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deleted
StakeVerificationKey stakeVK <- firstExceptT ShelleyGenesisCmdTextEnvReadFileError | ||
. newExceptT | ||
$ readFileTextEnvelope (AsVerificationKey AsStakeKey) stakeVKF | ||
computeInsecureDelegation :: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add a comment here to indicate that the keys are not cryptographically generated and we don't care because this is for testing purposes. computeInsecureDelegation
is a strange name. Why not generateTestDelegation
or something to that effect?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -1229,4 +1282,3 @@ readAlonzoGenesis fpath = do | |||
lbs <- handleIOExceptT (ShelleyGenesisCmdGenesisFileError . FileIOError fpath) $ LBS.readFile fpath | |||
firstExceptT (ShelleyGenesisCmdAesonDecodeError fpath . Text.pack) | |||
. hoistEither $ Aeson.eitherDecode' lbs | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is your editor deleting newlines at the end of the file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so.
skeyDesc, vkeyDesc :: TextEnvelopeDescr | ||
skeyDesc = "Stake Signing Key" | ||
vkeyDesc = "Stake Verification Key" | ||
runStakeAddressKeyGenToFile :: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indent/alignment of ::
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved the ::
-- Genesis staking: pools/delegation map & delegated initial UTxO spec: | ||
-> [(Ledger.KeyHash 'Ledger.StakePool StandardCrypto, Ledger.PoolParams StandardCrypto)] | ||
-> [(Ledger.KeyHash 'Ledger.Staking StandardCrypto, Ledger.KeyHash 'Ledger.StakePool StandardCrypto)] | ||
-> Lovelace |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add haddocks for these parameters?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the haddocks
{ dInitialUtxoAddr = shelleyAddressInEra initialUtxoAddr | ||
, dDelegStaking = Ledger.hashKey stakeVK | ||
, dDelegStaking = Ledger.hashKey (unStakeVerificationKey stakeVK) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use of unStakeVerificationKey
46db29c
to
5be4f73
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
bors r+ |
Merge conflict. |
5be4f73
to
cc24684
Compare
bors r+ |
4021: Reduce memory usage of create staked command r=newhoggy a=newhoggy The problem arises we build a very large in-memory JSON value and then write it out as `genesis.json`. The amount of memory used by the in-memory JSON value can be so large as to use up all the memory. This PR reduces the severity of the memory usage in two ways. 1. It introduces a `ListMap` type, which is almost like `Map`, but has a `[(k, v)]` as its internal representation. This avoids the serialisation cost of constructing a map only to convert it back into a list. 2. Uses Lazy IO so that generated stuffed utxos are created on demand rather than upfront an all in memory. 3. Introduces a new `LazyToJson` type class which doesn't have the memory retention problems of `aeson` library. 4. For evaluation of fields to `WHNF` to allow parent object to be GCed which allows large fields that have already be serialised to be collected as well. 5. Writes delegations to a single `delegations.jsonl` file which is a newline delimited JSON file. This file is streamed multiple times so that generation of the `genesis.json` file does not retain memory unnecessarily. This PR also changes the command to no longer generate payment keys and stake keys. If we want to have the ability optionally output these files, there is additional work to do. Addresses #3938 Co-authored-by: John Ky <[email protected]>
This PR was included in a batch that successfully built, but then failed to merge into master. It will not be retried. Additional information: {"message":"Waiting on code owner review from JaredCorduan, Jimbo4350, dcoutts, and/or erikd.","documentation_url":"https://docs.github.com/articles/about-protected-branches"} |
bors r+ |
Build succeeded: |
The problem arises we build a very large in-memory JSON value and then write it out as
genesis.json
. The amount of memory used by the in-memory JSON value can be so large as to use up all the memory.This PR reduces the severity of the memory usage in two ways.
It introduces a
ListMap
type, which is almost likeMap
, but has a[(k, v)]
as its internal representation. This avoids the serialisation cost of constructing a map only to convert it back into a list.Uses Lazy IO so that generated stuffed utxos are created on demand rather than upfront an all in memory.
Introduces a new
LazyToJson
type class which doesn't have the memory retention problems ofaeson
library.For evaluation of fields to
WHNF
to allow parent object to be GCed which allows large fields that have already be serialised to be collected as well.Writes delegations to a single
delegations.jsonl
file which is a newline delimited JSON file. This file is streamed multiple times so that generation of thegenesis.json
file does not retain memory unnecessarily.This PR also changes the command to no longer generate payment keys and stake keys. If we want to have the ability optionally output these files, there is additional work to do.
Addresses #3938