-
Notifications
You must be signed in to change notification settings - Fork 671
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use fee rate estimation in block assembly #2859
Conversation
src/core/mempool.rs
Outdated
/// at most `max_updates` entries in the database before returning. | ||
/// | ||
/// Returns `Ok(number_updated)` on success | ||
pub fn estimate_tx_rates<CE: CostEstimator + ?Sized, CM: CostMetric + ?Sized>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The mempool really can grow unbound. Is there a strategy for prioritizing transactions for fee reconsideration beyond just doing so in the order in which they were inserted into the mempool (which is what's implicitly happening in the SQL query here)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, there isn't, we could instead sort by random (or by raw fee, though that has the potential to skew things).
@@ -469,6 +474,9 @@ impl<'a> StacksMicroblockBuilder<'a> { | |||
let mut num_selected = 0; | |||
let deadline = get_epoch_time_ms() + (self.settings.max_miner_time_ms as u128); | |||
|
|||
mem_pool.reset_last_known_nonces()?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't this obliterate the known nonces for the entire mempool? Isn't that going to make microblock mining expensive, if we have to re-calculate nonces all the time? Or, is that bound by the subsequent call to estimate_tx_rates
with max_updates = 100
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it will obliterate the nonces for the entire mempool. However, it doesn't make mining any more expensive (in terms of nonce lookups) than the current implementation -- nonce calculation is done lazily. When a transaction is considered for inclusion, if the expected nonces aren't known, the origin and sponsor nonces are looked up, and the database is updated with those nonces, then the transaction is thrown back in for consideration, and the iterator polls for the next transaction to be considered (which could be the one it just saw, if it was valid given the expected nonce). That is:
- Get the transaction from the mempool with the highest fee rate whose expected nonce is not known or the transaction nonce matches the known expected nonce.
- If the transaction's nonce was not known, look it up and update the table, go back to 1.
- Otherwise, try to add the transaction to the block, then bump the expected nonce, go back to 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is resetting the last known nonces for each new microblock a step to prevent any type of "unsynced" info?
Each microblock builds on state that the miner has been keeping track of, so it seems like overkill to reset the nonce info here? Not fully clear on the reasoning behind doing this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes -- the reason to reset the nonces is to prevent unsynced info between the mempool's view of nonces and the chainstate MARF. While it's true a more aggressive cache maintenance strategy would be able to avoid this reset, the actual cost of recomputing the nonces is pretty low (and is already something the current deployed approach does: there's no state in the miner that tracks nonces between iterations currently).
src/chainstate/stacks/miner.rs
Outdated
@@ -1464,6 +1495,9 @@ impl StacksBlockBuilder { | |||
let mut epoch_tx = builder.epoch_begin(&mut chainstate, burn_dbconn)?; | |||
builder.try_mine_tx(&mut epoch_tx, coinbase_tx)?; | |||
|
|||
mempool.reset_last_known_nonces()?; | |||
mempool.estimate_tx_rates(estimator, metric, 100)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because there can be far more than 100 transactions available for consideration, what's our strategy for helping miners find the maximum-fee-rate transactions, if we're only going to estimate fees for 100 of them per attempt at building an anchored block?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The estimates aren't wiped out between blocks --- given the growth rate of the mempool, in practice, the miner would construct estimates for all the transactions in the mempool. Alternatively, the miner could do this iteratively -- if there's more than 100 transactions without estimates in the mempool, the miner could try to estimate more of the transactions after doing one pass of block assembly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, I think this PR is on the right track, but I think there's a couple of architectural changes that would make this PR a lot stronger and less brittle. They are:
-
Take maintenance of the
fee_estimates
out of the mempool. While the table needs to be in the same database as the mempool in order for theLEFT OUTER JOIN
s to work, I think at a minimum, the miner should be explicitly managing that table's lifetime -- including creating it, curating it, and deleting it. However, if we're ultimately aiming to make it so that transactions in the mempool have fee estimations for the purposes of both finding high-value transactions and making transaction relay decisions, then I think that makes a compelling case for having the chains coordinator (or some background worker thread the chains coordinator controls) responsible for building up the fee estimations. Then, the estimates are available to the miner for mining nodes, and are also available to help the relayer decide if a transaction's fee rate is high enough to be added to the mempool (or more likely, replace a low-fee transaction in the mempool, since we will eventually want to bound the size of the mempool for the dual purposes of avoiding DDoS attacks and making mempool set difference calculations tractable). -
Separate fee estimation and nonce (re-)calculation from building blocks. The size of the mempool can grow unbound, and thus so can the amount of work a mining node needs to do to estimate fees. Having a fixed-size "update 100 transactions per block attempt" I think makes it hard for miners to do what they really want: prioritize the globally highest-fee-rate transactions for mining. To this end, I wonder if instead there should be a background process in the chains coordinator that asynchronously updates nonces and fees, so that when the miner wakes up, there's a better chance that the fee estimates for the highest value transactions are known already. If we intend to use fee estimates to make relay decisions, then updating fee estimates outside of the miner is going to be required.
What do you think?
The However, the "ground truth" of cost estimation and fee estimation does in fact happen in the chains coordinator -- that's where the estimators receive updates from the block receipts. The
The design of the nonce tracking in this PR is lazy specifically to avoid unbounded work -- it should perform almost identically to the current system with regards to nonce lookups. The problem with trying to persist and maintain nonce data between block assembly invocations is that this state is highly dependent on the Stacks fork (unlike the cost estimates, which are also dependent on the fork, but less critically so), and so the maintenance of a SQL table with that information would be at once laborious and error-prone. It may be worthwhile to separate the fee rate estimation of transactions in the mempool from block assembly, but I'm not convinced yet; while updating just a subset per block attempt is naive and insufficient to span the mempool, approaching like iteratively estimating more transactions in the mempool while assembling the block would be best implemented during block assembly. Finally, for transactions that are not yet even possible to estimate (i.e., the called contract is new and has not been invoked before), the only way to provide an initial measurement to the estimator is by trying to evaluate the transaction in a block-- again, a job best done in block assembly (see the
Ultimately, my answer here is "maybe", but I'm not sufficiently convinced that I believe such a design should be a requirement for a first version of a miner that uses cost estimation. |
Yup, that all makes sense. But then, should the miner be responsible for carrying out fee estimations at all? It's unclear to me why the responsibility for curating the state of this table is spread between these two modules.
I'm in a position where I have to think about this now, because bounding the size of the mempool is an important part of my proposal for a mempool sync protocol (https://docs.google.com/document/d/1uHLUZEkzJJA8HtKfKVZmZIn5n7sY0X9j6gg90IyixYM/edit#). I'd welcome approaches that don't require bounding the size of the mempool, but supposing for a second that the bound must be in place, we're going to need a way to determine which transactions get evicted if the mempool is full. I'm not sure what criteria could be used besides the transaction's value to the miner; otherwise we run the risk of a DDoS attack whereby cheap transactions crowd out expensive transactions. Other blockchains use the fee rate -- lower-value transactions get evicted by higher-value transactions. But, other blockchains also have a well-defined fee estimation procedure -- Bitcoin has satoshis/byte, and Ethereum has wei/opcode. We may never have such an estimator, but that's what motivated me to ask about the possibility of exposing fee data to the relayer. Now, it's entirely possible that we always relay transactions if they pay more than some minimum absolute fee, but only store them if they furthermore clear a minimum fee rate. The mempool synchronization logic would still work, but it would simply be a little bit less efficient -- Alice could send Bob some transactions that he ultimately won't store. Do you think that would be alright?
The downside to this -- more generally, the downside of lazy estimation -- is that the probability that the highest-value transaction will be considered at all (let alone mined in the next block) is independent of the value of the other transactions. The highest-value transaction is not guaranteed to be considered at all if the mempool grows unbound -- it's possible that new transactions arrive frequently enough that it's simply unlikely (less-than-50% chance) that the miner will "get around" to this high-value transaction within 256 blocks. This leads to a very frustrating user experience and miner experience under load, precisely when the miners really ought to be searching for the highest-value transactions and users who pay more really ought to be prioritized. Would it be unreasonable from a performance standpoint to simply analyze the transaction once it's stored to the mempool (or shortly afterwards)? |
Because one module is only concerned with constructing
Nope, not unreasonable at all -- I can update the code to do that as well. |
Updated so that estimates are applied on mempool admission. At this point, is the only blocker on this PR testing? |
8762252
to
c59290f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just had a few questions!
needs: | ||
- build-integration-image | ||
strategy: | ||
matrix: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
matrix: | |
fail-fast: false | |
matrix: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added in 903fdda
@@ -469,6 +474,9 @@ impl<'a> StacksMicroblockBuilder<'a> { | |||
let mut num_selected = 0; | |||
let deadline = get_epoch_time_ms() + (self.settings.max_miner_time_ms as u128); | |||
|
|||
mem_pool.reset_last_known_nonces()?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is resetting the last known nonces for each new microblock a step to prevent any type of "unsynced" info?
Each microblock builds on state that the miner has been keeping track of, so it seems like overkill to reset the nonce info here? Not fully clear on the reasoning behind doing this.
src/net/download.rs
Outdated
@@ -2554,6 +2554,9 @@ pub mod test { | |||
use vm::costs::ExecutionCost; | |||
use vm::representations::*; | |||
|
|||
use crate::cost_estimates::metrics::UnitMetric; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unused in file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed in 903fdda
txid TEXT NOT NULL, | ||
fee_rate NUMBER, | ||
PRIMARY KEY (txid), | ||
FOREIGN KEY (txid) REFERENCES mempool (txid) ON DELETE CASCADE ON UPDATE CASCADE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a note after reading about "cascade" - it seems to me that ON UPDATE CASCADE
is unnecessary because the txid is not really going to be updated in the parent table?
Keep it if you want to though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, ON UPDATE CASCADE
isn't necessary here (because there are no updates), but it also isn't costing us anything for the same reason. However, in case there was some future implementation change that did, for whatever reason, want to update the txid, I wanted this statement here.
@@ -393,18 +461,131 @@ impl MemPoolDB { | |||
conn.busy_handler(Some(tx_busy_handler)) | |||
.map_err(db_error::SqliteError)?; | |||
|
|||
conn.execute_batch("PRAGMA foreign_keys = ON;")?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not move this statement to the SCHEMA?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It needs to be executed every time the connection opens -- this pragma is connection-local.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. We don't do this elsewhere, but it appears that we should (#2885).
src/core/mempool.rs
Outdated
let (next_tx, update_estimate): (MemPoolTxInfo, bool) = | ||
match query_row(&self.db, select_estimate, rusqlite::NO_PARAMS)? { | ||
Some(tx) => (tx, false), | ||
None => match query_row(&self.db, select_no_estimate, rusqlite::NO_PARAMS)? { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe switch to using probabilistic approach for choosing when to select a tx from select_estimate
vs select_no_estimate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a configurable probability in 903fdda
); | ||
let fee_rate_f64 = match estimator_result { | ||
Ok(x) => Some(x), | ||
Err(EstimatorError::NoEstimateAvailable) => continue, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To make sure we aren't retrying generating a fee_estimate for a given block, could we add a column to the fee_estimates
table to signal that an estimate generation attempt already occurred? A column like estimate_generation_failed
. This would reset when the estimates are reset, and would be false by default. Then, we can select for tx's where this column is false on line 617.
in this error case, EstimatorError::NoEstimateAvailable
, we could update that column.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's better for the mempool to keep treating those failed estimates like transactions that need estimates. Once the estimator has received enough information to estimate those transactions, the mempool should try to estimate them again.
src/core/mempool.rs
Outdated
"txid" => %tx.txid(), | ||
"error" => ?e); | ||
return Err(MemPoolRejection::Other( | ||
"Failed to estimate mempool tx rate".into(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Include error e
in the string?
Do you think this error warrants a new enum? (or maybe even just the use of an existing enum more descriptive than Other)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added in 903fdda
src/core/mempool.rs
Outdated
|
||
for (i, (tx, (origin, sponsor))) in [good_tx, underfunded_tx] | ||
for (i, (tx, (origin, sponsor))) in [good_tx] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can remove for loop since there is only a single tx now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed in 903fdda
src/chainstate/stacks/miner.rs
Outdated
.map(|addr| (addr.to_account_principal(), 100000000000)) | ||
.collect(); | ||
|
||
let mut peer_config = TestPeerConfig::new("build_anchored_incrementing_nonces", 2006, 2007); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These ports are used by test_build_anchored_blocks_skip_too_expensive
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch -- fixed.
/// Probability percentage to consider a transaction which has not received a cost estimate. | ||
/// That is, with x%, when picking the next transaction to include a block, select one that | ||
/// either failed to get a cost estimate or has not been estimated yet. | ||
pub consider_no_estimate_tx_prob: u8, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are the liveness consequences for a very small probability or a very large probability? I see you have 5
below -- is this 5%
, i.e., a transaction that has no estimate or no known nonce has only a 1-in-20 chance of even being considered by the miner?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, it's not a 5% chance of being considered by miner -- it's a 5% chance that, while there are transactions which do have cost estimates available, transactions without estimates will be considered first instead. So, while assembling a block, 5% of the time that a miner picks a transaction, it will first consider transactions without estimates. For new contracts, this means a contract-call
to that new contract may not be considered right away for inclusion in a block, but it will eventually (5% of the times a transaction could be included in a block, that contract-call would be eligible-- and it would need to use its fee to outcompete other contract-calls without estimates).
If its a very large probability, the inverse would be true: the first transactions considered for inclusion in every block would be the transactions that have never been seen before.
@@ -955,6 +1195,13 @@ impl MemPoolDB { | |||
event_observer, | |||
)?; | |||
|
|||
mempool_tx |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this!
/// because the fee goes up with the nonce. | ||
/// (2) Discovery of nonce in the mempool iteration: this behavior allows the miner | ||
/// to consider an origin's "next" transaction immediately. Prior behavior would | ||
/// only do so after processing any other origin's transactions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm having a hard time understanding how (or if) the new algorithm addresses fairness. Suppose Alice sends 25 transactions worth 1, 2, 3, 4, ..., 25 STX, and Bob sends 21 transactions, where the first transaction is worth 0.5 STX, but the last 20 are worth 20 STX each. Alice sent 300 STX in total to the miner, and Bob sent 400.5 STX in total. Will all of Bob's transactions get mined before Alice's, or will all of Alice's get mined before Bobs? Or, will the miner interleave Alice's and Bob's transactions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on my read, I think that with high probability, they'll be interleaved? When the transactions arrive, they'll either have an estimated fee off the bat, or they'll eventually get one as the miner considers transactions. Once Alice's and Bob's transactions have known fee rates, they'll be mined in highest-fee-rate order. So, if Alice's and Bob's transactions are otherwise identical in the CPU usage they have, we'd see Alice-1, Bob-1, Bob-2, Bob-3, ..., Bob-21, Alice-2, Alice-3, Alice-4, ... Alice-25, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not positive it's a question of fairness, rather a question of how the miner orders transactions based on fees. In the current algorithm, the miner will consider transactions in order of that transaction's fee rate. Considering the fees paid by subsequent transactions is certainly a possibility, but it can get complicated: the miner would need to consider the aggregate fee rate of those transactions, but also there's cases where the miner is near the block limit and is only capable of including < 25
transactions, and so it may not be correct to consider the rate of those transactions together. I think that taking subsequent transactions into account in the ordering algorithm may make sense, but I consider it something that should be a refinement on top of this. That is, we should start with the simplest approach (considering transactions independently) and if miners want to improve on this, they can do so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM @kantai. Just minor nits, but I did have one question about mining fairness I'd like to understand better, if you could reply before merging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
testnet/stacks-node/src/config.rs
Outdated
@@ -499,6 +503,9 @@ impl Config { | |||
subsequent_attempt_time_ms: miner | |||
.subsequent_attempt_time_ms | |||
.unwrap_or(miner_default_config.subsequent_attempt_time_ms), | |||
probability_no_estimate_tx: miner |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probability_no_estimate_tx
-> probability_pick_no_estimate_tx
(or probability_consider_no_estimate_tx
)
This draft PR alters the miner block assembly logic to use configured cost estimation and cost metrics to produce fee rate estimations for candidate transactions in the mempool. This fee rate estimation is used as the sort order for transaction consideration.
This PR also implements the expected nonce calculation proposed in #2834.
There isn't a new end to end test case yet for the block assembly -- I'm still trying to understand the best way to test the end to end changes here (many of the existing e2e tests cover these code paths, but they don't specifically test for the desired behavior here).