query cost, gas, and fees #410

jchappelow · 2023-12-04T22:00:58Z

See https://github.com/kwilteam/kwil-db/wiki/Kwil-v0.9 for v0.9 goals w.r.t. cost and pricing.
This issue breaks down into roughly a few main hurdles:

Creating a logical query plan from SQL
Given some table statistics create a virtual plan that can compute a unit-less cost estimate.
Collecting and maintaining statistics (e.g. row count, MVCs, min/max/range, etc.). Full scan vs incremental. Incremental updates using changeset info.
For procedures
- The wiki for 0.9 says "Therefore, we should use Postgres extensions to allow the procedural language to run with an interpreter that can track the gas while it executes." This option refers to a postgresql extension that takes the place of pl/pgSQL, and we would be responsible for all flow control. @brennanjl Would you expand on the likely mechanics w.r.t. the pl ast and implementing an interpreter? This sounds compelling, but had not even considered this all encompassing approach.
- A different approach might keep using pl/pgSQL but would use an extension with functions to compute cost of each query just before executing them. This may also to require postgresql tricks such as triggers, an audit table, or a novel use of a FWD (a special standard extension called a foreign data wrapper).
- In terms of detail, other reqs: the engine needs to know total used gas, postgresql or at least the procedure needs to be able to cancel in the event that the gas limit is reached, ...
In the application (abci/txApp/engine), redesign execution to involve progressive reduction of an allotted gas amount (either from the tx or the user's balance)

tasks

update query_planner for insert/update/delete. Currently only select is supported. In progress.
support CTEs and subqueries. buildTableSource needs to consider these from field of PlannerContext e.g. GetCTE
complete costmodel functions for computing cost from LogicalPlan, similar to virtual_plan pkg. This is now working, but not completely accurate. Selectivity needs to be computed to reduce cost. Expression cost is now computed. Accessing the scan/from plan's statistics in child plans is ready and will inform selectivity computations, which is most important in a join where one table's size can become small due to filter. In progress.
implement a Catalog (GetDataSourcer for qp.NewPlanner) at the level of the engine, with the actual DB and any stored statistics to avoid full rescan. In progress.
wire up engine and txApp/abci with new progressive transaction cost from the logical plans.
collect and maintain statistics
- changesets at end can update ground truth stats
- between statements (either within action or in different chain txns), stats change require audit table
translate stats solutions inside of procedures

Work branch: https://github.com/jchappelow/kwil-db/commits/cost-stats/

IMPORTANT NOTE: Development will take an incremental approach. We wish to have functional components along the way even if they are simplistic and contain known limitations. For example, there may be a point where we've wired up the app/engine changes, but we run with a cost system that involves no live updates of statistics, only query based cost. Similarly, an intermediate cost system may some statistics, perhaps just row count, and only some of the logical plans working (e.g. no proper updates or join costs, only a constant penalty for these). Similarly, fitting a postgres extension to support procedures should come relatively late so that the general approach is solidified and we won't have to rewrite the extension multiple times.

Note that some statistics cannot be updated without a full scan. There may be some ground truth refresh at the end of a block, or action, etc.

The original issue, which was more speculative, is below:

There are a few things to decide upon regarding gas and fees in Kwil.

First, there's a funny detail regarding the txn fee in all of our spending code where if a transaction specifies a fee X > price, the actual amount spent from the account's balance is just price rather than X. That essentially makes the Fee field of the transaction body more like the fee limit, but it is not documented this way. I feel like the transaction should reduce account balance by the given fee. Or is there a reason to interpret this as a limit, like removing a painful client footgun?

Another thing is gas vs. fee. We have an implicit gas cost of unity i.e. 1 atom (a kwil?) per gas. The transaction has no way to specify a fee rate, which is typically how transactions would be prioritized on a busy chain, just an absolute fee or fee limit depending on how we want to interpret this field.

Lastly, every abci module returns an "ExecutionResponse" with GasUsed left at zero, only a Fee set. The event logs are confusing since they say gasUsed is zero despite a fee being paid based on our hardcoded "prices". This is sorta OK since we are handling execution costs (gas!) at the app level, and keeping cometBFT blind to it. So, do we remove this "gas used" info and keep cometbft in the dark, or do we formalize what we mean by gas and fees?

I think it's too late for big decisions for 0.6.0, but we should discuss these points anyway.

The text was updated successfully, but these errors were encountered:

brennanjl · 2023-12-05T20:41:28Z

Missed this, will circle back after release with some thoughts

brennanjl · 2024-01-08T17:48:43Z

I've been thinking a lot about gas and fees. Monetization is a big focus for us this year, and so in addition to block gas limits, it is becoming a big priority.

The intent of the transaction Fee was essentially to have a fee limit. The network would then simply subtract the total it spent, assuming the user consented to spending that much. This fee system was one of the very first parts of Kwil, and therefore might be overly simplistic, so a big emphasis for our next version (v0.8) could be how we overhaul this.

No real suggestions here, but we don't have any strict precedents / legacy to adhere to, so we can design a solution as we wish.

jchappelow · 2024-05-02T14:54:10Z

I think is bumped fro 0.8, but perhaps tentatively could find it's way to 0.8 with some initial iteration? I'll put in in the 0.8.0 milestone so we can make the plan definitive and remove it or not.

jchappelow · 2024-07-17T21:49:33Z

Plan for mitigating cost estimate exploits and gross inaccuracy, which are both inevitable.

Even the best cost estimates (i.e. PostgreSQL's own internal costs) are often very poor indicators of the time and compute resources required to execute a query.
Our primary purpose of a cost to avoid user exploits and spam that can cripple a node and/or deny legitimate transactions. This may not even be intentional as it's quite easy to accidentally write a very costly query.
A secondary purpose is to have a fair cost for utilizing a Kwil network, even assuming all actors are honest, which adds value to a linked token.

We will do our best at achieving these goals with a cost computed for each query. However, it will be almost impossible to do this without any loopholes, bugs, or any other inaccurate cost constants used in the computation. Because the cost value is consensus (gas deducted from account balance), we cannot simply tweak and patch this repeatedly as we wish. Changes that result in frequent network migration or coordinated upgrade for the rule change is a poor outcome.

To be robust to issues that result in an actual query cost (i.e. execution time in postgres) that is much higher than our computed cost estimates (i.e. would exceed a block's max gas and/or an account balance if cost were more accurate), I propose the following:

Unlike an RPC that may be canceled if it takes too long, we cannot simply cancel the execution of a blockchain transaction once it is in a validator-approved block.
Tendermint consensus permits executing transactions at various stages ahead of FinalizeBlock, which is when a block's transactions are accepted and must be executed and must be deterministic.
The block proposer should attempt to execute their proposed block prior to actually proposing it (in PrepareProposal). At this stage, a transaction that takes too much time to execute may be cancelled and excluded from the block. This method is permitted to be non-deterministic.
The other validators, who approve or reject the proposed block (in ProcessProposal) before the block becomes canonical and broadcasted to all nodes, should reject a block that contains a transaction that takes much longer than expected to execute based on its cost. The execution would be isolated in a transaction that is either rolled back or held as a candidate state to be committed later. This preexecution is suggested in the docs. However, the docs say ProcessProposal must be deterministic, which this time-based cancellation would not be.

Despite the determinism requirement for ProcessProposal would this actually work? Say a handful of validators with slower hardware rejected the block while others accepted it? Other than going through multiple rounds to make it to the next block, is this worse in terms of liveness than taking say, many minutes or hours, to execute a finalized block with a poorly priced query? As long as the result of the tx execution is deterministic in the case where the proposal passes, I don't see an issue. The result of ProcessProposal is binary (accept or reject), and disagreement should be resolvable by the supermajority, unlike an apphash mismatch for example. Is my thinking about this method's role in Tendermint consensus incorrect?

A basic simulation with a random rejection in ProcessProposal should indicate the actual liveness consequences. However, the whole point of Tendermint consensus is to be BFT, so I believe the network should not fail as long as the rules used in proposal preparation and a validation agree.

jchappelow mentioned this issue Dec 5, 2023

genesis balance allocations and transfer transaction #392

Merged

10 tasks

jchappelow added this to the v0.8.0 milestone May 2, 2024

jchappelow removed this from the v0.8.0 milestone May 28, 2024

jchappelow added this to the v0.9 milestone Jun 24, 2024

jchappelow self-assigned this Jun 24, 2024

jchappelow changed the title ~~gas and fees~~ query cost, gas, and fees Jul 14, 2024

jchappelow pinned this issue Jul 17, 2024

jchappelow mentioned this issue Jul 30, 2024

pg: advanced query feats, table inspection, and stats #901

Merged

charithabandi unpinned this issue Aug 22, 2024

jchappelow mentioned this issue Aug 22, 2024

table statistics and selectivity for query cost #927

Open

jchappelow modified the milestones: v0.9, v0.10 Sep 3, 2024

jchappelow pinned this issue Oct 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

query cost, gas, and fees #410

query cost, gas, and fees #410

jchappelow commented Dec 4, 2023 •

edited

Loading

brennanjl commented Dec 5, 2023

brennanjl commented Jan 8, 2024

jchappelow commented May 2, 2024

jchappelow commented Jul 17, 2024 •

edited

Loading

query cost, gas, and fees #410

query cost, gas, and fees #410

Comments

jchappelow commented Dec 4, 2023 • edited Loading

tasks

brennanjl commented Dec 5, 2023

brennanjl commented Jan 8, 2024

jchappelow commented May 2, 2024

jchappelow commented Jul 17, 2024 • edited Loading

jchappelow commented Dec 4, 2023 •

edited

Loading

jchappelow commented Jul 17, 2024 •

edited

Loading