Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache statistics and provide estimation methods #19474

Closed
wants to merge 17 commits into from

Conversation

zeripath
Copy link
Contributor

Currently whenever the prometheus metrics endpoint or /admin endpoint are viewed
the statistics are recalculated immediately - using COUNT rather than a less expensive
method.

This PR provides a mechanism to cache these statistics, avoids generating all of the
metrics on the admin page and provides an estimation method for the plain table counts.

Fix #17506

Signed-off-by: Andrew Thornton [email protected]

Currently whenever the prometheus metrics endpoint or `/admin` endpoint are viewed
the statistics are recalculated immediately - using COUNT rather than a less expensive
method.

This PR provides a mechanism to cache these statistics, avoids generating all of the
metrics on the admin page and provides an estimation method for the plain table counts.

Fix go-gitea#17506

Signed-off-by: Andrew Thornton <[email protected]>
@zeripath zeripath added the type/enhancement An improvement of existing functionality label Apr 23, 2022
@zeripath zeripath added this to the 1.17.0 milestone Apr 23, 2022
@GiteaBot GiteaBot added the lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. label Apr 24, 2022
@lunny
Copy link
Member

lunny commented Apr 24, 2022

Maybe it's database's responsibility to optimize the select count(*)?

@wxiaoguang
Copy link
Contributor

wxiaoguang commented Apr 24, 2022

Maybe it's database's responsibility to optimize the select count(*)?

All (at least most) transactional database's COUNT(*) is a full table (index) scan.

Copy link
Contributor

@wxiaoguang wxiaoguang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, except why the accurate number is necessary or makes sense.

@GiteaBot GiteaBot added lgtm/need 1 This PR needs approval from one additional maintainer to be merged. and removed lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. labels Apr 24, 2022
ensure that statistics are only calculated one at a time.

Signed-off-by: Andrew Thornton <[email protected]>
Signed-off-by: Andrew Thornton <[email protected]>
@wxiaoguang
Copy link
Contributor

Since there is a new related PR #19561 , I still think the mechanism should be simplified to providing estimated numbers without cache, with no setting option.

Complexity is the enemy if it's not a must. If the accurate numbers don't help admins really, then it's not necessary to make it complex. It's not ideal to introduce more and more (not-that-useful) options, now there were 4 new options, if there are other similar requirements in future then there will be 6 or 8 options which doesn't help users really.

And simplified code would help other PRs like #19561 and maybe more.

@zeripath
Copy link
Contributor Author

zeripath commented May 4, 2022

I must absolutely disagree about the cache. We should be caching these statistics.

It's insanely wasteful to be repeatedly recalculating these results everytime you go to the admin dashboard.

I also disagree about whether we should be simply providing estimates or not. If drop to Estimates by default we will IMMEDIATELY receive complaints from people because the numbers are "incorrect". It's simply not worth the hassle.

@zeripath
Copy link
Contributor Author

zeripath commented May 4, 2022

In terms of providing estimates the non-simple counts, e.g. the UserCount in postgres we can use:

EXPLAIN SELECT COUNT(*) FROM `user` WHERE `type`=0;

and postgres will return something like:

                            QUERY PLAN                            
------------------------------------------------------------------
 Aggregate  (cost=10.25..10.26 rows=1 width=8)
   ->  Seq Scan on user  (cost=0.00..10.20 rows=20 width=0)

Extracting the rows=... from this will give a reasonable estimate of the number of rows.

MySQL also has EXPLAIN and will return the numbers of rows in a rows column but I'm not sure whether it will provide a quicker estimate.

For MSSQL we likely need to add a NONCLUSTERED COLUMNSTORE INDEX on to the appropriate columns.

@wxiaoguang
Copy link
Contributor

. If drop to Estimates by default we will IMMEDIATELY receive complaints from people because the numbers are "incorrect". It's simply not worth the hassle.

I disagree that we should make code more than complex just because of some non-sense complaints.

There are 2 kinds of complaints, one is that the user is really hurt, one is that the user doesn't know what they want or what they are doing.

For this case, I believe it's kind two (users do not know what they want). Do users really know what the accurate action/comment number mean? Will they be really hurt by estimated numbers to make wrong decisions?

And if you use cache, when the underlying number keeps changing (like comments/actions), do users really get accurate numbers? They are already contradictions, because in they end they still read the estimated numbers instead of the accurate ones.


Could there be some compromise plans that for small tables like user always use accurate numbers, for large and non-sense tables like comment/action just use estimated numbers? Then no more option, no complex code and no complaint.

@zeripath
Copy link
Contributor Author

zeripath commented May 6, 2022

OK I've dropped the ESTIMATE_COUNTS option.

You can field and answer the issues when they complain - because they will.

@wxiaoguang
Copy link
Contributor

wxiaoguang commented May 7, 2022

OK I've dropped the ESTIMATE_COUNTS option.

You can field and answer the issues when they complain - because they will.

Could we make a poll about how to continue?

  • Introduce new options ESTIMATE_COUNTS=true/false, STATISTICS_TTL=5m (vote 🚀 )

    • totally 4 new options for metrics/dashboard
  • (vs) Introduce no option, just tell users some numbers are estimated (vote: 🎉 ), what I mean is:

    • there are too many options already, if a new option doesn't really help users, then we should avoid it.
    • write the StatisticsTtl as a const to 5m in code, if there is any real requirement, use a new simple PR to make it an option next time. Do not pay the not-happened-yet cost for future.
    • for small tables like user / repository, Gitea can provide accurate numbers. Because on the user/repository list page, these tables are always counted, so accurate counting is fine.
    • for large tables like attachment / action / comment, estimated numbers are enough because accurate numbers helps nothing, when the table becomes large, it only wastes the database's performance.
    • on the dashboard UI, tell users the some numbers are estimated, by using a unicode char or .
    • I could take all the work I described (I could work on this PR directly if you don't mind), and I will answer user questions if they have to see their real requirments.
  • (or) Neutral, either is fine, and will approve either ( vote 👀 )

Indeed, my personal opinion might be wrong (and it's not a blocker). If nobody else likes any of the idea, such improvement can not get merged.

@lafriks
Copy link
Member

lafriks commented May 8, 2022

Can't we just cache values and reset cache on specific events (user created/deleted etc)?

models/db/context.go Outdated Show resolved Hide resolved
modules/metrics/statistics.go Outdated Show resolved Hide resolved
modules/metrics/statistics.go Outdated Show resolved Hide resolved
Copy link
Member

@delvh delvh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fine apart from the entries missing in custom/conf/app.example.ini.
Also, what's the reason to estimate only some of the statistics, but not all of them?

@go-gitea go-gitea deleted a comment from codecov-commenter May 22, 2022
@lunny lunny modified the milestones: 1.17.0, 1.18.0 Jun 16, 2022
@lunny lunny modified the milestones: 1.18.0, 1.19.0 Oct 17, 2022
@lunny lunny modified the milestones: 1.19.0, 1.20.0 Feb 2, 2023
@wxiaoguang
Copy link
Contributor

After long time, nobody complains that "the action is not accurate".

I have 100% confidence that my conclusion is right: no need to do more ticks for these large tables.

In the following PR, I will do:

  1. Remove the "estimated action number"
  2. Move the statistics to a separate page to avoid it affecting the default admin dashboard page's loading time.

So this PR could be closed.

@wxiaoguang wxiaoguang closed this May 2, 2023
@GiteaBot GiteaBot removed this from the 1.20.0 milestone May 2, 2023
@go-gitea go-gitea locked as resolved and limited conversation to collaborators Aug 1, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
lgtm/need 1 This PR needs approval from one additional maintainer to be merged. type/enhancement An improvement of existing functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve admin dashboard summary display to prevent timeout
6 participants