Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VCenter Advanced performance data charts and Sexi Graph charts do not match #349

Open
torck82 opened this issue Jul 18, 2023 · 14 comments
Open

Comments

@torck82
Copy link

torck82 commented Jul 18, 2023

Hi, I'm comparing the charts for all the statuses; the data does not match. For example, CPU usage in Percent and CPU utilization in Sexigraf, at 8.50 am Vcenter, shows a CPU spike at 90 Percent, but Sexigraf said it's only 61.5 percent.

@rschitz
Copy link
Member

rschitz commented Jul 18, 2023

Hi, what's the time range you're looking at?

@torck82
Copy link
Author

torck82 commented Jul 18, 2023 via email

@rschitz
Copy link
Member

rschitz commented Jul 18, 2023

On the last 1h preview of the vCenter, you got 20 seconds of data granularity, in SexiGraf we collect those data every 5min and store the median of those 15 metrics (15*20sec=5min) to optimize data usage. After 24h we start reducing the granularity for data usage and performances like so : 5m:24h,10m:48h,60m:7d,240m:30d,720m:90d,2880m:1y,5760m:2y,17280m:5y

@torck82
Copy link
Author

torck82 commented Jul 18, 2023 via email

@rschitz
Copy link
Member

rschitz commented Jul 18, 2023

Only for few metrics only because we also collect quickstats (which is basicaly the same thing as median) for even faster processing, so that wouldnt be really helpful in you situation. for troubleshooting you should use esxtop really IMHO

@torck82
Copy link
Author

torck82 commented Jul 18, 2023 via email

@rschitz
Copy link
Member

rschitz commented Jul 18, 2023

your welcome. you have a specific issue with a vm, maybe i can help you ?

@torck82
Copy link
Author

torck82 commented Jul 18, 2023 via email

@rschitz
Copy link
Member

rschitz commented Jul 18, 2023

it cant be more realtime than 20 sec anyway since that's the minimum the api can provide, below that you need esxtop and having worked in many big vdi env, even with 5min of cpu usage, ready and demand avg, you should have more than enough to tell if something is wrong (vcpu/pcpu ratio also help), otherwise it's more tricky than simply over-commitment.
I'd be interested in example to prove me wrong :)

@torck82
Copy link
Author

torck82 commented Jul 18, 2023 via email

@rschitz
Copy link
Member

rschitz commented Jul 19, 2023

that would require a lot of changes and of course wouldnt be possible over night but you're not the first one to ask me so i'm really starting to consider it.
What would be a minimum retention for those 20sec before we start to reduce the granularity? 2 days, 1 week?
also, what would be the number of vms the appliance would collect data from?
thanks for your feedback

@torck82
Copy link
Author

torck82 commented Jul 19, 2023 via email

@rschitz
Copy link
Member

rschitz commented Jul 20, 2023

according to https://m30m.github.io/whisper-calculator/ that would be 46.89 Gigabytes only vms instead of 600.81 Megabytes. For reference, we deployed a single sexigraf appliance in a 220K+ VMs env with 50+ vCenter and the total storage consumption was ~100GB :)
i'll try on cpu and latency related metrics and let you know.

@torck82
Copy link
Author

torck82 commented Jul 20, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants