Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Worktable workfile stats #8587

Merged
merged 12 commits into from
Jul 27, 2021
Merged

Conversation

m82labs
Copy link
Contributor

@m82labs m82labs commented Dec 17, 2020

Required for all PRs:

  • Signed CLA.
  • Associated README.md updated.
  • Has appropriate unit tests.

This is a small PR to add some new stats to help in troubleshooting tempdb contention from sort/hash spills. I have added two new metrics to the sqlServerPerformanceCounters query:

  • Workfiles Created/sec
  • Worktables Created/sec

I also noticed some mertrics from the V2 query that seemed to be omitted when the queries were split out and added them back:

  • Distributed Query
  • DTC calls
  • Query Store CPU usage

@m82labs
Copy link
Contributor Author

m82labs commented Dec 17, 2020

I might need a review from @denzilribeiro on this one. I don't think the metrics above were removed on purpose, but I don't honestly know.

Copy link
Contributor

@denzilribeiro denzilribeiro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mark, can you add these to the sqlAzureDBPerformanceCounters and sqlAzureMIPerformanceCounters as well? Did these get dropped along the way?

@sjwang90 sjwang90 added area/sqlserver feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin labels Dec 18, 2020
@m82labs
Copy link
Contributor Author

m82labs commented Dec 18, 2020

I'll add them. No the worktables/files stats were never there, but the Query Store CPU and the DTC related metrics were at one point.

@m82labs
Copy link
Contributor Author

m82labs commented Dec 18, 2020

All set @denzilribeiro.

Copy link
Contributor

@telegraf-tiger telegraf-tiger bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@m82labs
Copy link
Contributor Author

m82labs commented Apr 26, 2021

I haven't contributed in a while, will this automatically get merged in at some point in the future or do I have to do something else? Thanks!

@m82labs
Copy link
Contributor Author

m82labs commented May 6, 2021

@denzilribeiro I also noticed all the resource governor metrics were removed when the queries were split out. Do you know if the normal RG DMVs exist on all versions (Azure SQL DB, Managed instances, etc.)?

Specifically in the perf counter query I was inserting some "fake" metrics via a query:

SELECT
	 ''SQLServer:Workload Group Stats'' AS [object]
	,[counter]
	,[instance]
	,CAST(vs.[value] AS BIGINT) AS [value]
	,1
FROM
(
    SELECT
		 rgwg.name AS instance
		,rgwg.total_request_count AS [Request Count]
		,rgwg.total_queued_request_count AS [Queued Request Count]
		,rgwg.total_cpu_limit_violation_count AS [CPU Limit Violation Count]
		,rgwg.total_cpu_usage_ms AS [CPU Usage (time)]
		,rgwg.total_lock_wait_count AS [Lock Wait Count]
		,rgwg.total_lock_wait_time_ms AS [Lock Wait Time]
		,rgwg.total_reduced_memgrant_count AS [Reduced Memory Grant Count]
		' + @Columns + N'
	FROM sys.[dm_resource_governor_workload_groups] AS rgwg
	INNER JOIN sys.[dm_resource_governor_resource_pools] AS rgrp
		ON rgwg.[pool_id] = rgrp.[pool_id]
) AS rg
UNPIVOT (
    value FOR counter IN ( [Request Count], [Queued Request Count], [CPU Limit Violation Count], [CPU Usage (time)], [Lock Wait Count], [Lock Wait Time], [Reduced Memory Grant Count] ' + @PivotColumns + N')
) AS vs'

INSERT INTO @PCounters
EXEC( @SqlStatement )```

I can add it to this same PR unless there were compatibility issues. 

@denzilribeiro
Copy link
Contributor

@m82labs semantics of RG are different on SQL DB (MI is the same as on-prem).
Why don't you get what you can get from the actual counters rather than simulated? Aka perfmon Workload Group Stats Object ? It can be confusing otherwise right?
https://docs.microsoft.com/en-us/sql/relational-databases/performance-monitor/sql-server-workload-group-stats-object?view=sql-server-ver15
If it is just for backward compat, then no need to add for MI.

Copy link
Contributor

@denzilribeiro denzilribeiro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remove these 2 from the SQL DB portion ( given no DTC support)- sorry I missed that

,'Distributed Query'
,'DTC calls'

@m82labs
Copy link
Contributor Author

m82labs commented May 6, 2021

@m82labs semantics of RG are different on SQL DB (MI is the same as on-prem).
Why don't you get what you can get from the actual counters rather than simulated? Aka perfmon Workload Group Stats Object ? It can be confusing otherwise right?
https://docs.microsoft.com/en-us/sql/relational-databases/performance-monitor/sql-server-workload-group-stats-object?view=sql-server-ver15
If it is just for backward compat, then no need to add for MI.

You have made me re-evaluate how much I actually need the few extra metrics the DMVs provide, I don't think I need them. :)

@m82labs
Copy link
Contributor Author

m82labs commented May 6, 2021

Can you remove these 2 from the SQL DB portion ( given no DTC support)- sorry I missed that

,'Distributed Query'
,'DTC calls'

Done

@m82labs
Copy link
Contributor Author

m82labs commented Jul 15, 2021

Is there anything I need to do to get these changes merged @jagularr ?

Copy link
Contributor

@sspaink sspaink left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. @m82labs sorry for the slow response, thanks for waiting!

@sspaink sspaink merged commit 57ecd1d into influxdata:master Jul 27, 2021
reimda pushed a commit that referenced this pull request Jul 28, 2021
(cherry picked from commit 57ecd1d)
phemmer added a commit to phemmer/telegraf that referenced this pull request Aug 13, 2021
* origin/master: (183 commits)
  fix: CrateDB replace dots in tag keys with underscores (influxdata#9566)
  feat: Pull metrics from multiple AWS CloudWatch namespaces (influxdata#9386)
  fix: improve Clickhouse corner cases for empty recordset in aggregation queries, fix dictionaries behavior (influxdata#9401)
  fix(opcua): clean client on disconnect so that connect works cleanly (influxdata#9583)
  fix: Refactor ec2 init for config-api (influxdata#9576)
  fix: sort logs by timestamp before writing to Loki (influxdata#9571)
  fix: muting tests for udp_listener (influxdata#9578)
  fix: Do not return on disconnect to avoid breaking reconnect (influxdata#9524)
  fix: Fixing k8s nodes and pods parsing error (influxdata#9581)
  feat: OpenTelemetry output plugin (influxdata#9228)
  feat: Support AWS Web Identity Provider (influxdata#9411)
  fix: upgraded sensu/go to v2.9.0 (influxdata#9577)
  fix: Normalize unix socket path (influxdata#9554)
  docs: fix aws ec2 readme inconsistency (influxdata#9567)
  feat: Modbus Rtu over tcp enhancement (influxdata#9570)
  docs: information on new conventional commit format (influxdata#9573)
  docs: Add logo (influxdata#9574)
  docs: Adding links to net_irtt and dht_sensor external plugins (influxdata#9569)
  Upgrade hashicorp/consul/api to 1.9.1 (influxdata#9565)
  Update vmware/govmomi to v0.26.0 (influxdata#9552)
  Do not skip good quality nodes after a bad quality node is encountered (influxdata#9550)
  fix test so it hits a fake service (influxdata#9564)
  Update changelog
  Fix procstat plugin README to match sample config (influxdata#9553)
  Fix metrics reported as written but not actually written  (influxdata#9526)
  Prevent segfault in persistent volume claims (influxdata#9549)
  Update procstat to support cgroup globs & include systemd unit children (Copy of influxdata#7890) (influxdata#9488)
  Fix attempt to connect to an empty list of servers. (influxdata#9503)
  Fix handling bool in sql input plugin (influxdata#9540)
  Suricata alerts (influxdata#9322)
  Linter fixes for plugins/inputs/[fg]* (influxdata#9387)
  For Prometheus Input add ability to query Consul Service catalog (influxdata#5464)
  Support Landing page on Prometheus landing page (influxdata#8641)
  [Docs] Clarify tagging behavior (influxdata#9461)
  Change the timeout from all queries to per query (influxdata#9471)
  Attach the pod labels to the `kubernetes_pod_volume` & `kubernetes_pod_network` metrics. (influxdata#9438)
  feat(http_listener_v2): allows multiple paths and add path_tag (influxdata#9529)
  Bug Fix Snmp empty metric name (influxdata#9519)
  Worktable workfile stats (influxdata#8587)
  Update Go to v1.16.6 (influxdata#9542)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/sqlserver feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants