[vSphere][host] Add support for new metrics in host metricset #40429

kush-elastic · 2024-08-05T05:40:07Z

Description

Here are the following metrics to be added for the host data stream in the vSphere metricbeat module. Here we added a new performance API to get more detailed information from the vSphere.

Metrics Type	Metrics	API	Field Mappings
Host Disk Performance
	disk.deviceLatency.average (ms)	Performance	disk.devicelatency.average.ms
	disk.maxTotalLatency.latest (ms)	Performance	disk.latency.total.ms
	disk.usage.average (KBps)	Performance	disk.total.bytes
	disk.read.average (KBps)	Performance	disk.read.bytes
	disk.write.average (KBps)	Performance	disk.write.bytes
	disk.capacity.usage.average (KB)	Performance	disk.capacity.usage.bytes

Host uptime/status
	host.Summary.OverallStatus	Summary	status
	host.Summary.QuickStats.Uptime	Summary	uptime
	host.Datastore	Extra	datastore.names
			datastore.count
	host.Vm	Extra	vm.names
			vm.count

Host Network Stats	host.Netowork	Extra	network.names
			network.count
	net.transmitted.average (KBps)	Performance	network.bandwidth.transmitted.bytes
	net.received.average (KBps)	Performance	network.bandwidth.received.bytes
	net.usage.average (KBps)	Performance	network.bandwidth.total.bytes
	net.packetsTx.summation (num)	Performance	network.packets.transmitted.count
	net.packetsRx.summation (num)	Performance	network.packets.received.count
	net.errorsRx.summation (num)	Performance	network.packets.errors.received.count
	net.errorsTx.summation (num)	Performance	network.packets.errors.transmitted.count
			network.packets.errors.total.count
	net.multicastTx.summation (num)	Performance	network.packets.multicast.transmitted.count
	net.multicastRx.summation (num)	Performance	network.packets.multicast.received.count
			network.packets.multicast.total.count
	net.droppedTx.summation (num)	Performance	network.packets.dropped.transmitted.count
	net.droppedRx.summation (num)	Performance	network.packets.dropped.received.count
			network.packets.dropped.total.count

Checklist

My code follows the style guidelines of this project
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
I have made corresponding change to the default configuration files
I have added tests that prove my fix is effective or that my feature works
I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Related issues

Relates https://github.com/elastic/obs-integration-team/issues/16

mergify · 2024-08-05T05:40:43Z

This pull request does not have a backport label.
If this is a bug or security fix, could you label this PR @kush-elastic? 🙏.
For such, you'll need to label your PR with:

The upcoming major version of the Elastic Stack
The upcoming minor version of the Elastic Stack (if you're not pushing a breaking change)

To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

backport-v8./d.0 is the label to automatically backport to the 8./d branch. /d is the digit

metricbeat/module/vsphere/host/_meta/data.json

….com/kush-elastic/beats into obs_16_host_metricset_implementation

harnish-elastic

LGTM!

shmsr · 2024-08-20T09:26:27Z

@kush-elastic Here are some more changes that I'd recommend:

diff --git a/metricbeat/module/vsphere/host/host.go b/metricbeat/module/vsphere/host/host.go
index 22e4756661..1a2e3a1a7c 100644
--- a/metricbeat/module/vsphere/host/host.go
+++ b/metricbeat/module/vsphere/host/host.go
@@ -66,24 +66,24 @@ type assetNames struct {
 }
 
 // Define metrics to be collected
-var metricNames = []string{
-	"disk.capacity.usage.average",
-	"disk.deviceLatency.average",
-	"disk.maxTotalLatency.latest",
-	"disk.usage.average",
-	"disk.read.average",
-	"disk.write.average",
-	"net.transmitted.average",
-	"net.received.average",
-	"net.usage.average",
-	"net.packetsTx.summation",
-	"net.packetsRx.summation",
-	"net.errorsTx.summation",
-	"net.errorsRx.summation",
-	"net.multicastTx.summation",
-	"net.multicastRx.summation",
-	"net.droppedTx.summation",
-	"net.droppedRx.summation",
+var metricSet = map[string]struct{}{
+	"disk.capacity.usage.average": {},
+	"disk.deviceLatency.average":  {},
+	"disk.maxTotalLatency.latest": {},
+	"disk.usage.average":          {},
+	"disk.read.average":           {},
+	"disk.write.average":          {},
+	"net.transmitted.average":     {},
+	"net.received.average":        {},
+	"net.usage.average":           {},
+	"net.packetsTx.summation":     {},
+	"net.packetsRx.summation":     {},
+	"net.errorsTx.summation":      {},
+	"net.errorsRx.summation":      {},
+	"net.multicastTx.summation":   {},
+	"net.multicastRx.summation":   {},
+	"net.droppedTx.summation":     {},
+	"net.droppedRx.summation":     {},
 }
 
 // Fetch methods implements the data gathering and data conversion to the right
@@ -130,77 +130,73 @@ func (m *MetricSet) Fetch(ctx context.Context, reporter mb.ReporterV2) error {
 	// Create a performance manager
 	perfManager := performance.NewManager(c)
 
-	// Retrieve metric IDs for the specified metric names
+	// Retrieve all available metrics
 	metrics, err := perfManager.CounterInfoByName(ctx)
 	if err != nil {
 		return fmt.Errorf("failed to retrieve metrics: %w", err)
 	}
 
-	// Retrieve only the required metrics
-	requiredMetrics := make(map[string]*types.PerfCounterInfo)
-
-	for _, name := range metricNames {
-		metric, exists := metrics[name]
-		if !exists {
-			m.Logger().Warnf("Metric %s not found", name)
-			continue
+	// Filter for required metrics
+	var metricIds []types.PerfMetricId
+	for metricName := range metricSet {
+		if metric, ok := metrics[metricName]; ok {
+			metricIds = append(metricIds, types.PerfMetricId{CounterId: metric.Key})
+		} else {
+			m.Logger().Warnf("Metric %s not found", metricName)
 		}
-		requiredMetrics[name] = metric
-	}
-
-	metricIDs := make([]types.PerfMetricId, 0, len(requiredMetrics))
-	for _, metric := range requiredMetrics {
-		metricIDs = append(metricIDs, types.PerfMetricId{
-			CounterId: metric.Key,
-		})
 	}
 
 	pc := property.DefaultCollector(c)
 	for i := range hst {
-		assetNames, err := getAssetNames(ctx, pc, &hst[i])
-		if err != nil {
-			m.Logger().Errorf("Failed to retrieve object from host %s: %w", hst[i].Name, err)
-		}
+		select {
+		case <-ctx.Done():
+			return ctx.Err()
+		default:
+			assetNames, err := getAssetNames(ctx, pc, &hst[i])
+			if err != nil {
+				m.Logger().Errorf("Failed to retrieve object from host %s: %w", hst[i].Name, err)
+			}
 
-		spec := types.PerfQuerySpec{
-			Entity:     hst[i].Reference(),
-			MetricId:   metricIDs,
-			MaxSample:  1,
-			IntervalId: 20, // right now we are only grabbing real time metrics from the performance manager
-		}
+			spec := types.PerfQuerySpec{
+				Entity:     hst[i].Reference(),
+				MetricId:   metricIds,
+				MaxSample:  1,
+				IntervalId: 20, // right now we are only grabbing real time metrics from the performance manager
+			}
 
-		// Query performance data
-		samples, err := perfManager.Query(ctx, []types.PerfQuerySpec{spec})
-		if err != nil {
-			m.Logger().Errorf("Failed to query performance data from host %s: %v", hst[i].Name, err)
-			continue
-		}
+			// Query performance data
+			samples, err := perfManager.Query(ctx, []types.PerfQuerySpec{spec})
+			if err != nil {
+				m.Logger().Errorf("Failed to query performance data from host %s: %v", hst[i].Name, err)
+				continue
+			}
 
-		if len(samples) == 0 {
-			m.Logger().Debug("No samples returned from performance manager")
-			continue
-		}
+			if len(samples) == 0 {
+				m.Logger().Debug("No samples returned from performance manager")
+				continue
+			}
 
-		results, err := perfManager.ToMetricSeries(ctx, samples)
-		if err != nil {
-			m.Logger().Errorf("Failed to convert performance data to metric series for host %s: %v", hst[i].Name, err)
-		}
+			results, err := perfManager.ToMetricSeries(ctx, samples)
+			if err != nil {
+				m.Logger().Errorf("Failed to convert performance data to metric series for host %s: %v", hst[i].Name, err)
+			}
 
-		metricMap := make(map[string]interface{})
-		for _, result := range results[0].Value {
-			if len(result.Value) > 0 {
-				metricMap[result.Name] = result.Value[0]
-				continue
+			metricMap := make(map[string]interface{})
+			for _, result := range results[0].Value {
+				if len(result.Value) > 0 {
+					metricMap[result.Name] = result.Value[0]
+					continue
+				}
+				m.Logger().Debugf("For host %s,Metric %v: No result found", hst[i].Name, result.Name)
 			}
-			m.Logger().Debugf("For host %s,Metric %v: No result found", hst[i].Name, result.Name)
-		}
 
-		reporter.Event(mb.Event{
-			MetricSetFields: m.eventMapping(hst[i], &metricData{
-				perfMetrics: metricMap,
-				assetsName:  assetNames,
-			}),
-		})
+			reporter.Event(mb.Event{
+				MetricSetFields: m.eventMapping(hst[i], &metricData{
+					perfMetrics: metricMap,
+					assetsName:  assetNames,
+				}),
+			})
+		}
 	}
 
 	return nil

shmsr

Except for the CI failures, the rest LGTM!

niraj-elastic

Provided some minor nits.

CHANGELOG.next.asciidoc

metricbeat/module/vsphere/host/_meta/fields.yml

Co-authored-by: niraj-elastic <[email protected]>

….com/kush-elastic/beats into obs_16_host_metricset_implementation

shmsr · 2024-08-21T20:33:04Z

Some more changes:

diff --git a/metricbeat/module/vsphere/host/_meta/fields.yml b/metricbeat/module/vsphere/host/_meta/fields.yml
index 6ca8380c21..65ff1e110b 100644
--- a/metricbeat/module/vsphere/host/_meta/fields.yml
+++ b/metricbeat/module/vsphere/host/_meta/fields.yml
@@ -1,21 +1,21 @@
 - name: host
   type: group
   description: >
-    host
+    Host information from vSphere environment.
   release: ga
   fields:
     - name: cpu.used.mhz
       type: long
       description: >
-        Used CPU in Mhz.
+        Used CPU in MHz.
     - name: cpu.total.mhz
       type: long
       description: >
-        Total CPU in Mhz.
+        Total CPU in MHz.
     - name: cpu.free.mhz
       type: long
       description: >
-        Free CPU in Mhz.
+        Free CPU in MHz.
     - name: datastore.names
       type: keyword
       description: >
@@ -50,7 +50,7 @@
     - name: disk.total.bytes
       type: long
       description: >
-        Aggregated disk Input/Output rate.
+        Sum of disk read and write rates each second in bytes.
       format: bytes
     - name: memory.free.bytes
       type: long
@@ -71,10 +71,6 @@
       type: keyword
       description: >
         Host name.
-    - name: network_names
-      type: keyword
-      description: >
-        Network names.
     - name: network.names
       type: keyword
       description: >
@@ -96,7 +92,7 @@
     - name: network.bandwidth.total.bytes
       type: long
       description: >
-        Network utilization (combined transmit-rates and receive-rates).
+        Sum of network transmitted and received rates in bytes during the interval.
       format: bytes
     - name: network.packets.transmitted.count
       type: long

network_names is there twice? also fixed some descriptions.

kush-elastic · 2024-08-22T03:53:04Z

network_names is there twice? also fixed some descriptions.

We need to retain the original network_names field that existed before my changes. However, with the introduction of new fields, we wanted to adopt the ECS (Elastic Common Schema) convention, which groups fields using dots (.). As a result, we temporarily added duplicate fields. The older fields will be deprecated later.

ishleenk17 · 2024-08-22T04:57:15Z

@kush-elastic Can you add all the newly added metrics as part of the associations between host/network/datastore/vm to the metric list in the issue here and in our excel.
I see not all metrics are listed there.

ishleenk17

Looks good!
Please address this. #40429 (comment)

* [vSphere][host] Data collection and Fields mappings * add changelog entry * update github.com/vmware/govmomi version * update go.sum * make check for beats * golint-ci * update the field mappings * mage update * add more UTs * metricbeat: mage update * Resolve review comments * Minor changes - Update interval * add vm and datastore names with count per host * fix and update mapping methods * remove fmt.Errorf (cherry picked from commit e09627f)

#40697) * [vSphere][host] Data collection and Fields mappings * add changelog entry * update github.com/vmware/govmomi version * update go.sum * make check for beats * golint-ci * update the field mappings * mage update * add more UTs * metricbeat: mage update * Resolve review comments * Minor changes - Update interval * add vm and datastore names with count per host * fix and update mapping methods * remove fmt.Errorf (cherry picked from commit e09627f) Co-authored-by: Kush Rana <[email protected]> Co-authored-by: Ishleen Kaur <[email protected]>

[vSphere][host] Data collection and Fields mappings

227eaec

kush-elastic requested a review from a team as a code owner August 5, 2024 05:40

botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Aug 5, 2024

kush-elastic requested review from lalit-satapathy, ishleenk17 and harnish-elastic August 5, 2024 05:40

kush-elastic self-assigned this Aug 5, 2024

kush-elastic added enhancement Team:Obs-InfraObs Label for the Observability Infrastructure Monitoring team labels Aug 5, 2024

botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Aug 5, 2024

kush-elastic added Metricbeat Metricbeat Module:beat The beat Metricbeat module labels Aug 5, 2024

kush-elastic added 2 commits August 5, 2024 11:16

add changelog entry

b505cec

update github.com/vmware/govmomi version

8c61c2e

kush-elastic requested a review from a team as a code owner August 5, 2024 06:49

kush-elastic added 2 commits August 5, 2024 15:48

update go.sum

b54e545

make check for beats

ef8dc8c

harnish-elastic requested changes Aug 5, 2024

View reviewed changes

metricbeat/module/vsphere/host/_meta/data.json Outdated Show resolved Hide resolved

kush-elastic and others added 7 commits August 5, 2024 16:22

golint-ci fix

2aa91c3

remove extra fields

492ec96

update the field mappings

dd6e96e

mage update

8af84b0

Merge branch 'main' into obs_16_host_metricset_implementation

8472be9

golanlint-ci fix

4520b63

Merge branch 'obs_16_host_metricset_implementation' of https://github…

ae6f810

….com/kush-elastic/beats into obs_16_host_metricset_implementation

kush-elastic requested a review from harnish-elastic August 7, 2024 06:27

kush-elastic and others added 3 commits August 7, 2024 12:28

fix linting issues

86a9151

add more UTs

2d8997f

Merge branch 'main' into obs_16_host_metricset_implementation

04dd349

kush-elastic requested a review from devamanv August 20, 2024 06:16

fix datastore tests

516221e

harnish-elastic approved these changes Aug 20, 2024

View reviewed changes

devamanv approved these changes Aug 20, 2024

View reviewed changes

kush-elastic and others added 2 commits August 21, 2024 10:11

resolved review comments

af9b424

Merge branch 'main' into obs_16_host_metricset_implementation

be7f3c3

shmsr approved these changes Aug 21, 2024

View reviewed changes

niraj-elastic requested changes Aug 21, 2024

View reviewed changes

CHANGELOG.next.asciidoc Outdated Show resolved Hide resolved

metricbeat/module/vsphere/host/_meta/fields.yml Outdated Show resolved Hide resolved

kush-elastic changed the title ~~[vSphere][host] Implement the beats changes~~ [vSphere][host] Add support for new metrics in host metricset Aug 21, 2024

kush-elastic and others added 3 commits August 21, 2024 14:19

Update CHANGELOG.next.asciidoc

6d45870

Co-authored-by: niraj-elastic <[email protected]>

update field description

432dc60

Merge branch 'obs_16_host_metricset_implementation' of https://github…

3d9c31b

….com/kush-elastic/beats into obs_16_host_metricset_implementation

kush-elastic mentioned this pull request Aug 21, 2024

[vSphere][datastore] Add additional metrics #40441

Merged

6 tasks

kush-elastic requested a review from niraj-elastic August 21, 2024 10:09

update field mappings

29558f0

leehinman approved these changes Aug 21, 2024

View reviewed changes

ishleenk17 approved these changes Aug 22, 2024

View reviewed changes

kush-elastic and others added 2 commits August 22, 2024 11:05

resolved review comments

ddcbd35

Merge branch 'main' into obs_16_host_metricset_implementation

8f7bcce

niraj-elastic approved these changes Aug 22, 2024

View reviewed changes

kush-elastic merged commit e09627f into elastic:main Aug 22, 2024
123 checks passed

shmsr mentioned this pull request Aug 22, 2024

[vSphere][virtualmachine] Add metrics to virtualmachine metricset #40485

Merged

6 tasks

ishleenk17 added the backport-8.15 Automated backport to the 8.15 branch with mergify label Sep 5, 2024

mergify bot mentioned this pull request Sep 5, 2024

[8.15](backport #40429) [vSphere][host] Add support for new metrics in host metricset #40697

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[vSphere][host] Add support for new metrics in host metricset #40429

[vSphere][host] Add support for new metrics in host metricset #40429

kush-elastic commented Aug 5, 2024 •

edited

Loading

mergify bot commented Aug 5, 2024

harnish-elastic left a comment

shmsr commented Aug 20, 2024 •

edited

Loading

shmsr left a comment

niraj-elastic left a comment

shmsr commented Aug 21, 2024

kush-elastic commented Aug 22, 2024

ishleenk17 commented Aug 22, 2024

ishleenk17 left a comment

[vSphere][host] Add support for new metrics in host metricset #40429

[vSphere][host] Add support for new metrics in host metricset #40429

Conversation

kush-elastic commented Aug 5, 2024 • edited Loading

Description

Checklist

Related issues

mergify bot commented Aug 5, 2024

harnish-elastic left a comment

Choose a reason for hiding this comment

shmsr commented Aug 20, 2024 • edited Loading

shmsr left a comment

Choose a reason for hiding this comment

niraj-elastic left a comment

Choose a reason for hiding this comment

shmsr commented Aug 21, 2024

kush-elastic commented Aug 22, 2024

ishleenk17 commented Aug 22, 2024

ishleenk17 left a comment

Choose a reason for hiding this comment

kush-elastic commented Aug 5, 2024 •

edited

Loading

shmsr commented Aug 20, 2024 •

edited

Loading