Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace WMI in cs and os collectors #702

Merged
merged 15 commits into from
Mar 30, 2021

Conversation

benridley
Copy link
Contributor

@benridley benridley commented Jan 20, 2021

This PR addresses #677 by replacing WMI calls within the above collectors with direct winapi calls, perflib calls, and registry reads.

A 'header' folder has been introduced that contains Golang wrappers for parts of the Windows API that the exporter requires for exposing metrics. We figured this would be useful going forward to wrap more of the direct winapi calls as new collectors are added and older ones migrated away from WMI.

Performance benchmarks have shown ~20x improvement on the os collector and ~250x improvement in the cs collector.

We've basically attempted to replicate the WMI metrics 1-to-1 with these changes. Sometimes this meant digging deep into the WinAPI to find where WMI was pulling stuff from, but we believe we've gotten 100% compatibility. I think as we move away from WMI though, there should probably be a rationalization of metrics and collectors now we aren't tied to WMI classes.

@benridley benridley requested a review from a team as a code owner January 20, 2021 00:30
@benridley benridley force-pushed the dev_cs_collector branch 2 times, most recently from 3489796 to fe4fcf5 Compare January 20, 2021 00:32
Copy link
Contributor

@breed808 breed808 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking very good. I'm particularly impressed with the documentation in the comments, well done there!

I'm also rather happy with the introduction of benchmarks for the collectors.
I think this is a good start to introducing benchmarks so we can identify laggard collectors. The question is, do we want to introduce a separate _test.go file for each collector containing benchmarks or centralize them in collector_test.go?

I'd be happy approving this myself, but I'll give @carlpett the opportunity to review & approve; his low-level knowledge of Windows & this exporter surpasses my own.

@carlpett
Copy link
Collaborator

Thank you @benridley and @retryW, this is a great step forward!
From a high level, I think the main thing I'd like to discuss is if we could/should wrap the details a bit more? The Collect functions are now a quite long series of somewhat involved and low-level operations. I'm not completely certain where a good tradeoff lies, so happy to have a discussion there.
For example, we could have the APIs into two layers, one with the one-to-one mapping from Windows structs into Go, and then a a Go-friendly API that is what we actually expose to the rest of the exporter. This would of course mean some amount of more code, but I'm a bit concerned with how complex it looks.
Thoughts?

@retryW
Copy link
Contributor

retryW commented Jan 22, 2021

Cheers for the positive feedback!
I'm not opposed to adding another layer, however feel like for the most part it would add extra obfuscation for little benefit.
Maybe we wrap the windows registry calls up into their own function as they're causing the majority of the clutter/readability issues within the os.go collect() func ?

I had also been tossing and turning between making the winapi func calls when I need them, i.e. above the ch <- channel assignment that actually uses that data, or having all the data pulled first and keep all the channel assignment clean and how it was prior to this PR (which is how we current have it). Any preferences for overall readability and ease of understanding?

@benridley
Copy link
Contributor Author

Thank you both for the feedback!
@breed808 Good question. I think I'm in favour of separate files, only because if we start adding unit tests as well as benchmarks it'd be nice to have unit tests isolated per collector, as a single file could get really messy with multiple tests per collector for all collectors.
@carlpett I definitely see the concern, but I'm not sure how much we can abstract it in a meaningful way. I think a lot of the low level clutter is just a consequence of working with a low-level Windows API, and we shouldn't try and hide that when it's core to how the exporter operates. I am open to the idea if we can find a meaningful abstraction though, but what would that look like? Perhaps commenting might improve readability also without adding a layer of indirection?

@carlpett
Copy link
Collaborator

Extracting a Registry lookup function sounds good @retryW, I think we might have some other places that could use that.

@benridley Agree, there's definitely a fine line between too much and too little "implementation context". I'd subjectively find it easier to read and reason about if we'd separate the "load the data" and "export the data" concerns, and let the loading part scrub away some of the Win32isms (eg the Ull data type prefixes, or the Wki102 one). As an example, we'd then have a new package sysinfoapi wrapping the headers/sysinfo package, with a MemoryStatus struct with a TotalPhysicalMemoryBytes field (and the others needed).
Hiding the mechanisms of this will of course require stepping through a couple of files when understanding the deeper bits, and there will be a bit of boilerplate code writing required.

The main benefit of this would be making the code base a bit more welcoming to new contributors looking to add more functionality in these files, not having to find the right spot to inject more logic (although, writing this I realize there's of course the case to be made that these collectors are probably quite static in features).

What do you think, reasonable?

@benridley
Copy link
Contributor Author

@carlpett Yeah that seems reasonable. We'll start creating some more idiomatic wrappers for the API and see how it looks. Cheers!

@benridley benridley force-pushed the dev_cs_collector branch 2 times, most recently from ac0d351 to ee38481 Compare March 18, 2021 05:18
@benridley
Copy link
Contributor Author

benridley commented Mar 18, 2021

@carlpett @breed808 Apologies for the delay folks!

Let me know what you think now. Basically I've hidden all the nasty low level WinAPI operations in private methods and structs, and the 'header' files expose idiomatic wrappers that can be used externally. Certainly makes the code a lot cleaner to use, although it does involve a bit of double handling as we've kept the private stuff 1-to-1 with WinAPI. @retryW's and my thinking was that this makes it easy to trace back what API's we're consuming and why.

Let me know what you think and if this is what you had in mind, otherwise we can keep iterating :)

benridley and others added 13 commits March 19, 2021 10:13
moved

Signed-off-by: Ben Ridley <[email protected]>
Copy link
Collaborator

@carlpett carlpett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late review, but this is looking good! A few comments/questions below

}

// NetWkstaGetInfo returns information about the configuration of a workstation.
// WARNING: The caller must call netApiBufferFree to free the memory allocated by this function.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is outdated, I assume? Since the free call is made within the function already :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it was more relevant when the function was public. Now that it's not exposed, this should be irrelevant. I've cleaned it up.

Comment on lines 155 to 156
size := 1024
var buffer [4096]uint16
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1kb or 8kb? :)
I this as you mention in the comment that 256 (UTF-16) characters should be the limit of the how much can be returned.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad, I've made size a const 1024 and used that throughout the function. Much clearer now.

collector/os.go Outdated
Comment on lines 184 to 186
if err := memManKey.Close(); err != nil {
return nil, err
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a possible leak, since it isn't executed if GetStringsValue errors. You'll need to put this in a defer :(
As an alternative, perhaps pull the openkey/read value/close into a helper?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch... I got a bit tied up trying to figure out we would handle an error in Close(), because if we defer it will be ignored. They do this in the documentation for the package however, and from reading the winapi stuff it seems it will only error if we try and close a key that's not open/already closed. So I think in our case we can safely ignore it and just defer. Something to be wary of though.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, sounds reasonable. For completeness, what we've done in a few other places is essentially

defer func() {
  if err := key.Close(); err != nil {
    // log something
  }
}()

You can change the returned err of the enclosing function by using named returns, if that matters (as mentioned, here it doesn't)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @carlpett, I did find the named returns workaround in my travels but I was cautious because I thought we could end up in a situation where we e.g. fail to open a key, then fail to close it, and the close error overwrites the open failure which would lead to confusion. Directly logging the error seems like a great workaround, so I'll keep that in mind. Cheers!

collector/os.go Outdated
Comment on lines 203 to 205
if err := ntKey.Close(); err != nil {
return nil, err
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, needs a defer to avoid possible leak

- Defer registry close calls
- Ensure size parameter in GetComputerName is properly specified
- Clean up some comments to ensure correctness

Signed-off-by: Ben Ridley <[email protected]>
Copy link
Collaborator

@carlpett carlpett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me! Thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants