-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add config changes for UI metrics #8694
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few questions about how these new fields are used, and some suggestions for improving the tests. Our test suite is already quite slow, and if we want to make room for more tests I think we need to be careful about what we are running in each test.
agent/agent_test.go
Outdated
a := NewTestAgent(t, hcl) | ||
defer a.Shutdown() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a very expensive test for what is essentially testing that an atomic works. Do we really need a running agent to be able to test this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 This is really a test of the plumbing and glue code around reloading. It follows the pattern of the other reload tests. IIRC for a long time we didn't have tests for much of this stuff and often had bugs where code had been added to a reload method but some part of the plumbing missed.
Sadly since the agent isn't more modular it's pretty hard to test any of the plumbing around one without starting one.
I'm with you on wanting to make tests faster and reduce the number of times we spin up a full agent but that seems like something we need to tackle by actually breaking up the agent rather than just not testing new stuff we do add to it?
Do we really need a running agent to be able to test this?
Maybe not but if we hacked together an agent struct bypassing all the normal startup code and called an internal method to reload a pointer... then it would be only a test of whether the atomic.Value works as expected! If we had a way to start an agent using the normal code paths but skipping some expensive elements like not starting serf or raft maybe that might work?
Overall though, as discussed above I'd rather that this config reload plumbing was all encapsulated in a separate package and not on the agent itself but that's out of scope here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😄 - this comment made me realise that this test was actually failing...
Because I'd forgotten to actually add the Store
in the reloadConfigInternal
method!
Perhaps there is a cheaper way if we change the reload logic a little like you propose in #8623 since then reloaders are not relying on any other state in the agent to be a certain way and we can have a cheaper test that just ensure s all the expected reloaders have indeed been registered after a "normal" startup?
Plumbing tests always seem wasteful since the code is trivial unless it's not there!
// NOTE: Never read from this field directly once the agent has started up | ||
// since the UI config is reloadable. The on in the agent's config field may | ||
// be out of date. Use the agent.getUIConfig() method to get the latest config | ||
// in a thread-safe way. | ||
// | ||
// hcl: ui_config { ... } | ||
UIConfig UIConfig |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The runtime reference on the agent is updated on a reload, so wouldn't it be safe to read it from this field as well?
Edit: I thought it was supposed to be updated, but maybe I'm wrong about that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I am wrong about the main config reference. The current ReloadConfig
is a bit confusing because it generates a new consul.Config
based on a.config
, which is not updated... It's not clear to me if delegate.ReloadConfig
actually does anything. It seems like all it does is pass the same original config to the Client and Server. Maybe I am still missing something.
edit: Ah, it's because loadLimits
updates the a.config
. That is super not obvious. I'll look at making this a little more explicit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cleaned this up a bit in #8696
agent/http_test.go
Outdated
a := NewTestAgent(t, ` | ||
ui_config { | ||
enabled = true | ||
metrics_provider_options_json = "{\"foo\": 1}" | ||
} | ||
`) | ||
defer a.Shutdown() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need a full running agent to test the HTTPServer
. I think this test could be a lot less expensive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree it could for sure.
Our current testing methodology in this package with a full agent is a know area we'd like to improve but that seems out of scope for a feature PR.
The HTTPServer
is very limited but most of the methods in it call into the agent state and similar so decoupling the server from the agent state is not trivial. Even in cases like this where we could fake it, I'm not very confident that provides a very robust test. If we had a more holistic refactor - for example make the server only have access to the (reloadable) config from the agent and specific interfaces where it needs access to RPCs and Serf, then we could make tests cheaper without making them brittle but that seems out of scope and fix all the tests not just this one new one!
agent/config/builder.go
Outdated
// ensures we get an object. | ||
var dummyMap map[string]interface{} | ||
err := json.Unmarshal([]byte(rt.UIConfig.MetricsProviderOptionsJSON), | ||
&dummyMap) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: kind of awkward line wrapping (https://github.com/golang/go/wiki/CodeReviewComments#line-length)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd tend to agree with you. When I started here it was tribal knowledge that there is a preference in our projects to limit lines to 80 chars max and I was asked to stick to this in several PRs.
I suggested at the time that if this is important we codify it as it's not a general Go community thing but that document still doesn't exist. I don't know how widespread the practice is but since I setup those line length guides in my editor I still use them.
I'd be happy if the consensus has moved on from that - I know it's not rigorously adhered to anyway. Not sure if anyone still has strong opinions either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBF this one was super border line anyway so I just fixed it - obviously more readable on one line. 😄
agent/config/builder.go
Outdated
// Attempt to parse the JSON to ensure it's valid, parsing into a map | ||
// ensures we get an object. | ||
var dummyMap map[string]interface{} | ||
err := json.Unmarshal([]byte(rt.UIConfig.MetricsProviderOptionsJSON), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We weren't happy with inlined JSON for connect escape hatches. Is this a similar idea? should we make this opaque config instead and serialize it to the UI?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I considered this during RFC and I'm OK with this as it is for now - it's a highly experimental interface. Somewhat unlike the escape hatches where the JSON is protobuf we decode and merge, this is literally passed through to the front end which passes it directly back to the custom code the operator is using in JavaScript as a parameter. The decode here is just to try to ensure valid formatting so it doesn't just silently break the whole UI if there's a typo.
The UX of writing explicit JSON for a value that is passed to your javascript code doesn't seem as bad to me as writing raw JSON encoded proto3 Envoy config. The complexities of conversions between HCL and JSON seem to potentially make the UX more confusing?
That said, I'm open to iterating on this if it proves its value but I'm not sure the extra complexity of opaque config makes sense for now either from UX or from a "getting an MVP out and seeing if it's useful" perspective?
agent/config/builder.go
Outdated
return fmt.Errorf("ui_config.dashboard_url_templates key names can only "+ | ||
"contain lowercase alphanumeric or _ characters. received: %q", k) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is repeated a couple times. Instead of sharing just the regex, possibly expose the functionality with a validateBasicName(...)
so that others can re-use it without the same duplication?
Thanks @rboyer and @dnephin for the reviews! I think I fixed up a few things including one of the new tests actually failing! I've replied inline to a few comments - I think I agree with all of them at some level but several I'm not sure I see a good solution for in this PR without blowing out the scope. Happy to chat more about that though! I've generally followed an existing convention for most parts of this PR, plenty of room for improvement overall but I think to do so substantially would move into major refactor territory. |
ebec92f
to
5a1269f
Compare
@dnephin I've refactored this a bunch following your lead on
|
{ | ||
// TODO: is this really what we want? It's what we've always done but | ||
// seems a bit odd to not do an actual 301 but instead serve the | ||
// index.html from every path... It also breaks the UI probably. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing this is intentional because the UI is a single-page app that parses the path to decide what to render, like when navigating to http://localhost:8500/ui/dc1/services
we don't actually want to redirect, but rather serve the UI code and let JavaScript control rendering the dc1
services "page".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah good point. I'll remove the TODO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just one question about the error handling
@@ -55,8 +55,6 @@ func (s *HTTPHandlers) rewordUnknownEnterpriseFieldError(err error) error { | |||
return err | |||
} | |||
|
|||
func (s *HTTPHandlers) addEnterpriseUIENVVars(_ map[string]interface{}) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reminder to clean up this line's counterpart.
agent/uiserver/bindata_assetfs.go
Outdated
@@ -15,7 +15,7 @@ | |||
// pkg/web_ui/assets/codemirror/mode/ruby/ruby-ea43ca3a3bdd63a52811e8464d66134b.js |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You may need to also edit the CI configs in oss or ent, as this file is specially handled in merges.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.circleci/config.yml: - ./agent/bindata_assetfs.go
.circleci/config.yml: name: commit agent/bindata_assetfs.go if there are changes
.circleci/config.yml: if ! git diff --exit-code agent/bindata_assetfs.go; then
.circleci/config.yml: git add agent/bindata_assetfs.go
.circleci/config.yml: git commit -m "auto-updated agent/bindata_assetfs.go from commit ${short_sha}"
GNUmakefile:ASSETFS_PATH?=agent/bindata_assetfs.go
build-support/functions/20-build.sh: local container_id=$(docker create -it -e GIT_COMMIT=${GIT_COMMIT} -e GIT_DIRTY=${GIT_DIRTY} ${image_name} make static-assets ASSETFS_PATH=bindata_assetfs.go)
build-support/functions/20-build.sh: status "Copying back artifacts" && docker cp ${container_id}:/consul/bindata_assetfs.go ${sdir}/agent/uiserver/bindata_assetfs.go
build-support/functions/30-release.sh: git add "${sdir}/agent/bindata_assetfs.go"
build-support/functions/40-publish.sh: # bindata_assetfs.go will make these meaningless
build-support/functions/40-publish.sh: git_diff "$(pwd)" ":!agent/bindata_assetfs.go"|| ret 1
codecov.yml: - "agent/bindata_assetfs.go"
5a1269f
to
2f7b060
Compare
… break tests. Also add tests that do catch that error.
2f7b060
to
3ff5901
Compare
OK... I think this is good to go. The test fail was legitimately a regression although quite an edge case (when using a custom UI dir that didn't have an index.html (which would be somewhat broken anyway because we redirect index requests to index.html). Still this is handled slightly more nicely now - we still serve the custom UI and disable attempts to redirect to a non-existent /index.html. I also added a warning log to make the UX a little nicer. |
agent/uiserver/buffered_file.go
Outdated
"time" | ||
) | ||
|
||
// bufferedFile implements os.File and allows us to modify a file from disk by |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be more correct to say "implements similar methods as *os.File
" since os.File
is not an interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh.. I meant. Implements http.File
which is what http.FileServer.Open
returns. Thanks!
// have to match the encoded double quotes around the JSON string value that | ||
// is there as a placeholder so the end result is an actual JSON bool not a | ||
// string containing "false" etc. | ||
re := regexp.MustCompile(`%22__RUNTIME_BOOL_[A-Za-z0-9-_]+__%22`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be a package constant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it's not on a hot path (typically called once per agent lifetime) and very unlikely to be useful in any other place do you think it's preferable to define it out of context on the package? I don't mind but my feeling was it's easier to read this code with it inline so you can see what it's doing all at once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case I don't have a strong opinion here.
One thing that might slightly sway me towards the package constant is that MustCompile
may panic (if for some reason the regex got busted). As long as we have tests that exercise this specific code path then it's fine. If we didn't then moving it to a package constant ensures that any stray changes will totally trigger a panic during development.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
You still have a few dangling references to the old agent/bindata_assetfs.go
name to fixup. The rest of my comments were not blocking.
Oh thanks. I totally missunderstood that in your previous comments - yes this would have broken lots of CI stuff! I've updated, will push up... |
This PR adds new configuration for the UI to support up-coming work on integrating metrics. None of the new configurations do anything yet this is just the wiring for them. Building the features will come in subsequent PRs.
Following the checklist in
/contributing
:Adding a Simple Config Field for Client Agents
agent/config/config.go
.agent/config/runtime.go
.agent/config/builder.go
totranslate.
TestFullConfig
inagent/config/runtime_test.go
, it should fail now, thenadd the same random value to the expected struct in that test so it passes
again.
TestSanitize
in the samefile. (Running the test first gives you a nice diff which can save working
out where etc.)
some cases or with some values (often true).
agent/config/builder.go
.TestConfigFlagsAndEdgeCases
inagent/config/runtime_test.go
.DefaultSource
inagent/config/defaults.go
.TestConfigFlagsAndEdgeCases
inagent/config/runtime_test.go
.any state the feature needs changing. This needs to be added to one or
more of the following places:
ReloadConfig
inagent/agent.go
if it needs to affect the localclient state or another client agent component.
ReloadConfig
inagent/consul/client.go
if it needs to affectstate for client agent's RPC client.
agent/agent_test.go
similar to others with prefixTestAgent_reloadConfig*
.website/source/docs/agent/options.html.md
.TODO
As can be seen above I've not yet added the documentation for these new configurations. I intend to do that in a later PR once more of the feature is working and we can be more accurate about the actual effect in case any of it doesn't work out as planned!