Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add flag to enable logging output in json to consul-k8s #523

Merged
merged 17 commits into from
Jun 29, 2021

Conversation

kschoche
Copy link
Contributor

@kschoche kschoche commented May 25, 2021

Changes proposed in this PR:

  • Adds an optional flag -log-level to all consul-k8s commands that dont already have it which controls the log level.
  • Adds an optional flag -log-json to all consul-k8s commands which controls whether or not the command will log its output in JSON.
  • Disables DevMode for Zap based logging (controller and connect-inject) so that we no longer print stack traces for anything below Error.
    This resolves Allow JSON logging for consul-k8s #516

How I've tested this PR:

Tested using a debug image:

% cat consul-values.yaml
global:
  imageK8S: kschoche/consul-k8s-dev-logging
  image: hashicorp/consul:1.10.0
  logLevel: "debug"
  logJSON: true
  tls:
    enabled: true
  acls:
    manageSystemACLs: true
  metrics:
    enabled: true
controller:
  enabled: true
server:
  replicas: 1
connectInject:
  enabled: true
ui:
  enabled: true
 consul-k8s % k logs consul-consul-connect-injector-webhook-deployment-864bf776gjzzw | head
{"level":"info","ts":1624467429.182265,"logger":"controller-runtime.metrics","msg":"metrics server is starting to listen","addr":"0.0.0.0:9444"}
{"level":"info","ts":1624467429.1829019,"logger":"controller-runtime.webhook","msg":"registering webhook","path":"/mutate"}
{"level":"info","ts":1624467429.2273815,"msg":"attempting to acquire leader lease default/consul-controller-lock...\n"}
{"level":"info","ts":1624467429.2276971,"logger":"controller-runtime.webhook.webhooks","msg":"starting webhook server"}
{"level":"info","ts":1624467429.2277288,"msg":"starting metrics server","path":"/metrics"}

Unit tests pass and when enabled we have logging like this:

=== RUN   TestRun_ServicePollingWithACLsAndTLS/ACLs_enabled,_tls
[INFO] freeport: detected ephemeral port range of [49152, 65535]
[INFO] freeport: reducing max blocks from 30 to 26 to avoid the ephemeral port range
    server.go:253: CONFIG JSON: {"node_name":"node-77eba14f-ba7c-6249-6f7a-22e34bb4ba88","node_id":"77eba14f-ba7c-6249-6f7a-22e34bb4ba88","performance":{"raft_multiplier":1},"bootstrap":true,"server":true,"data_dir":"/var/folders/wf/dsq4425j3gj7j2hbqhg14rrc0000gp/T/TestRun_ServicePollingWithACLsAndTLS_ACLs_enabled,_tls208609805/data","segments":null,"disable_update_check":true,"log_level":"debug","bind_addr":"127.0.0.1","addresses":{},"ports":{"dns":28001,"http":28002,"https":28003,"serf_lan":28004,"serf_wan":28005,"server":28006},"acl":{"enabled":true,"default_policy":"deny","tokens":{"master":"b78d37c7-0ca7-5f4d-99ee-6d9975ce4586"}},"ca_file":"/var/folders/wf/dsq4425j3gj7j2hbqhg14rrc0000gp/T/ca028592136","cert_file":"/var/folders/wf/dsq4425j3gj7j2hbqhg14rrc0000gp/T/cert180506567","key_file":"/var/folders/wf/dsq4425j3gj7j2hbqhg14rrc0000gp/T/key270148218","connect":{"ca_config":{"cluster_id":"11111111-2222-3333-4444-555555555555"},"enabled":true}}
{"@level":"info","@message":"Consul login complete","@timestamp":"2021-05-25T13:06:46.300842-05:00"}
{"@level":"info","@message":"Registered service has been detected","@timestamp":"2021-05-25T13:06:46.302249-05:00","service":"counting-sidecar-proxy"}
{"@level":"info","@message":"Registered service has been detected","@timestamp":"2021-05-25T13:06:46.302270-05:00","service":"counting"}
{"@level":"info","@message":"Connect initialization completed","@timestamp":"2021-05-25T13:06:46.302560-05:00"}

(and when disabled by default)
=== RUN   TestRun_ServicePollingWithACLsAndTLS/ACLs_enabled,_tls
[INFO] freeport: detected ephemeral port range of [49152, 65535]
[INFO] freeport: reducing max blocks from 30 to 26 to avoid the ephemeral port range
    server.go:253: CONFIG JSON: {"node_name":"node-24e952a1-a0a0-5981-ba1b-204d6a7218e1","node_id":"24e952a1-a0a0-5981-ba1b-204d6a7218e1","performance":{"raft_multiplier":1},"bootstrap":true,"server":true,"data_dir":"/var/folders/wf/dsq4425j3gj7j2hbqhg14rrc0000gp/T/TestRun_ServicePollingWithACLsAndTLS_ACLs_enabled,_tls582795832/data","segments":null,"disable_update_check":true,"log_level":"debug","bind_addr":"127.0.0.1","addresses":{},"ports":{"dns":23501,"http":23502,"https":23503,"serf_lan":23504,"serf_wan":23505,"server":23506},"acl":{"enabled":true,"default_policy":"deny","tokens":{"master":"b78d37c7-0ca7-5f4d-99ee-6d9975ce4586"}},"ca_file":"/var/folders/wf/dsq4425j3gj7j2hbqhg14rrc0000gp/T/ca752820279","cert_file":"/var/folders/wf/dsq4425j3gj7j2hbqhg14rrc0000gp/T/cert415427626","key_file":"/var/folders/wf/dsq4425j3gj7j2hbqhg14rrc0000gp/T/key322449793","connect":{"ca_config":{"cluster_id":"11111111-2222-3333-4444-555555555555"},"enabled":true}}
2021-05-25T13:07:38.834-0500 [INFO]  Consul login complete
2021-05-25T13:07:38.835-0500 [INFO]  Registered service has been detected: service=counting
2021-05-25T13:07:38.835-0500 [INFO]  Registered service has been detected: service=counting-sidecar-proxy
2021-05-25T13:07:38.835-0500 [INFO]  Connect initialization completed

How I expect reviewers to test this PR:
code review / check out and run tests with json logging enabled.
A second PR for consul-helm will drive these flags [https://github.com/hashicorp/consul-helm/pull/980/]

Note to reviewers: I'm curious about how to go about unit testing for this PR, thoughts?

Checklist:

  • Tests added
  • CHANGELOG entry added (HashiCorp engineers only, community PRs should not add a changelog entry)

@kschoche kschoche added the type/enhancement New feature or request label May 25, 2021
@kschoche kschoche self-assigned this May 25, 2021
subcommand/connect-init/command.go Outdated Show resolved Hide resolved
subcommand/connect-init/command.go Outdated Show resolved Hide resolved
@kschoche kschoche marked this pull request as ready for review June 22, 2021 18:30
Copy link
Member

@lkysow lkysow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work on this. Couple questions.

What about:

  • acl-init
  • tls-init

It looks like zapcore doesn't support trace (see snippet below) so how should we handle that since it's a valid value in the values file? We could make it use debug in that instance?

Other questions

  1. should we switch everything over to using zap logger and ditch hclog?
  2. should we also change where we're using c.UI to use the logger? Otherwise there will be non-json output mixed in I think?
// UnmarshalText unmarshals text to a level. Like MarshalText, UnmarshalText
// expects the text representation of a Level to drop the -Level suffix (see
// example).
//
// In particular, this makes it easy to configure logging levels using YAML,
// TOML, or JSON files.
func (l *Level) UnmarshalText(text []byte) error {
	if l == nil {
		return errUnmarshalNilLevel
	}
	if !l.unmarshalText(text) && !l.unmarshalText(bytes.ToLower(text)) {
		return fmt.Errorf("unrecognized level: %q", text)
	}
	return nil
}

func (l *Level) unmarshalText(text []byte) bool {
	switch string(text) {
	case "debug", "DEBUG":
		*l = DebugLevel
	case "info", "INFO", "": // make the zero value useful
		*l = InfoLevel
	case "warn", "WARN":
		*l = WarnLevel
	case "error", "ERROR":
		*l = ErrorLevel
	case "dpanic", "DPANIC":
		*l = DPanicLevel
	case "panic", "PANIC":
		*l = PanicLevel
	case "fatal", "FATAL":
		*l = FatalLevel
	default:
		return false
	}
	return true
}

subcommand/connect-init/command.go Outdated Show resolved Hide resolved
subcommand/controller/command.go Outdated Show resolved Hide resolved
@kschoche
Copy link
Contributor Author

Nice work on this. Couple questions.

What about:

  • acl-init

@lkysow acl-init does not use the command logger, it just uses c.UI so I didn't do any changes to it.

Which brings us to your other 2 questions:

  1. switching everything to Zap instead of hclog - that wasn't in the scope of this PR but I think it's something we could do in the future. If my understanding of the current code is correct that would be something that is naturally additive to this PR and wouldn't require backing out anything from this one.
  2. In addition to point 1, not using c.UI at all, and using Zap instead everywhere? I think the current approach is to use c.UI for parsing/setup/etc and then to use the logger for anything meaningful/once the command "runs". I think there might be some value in keeping this, but I also see the value in standardizing on a single logging interface! We could discuss this in more depth.
  • tls-init

tls-init should be implemented in here? I see it at least, maybe my 👁️👁️ see into the future?

@kschoche
Copy link
Contributor Author

@lkysow I did a bit of digging regarding your question about switching everything to Zap from hclog:

I dug a bit into converting all hclog into logr.Logger (zap).

  1. I will have to touch every log line in subcommands/
  2. It is mostly convertingc.Log.Error("abcd", err) into c.Log.Error(err, "abcd")
  3. But also lots of this:
    c.log.Error(fmt.Sprintf("Error creating Consul client for addr %q: %s", serverAddr, err))
    to
    c.log.Error(err, "Error creating Consul client for", "addr", serverAddr, "error", err)
  4. We have at least one external package that requires a hclog.Logger interface as a parameter (godiscover)

So I'll open an issue for 1-3 above, and maybe we can try to also solve number 4 somehow else.. And then we'll just go ahead with this PR in its current form after I've fixed the comments!

Copy link
Contributor

@ndhanushkodi ndhanushkodi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this PR looks great!

I agree with the comments above for moving over to zap in a follow up PR.

For the UI logs, will it break users ci setups if they turn on JSON logging and there are a few interspersed UI logs? Will it just be a few unparseable logs? Just wondering the impact to see whether it should belong in this PR or not. If you think it's reasonable to push that out to a follow up though, that sounds good too.

subcommand/controller/command.go Outdated Show resolved Hide resolved
subcommand/controller/command.go Outdated Show resolved Hide resolved
Copy link
Contributor

@ndhanushkodi ndhanushkodi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(copied from helm, had a thought when reviewing that one) For a command like consul-sidecar for connect-injected pods that doesn't have its logLevel/logJSON set in the helm template, do you need to add something to this PR to invoke that internal command with the right flag based on the global value?

@kschoche
Copy link
Contributor Author

(copied from helm, had a thought when reviewing that one) For a command like consul-sidecar for connect-injected pods that doesn't have its logLevel/logJSON set in the helm template, do you need to add something to this PR to invoke that internal command with the right flag based on the global value?

Oh great catch! I've plumbed that through in the handler now.

subcommand/common/common.go Outdated Show resolved Hide resolved
subcommand/common/common.go Outdated Show resolved Hide resolved
@kschoche kschoche merged commit 7d7ce5d into master Jun 29, 2021
@kschoche kschoche deleted the add_json_loggin_flags branch June 29, 2021 20:59
@david-yu
Copy link
Contributor

david-yu commented Jul 6, 2021

@kschoche I believe the Changelog somehow got placed in the wrong place after merging? It looks like it is now under Improvements under 0.26.0? I don't think was released in 0.26.0 is that correct?

@kschoche
Copy link
Contributor Author

kschoche commented Jul 6, 2021

@kschoche I believe the Changelog somehow got placed in the wrong place after merging? It looks like it is now under Improvements under 0.26.0? I don't think was released in 0.26.0 is that correct?

@david-yu you're not wrong! This was a nasty merge by git.. let me fix that..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow JSON logging for consul-k8s
4 participants