Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Flow Visibility] Add Grafana and Clickhouse deployment file #3063

Merged
merged 3 commits into from
Mar 4, 2022

Conversation

zyiou
Copy link
Contributor

@zyiou zyiou commented Nov 29, 2021

This PR is for adding deployment file and custom plugin for Grafana configuration and setup for Clickhouse, which are the UI solution and database solution for flow visibility long term architecture.

Deployment file:
It adds deployment for Clickhouse and Grafana with provisioning datasource and dashboards configuration.
Grafana sankey plugin:
A panel plugin is initialized from Grafana plugin template with Sankey diagram using react-google-chart.

@codecov-commenter
Copy link

codecov-commenter commented Nov 29, 2021

Codecov Report

Merging #3063 (333076b) into main (3f30d84) will decrease coverage by 6.27%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #3063      +/-   ##
==========================================
- Coverage   60.85%   54.57%   -6.28%     
==========================================
  Files         266      373     +107     
  Lines       26520    51224   +24704     
==========================================
+ Hits        16139    27956   +11817     
- Misses       8591    20862   +12271     
- Partials     1790     2406     +616     
Flag Coverage Δ
e2e-tests 53.54% <ø> (?)
integration-tests 36.31% <ø> (?)
kind-e2e-tests 47.67% <ø> (-0.29%) ⬇️
unit-tests 42.35% <ø> (+0.76%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/agent/cniserver/pod_configuration_linux.go 26.31% <0.00%> (-40.36%) ⬇️
pkg/controller/ipam/antrea_ipam_controller.go 48.71% <0.00%> (-31.57%) ⬇️
pkg/controller/networkpolicy/endpoint_querier.go 61.46% <0.00%> (-29.97%) ⬇️
pkg/controller/egress/controller.go 62.19% <0.00%> (-26.26%) ⬇️
.../registry/networkpolicy/clustergroupmember/rest.go 64.28% <0.00%> (-23.95%) ⬇️
pkg/agent/cniserver/ipam/antrea_ipam_controller.go 55.93% <0.00%> (-23.83%) ⬇️
pkg/agent/cniserver/ipam/antrea_ipam.go 55.55% <0.00%> (-23.62%) ⬇️
pkg/agent/util/ethtool/ethtool_linux.go 46.66% <0.00%> (-23.34%) ⬇️
pkg/controller/ipam/validate.go 57.95% <0.00%> (-22.05%) ⬇️
pkg/agent/util/iptables/lock.go 60.00% <0.00%> (-21.82%) ⬇️
... and 340 more

@zyiou zyiou changed the title Add grafana deployment file and custom plugin [Flow Visibility]Add Grafana and Clickhouse deployment file Nov 30, 2021
@zyiou zyiou changed the title [Flow Visibility]Add Grafana and Clickhouse deployment file [Flow Visibility] Add Grafana and Clickhouse deployment file Nov 30, 2021
@zyiou zyiou added the area/flow-visibility Issues or PRs related to flow visibility support in Antrea label Nov 30, 2021
@zyiou
Copy link
Contributor Author

zyiou commented Nov 30, 2021

Hi @antoninbas and @tnqn , this PR is still in progress, but I have a quick question about how/where to serve Grafana custom plugin (currently in build/yamls/flow-visibility/grafana-custom-plugin).
Grafana custom plugin is for creating some custom panels that are not supported by Grafana yet (in our case, sankey diagram for visualizing throughput over a time range). It is generated by grafana toolkit https://grafana.com/tutorials/build-a-panel-plugin/#create-a-new-plugin, so there are several components like license, default settings and configuration for React etc. To serve and use it, we need to sign the plugin, package it and upload with release (sample workflow: https://github.com/zyiou/grafana-custom-plugin/blob/main/.github/workflows/build_plugin.yml)
Since the programming language and layout are quite different from Antrea, I’m considering whether we can serve this in a separate repository. What do you suggest?

@antoninbas
Copy link
Contributor

@zyiou could you put the plugin code under plugins/ maybe?

@zyiou zyiou force-pushed the zyiou/grafana branch 2 times, most recently from 34c8ab6 to 98a716c Compare December 6, 2021 01:51
@zyiou zyiou force-pushed the zyiou/grafana branch 2 times, most recently from 4808cff to 3cbdb3d Compare December 21, 2021 19:38
@zyiou zyiou marked this pull request as ready for review December 21, 2021 19:39
@zyiou zyiou force-pushed the zyiou/grafana branch 3 times, most recently from 68d89f8 to 403a5c0 Compare January 11, 2022 22:23
@heanlan
Copy link
Contributor

heanlan commented Jan 14, 2022

A quick question @zyiou . Do we also need to include the README file in order to deploy the clickhouse operator?

@zyiou
Copy link
Contributor Author

zyiou commented Jan 14, 2022

A quick question @zyiou . Do we also need to include the README file in order to deploy the clickhouse operator?

I'm planning to add all documentation once we have e2e work.

@dreamtalen dreamtalen force-pushed the zyiou/grafana branch 2 times, most recently from 545112f to ed3efe4 Compare January 19, 2022 22:59
@zyiou zyiou force-pushed the zyiou/grafana branch 3 times, most recently from b85c6a2 to 39c056b Compare January 26, 2022 00:35
@heanlan
Copy link
Contributor

heanlan commented Feb 28, 2022

I have another question: once we have the all the flow-visibility refactoring PRs merged, do we still want to keep the ELK related contents in the repo? shall we open another PR to remove all of them for v1.6 release? or postpone it to v1.7?

@heanlan
Copy link
Contributor

heanlan commented Feb 28, 2022

By using "Clickhouse-Grafana" we might indicate it as an alternative to "ELK", so in that way it makes sense. On the other hand, it's also very long.

I have 2 observations:

  • For consistency we might want to keep using "Flow Collector" terminology, at least for this release.
  • Instead of referring to specific products making up the stack I'd just call it "Enhanced Flow Collector"

Thanks, Salvatore. cc @jianjuns @antoninbas for opinions. Thanks!

@dreamtalen
Copy link
Contributor

I have another question: once we have the all the flow-visibility refactoring PRs merged, do we still want to keep the ELK related contents in the repo? shall we open another PR to remove all of them for v1.6 release? or postpone it to v1.7?

I think we could keep ELK related stuff in v1.6 release. In order to fully move to our new flow visibility arch, we also need some changes on Flow Aggregator side (deleting unused fields in new arch, debugging cqe test, etc), which may take a while before finished.

@jianjuns
Copy link
Contributor

I feel it depends on how complete the new implementation is (which you guys know better). If it is not that complete, probably let us still keep ELK support for a while.

For naming, another way is to name it something you guys like, not necessary description of the functionalities.

@heanlan
Copy link
Contributor

heanlan commented Feb 28, 2022

Another alternative I can think of is "Grafana Flow Collector".

  • it's shorter
  • satisfies the "Flow Collector" terminology consistency
  • from the users perspective they will only directly interact with Grafana

I'll go-ahead and do the refactoring if that sounds good to all of us.

@heanlan
Copy link
Contributor

heanlan commented Mar 1, 2022

Hi @antoninbas @jianjuns , would you mind having a look at this PR again?

@wsquan171 wsquan171 mentioned this pull request Mar 2, 2022
wsquan171 added a commit to wsquan171/antrea that referenced this pull request Mar 2, 2022
This change groups files brought from:
[Flow Visibility] Add Grafana and Clickhouse deployment file antrea-io#3063

Additional modifications are added and to be resolved when ^ PR is
merged.
Copy link
Contributor

@jianjuns jianjuns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not have further comments on the PR, but we need a reviewer with enough knowledge to approve.

wsquan171 added a commit to wsquan171/antrea that referenced this pull request Mar 2, 2022
This change groups files brought from:
[Flow Visibility] Add Grafana and Clickhouse deployment file antrea-io#3063

Additional modifications are added and to be resolved when ^ PR is
merged.
Copy link
Contributor

@antoninbas antoninbas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM. Would be good if someone else beyond the person working on this PR could validate the installation instructions by trying them out.

build/yamls/flow-visibility/base/kustomization.yml Outdated Show resolved Hide resolved
build/yamls/flow-visibility/base/kustomize-config.yml Outdated Show resolved Hide resolved
build/yamls/flow-visibility/base/grafana.yml Outdated Show resolved Hide resolved
docs/network-flow-visibility.md Outdated Show resolved Hide resolved
docs/network-flow-visibility.md Outdated Show resolved Hide resolved
docs/network-flow-visibility.md Outdated Show resolved Hide resolved
docs/network-flow-visibility.md Show resolved Hide resolved
docs/network-flow-visibility.md Outdated Show resolved Hide resolved
ClickHouse as the data storage, and use Grafana as the data visualization and monitoring tool.

#### Deployment Steps

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we encourage users to edit the clickhouse-secret Secret and use non-default credentials? I think that may be a good idea.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I added a credential configuration section in the latest commit for review, where I also added the Grafana login credentials configuration. There's one issue: flow-aggregator.yml also has a Secret with the same contents as clickhouse-secret. Their namespaces are different, so they cannot share the same Secret. Users need to make the changes at two different locations.

I feel it can be addressed if we later decide to merge flow-aggregator and flow-visibility namespaces into one. But at the current stage, I'm not sure whether we want to add this customization support. Do you have any opinions on that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that sounds fine for now. I want us to support Helm as an installation method later one, and I believe it will make it easier to keep everything consistent.

docs/network-flow-visibility.md Outdated Show resolved Hide resolved
docs/network-flow-visibility.md Outdated Show resolved Hide resolved
Copy link
Contributor

@yanjunz97 yanjunz97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may need to mention Grafana flow collector in this paragraph

@heanlan
Copy link
Contributor

heanlan commented Mar 3, 2022

We may need to mention Grafana flow collector in this paragraph

Thanks @yanjunz97 , could you please also help to try out the instructions of deployment?

@heanlan heanlan marked this pull request as draft March 3, 2022 20:18
@heanlan heanlan marked this pull request as ready for review March 3, 2022 22:30
1. Add Grafana dashboards
2. Change deployment files arrangements and manifest generation and maintenance
3. Add documentation for flow-visibility solution

Signed-off-by: heanlan <[email protected]>
@yanjunz97
Copy link
Contributor

Overall LGTM. Would be good if someone else beyond the person working on this PR could validate the installation instructions by trying them out.

I have worked on the validation. I found an error in Grafana dashboard when trying to set a customized username and password for Clickhouse. After some discussions and troubleshooting with @heanlan , we found the root cause. Anlan has updated the files to fix it.

After this fix, I tried out all the installation instructions again and validated that they all work as expected.

Copy link
Contributor

@antoninbas antoninbas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@heanlan
Copy link
Contributor

heanlan commented Mar 4, 2022

/test-all

@antoninbas
Copy link
Contributor

@heanlan can I go ahead and merge this PR?

@heanlan
Copy link
Contributor

heanlan commented Mar 4, 2022

Yes, please @antoninbas

@antoninbas antoninbas merged commit 1767a7d into antrea-io:main Mar 4, 2022
@dreamtalen dreamtalen mentioned this pull request Mar 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/flow-visibility Issues or PRs related to flow visibility support in Antrea
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants