Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix and improve the docker image building process of Flow Aggregator #2016

Merged
merged 1 commit into from
Apr 1, 2021

Conversation

dreamtalen
Copy link
Contributor

Fix the error of Flow Aggregator image after PR #2004 changed the base image to scratch.
Change the image name target "flow-aggregator-ubuntu" to "flow-aggregtor-image" in Makefile since ubuntu is not relevant.
Change the base image of Flow Aggregator code coverage image to ubuntu:20.04.

@codecov-io
Copy link

codecov-io commented Mar 31, 2021

Codecov Report

Merging #2016 (d10d3c0) into main (08ea67c) will decrease coverage by 5.82%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #2016      +/-   ##
==========================================
- Coverage   65.12%   59.30%   -5.83%     
==========================================
  Files         197      197              
  Lines       17407    17507     +100     
==========================================
- Hits        11336    10382     -954     
- Misses       4883     6011    +1128     
+ Partials     1188     1114      -74     
Flag Coverage Δ
kind-e2e-tests 44.59% <ø> (-11.57%) ⬇️
unit-tests 41.53% <ø> (-0.16%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/apis/controlplane/helper.go 25.00% <0.00%> (-75.00%) ⬇️
pkg/apis/controlplane/v1beta2/helper.go 25.00% <0.00%> (-75.00%) ⬇️
pkg/controller/networkpolicy/mutate.go 0.00% <0.00%> (-71.77%) ⬇️
pkg/controller/networkpolicy/store/group.go 5.00% <0.00%> (-70.00%) ⬇️
pkg/controller/networkpolicy/validate.go 0.00% <0.00%> (-69.21%) ⬇️
pkg/k8s/name.go 33.33% <0.00%> (-66.67%) ⬇️
pkg/agent/controller/networkpolicy/packetin.go 9.93% <0.00%> (-59.01%) ⬇️
pkg/apiserver/handlers/webhook/mutation_crd.go 0.00% <0.00%> (-52.64%) ⬇️
pkg/apiserver/handlers/webhook/validation_crd.go 0.00% <0.00%> (-52.64%) ⬇️
...ver/registry/controlplane/nodestatssummary/rest.go 50.00% <0.00%> (-50.00%) ⬇️
... and 43 more

Copy link
Member

@srikartati srikartati left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this.


FROM scratch

LABEL maintainer="Antrea <[email protected]>"
LABEL description="The docker image for the flow aggregator"

USER root
COPY --from=flow-aggregator-build /antrea/bin/flow-aggregator /
COPY --from=flow-aggregator-build /new_tmp /tmp
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of tmp?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's needed for logs. Added a comment here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry could you clarify? which logs?
The flow-aggregator should not emit logs to /tmp, everything should go to stdout / stderr

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I was adding /tmp directory because got this error:

 >>> kubectl logs flow-aggregator-78cc9f6c78-5jwpp -n flow-aggregator
 I0331 20:33:40.099906       1 log_file.go:99] Set log file max size to 104857600
 I0331 20:33:40.099906       1 log_file.go:99] Set log file max size to 104857600
 log: exiting because of error: log: cannot create log: open /tmp/flow-aggregator.flow-aggregator-78cc9f6c78-5jwpp.unknownuser.log.INFO.20210331-203340.1: no such file or directory

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to https://github.com/vmware-tanzu/antrea/blob/bb6cd1016e2d2863847ce13052b951e27a654b14/build/yamls/flow-aggregator/base/flow-aggregator.yml#L125, the logs should go to /var/log/flowaggregator, where does the /tmp come from?
The intent is to have this directory mounted from the host actually (see YAML for Antrea Controller) which is not currently the case for flow-aggregator.yml. This needs to be addressed as well, there is no need to create the directory in the container image (it will be created dynamically by the volume mount).
The unknownuser is not ideal, I think you can solve it by adding ENV USER root to the Dockerfile

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, it seems the /tmp is used by klog. I solved it by adding --logtostderr to let logs go to stderr.

Makefile Outdated
.PHONY: flow-aggregator-ubuntu
flow-aggregator-ubuntu:
.PHONY: flow-aggregator-image
flow-aggregator-image:
@echo "===> Building antrea/flow-aggregator Docker image <==="
ifneq ($(NO_PULL),)
docker build -t antrea/flow-aggregator:$(DOCKER_IMG_VERSION) -f build/images/flow-aggregator/Dockerfile --build-arg OVS_VERSION=$(OVS_VERSION) .
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OVS_VERSION can be removed. Same for code coverage image.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, addressed.

FROM golang:1.15 as flow-aggregator-build

WORKDIR /antrea

COPY . /antrea

RUN make flow-aggregator
RUN CGO_ENABLED=0 make flow-aggregator
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comment would be informative here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, addressed.

@srikartati srikartati added this to the Antrea v1.0 release milestone Mar 31, 2021
@dreamtalen dreamtalen force-pushed the local-fix-dockerfile branch 2 times, most recently from 06a107f to e19bf9c Compare March 31, 2021 21:41
Comment on lines 127 to 132
- --logtostderr=false
- --log_dir=/var/log/flowaggregator
- --alsologtostderr
- --log_file_max_size=100
- --log_file_max_num=4
- --logtostderr
- --v=0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change may be orthogonal to this PR. What's your take on this @srikartati?
It's fine to keep logging to /var/log/flowaggregator as long we use a hostPath volume, unless there is a bug in klog an it tries to log to /tmp even when --log_dir is provided (which I would find surprising).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think this change seems to be orthogonal.

https://github.com/vmware-tanzu/antrea/blob/main/cmd/flow-aggregator/main.go#L51
We expect log limits here. Not sure what default values or how does this function will react. @dreamtalen do you know what happens?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I quickly tried this patch by adding the missing hostPath volume. I did not get any error that logs are written to /tmp.

Here is my diff on top of this patch:

~/work/antrea_all/antrea$ git diff
diff --git a/build/yamls/flow-aggregator/base/flow-aggregator.yml b/build/yamls/flow-aggregator/base/flow-aggregator.yml
index 5dd33d93..7e7ac18d 100644
--- a/build/yamls/flow-aggregator/base/flow-aggregator.yml
+++ b/build/yamls/flow-aggregator/base/flow-aggregator.yml
@@ -124,7 +124,11 @@ spec:
       - args:
         - --config
         - /etc/flow-aggregator/flow-aggregator.conf
-        - --logtostderr
+        - --logtostderr=false
+        - --log_dir=/var/log/flow-aggregator
+        - --alsologtostderr
+        - --log_file_max_size=100
+        - --log_file_max_num=4
         - --v=0
         name: flow-aggregator
         image: flow-aggregator
@@ -135,9 +139,14 @@ spec:
           name: flow-aggregator-config
           readOnly: true
           subPath: flow-aggregator.conf
+        - mountPath: /var/log/flow-aggregator
+          name: host-var-log-flow-aggregator
       serviceAccountName: flow-aggregator
       volumes:
       - name: flow-aggregator-config
         configMap:
           name: flow-aggregator-configmap
-        
+      - name: host-var-log-flow-aggregator
+        hostPath:
+          path: /var/log/flow-aggregator
+          type: DirectoryOrCreate

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for checking let's go with this

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think keeping the same behavior as antrea-controller for logging is better.
@antoninbas Do you know what is the benefit of logging this to a separate directory along with stderr? Where will the directory be used? I see that log collection in e2e tests is also using "kubectl logs.."

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea was to have all the logs in one directory on the Node (/var/log/antrea). But it doesn't make too much sense for the Antrea Controller and the Flow Aggregator which are Deployments, and not DaemonSets. You can either match the Antrea Controller and we can revisit this decision later for both, or you can switch to only logging to stdout / stderr (remove file logging) for the Flow Aggregator (leave the Antrea Controller alone in this PR though). I don't have a strong preference. Maybe the second solution simplifies things and is the right way to go? Since we were not actually logging to the Node filesystem earlier (missing volume mount) there isn't any loss of functionality. What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, Antonin.
From Jianjun's PR, I see that it can be used for the log bundle. If that is the case, can we use /var/log/antrea instead of /var/log/flow-aggregator here as a solution? Earlier, the missing log functionality in the flow aggregator is overlooked. We can fix that here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes I think you can use /var/log/antrea

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Srikar for helping me try adding the hostpath, updated PR with /var/log/antrea log path.

@srikartati
Copy link
Member

/test-all
/test-ipv6-only-e2e
/test-ipv6-ds-e2e

@srikartati
Copy link
Member

/test-ipv6-e2e

srikartati
srikartati previously approved these changes Apr 1, 2021
Copy link
Member

@srikartati srikartati left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

serviceAccountName: flow-aggregator
volumes:
- configMap:
name: flow-aggregator-configmap-hf78268hm6
name: flow-aggregator-config
- hostPath:
path: /var/log/antrea
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably better to have a sub dir for flow aggregtor, as the same log dir can be used for other components too (agent/controller/ovs).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no strong preference here. The agent and controller logs are already in the same directory, and the component name is part of the log file name. I am fine with a sub directory though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but today only agent/controller logs are in the top dir; OVS is not. I prefer a sub-dir for flow aggregator too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, updated the log directory to /var/log/antrea/flow-aggregator

Copy link
Member

@srikartati srikartati left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Realized that code-coverage patch needs to be changed as well otherwise code-coverage manifestation will not work.
https://github.com/vmware-tanzu/antrea/blob/main/test/e2e/coverage/flow-aggregator-arg-file

@srikartati srikartati dismissed their stale review April 1, 2021 18:10

needs more changes

Fix the error of Flow Aggregator image after PR antrea-io#2004 changed the base
image to scratch.
Change the image name target "flow-aggregator-ubuntu" to
"flow-aggregtor-image" in Makefile since ubuntu is not relevant.
Change the base image of Flow Aggregator code coverage image to
ubuntu:20.04.
@dreamtalen
Copy link
Contributor Author

Realized that code-coverage patch needs to be changed as well otherwise code-coverage manifestation will not work.
https://github.com/vmware-tanzu/antrea/blob/main/test/e2e/coverage/flow-aggregator-arg-file

Thanks, addressed

Copy link
Contributor

@jianjuns jianjuns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@antoninbas antoninbas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@srikartati
Copy link
Member

/test-all
/test-ipv6-only-e2e
/test-ipv6-e2e

Copy link
Member

@srikartati srikartati left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@srikartati
Copy link
Member

/test-conformance
/test-networkpolicy

@srikartati
Copy link
Member

Kind tests with Antrea network policies failed because e2e tests have timed out after 50 mins. This is unrelated to Flow Aggregator. All the other tests have passed. Merging it.

@srikartati srikartati merged commit 117cd94 into antrea-io:main Apr 1, 2021
@zyiou zyiou added area/flow-visibility Issues or PRs related to flow visibility support in Antrea area/flow-visibility/aggregator Issues or PRs related to Flow Aggregator labels Jun 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/flow-visibility/aggregator Issues or PRs related to Flow Aggregator area/flow-visibility Issues or PRs related to flow visibility support in Antrea
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants