-
Notifications
You must be signed in to change notification settings - Fork 370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix and improve the docker image building process of Flow Aggregator #2016
Conversation
Codecov Report
@@ Coverage Diff @@
## main #2016 +/- ##
==========================================
- Coverage 65.12% 59.30% -5.83%
==========================================
Files 197 197
Lines 17407 17507 +100
==========================================
- Hits 11336 10382 -954
- Misses 4883 6011 +1128
+ Partials 1188 1114 -74
Flags with carried forward coverage won't be shown. Click here to find out more.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing this.
|
||
FROM scratch | ||
|
||
LABEL maintainer="Antrea <[email protected]>" | ||
LABEL description="The docker image for the flow aggregator" | ||
|
||
USER root | ||
COPY --from=flow-aggregator-build /antrea/bin/flow-aggregator / | ||
COPY --from=flow-aggregator-build /new_tmp /tmp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the purpose of tmp?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's needed for logs. Added a comment here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry could you clarify? which logs?
The flow-aggregator should not emit logs to /tmp
, everything should go to stdout / stderr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I was adding /tmp
directory because got this error:
>>> kubectl logs flow-aggregator-78cc9f6c78-5jwpp -n flow-aggregator
I0331 20:33:40.099906 1 log_file.go:99] Set log file max size to 104857600
I0331 20:33:40.099906 1 log_file.go:99] Set log file max size to 104857600
log: exiting because of error: log: cannot create log: open /tmp/flow-aggregator.flow-aggregator-78cc9f6c78-5jwpp.unknownuser.log.INFO.20210331-203340.1: no such file or directory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to https://github.com/vmware-tanzu/antrea/blob/bb6cd1016e2d2863847ce13052b951e27a654b14/build/yamls/flow-aggregator/base/flow-aggregator.yml#L125, the logs should go to /var/log/flowaggregator
, where does the /tmp
come from?
The intent is to have this directory mounted from the host actually (see YAML for Antrea Controller) which is not currently the case for flow-aggregator.yml. This needs to be addressed as well, there is no need to create the directory in the container image (it will be created dynamically by the volume mount).
The unknownuser
is not ideal, I think you can solve it by adding ENV USER root
to the Dockerfile
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, it seems the /tmp
is used by klog. I solved it by adding --logtostderr
to let logs go to stderr.
Makefile
Outdated
.PHONY: flow-aggregator-ubuntu | ||
flow-aggregator-ubuntu: | ||
.PHONY: flow-aggregator-image | ||
flow-aggregator-image: | ||
@echo "===> Building antrea/flow-aggregator Docker image <===" | ||
ifneq ($(NO_PULL),) | ||
docker build -t antrea/flow-aggregator:$(DOCKER_IMG_VERSION) -f build/images/flow-aggregator/Dockerfile --build-arg OVS_VERSION=$(OVS_VERSION) . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OVS_VERSION can be removed. Same for code coverage image.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, addressed.
FROM golang:1.15 as flow-aggregator-build | ||
|
||
WORKDIR /antrea | ||
|
||
COPY . /antrea | ||
|
||
RUN make flow-aggregator | ||
RUN CGO_ENABLED=0 make flow-aggregator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A comment would be informative here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, addressed.
06a107f
to
e19bf9c
Compare
- --logtostderr=false | ||
- --log_dir=/var/log/flowaggregator | ||
- --alsologtostderr | ||
- --log_file_max_size=100 | ||
- --log_file_max_num=4 | ||
- --logtostderr | ||
- --v=0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change may be orthogonal to this PR. What's your take on this @srikartati?
It's fine to keep logging to /var/log/flowaggregator
as long we use a hostPath volume, unless there is a bug in klog an it tries to log to /tmp
even when --log_dir
is provided (which I would find surprising).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think this change seems to be orthogonal.
https://github.com/vmware-tanzu/antrea/blob/main/cmd/flow-aggregator/main.go#L51
We expect log limits here. Not sure what default values or how does this function will react. @dreamtalen do you know what happens?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I quickly tried this patch by adding the missing hostPath volume. I did not get any error that logs are written to /tmp.
Here is my diff on top of this patch:
~/work/antrea_all/antrea$ git diff
diff --git a/build/yamls/flow-aggregator/base/flow-aggregator.yml b/build/yamls/flow-aggregator/base/flow-aggregator.yml
index 5dd33d93..7e7ac18d 100644
--- a/build/yamls/flow-aggregator/base/flow-aggregator.yml
+++ b/build/yamls/flow-aggregator/base/flow-aggregator.yml
@@ -124,7 +124,11 @@ spec:
- args:
- --config
- /etc/flow-aggregator/flow-aggregator.conf
- - --logtostderr
+ - --logtostderr=false
+ - --log_dir=/var/log/flow-aggregator
+ - --alsologtostderr
+ - --log_file_max_size=100
+ - --log_file_max_num=4
- --v=0
name: flow-aggregator
image: flow-aggregator
@@ -135,9 +139,14 @@ spec:
name: flow-aggregator-config
readOnly: true
subPath: flow-aggregator.conf
+ - mountPath: /var/log/flow-aggregator
+ name: host-var-log-flow-aggregator
serviceAccountName: flow-aggregator
volumes:
- name: flow-aggregator-config
configMap:
name: flow-aggregator-configmap
-
+ - name: host-var-log-flow-aggregator
+ hostPath:
+ path: /var/log/flow-aggregator
+ type: DirectoryOrCreate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for checking let's go with this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think keeping the same behavior as antrea-controller for logging is better.
@antoninbas Do you know what is the benefit of logging this to a separate directory along with stderr? Where will the directory be used? I see that log collection in e2e tests is also using "kubectl logs.."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea was to have all the logs in one directory on the Node (/var/log/antrea
). But it doesn't make too much sense for the Antrea Controller and the Flow Aggregator which are Deployments, and not DaemonSets. You can either match the Antrea Controller and we can revisit this decision later for both, or you can switch to only logging to stdout / stderr (remove file logging) for the Flow Aggregator (leave the Antrea Controller alone in this PR though). I don't have a strong preference. Maybe the second solution simplifies things and is the right way to go? Since we were not actually logging to the Node filesystem earlier (missing volume mount) there isn't any loss of functionality. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, Antonin.
From Jianjun's PR, I see that it can be used for the log bundle. If that is the case, can we use /var/log/antrea
instead of /var/log/flow-aggregator
here as a solution? Earlier, the missing log functionality in the flow aggregator is overlooked. We can fix that here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes I think you can use /var/log/antrea
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Srikar for helping me try adding the hostpath, updated PR with /var/log/antrea
log path.
e19bf9c
to
d10d3c0
Compare
/test-all |
/test-ipv6-e2e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
build/yamls/flow-aggregator.yml
Outdated
serviceAccountName: flow-aggregator | ||
volumes: | ||
- configMap: | ||
name: flow-aggregator-configmap-hf78268hm6 | ||
name: flow-aggregator-config | ||
- hostPath: | ||
path: /var/log/antrea |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably better to have a sub dir for flow aggregtor, as the same log dir can be used for other components too (agent/controller/ovs).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no strong preference here. The agent and controller logs are already in the same directory, and the component name is part of the log file name. I am fine with a sub directory though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but today only agent/controller logs are in the top dir; OVS is not. I prefer a sub-dir for flow aggregator too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, updated the log directory to /var/log/antrea/flow-aggregator
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Realized that code-coverage patch needs to be changed as well otherwise code-coverage manifestation will not work.
https://github.com/vmware-tanzu/antrea/blob/main/test/e2e/coverage/flow-aggregator-arg-file
Fix the error of Flow Aggregator image after PR antrea-io#2004 changed the base image to scratch. Change the image name target "flow-aggregator-ubuntu" to "flow-aggregtor-image" in Makefile since ubuntu is not relevant. Change the base image of Flow Aggregator code coverage image to ubuntu:20.04.
d10d3c0
to
8c9dcf0
Compare
Thanks, addressed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/test-all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/test-conformance |
Kind tests with Antrea network policies failed because e2e tests have timed out after 50 mins. This is unrelated to Flow Aggregator. All the other tests have passed. Merging it. |
Fix the error of Flow Aggregator image after PR #2004 changed the base image to scratch.
Change the image name target "flow-aggregator-ubuntu" to "flow-aggregtor-image" in Makefile since ubuntu is not relevant.
Change the base image of Flow Aggregator code coverage image to ubuntu:20.04.