Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KEP: Rebase k8s images to distroless #900

Merged
merged 1 commit into from
Apr 3, 2019
Merged

KEP: Rebase k8s images to distroless #900

merged 1 commit into from
Apr 3, 2019

Conversation

yuwenma
Copy link
Contributor

@yuwenma yuwenma commented Mar 17, 2019

No description provided.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Mar 17, 2019
@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Mar 17, 2019
@k8s-ci-robot
Copy link
Contributor

Hi @yuwenma. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/auth Categorizes an issue or PR as relevant to SIG Auth. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Mar 17, 2019
@yuwenma
Copy link
Contributor Author

yuwenma commented Mar 17, 2019

/assign tallclair

@justaugustus
Copy link
Member

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Mar 17, 2019
@tallclair
Copy link
Member

I don't think sig-auth should own this, but it's not clear to me which sig should. Maybe @kubernetes/sig-release ?

Copy link
Member

@liggitt liggitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think sig-auth should own this, but it's not clear to me which sig should. Maybe @kubernetes/sig-release ?

that would make sense to me

@k8s-ci-robot k8s-ci-robot added the sig/release Categorizes an issue or PR as relevant to SIG Release. label Mar 18, 2019
@yuwenma
Copy link
Contributor Author

yuwenma commented Mar 18, 2019

Thanks guys. The sig-arch group also suggests using sig-release. Updated. PTAL

@yuwenma
Copy link
Contributor Author

yuwenma commented Mar 18, 2019

/assign justaugustus

Copy link
Member

@johnbelamaric johnbelamaric left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bunch of minor typo fixes. One question though, have you estimated what it would take to do kube-proxy too?

keps/sig-release/20190316-rebase-images-to-distroless.md Outdated Show resolved Hide resolved
keps/sig-release/20190316-rebase-images-to-distroless.md Outdated Show resolved Hide resolved
keps/sig-release/20190316-rebase-images-to-distroless.md Outdated Show resolved Hide resolved
keps/sig-release/20190316-rebase-images-to-distroless.md Outdated Show resolved Hide resolved
keps/sig-release/20190316-rebase-images-to-distroless.md Outdated Show resolved Hide resolved
keps/sig-release/20190316-rebase-images-to-distroless.md Outdated Show resolved Hide resolved
keps/sig-release/20190316-rebase-images-to-distroless.md Outdated Show resolved Hide resolved
keps/sig-release/20190316-rebase-images-to-distroless.md Outdated Show resolved Hide resolved
keps/sig-release/20190316-rebase-images-to-distroless.md Outdated Show resolved Hide resolved
keps/sig-release/20190316-rebase-images-to-distroless.md Outdated Show resolved Hide resolved
keps/sig-release/20190316-rebase-images-to-distroless.md Outdated Show resolved Hide resolved
keps/sig-release/20190316-rebase-images-to-distroless.md Outdated Show resolved Hide resolved
keps/sig-release/20190316-rebase-images-to-distroless.md Outdated Show resolved Hide resolved
keps/sig-release/20190316-rebase-images-to-distroless.md Outdated Show resolved Hide resolved
keps/sig-release/20190316-rebase-images-to-distroless.md Outdated Show resolved Hide resolved
keps/sig-release/20190316-rebase-images-to-distroless.md Outdated Show resolved Hide resolved
keps/sig-release/20190316-rebase-images-to-distroless.md Outdated Show resolved Hide resolved
keps/sig-release/20190316-rebase-images-to-distroless.md Outdated Show resolved Hide resolved
@yuwenma yuwenma changed the title KEP: Rebase k8s images to distroless [WIP] KEP: Rebase k8s images to distroless Mar 21, 2019
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 21, 2019
@yuwenma
Copy link
Contributor Author

yuwenma commented Mar 21, 2019

Thanks @tallclair @johnbelamaric and @ixdy for the feedback.

I feel like I may need to provide more implementation details for the proposal and test section. Mainly because I noticed the cloud providers (GCP, azure, amazon, etc) share the script on the base image part (gcr.k8s.io/debian-base --> distroless/static), but also have their own manifest configuration (including shell and log redirect to upstart a container). Therefore, before rebasing the images, those manifests for different cloud providers need to be updated to avoid unexpected dependencies (using /bin/sh and corresponding parameter format changes). So I would like to provide more details on what should be changed and how the test is designed especially in the core master component part, and loop in the memers from other cloud provider teams. At least, I wish each cloud provider should be aware of their manifest changes and own at least the testing part.

Change the KEP back to WIP.

@tallclair
Copy link
Member

I noticed the cloud providers (GCP, azure, amazon, etc) share the script on the base image part (gcr.k8s.io/debian-base --> distroless/static), but also have their own manifest configuration (including shell and log redirect to upstart a container)

Can you share some code pointers to clarify? At some point we may just need to document this as "action required".

@kubernetes/sig-cloud-provider

@k8s-ci-robot k8s-ci-robot merged commit 0dfda67 into kubernetes:master Apr 3, 2019
dekkagaijin pushed a commit to GoogleCloudPlatform/k8s-metadata-proxy that referenced this pull request Apr 11, 2019
This is part of the effort described in KEP kubernetes/enhancements#900
yuwenma added a commit to yuwenma/kubernetes that referenced this pull request Apr 12, 2019
This is part of the effort described in KEP kubernetes/enhancements#900
"pause" image is broadly used by e2e tests (test-infra/kubetest) and tooling under ./hack. Rebasing "pause" to distroless helps improving the test coverage and aligning the developing env with the release environment.
yuwenma added a commit to yuwenma/kubernetes that referenced this pull request Apr 12, 2019
Busybox has more severe CVE issues.
This is part of the effort described in kep kubernetes/enhancements#900, we can't change the ibase image to distroless directly since a bash script is used.
Moved to debian-base and put it as a temp exception for now.
yuwenma added a commit to yuwenma/kubernetes that referenced this pull request Apr 12, 2019
Busybox has more severe CVE issues.
This is part of the effort described in kep kubernetes/enhancements#900, we can't change the ibase image to distroless directly since a bash script is used.
Moved to debian-base and put it as a temp exception for now.
losipiuk pushed a commit to losipiuk/autoscaler that referenced this pull request Apr 15, 2019
This is part of the effort described in KEP kubernetes/enhancements#900
This mitigates the affect of GVE issues.
dekkagaijin pushed a commit to dekkagaijin/kubernetes that referenced this pull request May 2, 2019
sunny0826 added a commit to sunny0826/kubernetes that referenced this pull request May 9, 2019
* Refactor PV scheduling library into separate package

* Update scheduler to use new volume scheduling library

To fix scheme issue, use k8s.io/client-go/kubernetes/scheme instead of
legacyscheme.

* kubeadm: do unit testing of actual public function

Even though CreateServiceAccountKeyAndPublicKeyFiles() function is
an interface function it's not unittested. Instead it wraps a couple
of internal functions which are used only inside CreateServiceAccountKeyAndPublicKeyFiles()
and those internal functions are tested.

Rewrite the function to do only what it's intended to do and add unit
tests for it.

* kubeadm: Add certificateKey field to v1beta2 config

This change introduces config fields to the v1beta2 format, that allow
certificate key to be specified in the config file. This certificate key is a
hex encoded AES key, that is used to encrypt certificates and keys, needed for
secondary control plane nodes to join. The same key is used for the decryption
during control plane join.
It is important to note, that this key is never uploaded to the cluster. It can
only be specified on either command line or the config file.
The new fields can be used like so:

---
apiVersion: kubeadm.k8s.io/v1beta2
kind: InitConfiguration
certificateKey: "yourSecretHere"
---
apiVersion: kubeadm.k8s.io/v1beta2
kind: JoinConfiguration
controlPlane:
  certificateKey: "yourSecretHere"
---

Signed-off-by: Rostislav M. Georgiev <[email protected]>

* remove soak test cauldron

* Add quota admission test for decreasing usage without covering quota

* Remove unnecessary custom conversions

* generated

* Fix describe error of Successful Job History Limit

* Use latest etcd from release-3.3 branch for dropping ugorji

Pick up changes from:
etcd-io/etcd#10675

Change-Id: Ic4d6daa3c54824d3d27809a125b798e88db0bf7e

* Add johnbelamaric to sig-network-{reviewers,approvers}

* Bump metadata-proxy image to v0.1.12

Rebases the image on `gcr.io/distroless/static:latest` per kubernetes/enhancements#900

https://github.com/GoogleCloudPlatform/k8s-metadata-proxy/releases/tag/v0.1.12

* Remove encryption via locally stored key.

* Fix golint failures of e2e/framework/providers/gce/recreate_node.go

This fixes golint failures of the following file:

test/e2e/framework/providers/gce/recreate_node.go

And also, replaces functions using gomega with framework functions.

* GCE/Windows: force kill the stackdriver processes when necessary

StackdriverLogging service sometimes cannot be stopped properly. This
work around the bug by force killing the processes.

* fix golint

* fix golint failures of pkg/apis/scheduling pkg/apis/storage/util pkg/apis/storage/v1/util pkg/apis/storage/v1beta1/util

* Avoid duplicate error reporting in glusterfs

Signed-off-by: Humble Chirammal <[email protected]>

* kubeadm: Fix omitempty in v1beta2

There are a couple of problems with regards to the `omitempty` in v1beta1:

- It is not applied to certain fields. This makes emitting YAML configuration
  files in v1beta1 config format verbose by both kubeadm and third party Go
  lang tools. Certain fields, that were never given an explicit value would
  show up in the marshalled YAML document. This can cause confusion and even
  misconfiguration.

- It can be used in inappropriate places. In this case it's used for fields,
  that need to be always serialized. The only one such field at the moment is
  `NodeRegistrationOptions.Taints`. If the `Taints` field is nil, then it's
  defaulted to a slice containing a single control plane node taint. If it's
  an empty slice, no taints are applied, thus, the cluster behaves differently.
  With that in mind, a Go program, that uses v1beta1 with `omitempty` on the
  `Taints` field has no way to specify an explicit empty slice of taints, as
  this would get lost after marshalling to YAML.

To fix these issues the following is done in this change:

- A whole bunch of additional omitemptys are placed at many fields in v1beta2.
- `omitempty` is removed from `NodeRegistrationOptions.Taints`
- A test, that verifies the ability to specify empty slice value for `Taints`
  is included.

Signed-off-by: Rostislav M. Georgiev <[email protected]>

* fix scheduler plugin example

* Use any host that mounts the datastore to create Volume

Also, This change makes zone to work per datacenter and cleans up dummy vms.
There can be multiple datastores found for a given name. The datastore name is
unique only within a datacenter. So this commit returns a list of datastores
for a given datastore name in FindDatastoreByName() method. The calles are
responsible to handle or find the right datastore to use among those returned.

* Allow to define exec credential plugin config options from kubectl

This commit adds support of setting config options to the exec plugin
from cli.

Next options are added:
  * --exec-command new command for the exec credential plugin
  * --exec-api-version API version of the exec credential plugin.
  * --exec-arg new arguments for the exec credential plugin command
  * --exec-env add, update or remove environment values for the exec credential plugin

* Move auth and network tests to use framework/log

This is part of the transition to using framework/log instead
of the Logf inside the framework package. This will help with
import size/cycles when importing the framework or subpackages.

* Require version match to special-case status objects

* test/e2e/network: Honor --dns-domain in more places

Try to finish what commit 4c8a65a started; that is, do not assume
cluster.local is a constant base domain, when it is configurable.

This makes DNS e2e tests pass with --dns-domain, which was only being honored
for some tests, not all

Signed-off-by: Tobias Wolf <[email protected]>

* Generate meta/v1 protobuf

* Self nominate cmluciano as a sig-network reviewer

Signed-off-by: Christopher M. Luciano <[email protected]>

* Verify apimachinery protobuf

* Add jan and msau42 as approver for volumemanager

* Expect the correct object type to be removed

* Lock GCERegionalPersistentDisk feature on

* Move node, windows, and autoscaling tests to framework/log

This is part of the transition to using framework/log instead
of the Logf inside the framework package. This will help with
import size/cycles when importing the framework or subpackages.

* GCE/Windows: send container logs to the proper resource

This PR fixes a bug where all container logs are sent to the "k8s_node"
resource by adding a "match" directive that applies only to container
logs.

* tableprinter: simplifies default printer handler

* Move framework/upgrade_util.go to framework/lifecycle/upgrade.go

Signed-off-by: Jiatong Wang <[email protected]>

* make example plugins conform with the PluginFactory type

* Store runtimeHandler for the PodSandboxStatus in FakeRuntimeService

Include the RuntimeHandler in ListPodSandbox

Signed-off-by: Aldo Culquicondor <[email protected]>

* durable link to cherry pick instructions

* Union all CPUSets in one round

* Refeactored framework deployment utils

This is the continuation of the refactoring of framework/deployment_utils.go
into framework/deployment.

Signed-off-by: Jorge Alarcon Ochoa <[email protected]>

* Added function to create kubeconfig for addon-manager

* fix(daemon): create more expections when skipping pods

* Remove terminated pod from summary api.

Signed-off-by: Lantao Liu <[email protected]>

* Add detacher assert for csiAttacher

* add common func for NewAttacher and NewDetacher

* Create OWNERS in volume scheduling package

* Move scalability, upgrade, and common packages to framework/log

This is part of the transition to using framework/log instead
of the Logf inside the framework package. This will help with
import size/cycles when importing the framework or subpackages.

* Move storage tests to use the framework/log package

This is part of the transition to using framework/log instead
of the Logf inside the framework package. This will help with
import size/cycles when importing the framework or subpackages.

* Allow to define kubeconfig file for OpenStack cloud provider

Now, to build a kubernetes client, provider uses only in-cluster config,
but if kubelet is not running as a pod, then it doesn't work.

This commit adds an ability to specify a path to the kubeconfig file if
necessary. If no value was provided, then the provider falls back to
in-cluster config.

* Move all private annotations to shared package and update code

* Fix go lint failures in a few packages

- pkg/controller/volume/persistentvolume/testing
- pkg/controller/volume/scheduling

* refer to constant to guarantee constant behavior

* Cleanup the workarounds for augmented NSGs since it has been GA

* replace errors.New(fmt.Sprintf()) with fmt.Errorf()

* fix always print EventTypeWarning due to err overrided

* fix golint failures of pkg/kubectl/cmd/help pkg/kubectl/cmd/proxy pkg/kubectl/cmd/util/openapi

* fix increment-decrement lint error

* remove redundant else block

* Enhance the local-cluster-up.sh script to work with docker 19.03.0-beta3

* Modify e2e/kubectl tests to import e2elog.Log

Signed-off-by: Jiatong Wang <[email protected]>

* style: update several golint errors in winkernel

* Revert "github.com/beorn7/perks v0.0.0-20180321164747-3a771d992973"

This reverts commit a2ec981.

* renew-embedded-certs

* autogenerated

* [Distroless] Convert the GCE manifests for master containers.

* Touched containers: kube-apiserver, kube-scheduler,
kube-controller-manager.
* Remove the shell dependencies when upstart the containers.
* Reformat the command parameters to ["Exec", "Param1", "Param2"]

* organize sig-net-api-{reviewers,approvers} in OWNERS_ALIASES

Signed-off-by: Christopher M. Luciano <[email protected]>

* kubeadm: upload the `ClusterConfiguration` during the upgrade

During the upgrade process, `kubeadm` will take the current
`ClusterConfiguration`, update the `KubernetesVersion` to the latest
version, and call to `UploadConfiguration`.

This change makes sure that when the mutation happens, not only the
`ClusterStatus` is mutated, but the `ClusterConfiguration` object
inside the `kubeadm-config` ConfigMap as well; it will contain the
new `KubernetesVersion`.

* Staging legacy AWS cloud provider

* feat: move klog from AddUnschedulableIfNotPresent into the call site

* Remove hyperkube short aliases used in local-up-cluster.sh

* Add RemainingItemCount to ListMeta

* Remove unused code from CSI e2e tests

* Split TestLoopbackHostPort into 2 tests

firstly, split into two tests: TestLoopbackHostPortIPv4 and  TestLoopbackHostPortIPv6.
then improve error handling, going to fail with explicit error message when run host
that does not support ipv6 or ipv4

* generated

* Treat NoCorrespondingTypeError as MissingVersionError

* Don't use mapfile as it isn't bash 3 compatible

* refactor: use e2elog.Logf instead of framework.Logf

* ensure that kubectl works when the master insecure port is disabled

* disable the apiserver insecure port by default in configure-helper

* fix other add statement

* Do one more level of casting to get the 'assumeCache'

* Unexport PrintTable function

* Bump ip-masq-agent version to v2.3.0. Enable nomasq for reserved IPs.

Added the non-masq ranges to configure-helper.sh so that GCE clusters
will have the non-masq IP ranges aligned with GKE clusters.

* Add Un-reserve extension point for the scheduling framework

* Modify e2e/lifecycle tests to import e2elog.Logf

Signed-off-by: Jiatong Wang <[email protected]>

* Revert "Add better logging when iptables-restore fails"

* Add --chunk-size=0 to disable pagination when listing nodes.

Otherwise the default of 500 is used which started breaking large
cluster tests, e.g.
https://prow.k8s.io/view/gcs/kubernetes-jenkins/logs/ci-kubernetes-e2e-gce-scale-performance/1125672232488538115

* Add Release information to each of the conformance tests.

* Update CHANGELOG-1.15.md for v1.15.0-alpha.3.

* Promote spiffxp to approver, add oomichi as reviewer

* Avoid using tag filters for EC2 API where possible

For very large clusters these tag filters are not efficient within the
EC2 API and will result in rate limiting. Most of these queries have
filters that are targeted narrowly enough that the elimination of the
tags filter will not return significantly more data but will be executed
more efficiently by the EC2 API.

Additionally, some API wrappers did not support pagination despite the
underlying API calls being paginated. This change adds pagination to
prevent truncating the returned results.

* Staging the GCE Cloud Provider

**What type of PR is this?**
/kind cleanup

**What this PR does / why we need it**:
Staging the GCE Cloud Provider as part of KEP [20190125-removing-in-tree-providers](https://github.com/kubernetes/enhancements/blob/master/keps/sig-cloud-provider/20190125-removing-in-tree-providers.md). Staging repo setup here https://github.com/kubernetes/legacy-cloud-providers
Moves the GCE cloud provider implementation to staging.
This is in preparation for moving the cloud provider code out of tree entirely.
However we need it in staging while the code needs to be consumed both in/out of tree.

**Which issue(s) this PR fixes**:
Fixes #

**Special notes for your reviewer**:

**Does this PR introduce a user-facing change?**:

```
NONE
```

Updated import dependency tracking.
Factored in the cleanup from kubernetes#77412
Minor fix to go.mod.

* Make external driver storage class name generation contain a more random suffix in case of double generation in the same framework context (twice in the same test)

* get node zone info from k8s, added tests

* GCE/Windows: ignore stopping errors for stackdriver

* remove GCERegionalPersistentDisk feature from cloud-provider directly to pkg/features since it is no longer used in cloud-provider. This change prevents cloud-provider from bringing in apiserver and component-base (and csi-translation-lib from bringing those two things in transitively)

* Remove spurious godeps.json files

* Revert "Add Un-reserve extension point for the scheduling framework"

This reverts commit 8b51825.

* Add initial wrappers for prometheus.Counter and prometheus.CounterVec. Also add wrapper around prometheus.Registry to customize control-flow

* make method names more succinct, improve documentation for posterity

* add version parsing to metrics framework, use build version information for registry version

* swap out internal reference to use unexported registry initializer

* move framework files to subdirectory for isolation

* add additional documentation around exposed functionality

* move files to component-base

* move global registry code into subdirectory 'legacyregistry'

* update dependencies (bring in prometheus and semver)

* Faster scheduler.

* move generic feature gate code from k8s.io/apiserver to k8s.io/component-base

* update import of generic featuregate code from k8s.io/apiserver/pkg/util/feature -> k8s.io/component-base/featuregate

* run update-vendor.sh

Signed-off-by: Andrew Sy Kim <[email protected]>

* remove apiserver deps to k8s.io/cloud-provider in publish-bot verify rules

Signed-off-by: Andrew Sy Kim <[email protected]>

* remove apiserver import restrictions for k8s.io/cloud-provider

Signed-off-by: Andrew Sy Kim <[email protected]>

* Update CHANGELOG-1.13.md for v1.13.6.

* Move framework ssh code to new package

The framework/ssh.go code was heavily used throughout the framework
and could be useful elsewhere but reusing those methods requires
importing all of the framework.

Extracting these methods to their own package for reuse.

Only a few methods had to be copied into this package from the
rest of the framework to avoid an import cycle.

*  fix duplicated imports of k8s code (kubernetes#77484)

* fix duplicated imports of api/core/v1

* fix duplicated imports of client-go/kubernetes

* fix duplicated imports of rest code

* change import name to more reasonable

* handle global registry version loading more than once (with different versions)
tim-smart pushed a commit to arisechurch/autoscaler that referenced this pull request Nov 22, 2022
This is part of the effort described in KEP kubernetes/enhancements#900
This mitigates the affect of GVE issues.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. sig/auth Categorizes an issue or PR as relevant to SIG Auth. sig/release Categorizes an issue or PR as relevant to SIG Release. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants