build: Support Mounted Resource Volumes

Proposal to support mounts for Secrets and ConfigMaps in builds. This augments prior work which let Secrets and ConfigMaps be used as sources in builds. Unlike the current approach, this enhancement leverages buildah's volume mount feature to let content be available only at build time, and not at runtime. This is useful for builds that need to access private artifact repositories or RHEL subscription content. This current proposal builds on top of Ben Parees's original volume mount proposal. Supported volume types will be gated at the API level to ensure the volume type does not pose a security risk and is correctly lifecycled. To distinguish Kubernetes volumes+mounts from buildah's volume mount mechanism, the terms "volume content," "input volume," "buildah volume mount," and "buildah runtime environment" were introduced. The proposed API uses these terms to distinguish build volume mounts from pod volume mounts. Documentation requirements for this feature were added in the "Drawbacks" section of the proposal. Because the build controller uses the privileged security context constraint, build pods to bypass most security features in OpenShift. Therefore, future volume mounts in build pods need to be tightly controlled. Open-ended volume types like CSI volumes could be an attack vector if a developer uses an insecure CSI driver implementation. This proposal establishes graduation criteria for adding new volume source types to builds, including security, testing, lifecycle concerns, failure modes, and feature gating.
openshift · May 6, 2021 · b310d8c · b310d8c
1 parent ec37598
commit b310d8c
Showing 1 changed file with 395 additions and 0 deletions.
diff --git a/enhancements/builds/volume-mounted-resources.md b/enhancements/builds/volume-mounted-resources.md
@@ -0,0 +1,395 @@
+---
+title: volume-mounted-resources
+authors:
+  - @bparees
+  - @adambkaplan
+reviewers:
+  - @derekwaynecarr
+  - @smarterclayton
+  - @deads2k
+approvers:
+  - @bparees
+  - @deads2k
+  - @sttts
+creation-date: 2020-01-08
+last-updated: 2021-05-05
+status: implementable
+---
+
+# Volume Mounted Resources
+
+## Release Signoff Checklist
+
+- [x] Enhancement is `implementable`
+- [x] Design details are appropriately documented from clear requirements
+- [x] Test plan is defined
+- [ ] Operational readiness criteria is defined
+- [x] Graduation criteria for dev preview, tech preview, GA
+- [ ] User-facing documentation is created in [openshift-docs](https://github.com/openshift/openshift-docs/)
+
+## Summary
+
+Today builds support getting source input from configmaps and secrets, hereafter referred to as "resources".
+When users utilize this feature, the resource is volume-mounted into the build pod and placed in the "build context" within the build's execution environment, alongside other build sources like git source code.
+The next steps depend on whether it is an s2i or dockerfile build.
+
+For s2i builds, the generated Dockerfile contains commands to `ADD` the content at a path specified by the user, the assemble script is invoked, and then the injected content is zeroed out prior to committing the image via a `RUN rm` command added to the Dockerfile.
+
+For dockerfile builds, the user is instructed to add appropriate `ADD` and `RUN rm` commands to their dockerfile to inject the content that is available in the build's working directory (along with their application source, where applicable).
+
+There are a few undesirable aspects to this:
+
+1. In the dockerfile case, the content can still be found in lower layers of the image unless a layer squashing option is selected.
+2. Requires extra work by the user in the Dockerfile, so each Dockerfile must be customized
+
+This enhancement proposes to introduce an option to use buildah's capability to mount a volume at build time.
+The content mounted into the build pod would be then mounted into the container processing the Dockerfile, making that content available within the container so Dockerfile commands could reference it.
+No explicit `ADD` would be required, and since mounted content is not committed to the resulting image, no `RUN rm` equivalent is required to clean up the injected content.
+
+To avoid security and lifecycle concerns, the following volume types will be supported initially:
+
+1. Secrets
+2. ConfigMaps
+
+## Motivation
+
+### Goals
+
+* Simplify how users consume secret + configmap content in builds
+* Increase the security of protected content being injected to images
+* Simplify use cases that require consuming credentials during the build, but need to ensure those credentials do not end up in the output image.
+* Eventually extend this api to allow the mounting of other volumes (such as those backed by persistent storage)
+
+### Non-Goals
+
+* This enhancement should not result in a change of behavior for users of the existing secret/configmap injection api.
+* Provide immediate support for persistent volume claims in builds.
+  This is a long term goal that will be addressed in a future enhancement proposal.
+* Provide support to mount Secrets and ConfigMaps shared by the [projected resource CSI driver](/enhancements/cluster-scope-secret-volumes/csi-driver-host-injections.md).
+  This is a long term goal that will be addressed in a future enhancement proposal.
+
+## Proposal
+
+### User Stories [optional]
+
+The enabled use cases are essentially identical to what can be done with the configmap/secret input api in builds today, but with a better user experience and security as discussed above.
+It does not enable a new use case that is not already possible today, except that layer squashing will not be required.
+
+Future extensions to this enhancement could enable additional use cases, such as:
+
+- Persistent volumes that cache content
+- Shared Secrets and ConfigMaps, such a Simple Content Access certificate used to download RHEL content.
+
+These will be discussed in respective future enhancements.
+
+### Definitions
+
+The folllowing terms will be used in the remainder of the document to clarify the behavior of volume mounts in builds:
+
+- *Volume content*: contents used within a build that are not indended to be present in the resulting container image.
+- *Input Volume*: a Kubernetes Volume that is added to the build Pod, with the intent of being used as a transient bind mount within a buildah build.
+  See buildah's `--volume` [option](https://github.com/containers/buildah/blob/master/docs/buildah-bud.md#options).
+- *Buildah volume mount*: a directory that is added to Buildah's runtime environment as a transient bind mount.
+- *Container volume mount*: a volume that is mounted into a container within a Kubernetes pod.
+- *Buildah runtime environment*: the process that runs the buildah build within the build pod's main container.
+  In OCP 4.6 and higher, buildah uses OCI isolation and effectively runs as a sub-container within the build pod's main container.
+
+### Implementation Details/Notes/Constraints [optional]
+
+We will need to introduce a new mechanism in the build api which allows the user to indicate that they want to inject *volume content* into the build.
+Unlike source content, *volume content* is not intended to be included in the container image produced by an OpenShift build.
+Initially the only allowed volume types will be ConfigMaps and Secrets.
+The API will otherwise be similar to the existing secret/configmap injection api in which users identify the configmap/secret and the target path for injection.
+
+#### Build/BuildConfig API
+
+Volume content can be declared via a new `[]BuildVolume` field that is added to the Source and Docker strategy structs.
+These will define the *input volumes* that are appended to other volumes in the build pod.
+Items in the `[]BuildVolume` array will be translated directly into a pod `Volume`, using Kubernetes mechanisms to set up the volume for consumption by the build pod.
+
+Unlike Kubernetes, the types of volume sources that can be defined will be restricted to those explictly supported by Builds.
+Build pods are created with the privileged security context constraint, which allows a pod to mount any volume type with no SELinux restrictions.
+This SCC is inherited from the build controller.
+Restricting supported volume sources in the API ensures that builds only allow safe volume types to be mounted.
+
+The subset of supported volume sources will be defined by the `BuildVolumeSource` object.
+This will behave like the Kubernetes `VolumeSource` object, with the addition of a type discriminator that is set on Build/BuildConfig admission (if not specified).
+The OpenAPI schema must ensure that one of the volume sources are set.
+Initially only `Secret` and `ConfigMap` will be supported as valid Volume sources.
+The volume sources themselves will use Kubernetes volume source types, as the values in these volume sources will be passed directly to the build pod.
+
+The destination *buildah volume mount* will be controlled by the new `[]BuildVolumeMount` field, which is a subset of the Kubernetes `VolumeMount` API.
+The options in `BuildVolumeMount` will be constrained to those allowed in the *buildah volume mount*:
+
+- `MountPath` - the destination directory to place the *buildah volume mount* within buildah's runtime environment.
+- `ReadOnly` - whether or not the volume mount within buildah's runtime environment is read-only.
+
+Other options like `SubPath` and `MountPropagation` may be considered for support in the future.
+
+When the build pod is constructed, all *input volumes* will be mounted in a subdirectory of a fixed location in the build pod's main container - `/var/run/openshift.io/volumes`.
+The OpenShift builder process will infer the destination directory of these volumes inside the *buildah runtime environment* from the `Build` API object (already available via an environment variable, serialized as JSON).
+The logic that invokes buildah will then pass the buildah volume mounts as transient bind mount arguments.
+
+#### API Example
+
+```go
+// SourceBuildStrategy defines input parameters specific to an Source build.
+type SourceBuildStrategy struct {
+
+  // existing API
+  ...
+  // Note that these fields will also be added to the DockerBuildStrategy API
+
+  // Volumes is a list of input volumes that can be mounted into buildah's runtime environment.
+  // Only a subset of Kubernetes Volume sources are supported by builds.
+  // More info: https://kubernetes.io/docs/concepts/storage/volumes
+  Volumes []BuildVolume
+
+  // VolumeMounts are volumes to mount into buildah's runtime environment at the specified mount path.
+  VolumeMounts []BuildVolumeMount
+}
+```
+
+```go
+// BuildVolume describes a volume that is made available to build pods, such that it can be mounted into buildah's runtime environment.
+// Only a subset of Kubernetes Volume sources are supported by builds.
+type BuildVolume struct {
+
+  // Volume's name.
+	// Must be a DNS_LABEL, unique within the pod, and cannot collide with volumes that are always added by the build controller.
+	// More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
+	Name string 
+
+	// BuildVolumeSource represents the location and type of the mounted volume.
+	BuildVolumeSource `json:",inline" protobuf:"bytes,2,opt,name=volumeSource"`
+}
+
+type BuildVolumeSourceType string
+
+const (
+  BuildVolumeSourceTypeSecret = "Secret"
+  BuildVolumeSourceTypeConfigMap = "ConfigMap"
+)
+
+// Represents the source of a volume to mount.
+// Only one of its members may be specified.
+// This is a subset of the core Kubernetes VolumeSource.
+type BuildVolumeSource struct {
+
+  // Type is the type for the volume source.
+  // Type must match the populated volume source, and if not specified will be inferred on admission.
+  // Only one type of volume source can be specified.
+  Type BuildVolumeSourceType
+
+  // Secret represents a secret that should populate this volume.
+  // More info: https://kubernetes.io/docs/concepts/storage/volumes#secret
+  // +optional
+  Secret *kapi.SecretVolumeSource
+
+  // ConfigMap represents a configMap that should populate this volume
+  // +optional
+  ConfigMap *kapi.ConfigMapVolumeSource
+}
+```
+
+```go
+// BuildVolumeMount describes a mounting of a Volume within buildah's runtime environment.
+type BuildVolumeMount struct {
+  // This must match the Name of a BuildVolume, and cannot collide with volumes that are always added to the build pod by the build controller.
+  Name string
+  // Mounted read-only if true, read-write otherwise (false or unspecified).
+  // Defaults to false.
+  // +optional
+  ReadOnly bool
+  // Path within the build runtime environment at which the volume should be mounted.
+  // Must not contain ':', and cannot collide with a destination path generated by the builder process.
+  MountPath string
+}
+```
+
+Example usage (use of the existing secret/configmap injection api is included for comparison, it is not changing):
+
+```yaml
+apiVersion: v1
+items:
+- apiVersion: build.openshift.io/v1
+  kind: BuildConfig
+  metadata:
+    name: mybuild
+    namespace: p1
+  spec:
+    strategy:
+      sourceStrategy:
+        from:
+          kind: ImageStreamTag
+          name: nodejs:10-SCL
+          namespace: openshift
+        volumes:
+        - name: secret
+          secret:
+            secretName: somesecret
+        - name: config
+          configMap:
+            name: someconfigmap
+            items:
+            - key: somekey
+              path: volume/path/value.txt
+        volumeMounts:
+        - name: config
+          mountPath: /tmp/config
+        - name: secret
+          mountPath: /tmp/secret
+      type: Source
+    source:
+      secrets:
+      - secret: 
+          name: myOtherSecret
+        destinationDir: /tmp/othersecret
+      configMaps:
+      - configMap:
+          name: myOtherConfigMap
+        destinationDir: /tmp/otherconfig
+```
+
+#### Volume Name and Destination Collisions
+
+Builds today have several container volume mounts that are provided by the build controller, including:
+
+- Node pull secrets (host)
+- System configuration (ConfigMap)
+- Certificate authorities for the cluster proxy and internal registry (ConfigMap)
+- Blob metadata cache (host)
+- RHEL entitlements (host via cri-o configuration)
+
+In addition, the builder process generates the following buildah volume mounts:
+
+- RHEL entitlements (destination is `/run/secrets/etc-pki-entitlement`, `/run/secrets/redhat.repo`, and `/run/secrets/rhsm`)
+- Custom PKI trust bundle - destination is `/etc/pki/ca-trust`, added via the Build/BuildConfig's `mountTrustedCA` option.
+
+If an input volume's name collides with a volume created by the build controller, or implied by OpenShift's cri-o configuration, the build should fail to run.
+Automatically avoiding volume name collisions - for example, by adding randomized suffixes - can be addressed in a future enhancement.
+
+If a volume mount path collides with a buildah volume mount generated by the builder process, the build should fail to run.
+This is unavoidable - we cannot have two mount points share the same destination in the build.
+
+#### Failure behavior
+
+A BuildConfig object can reference a Secret or ConfigMap in a volume that does not exist in the BuildConfig's namespace (yet).
+If a Build is generated from a BuildConfig and a referenced Secret or ConfigMap does not exist, the build controller should report a failure status and not create a build pod.
+This behavior should also apply to Secrets or ConfigMaps referenced in the build source array.
+
+### Risks and Mitigations
+
+**Risk:** Build mounts can alter the behavior of the build itself (a security risk).
+
+*Mitigations:*
+
+- Supported volume types are gated by the API.
+  For example, `HostPath` volume mounts are not supported.
+- Input volumes are mounted in a fixed location within the build pod's container.
+  The container volume is created by the build controller and cannot be changed by an end user.
+- Builds fail to run if a volume name collides with a volume generated by the build controller.
+- Builds fail if the a volume mount path collides with a destination generated by the builder process.
+
+**Risk:** Supporting arbitrary volume sources can lead to privilege escalations.
+This is a particular concern for builds since build pods run privileged.
+
+*Mitigations:*
+
+- Only volume sources that are known to provide storage isolated from the host file system will be supported (Secrets and ConfigMaps).
+- Future supported volume types must ensure isolation from the host.
+  Gating must happen at the API or build controller config level, since the build controller uses the privileged SCC to create pods.
+
+**Risk:** Builds can mount content that the user does not have permission to access
+
+*Mitigation:*
+
+Volume mounts for Secrets and ConfigMaps use local object references within the same namespace as the Build.
+The assumption is that a Secret in a namespace is accessible to any service account within the namespace.
+Furthermore, the existing functionality for source Secrets and ConfigMaps allow any valid object of those types in the namespace to be included in the build.
+
+## Design Details
+
+## Open Questions [optional]
+
+1. Can we make this the default or even only behavior for builds?
+   No, need to make it opt-in to avoid potentially breaking existing buildconfigs.
+2. What happens if a user overrides the default volume mounts used by Builds today?
+   We prevent the build from running.
+   Overriding default volume mounts can alter build behavior, which can be a security risk.
+3. Can cluster admins alter the default `mounts.conf` configuration for cri-o?
+   No - `mounts.conf` cannot be configured via the cluster ContainerRuntimeConfig custom resource.
+   Altering `mounts.conf` via MachineConfig is unsupported.
+
+### Test Plan
+
+This feature will need new e2e tests that leverage the new api option.
+We have existing tests for configmap+secret injection would should be able to be copied+adapted to testing this feature relatively easily.
+
+### Graduation Criteria
+
+This should be introduced directly as a GA feature when it is implemented.
+
+Future enhancements can add additional volume source types.
+When adding new source types, the following questions need to be answered:
+
+- Can the build controller support the lifecycle of this volume source?
+- What should happen if the build is not able to mount a volume source? How will the status of the volume mount in the build pod be reflected in the build's status?
+- Does this volume source type introduce security risks?
+  If so, how will this risk be mitigated?
+- How will this volume source be tested?
+- How will this volume source be made available to users and cluster admins?
+  - Will cluster admins want to disable this volume source?
+    Or conversely, will cluster admins need to explicitly enable this volume source as a cluster feature?
+  - Will cluster admins want fine control over how this volume source can be mounted?
+
+#### Dev Preview -> Tech Preview
+
+N/A - this will be introduced as a GA feature.
+
+#### Tech Preview -> GA
+
+N/A - this will be introduced as a GA feature.
+
+#### Removing a deprecated feature
+
+N/A - this is a new feature.
+
+### Upgrade / Downgrade Strategy
+
+This will be added to the Build and BuildConfig APIs, which are provided by the aggregated OpenShift apiserver.
+On downgrade, the volume fields will not be returned but will remain in etcd storage until a subsequent write is made to the Build or BuildConfig object.
+Any Build which is invoked from a BuildConfig after a downgrade will risk losing the volume mount information (ex - via `oc start-build`, an image change trigger, etc.)
+
+### Version Skew Strategy
+
+The `BuildVolumeSource` API will include a type discriminator.
+As new volume source types are added to the API, the discriminator will ensure that the build controller will only mount volumes that it is aware of.
+
+## Implementation History
+
+2020-01-08: Initial proposal by @bparees
+2021-04-21: Revised proposal with edits by @adambkaplan
+2021-05-05: Implementable version
+
+## Drawbacks
+
+This API will overlap with our existing support for injected Secrets and ConfigMaps in build sources.
+The nuance of injecting a Secret or ConfigMap as a *buildah volume mount* can be difficult for end users to comprehend.
+The following critical question can guide users to what is best for their needs:
+
+*Do you want this content in the resulting image?*
+If yes, use build sources.
+If no, use build volumes.
+
+Documentation of this feature should encourage users to mount Secrets using the new Volume API, instead of the current "source" API.
+Users will need to opt-in to this feature - BuildConfigs that inject secrets today will need to be updated to take advantage of this feature.
+
+## Alternatives
+
+Updating the existing injection apis to have a "asVolume" field was considered as it would be a simpler implementation (more code reuse) but it was rejected as there is a long term goal to allow builds to mount traditional volumes as well.
+The existing injection api can't easily be extended to support such a thing, so the design proposed in this enhancement is a better stepping stone to that goal.
+This also puts us on a better path to support the following critical use cases:
+
+- Using volumes for caching build content between build invocations.
+- Using "cluster-scoped" credentials in builds - in particular simple content access certificates used to download RHEL subscription content.