Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(ci): Release 2.8.4 #5934

Merged
merged 60 commits into from
Sep 25, 2024
Merged

fix(ci): Release 2.8.4 #5934

merged 60 commits into from
Sep 25, 2024

Conversation

sakoush
Copy link
Member

@sakoush sakoush commented Sep 25, 2024

No description provided.

sakoush and others added 30 commits July 17, 2024 15:21
update workflow to commit changelog to v2 branch after releases.
…ux/otelmux (#5758)

Bumps [go.opentelemetry.io/contrib/instrumentation/github.com/gorilla/mux/otelmux](https://github.com/open-telemetry/opentelemetry-go-contrib) from 0.52.0 to 0.53.0.
- [Release notes](https://github.com/open-telemetry/opentelemetry-go-contrib/releases)
- [Changelog](https://github.com/open-telemetry/opentelemetry-go-contrib/blob/main/CHANGELOG.md)
- [Commits](open-telemetry/opentelemetry-go-contrib@zpages/v0.52.0...zpages/v0.53.0)

---
updated-dependencies:
- dependency-name: go.opentelemetry.io/contrib/instrumentation/github.com/gorilla/mux/otelmux
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [github.com/confluentinc/confluent-kafka-go/v2](https://github.com/confluentinc/confluent-kafka-go) from 2.4.0 to 2.5.0.
- [Release notes](https://github.com/confluentinc/confluent-kafka-go/releases)
- [Changelog](https://github.com/confluentinc/confluent-kafka-go/blob/master/CHANGELOG.md)
- [Commits](confluentinc/confluent-kafka-go@v2.4.0...v2.5.0)

---
updated-dependencies:
- dependency-name: github.com/confluentinc/confluent-kafka-go/v2
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…5757)

Bumps [go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp](https://github.com/open-telemetry/opentelemetry-go-contrib) from 0.51.0 to 0.53.0.
- [Release notes](https://github.com/open-telemetry/opentelemetry-go-contrib/releases)
- [Changelog](https://github.com/open-telemetry/opentelemetry-go-contrib/blob/main/CHANGELOG.md)
- [Commits](open-telemetry/opentelemetry-go-contrib@zpages/v0.51.0...zpages/v0.53.0)

---
updated-dependencies:
- dependency-name: go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…#5756)

Bumps [go.opentelemetry.io/otel/sdk](https://github.com/open-telemetry/opentelemetry-go) from 1.27.0 to 1.28.0.
- [Release notes](https://github.com/open-telemetry/opentelemetry-go/releases)
- [Changelog](https://github.com/open-telemetry/opentelemetry-go/blob/main/CHANGELOG.md)
- [Commits](open-telemetry/opentelemetry-go@v1.27.0...v1.28.0)

---
updated-dependencies:
- dependency-name: go.opentelemetry.io/otel/sdk
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [github.com/confluentinc/confluent-kafka-go/v2](https://github.com/confluentinc/confluent-kafka-go) from 2.4.0 to 2.5.0.
- [Release notes](https://github.com/confluentinc/confluent-kafka-go/releases)
- [Changelog](https://github.com/confluentinc/confluent-kafka-go/blob/master/CHANGELOG.md)
- [Commits](confluentinc/confluent-kafka-go@v2.4.0...v2.5.0)

---
updated-dependencies:
- dependency-name: github.com/confluentinc/confluent-kafka-go/v2
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
This allows mounting local directories in a one-node kind cluster. It can be
useful (for example) for being able to load local models rather than requiring
them to be first copied to MinIO or a cloud bucket.

The changes here only add the basic support at the kind-cluster level. A
separate PR will update the helm-charts to allow for Server podSpec changes,
including adding a volume that points to the local folder now mounted in kind.

You can now set the following additional variables (overwriting with -e when
calling ansible-playbook preferred):

- kind_local_mount (bool, default false): enable kind extraMounts option when creating the cluster
- kind_host_path(string, default /tmp/kind-cluster): the local path to mount
- kind_container_path(string, default /host-mount): the "containerPath"
extraMounts setting - this (somewhat confusingly) defines the value that should be set as
the path in `volumes[i].hostPath.path` when adding a volume to the pod spec. Example:

```
kind: Pod
...
spec:
  volumes:
    - name: host-models
      hostPath:
        path: {{ kind_container_path }}
  ...
  containers
    - image: ...
      volumeMounts:
        - name: host-models
          # the actual path inside the container
          mountPath: /mnt/host-models
```
* feat(helm-charts): allow pod spec overrides via values

Updates the helm charts to allow passing dictionary values that update the
podSpec for:

- hodometer
- seldon-scheduler
- seldon-envoy
- seldon-dataflow-engine
- seldon-modelgateway
- seldon-pipelinegateway
- default mlserver & triton Server CRs

The podSpec value can be used to override any of the k8s pod spec configs,
including:
- tolerations
- nodeSelector/nodeAffinity
- volumes

The current operator behaviour when providing list values as part of the podSpec
setting is to append the elements of those lists to already existing elements.

The one exception is the list under podSpec.containers, where the settings for
items (containers) with the same name are *merged*.

* add ansible docs for local mounts

- example defining a volume mounted in the rclone container of an inference
server and pointing to a local (host) path
This bumps the grpc library versions across our codebase. Represents the sum of dependabot-suggested PRs related to the update to goolge.golang.org/grpc 1.65.0, which was integration-tested:

* Bump go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc
* Bump google.golang.org/grpc from 1.63.2 to 1.65.0 in /scheduler
* Bump go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc
* Bump google.golang.org/grpc from 1.63.2 to 1.65.0 in /operator
* Bump google.golang.org/grpc from 1.63.2 to 1.65.0 in /hodometer
* Bump google.golang.org/grpc from 1.62.1 to 1.65.0 in /apis/go
* Bump google.golang.org/grpc from 1.62.1 to 1.65.0 in /components/tls
* Bump io.grpc:grpc-{stub, protobuf, netty-shaded} from 1.64.0 to 1.65.0 in scheduler/dataflow

Code updates:
* fix(grpc): move from deprecated .Dial to .NewClient
This means that the underlying grpc connections are done lazily on first RPC.

* fix(tests) grpc tests should use passthrough resolver
Because the default resolver for NewClient is dns:// rather than the old Dial
behaviour (using passthrough://) we need to be explicit that the tests use
passthrough

**Which issue(s) this PR fixes**:
- Fixes INFRA-1073 (internal): Grpc package updates
…5789)

This updates the version for protobuf/grpc codegen tools as follows:
- protoc                        21.10   -> 27.2
- protoc-gen-go plugin          1.28.1  -> 1.34.2
- protoc-gen-go-grpc plugin     1.2.0   -> 1.4.0
- protoc-gen-kotlin-grpc plugin 1.2.1   -> 1.4.1
- protoc-gen-java-grpc plugin   1.45.1  -> 1.65.1
- python grpcio-tools           1.51.3  -> 1.64.1

The python grpcio-tools was not updated to the latest available version (1.65.1) because of
persisting issues with logs being spammed in that release.

**Tests**
The following tests were executed in order to guarantee that this update doesn't introduce
problematic behaviour:

- [x] control-plane and data-plane load tests (k6)
- [x] pipeline smoke tests
- [x] cluster update from 2.8.2
- [x] cluster update from 2.8.3

**Which issue(s) this PR fixes**:
Fixes #INFRA-1076
License exceptions added for transitive dependencies whose license could not be automatically determined:
- hasicorp/vault (BSL-1.1)
- shopspring/decimal (MIT)
Bumps envoyproxy/envoy from v1.30.4 to v1.31.0.

---
updated-dependencies:
- dependency-name: envoyproxy/envoy
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps ubi9/ubi-micro from 9.4-9 to 9.4-13.

---
updated-dependencies:
- dependency-name: ubi9/ubi-micro
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps ubi9/ubi-micro from 9.4-9 to 9.4-13.

---
updated-dependencies:
- dependency-name: ubi9/ubi-micro
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps ubi9/ubi-micro from 9.4-9 to 9.4-13.

---
updated-dependencies:
- dependency-name: ubi9/ubi-micro
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps grafana/grafana from 11.1.0 to 11.1.1.

---
updated-dependencies:
- dependency-name: grafana/grafana
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps ubi9/ubi-minimal from 9.4-1134 to 9.4-1194.

---
updated-dependencies:
- dependency-name: ubi9/ubi-minimal
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [github.com/onsi/gomega](https://github.com/onsi/gomega) from 1.33.1 to 1.34.0.
- [Release notes](https://github.com/onsi/gomega/releases)
- [Changelog](https://github.com/onsi/gomega/blob/master/CHANGELOG.md)
- [Commits](onsi/gomega@v1.33.1...v1.34.0)

---
updated-dependencies:
- dependency-name: github.com/onsi/gomega
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* add max load elapsed time in client settings

* add default maz elapsed time to 2 hours

* increase default a single load operation timeout to an hour

* adjust test after api change

* remove outdate comment

* add max unload elapsed time, defaulting to 15 minutes including retries.

* add test coverage

* fix fmt

* fix spelling mistake

* reduce the numner of retries to 5 by default

* add rename test
Previously, any error occurring after the downloading of the model artifact but
before the model was properly registered in the agent's model repository would
result in the rclone path not being cleaned-up. This leads to the PVC filling up
and requiring end-user manual intervention.

This change fixes this, by cleaning the rclone path even when errors occur.
* cleaning up models that fail to load

* do not used a named return value
dependabot bot and others added 26 commits September 4, 2024 15:26
Bumps [github.com/envoyproxy/go-control-plane](https://github.com/envoyproxy/go-control-plane) from 0.12.0 to 0.13.0.
- [Release notes](https://github.com/envoyproxy/go-control-plane/releases)
- [Changelog](https://github.com/envoyproxy/go-control-plane/blob/main/CHANGELOG.md)
- [Commits](envoyproxy/go-control-plane@v0.12.0...v0.13.0)

---
updated-dependencies:
- dependency-name: github.com/envoyproxy/go-control-plane
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Change location where scheduler/Makefile downloads golangci-lint. Previous
location name conflicts with any previously installed golangci-lint.

Specifically, the Makefile used to install it in
`${GOPATH}/bin/golangci-lint/[VERSION]/` however, this means that the
`${GOPATH}/bin/golangci-lint` file can not exist at the same time. The latter is
the default install path when installing via `go install`.

The new location for the seldon downloaded golangci-lint is
`${GOPATH}/bin/golangci-lint-versions/[VERSION]/`
New dashboard showing:
 * Filter-able to a given set of models/inference server pods:
   - per (model, inference server pod) throughput and average latency
   - aggregated per model throughput
   - aggregated per inference server pod throughput
 * Filter-able to a given set of inference server pods
   - latency heatmaps (configurable rate interval)
     . agent -> inference srv -> agent
     . inference srv -> model -> inference srv
   - in-flight inference requests
   - CPU usage
* remove envoy route for experiment before adding/updating the new one

* add test for new model version

* update test to check for versions

* remove unnecessarily check for routes

* add iris2 as model example
* adding pipeline name validation

* adding some validation for model names

* hyphens not dashses

* modifying the regex

* check for k8s format and existence of .

* fixing the error message

* fixing another error message

* removing grouping

* simplify regex
Bumps ubi9/ubi-micro from 9.4-13 to 9.4-15.

---
updated-dependencies:
- dependency-name: ubi9/ubi-micro
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…nfluentinc/confluent-kafka-go/v2/kafka/splunkkafka (#5881)

Bumps [github.com/signalfx/splunk-otel-go/instrumentation/github.com/confluentinc/confluent-kafka-go/v2/kafka/splunkkafka](https://github.com/signalfx/splunk-otel-go) from 1.16.0 to 1.19.0.
- [Release notes](https://github.com/signalfx/splunk-otel-go/releases)
- [Changelog](https://github.com/signalfx/splunk-otel-go/blob/main/CHANGELOG.md)
- [Commits](signalfx/splunk-otel-go@v1.16.0...v1.19.0)

---
updated-dependencies:
- dependency-name: github.com/signalfx/splunk-otel-go/instrumentation/github.com/confluentinc/confluent-kafka-go/v2/kafka/splunkkafka
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps ubi9/ubi-micro from 9.4-13 to 9.4-15.

---
updated-dependencies:
- dependency-name: ubi9/ubi-micro
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps ubi9/ubi-minimal from 9.4-1194 to 9.4-1227.

---
updated-dependencies:
- dependency-name: ubi9/ubi-minimal
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps ubi9/ubi-micro from 9.4-13 to 9.4-15.

---
updated-dependencies:
- dependency-name: ubi9/ubi-micro
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* feat(env): parameters exposed as env variables

* fix(typo): typo updates
* fix(bug): time units fix

* fix(bug): MaxLoadElapsedTimeMinute is in minutes
… on reconnect (#5893)

* Make servernotify message a list

* update copyright

* Convert operator to use adjust ServerNotify api

* Adjust scheduler with ServerNotify changes

* remove dead code from grpc proxy

* Remove unused parameter

* add test for server notify

* make helper func deal with a list of servers

* remove extra condition for 0 servers

* add utility to send servers on reconnect to scheduler

* add handle servers on reconnects

* add test for handler

* lint fixes

* tidy up note

* add mock server

* add test scaffolding for server status

* add server subscribe test

* fix typo
Bumps envoyproxy/envoy from v1.31.0 to v1.31.1.

---
updated-dependencies:
- dependency-name: envoyproxy/envoy
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps rclone/rclone from 1.67.0 to 1.68.0.

---
updated-dependencies:
- dependency-name: rclone/rclone
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…ler (#5913)

Bumps ubi9/ubi-minimal from 9.4-1227 to 9.4-1227.1725849298.

---
updated-dependencies:
- dependency-name: ubi9/ubi-minimal
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* adding an envoy dashbaord

* adding envoy as a provisioned dashboard

* small updates - mainly to filter by pod
Bumps envoyproxy/envoy from v1.31.1 to v1.31.2.

---
updated-dependencies:
- dependency-name: envoyproxy/envoy
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Start envoy xDS server last

* Orginise starter cmd for scheduler

* Add synchroniser interface and simple timerbased impl

* Separate out logic when agent connects

* Integrate simple synchroniser in code

* Add first sync option to ServerNotify

* Adjust controller to set isFirstSync

* Allow servernotify initial sync to start the sync process in scheduler

* Rename start to signal in synchroniser

* fFx test

* Add test for ServerNotify

* Change interface to allow for number of signals to wait for

* Changes from #5886 (to include server events)

* Add testing for servers (and other events) in hub

* Add test for AddServerReplica

* Add server sync impl

* Add test for server sync

* Add more testing coverage

* Add logging

* Wireup server changes in starter cmd

* Add extra logging

* Skip logging if not required

* Start timer from the begining.

* Tidy up logic and add more tests

* Wire up simple sync for the non k8s case

* Use waut group instead of a channel for sync

* Tidy up log messages

* Lint fixes

* Set default timeout for scheduler readiness in docker-compose setup

* Add explicit envar for scheduler ready timeout (compose)

* Fix lint

* Fix test

* Add architecture design at the start of the file for server sync

* Add new line

* Tidy up name in test

* Add parametrisation for helm for SCHEDULER_READY_TIMEOUT_SECONDS

* Add log message for variable

* Add note why xDS starts last

* Add extra wait for routes to be established.

* Tidy up event hub code

* Tidy up event handling code
…A-based scaling (#5932)

* add(operator): Model selector for scale subresource to enable HPA-based scaling

- updates the Model CRD to contain a pod selector in the scale subresource
- sets the selector to a label `server=[inference-server-name]` matching no actual pods
- docs
* adding a metadata object to seldonconfig components

* adding labels to the servers helm charts

* add metadata to seldonconfig.component

* using sed to template the templates

* add makefile changes

* patching each component with labels and annotations

* add a test

* missing forward slash

* hack labelz

* remove test labels

* add annotationz to servers

* add prometheus annotations to the default seldonconfig

* add a comment and a test
@sakoush sakoush requested a review from lc525 as a code owner September 25, 2024 13:07
Copy link
Member

@lc525 lc525 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@sakoush sakoush merged commit 978e7f3 into release-2.8 Sep 25, 2024
17 checks passed
@lc525 lc525 added the v2 label Sep 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants