Releases: tensorflow/serving
Releases · tensorflow/serving
2.17.1
2.18.0
2.18.0-rc0
Major Features and Improvements
- No major features or improvements.
Breaking Changes
- No breaking changes.
Bug Fixes and Other Changes
- Extend GbmcChannel interface to implement redfish channel for TPUs (commit: 683cb64)
- Add tests to validate monitoring states. (commit: fab5c05)
- Disable xnn_enable_avx256vnnigfni (commit: 19f9ccf)
- Reduce duplicate code using a test class (commit: 51cf3a7)
- Define an option to specify different IFRT client. (commit: aca5cfa)
- Add release notes for tf-serving 2.17.0 (commit: b72a86e)
- avoid SetNumLoadThreads stall the server by forcing reset ThreadPool (commit: 6b9cf7c)
- Add max_enqueued_batches option for model servers (commit: 7c99259)
- Remove gpr_set_log_verbosity from grpc_client.cc (commit: 6e05a38)
- Add option to stop retrying on permanent loading errors. (commit: 9ba72fa)
- Add the batch_padding_policy attribute the tensorflow serving api. (commit: ea02141)
- Improve handling of large JSON objects. (commit: 6cb0131)
- Silence warnings from external code (commit: 010d61a)
- Migration of the histogram header and cc code for TSL. Move tsl/lib/histogram to compiler/tsl/lib/histogram and update users. (commit: ab33df4)
- Add hermetic CUDA repository rule calls to TF serving project. (commit: 787c85f)
- Update users of
status_test_util
to use the new location inxla/tsl
(commit: 22b2b1e) - Bump Bazel version from 6.4.0 to 6.5.0. (commit: 82e532f)
- provide an option to customize the sort order among servable names (commit: 32a85a8)
- Remove cc_api_version stage 4: deletion where cc_api_version = 2 (commit: 7e0c196)
- Remove cc_api_version stage 4: deletion where cc_api_version = 2 (commit: 48e0f56)
- This is a noop comment update for streaming inputs. (commit: cfac240)
- Add a resource kind for number of LoRA models. (commit: 6b7ba27)
- Disable more warnings to make logs cleaner (commit: 4a830ca)
- Add
bool return_single_response
field toPredictStreamedOptions
. (commit: 648c9ee) - Use gcc-10 to avoid build issues while building XLA on CI (commit: 8bd1fda)
- Create separate
kokoro
config (commit: dbc7681) - Remove top-level .bazelrc settings now that scripts use
--config=kokoro
(commit: f920b98) - Update Dockerfile.devel to build with gcc-10 (commit: f9c0262)
- Move
tsl/lib/monitoring
toxla/tsl/lib/monitoring
(commit: cb934df) - Delete 'enable_lazy_split', since the flag is not used anywhere. The code paths for the above flag being false are retained and true are eliminated. This will ensure that improving batching will be easier. (commit: 873993f)
- BUILD rule fix. (commit: d89b272)
- Automated Code Change (commit: 4decd0a)
- Automated Code Change (commit: 0b05e86)
- Fix build error (commit: d341c34)
- Added capability to use XLA on a GPU. (commit: e5e795f)
- Update version for 2.18.0-rc0 release. (#2258) (commit: d6d4022)
- Mark Tensorflow compatible with Protobuf v26+. (#2261) (commit: 424dba4)
- Update version for 2.18.0-rc0 release. (#2262) (commit: 67f4ee8)
- This release is based on TF version 2.18.0-rc2.
2.17.0
Major Features and Improvements
- No major features or improvements.
Breaking Changes
- No breaking changes.
Bug Fixes and Other Changes
- Add RequestOptions and DeterministicMode options. (commit: a8b200b)
- Remove usages of bridge fallback. (commit: 98570a6)
- Provide a runtime option to lower bound the number of batch threads. (commit: 50b07e4)
- Avoid GetChildren when using Specific servable versions (commit: 6fb9403)
- Add python clif target for prediction_log.proto. (commit: 39ba623)
- Build with --xnn_enable_avx512amx=false (commit: f6c4219)
- Update comment in tfrt_saved_model_factory.h for wrong param name. (commit: 14ce911)
- Upgraded libevent to 2.1.12. Fixed minor bug in EvHTTPServer. (commit: 2cda80a)
- Introduce RequestRecorder in tfrt_servable so that implementation can record customized costs and metrics. (commit: 749007b)
- Integrate TFRT+IFRT with tensorflow serving (commit: a8b64dd)
- Add core selector support for TFRT+IFRT serving on tensorflow serving (commit: 84a71a4)
- Remove GPR_ASSERT . (commit: 2dca3af)
- Add timeout support when waiting on servables to load. (commit: 093d841)
- Build with --xnn_enable_avx512fp16=false (commit: eeac086)
- Support paging in TfrtSavedModelServable. (commit: 993a53c)
- Add max_enqueued_batches option for model servers. (commit: d914192)
- Add max_enqueued_batches option for model servers. (commit: 67a2dcb)
- Update version for 2.17.0 release. (#2225) (commit: 68eda92)
- Include patch files necessary for building at TF 2.17 (commit: 6311b72)
- This release is based on TF version 2.17.0.
2.16.1
2.15.1
2.15.0-rc0
Major Features and Improvements
- No major features or improvements.
Breaking Changes
- No breaking changes.
Bug Fixes and Other Changes
- Moves model server TFRT integration code oss (commit: 50ebab4)
- Add an option to override to the size of GPU system (commit: 445a87b)
- This cl is causing test failures and we are rolling it back. (commit: a39289b)
- Default signature_method_check to false (commit: 4711a8d)
- Add an option to propagate current Context in periodic functions from AspiredVersionsManager. (commit: e4a8a87)
- Refactor
Servable::PredictStreamed
so that implementations can support bidirectional streaming if needed (commit: a8c3ea6) - Create koltin proto library for the tensor flow protos. (commit: cae3164)
- Create and use Kotlin proto targets for model.proto and predict.proto (commit: ea9529e)
- Add release notes for tf-serving 2.13.1 (commit: 45fae91)
- Resubmit to move model server TFRT integration code oss (commit: eb5b3a5)
- Enable BF16 Automatic Mixed Precision (commit: 970c630)
- Follow expected format (commit: 60a3d73)
- Remove upper_cost_threshold in TFRT serving (commit: 7f8d9d7)
- Build tensorflow_model_server with -rdynamic (commit: fc89240)
- Add peak memory resource kind. (commit: 96e0661)
- Fix typo (commit: c0b35c7)
- Update warmup documentation (commit: 90148d7)
- Implement Freeze() in pathways/tfrt serving. (commit: 0117fd4)
- This CL is a no-op (commit: b75349d)
- OSS remote_op_config_rewriter.proto (commit: ba47377)
- Add release notes for tf-serving 2.14.0-rc0 (commit: 4d5ecfd)
- Add flags for gpu multi-streaming support. (commit: 77cabde)
- Add release notes for tf-serving 2.14.0-rc1 (commit: a3023de)
- Add 3 new resource kinds constants for GPU. (commit: 6b6dea3)
- Adding flag allowing to turn off automatic TPU system initialization on startup. (commit: f83bc0c)
- Add release notes for tf-serving 2.14.0 (commit: 60976ef)
- Annotate which model is missing inputs. (commit: c99b18b)
- ebpf-transport-monitoring adding dependency on net_http. (commit: 152ef4e)
- Add release notes for tf-serving 2.14.1 (commit: 83d9709)
- OSS saved_model_config library, removes saved_model_config_stub/impl, moves GraphRewriter related API from session_bundle_util to graph_rewriter.h. (commit: 7356bbd)
- No-op. (commit: 9d02d89)
- Upgrading Bazel version from 6.1.0 to 6.4.0 (commit: 34521dc)
- Set xnn_enable_avxvnni=false in .bazelrc (commit: 4aed749)
- Add cuda-nvml-dev-11-8 to Dockerfile.gpu (commit: b2def71)
- Revert problem with incorrect Dart build rules and targets. (commit: b6bccce)
- Add cuda-nvml-dev-11-8 to Dockerfile.devel-gpu (and remove from Dockerfile.gpu) (commit: 028aac5)
- OSS tfrt_http_api_handler*. (commit: 8ded4ce)
- Added FileAcl to tsl::FileSystem. (commit: d6c0917)
- Remove metadata size check in GetModelMetadata method in order to be consistent with other servable impl. (commit: a635552)
- Replace the global registration with a registration class so that when we move server_init_internal to OSS we won't run into undetermined global registration sequence issue. (commit: 21d8f88)
- Move TPU runner init stub to tensorflow serving OSS directory. (commit: 2b9e58c)
- Add util function to verify if override resource have a subset of device kind of base resource. This is not used by OSS. (commit: 06ff18d)
- Add streaming options for predict request. (commit: 8ccd8a5)
- Define how tensors will be split for SPLIT streamed requests. (commit: b581572)
- Add a client_id field for custom servables. (commit: eb57852)
- Add option to configure the name of the input layer of remote model. (commit: f1e1341)
- Added grpc reflection service to the serving binary. (commit: c140e01)
- Add the option to enable GRPC health checking to model_server. This is useful for clients that want to use health checking with load balancing channels (if not we get errors on the client side). The current implementation is trivial, once we open our serving port we assume we we always be healthy but users may want to tweak this, specially if they need a mandated version, etc. (commit: a9a8e7b)
- Automated Code Change (commit: f761fc7)
- Update description of model versioning. (commit: d820234)
- Exported
FindMetaGraphDef
function. (commit: 0df0975) - Automated Code Change (commit: 27923d3)
- Automated Code Change (commit: 704e250)
- If accepting_requests_ is not set Terminate() returns without doing anything. (commit: c45fe14)
- Automated Code Change (commit: fce1804)
- Modify PredictStreamed to return a response or an error. (commit: 5b5d30f)
- Add support to use a MockServable in MockServerCore. (commit: 5b6e0b6)
- Fix OSS cpu build. (commit: 72acbaf)
- Adds functionality to send TSL metrics over model_service RPC. (commit: 9564ef6)
- Add a method in tensorflow::serving::Servable to indicate whether a servable is critical. (commit: 5c0299e)
- Upgrade to CUDA 12.2 and CuDNN 8.9.4 (commit: f82600a)
- Fixes tensorflow_serving continuous build. (commit: a99fb9c)
- Add headers. (commit: fab7271)
- Remove the criticality field in the BatchingSessionTask. (commit: b8663d0)
- Move gpu docker build clang. (commit: 611c5a9)
- Updated Dockerfile.devel-gpu to run setup.sources.sh from repo. (commit: d3102f0)
- Add an interface for all Servables that support paging. (commit: e4716e5)
- Update cuda libraries to match TF (commit: 45446cf)
- Match libraries with Dockerfile.devel-gpu (commit: f6ef270)
- Update version for 2.15.0-rc0 release. (#2209) (commit: 73ba2b9)
- Resolve breakages for 2.15 release. (commit: 3181292)
2.14.1
2.14.0
2.14.0-rc1
Major Features and Improvements
- No major features or improvements.
Breaking Changes
- No breaking changes.
Bug Fixes and Other Changes
- This release is based on TF version 2.14.0-rc1