Skip to content

Releases: HDFGroup/hermes

v1.2.1

16 Aug 05:04
76dd5ae
Compare
Choose a tag to compare

What's Changed

Full Changelog: v1.2.0...v1.2.1

v1.2.0

08 Feb 18:43
4333b6b
Compare
Choose a tag to compare

This release marks the point of several important bug fixes. For performance and scalability, a long-running issue with nested RPCs causing deadlocks in large-scale workloads has been fixed. Previously, Hermes would wait for an entire task to execute before an RPC completed. This resulted in the requirement of having at least 1 RPC thread for each node Hermes was running on, becoming problematic at scales of larger than a few hundred nodes. Now RPCs are used only for the transfer of tasks, and do not wait for their completion. Hermes can now run with a single RPC thread per node, regardless of scale. In addition, we have changed our github actions to rely on Dockerhub. This improves the performance of actions dramatically while giving the benefit of having a maintained container. Lastly, we have made some changes to the Hermes spack. We now rely on thallium with cereal to maintain compatability with future mochi releases.

What's Changed

Full Changelog: v1.1.1...v1.2.0

v1.1.0

16 Oct 15:43
d38d46d
Compare
Choose a tag to compare

Hermes 1.1.0 has been released. This release features a major restructuring of Hermes to improve I/O latency and scalability through asynchronous and partial I/O operations. Hermes now follows a task-based design, where I/O commands are converted into tasks and processed asynchronously. We leverage this asynchronicity to improve the performance of metadata and data updates for latency-sensitive producer workloads by nearly 100x over the previous release while leveraging properties of queuing to guarantee strong consistency without locking. In addition, Hermes now supports partial get and put operations. This enables small I/O requests to be merged into the same blob, dramatically reducing the amount of metadata stored in Hermes for latency-sensitive I/O workloads (e.g., deep learning) in addition to eliminating I/O amplification and lock contention for same-blob updates in the Hermes adapters. We have evaluated Hermes 1.1 underneath a variety of real programs, a summary is located here.

What's Changed

Full Changelog: v1.0.5-beta...v1.1.0

v1.0.5

16 Oct 14:10
45522e4
Compare
Choose a tag to compare

This release contains minor updates to spack, improvements to the Hermes CMake, and minor updates to portability. This is the last release before the task-based design of Hermes is used.

What's Changed

Full Changelog: v1.0.0-beta...v1.0.5-beta

v1.0.0

30 May 13:47
9b0058f
Compare
Choose a tag to compare
v1.0.0 Pre-release
Pre-release

Hermes 1.0.0 has been released. Hermes is a multi-tiered I/O buffering platform which can be used to accelerate data access for large-scale scientific applications. This represents the first feature-complete release of Hermes.

For applications that produce data, Hermes intelligently makes initial data placement decisions using the Data Placement Engine. Hermes supports various data placement policies, each with different considerations to hardware characteristics and application I/O patterns. For high-bandwidth checkpoint-restart workloads, for example, Hermes can place data in the fastest available tiers. Data can be placed either locally on the node producing data, remotely, or both. After inital data placement, data can be re-shuffled in the hierarchy using the buffer organizer (BORG) and prefetcher. The BORG demotes data based on their observed access frequency and last time accessed. For checkpoint-restart workloads, demoting data can make space available in high-performing tiers to accelerate future checkpoints.

For workloads which read data, Hermes can accelerate I/O through prefetching and data staging. Hermes provides a policy-based Prefetcher component that promotes data expected to be accessed in the near future. Prefetchers are policy-based in order to represent diverse application behaviors. We currently provide a prefetcher tailored for deterministic I/O workloads, which is fairly common. Deep learning applications, for example, have randomness seeds which can be used to make I/O behavior completely reproducable. Many scientific analysis codes predictably read a batch of data and then perform analysis. For these cases, Hermes comes equipped with an Apriori Prefetcher which parses and executes a user-defined schema file indicating when and where to prefetch data. In addition to prefetching, Hermes also provides data staging, which can import entire datasets from services external to Hermes (e.g., a PFS) and place them in the hierarchy for analysis.

We have evaluated Hermes 1.0.0 underneath various benchmarks and real applications. The Grey-Scott Model, for example, is a reaction-diffusion code that simulates the chemical reaction between two substances diffusing over time. We found that Hermes can improve I/O performance by 3x by intelligently buffering data in faster tiers and asynchronously flushing during checkpoints. A detailed summary of our benchmarks is located here.

We would like to thank the NSF for supporting our research. We invite the community to try Hermes and contribute. We would love to hear about use cases, desired features, and any improvements that we can make.

v0.9.9

25 Apr 18:35
ee0dc63
Compare
Choose a tag to compare
v0.9.9 Pre-release
Pre-release

This release primarily focuses on changes to CI, portability issues, and bug fixes. We have completely decoupled hermes from MPI, which provides more portability across HPC and Cloud machines. We have also addressed some performance issues relating to metadata performance in workloads which produce many small objects.

v0.9.8-beta

06 Mar 17:28
41fe770
Compare
Choose a tag to compare
v0.9.8-beta Pre-release
Pre-release

Hermes 0.9.8 has been released. This release features tagging. Tags enable users to semantically define associations between blobs and provide an intuitive way of locating blobs which are related. Tags can be used internally by Hermes to provide enhanced data placement decisions based on the logical grouping of data. Traits can be attached to tags to transparently perform a set operations on a group of related blobs. For example, a combination of tags and traits can indicate that a group of blobs is to be compressed and encrypted.

This release also features enhanced portability. Hermes no longer requires client programs to follow MPI design patterns. In addition, we have addressed a number of issues which have prevented Hermes from being installed on recent OS versions.

v0.9.5-beta

06 Feb 21:45
d7010a8
Compare
Choose a tag to compare
v0.9.5-beta Pre-release
Pre-release

This release features enhanced concurrency control and more attention to useability. Hermes now provides dynamic data structures, which avoids forcing users to configure strict limits on data structures such as Buckets and VBuckets for the sake of simplicity, reducing accidental segfaults due to misconfiguration. In addition, these data structures reduce the complexity of extending Hermes to support new policies, such as Data Placement Engines and Prefetchers.

Hermes 0.9.0-beta

28 Nov 10:26
cd7f515
Compare
Choose a tag to compare
Hermes 0.9.0-beta Pre-release
Pre-release

Hermes 0.9.0-beta

What's Changed

New Features

  • Provide stage-in / stage-out capabilities

Other Changes

  • Unified the adapter codes to inherit from the same base class (reduce code duplication)
  • Add a new ADAPTER_MODE (WORKFLOW) to enable data to remain in Hermes after program closes
  • Added an explict hermes_finalize script to ensure hermes daemon is stopped
  • Make it so that IOR does not require transfer size to be a multiple of the file page size
  • Make MinimizeIoTime DPE respect capacity constraints
  • Support for parallel I/O to POSIX / STDIO files

Full Changelog: v0.8.0-beta...v0.9.0-beta

Hermes 0.8.0-beta

25 Aug 18:26
b76aade
Compare
Choose a tag to compare
Hermes 0.8.0-beta Pre-release
Pre-release

What's Changed

New Features

Other Changes

  • Make gflags an optional dependency by @ChristopherHogan in #427
  • Fix GLPK bug when one or more constraints are disabled. by @hyoklee in #430
  • Fix several CI and build-related issues by @ChristopherHogan in #433
    • Update spack to 0.18.0
    • Update hdf5 to 1.13.1
    • Install hdf5 with spack instead of manually.
    • Add IOR so that IOR+VFD tests are run.
    • Add glog as a dependency of libhermes.
  • Update to Catch2 version 3.0.1. by @hyoklee in #437

Full Changelog: v0.7.0-beta...v0.8.0-beta