Releases: broadinstitute/cromwell
33.1
33
33 Release Notes
Query endpoint
Exclude workflows based on Labels
This gives the ability to filter out workflows based on labels. Two new parameters called excludeLabelAnd
and excludeLabelOr
can be used for this purpose.
More details on how to use them can be found here.
Include/Exclude subworkflows
Cromwell now supports excluding subworkflows from workflow query results using the includeSubworkflows
parameter. By default they are included in the results.
More information can be found at REST API.
Query workflows by Submission time
Cromwell now supports querying workflows by submission time. This will help find workflows that are submitted but not started yet (i.e. workflows which are
in On Hold state). More information can be found here.
Submission time in Workflow Query Response
Submission time of a workflow is now included in WorkflowQueryResult, which is part of the response for workflow query.
File Localization (NIO) Hint
Cromwell now allows tasks in WDL 1.0 can now specify an optimization in their parameter_meta
that some File
inputs do not need to be localized for the task to run successfully.
Full details are available in the documentation page for this optimization.
Bug Fixes
Workflows which are in 'On Hold' state can now be fetched using the query endpoint.
32
32 Release Notes
Backends
Pipelines API V2
Initial support for Google Pipelines API version 2.
Expect feature parity except for private dockerhub images which are not supported at the moment, but will be in the near future.
Additionally, the "refresh token" authentication mode is NOT supported on PAPI V2.
In addition, the following changes are to be expected:
- Error messages for failed jobs might differ from V1
- The Pipelines API log file content might differ from V1
Important (If you're running Cromwell with a Google backend, read this):
The actor-factory
value for the google backend (cromwell.backend.impl.jes.JesBackendLifecycleActorFactory
) is being deprecated.
Please update your configuration accordingly.
PAPI Version | actor-factory |
---|---|
V1 | cromwell.backend.google.pipelines.v1alpha2.PipelinesApiLifecycleActorFactory |
V2 | cromwell.backend.google.pipelines.v2alpha1.PipelinesApiLifecycleActorFactory |
If you don't update the actor-factory
value, you'll get a deprecation warning in the logs, and Cromwell will default back to PAPI V1
Task Retries
Cromwell now supports retrying failed tasks up to a specified count by declaring a value for the maxRetries key through the WDL runtime attributes.
Labels
- Cromwell has removed most of the formatting restrictions from custom labels. Please check the README for more detailed documentation.
- Custom labels won't be submitted to Google backend as they are now decoupled from Google's default labels.
- Cromwell now publishes the labels as soon as the workflow is submitted (whether started or on hold). If the labels are invalid, the workflow will not be submitted and request will fail.
Scala 2.11 Removed
From version 32 onwards we will no longer be publishing build artifacts compatible with Scala 2.11.
- If you don't import the classes into your own scala project then this should have no impact on you.
- If you are importing the classes into your own scala project, make sure you are using Scala 2.12.
Input Validation
Cromwell can now validate that your inputs files do not supply inputs with no impact on the workflow. Strict validation will be disabled by default in WDL draft 2 and CWL but enabled in WDL draft 3. See the 'Language Factory Config' below for details.
Language Factory Config
All language factories can now be configured on a per-language-version basis. All languages and versions will support the following options:
enabled
: Defaults totrue
. Set tofalse
to disallow workflows of this language and version.strict-validation
: Defaults totrue
for WDL draft 3 andfalse
for WDL draft 2 and CWL. Specifies whether workflows fail if the inputs JSON (or YAML) file contains values which the workflow did not ask for (and will therefore have no effect). Additional strict checks may be added in the future.
API
- More accurately returns 503 instead of 500 when Cromwell can not respond in a timely manner
- Cromwell now allows a user to submit a workflow but in a state where it will not automatically be picked up for execution. This new state is called 'On Hold'. To do this you need to set the parameter workflowOnHold to true while submitting the workflow.
- API end point 'releaseHold' will allow the user to send a signal to Cromwell to allow a workflow to be startable, at which point it will be picked up by normal execution schemes.
GPU
The PAPI backend now supports specifying GPU through WDL runtime attributes:
runtime {
gpuType: "nvidia-tesla-k80"
gpuCount: 2
zones: ["us-central1-c"]
}
The two types of GPU supported are nvidia-tesla-k80
and nvidia-tesla-p100
Important: Before adding a GPU, make sure it is available in the zone the job is running in: https://cloud.google.com/compute/docs/gpus/
Job Shell
Cromwell now allows for system-wide or per-backend job shell configuration for running user commands rather than always
using the default /bin/bash
. To set the job shell on a system-wide basis use the configuration key system.job-shell
or on a
per-backend basis with <config-key-for-backend>.job-shell
. For example:
# system-wide setting, all backends get this
-Dsystem.job-shell=/bin/sh
# override for just the Local backend
-Dbackend.providers.Local.config.job-shell=/bin/sh
For the Config backend the value of the job shell will be available in the ${job_shell}
variable. See Cromwell's reference.conf
for an example
of how this is used for the default configuration of the Local
backend.
Bug Fixes
The imports zip no longer unpacks a single (arbitrary) internal directory if it finds one (or more). Instead, import statements should now be made relative to the base of the import zip root.
Reverting Custom Labels
Reverting to a prior custom label value now works.
"Retrieves the current labels for a workflow"
will return the most recently summarized custom label value.
The above endpoint may still return the prior value for a short period of time after using
"Updated labels for a workflow"
until the background metadata summary process completes.
Deleting Duplicate Custom Label Rows
If you never used the REST API to revert a custom label back to a prior value you will not be affected. This only applies to workflows previously updated using
"Updated labels for a workflow".
The database table storing custom labels will delete duplicate rows for any workflow label key. For efficiency purposes
the values are not regenerated automatically from the potentially large metadata table.
In rare cases where one tried to revert to a prior custom label value you may continue to see different results
depending on the REST API used. After the database update
"Retrieves the current labels for a workflow"
will return the most-recent-unique value while
"Get workflow and call-level metadata for a specified workflow"
will return the up-to-date value. For example, if one previously updated a value from "value-1"
> "value-2"
>
"value-3"
> "value-2"
then the former REST API will return value-3
while the latter will return value-2
.
Workflow options google_project
output in metadata
Workflow metadata for jobs run on a Google Pipelines API backend will report the google_project
specified via a
workflow options json.
31.1
31
31 Release Notes
-
Cromwell server
The Cromwell server source code is now located underserver/src
.sbt assembly
will build the runnable Cromwell JAR in
server/target/scala-2.12/
with a name likecromwell-<VERSION>.jar
. -
Robustness
- The rate at which jobs are being started can now be controlled using the
system.job-rate-control
configuration stanza. - A load controller service has been added to allow Cromwell to self-monitor and adjust its load accordingly.
The load controller is currently a simple on/off switch controlling the job start rate. It gathers metrics from different parts of the system
to inform its decision to stop the creation of jobs.
You can find relevant configuration in theservices.LoadController
section of thecromwell.examples.conf
file,
as well as in theload-control
section inreference.conf
.
The load level of the monitored sub-systems are instrumented and can be found under thecromwell.load
statsD path. - The statsD metrics have been re-shuffled a bit. If you had a dashboard you might find that you need to update it.
Changes include:- Removed artificially inserted "count" and "timing" the path
- Added a
load
section - Metrics were prefixed twice with
cromwell
(cromwell.cromwell.my_metric
), now they're only prefixed once - Added
processed
andqueue
metrics under various metrics monitoring the throughput and amount of queued work respectively - Added a memory metric representing an estimation of the free memory Cromwell thinks it has left
- The rate at which jobs are being started can now be controlled using the
-
Added a configuration option under
docker.hash-lookup.enabled
to disable docker hash lookup.
Disabling it will also disable call caching for jobs with floating docker tags. -
Rest API
- Updated the
/query
response to include the total number of query results returned. See here for more information.
- Updated the
-
Language APIs
- The WDL library import from Cromwell 30 has split in two and its scala packages have changed.
- The WDL draft 2 parser is now in
cromwell-wdl-model-draft2
and its classes have moved from thewdl4s.parser
package towdl.draft2.parser
. - The WDL object model is now in
cromwell-wdl-model-draft2
and its classes have moved from thewdl
package towdl.draft2.model
. - The WDL to WOM transform functions are now in
cromwell-wdl-transforms-draft2
. The functions were removed from their object model classes and are now found in their own objects inwdl.draft2.transforms.wdlom2wom
.
30.2
30.1
A set of bug fixes following the migration of Cromwell to WOM (the Workflow Object Model) in version 30.
30
30 Release Notes
Breaking changes
- The
customLabels
form field for workflow submission has been renamed tolabels
.
Other changes
-
New Cromwell documentation
Our documentation has moved from our README to a new website: Cromwell Documentation. There are new Tutorials and much of the documentation has been re-written. The source files are in the /docs directory. -
API
- Cromwell now supports input files in the yaml format (JSON format is still supported).
- Added a GET version for the
labels
endpoint which will return current labels for a workflow.
-
Database
You have the option of storing the metadata in a separate SQL database than the database containing the internal engine data. When switching connection information for an existing database containing historical data, the tables should be manually replicated from one database instance to another using the tools appropriate for your specific database types. Cromwell will not move any existing data automatically. This feature should be considered experimental and likely to change in the future. See the Database Documentation or thedatabase
section in cromwell.examples.conf for more information. -
StatsD
Added initial support for StatsD instrumentation. See the Instrumentation Documentation for details on how to use it. -
User Service Account auth mode for Google
Added a new authentication mode for Google Cloud Platform which will allow a user to supply the JSON key file in their workflow options to allow for per-workflow authentication via service account. This is analogous to the previously existing refresh token authentication scheme. As with the refresh token scheme it is encouraged that the user_service_account_json workflow option field is added to the encrypted-fields list in the configuration. -
Bugfixes
Abort of Dockerized tasks on the Local backend should now work as expected. Cromwell usesdocker kill
to kill the Docker container.
29
Breaking Changes
-
Command line
In preparation for supporting CWL scripts (yes, you read that right!), we have extensively revised the Command Line in Cromwell 29. For more details about the usage changes please see the README. And stay tuned to the WDL/Cromwell blog over the next couple of months for more news about CWL. -
Request timeouts
Cromwell now returns more specific503 Service Unavailable
error codes on request timeouts, rather than the more generic500 Internal Server Error
. The response for a request timeout will now be plain text, rather than a JSON format. -
Metadata endpoint
The response from the metadata endpoint can be quite large depending on your workflow. You can now opt-in to have Cromwell gzip your metadata file, in order to reduce file size, by sending theAccept-Encoding: gzip
header. The default behavior now does not gzip encode responses. -
Engine endpoints
Previously the engine endpoints were available under/api/engine
but now the endpoints are under/engine
so they don't require authentication. Workflow endpoints are still available under/api/workflows
. We also deprecated the settingapi.routeUnwrapped
as a part of this internal consistency effort. -
Call caching diff
We updated the response format of the callcaching/diff endpoint.
Other changes
-
Cromwell server
When running in server mode, Cromwell now attempts to gracefully shutdown after receiving aSIGINT
(Ctrl-C
) orSIGTERM
(kill
) signal. This means that Cromwell waits for all pending database writes before exiting, as long as you includeapplication.conf
at the top of your config file. You can find detailed information about how to configure this feature in the Cromwell Wiki. -
Concurrent jobs
You can now limit the number of concurrent jobs for any backend. Previously this was only possible in some backend implementations. Please see the README for details.
WDL
-
Optional WDL variables
Empty optional WDL values are now rendered as thenull
JSON value instead of the JSON string"null"
in the metadata and output endpoints. You do not need to migrate previous workflows. Workflows run on Cromwell 28 and prior will still render empty values as"null"
. -
Empty WDL variables
Cromwell now acceptsnull
JSON values in the input file and coerces them as an empty WDL value. WDL variables must be declared optional in order to be supplied with anull
JSON value.
input.json
{
"null_input_values.maybeString": null,
"null_input_values.arrayOfMaybeInts": [1, 2, null, 4]
}
workflow.wdl
workflow null_input_values {
String? maybeString
Array[Int?] arrayOfMaybeInts
}