Skip to content

Commit

Permalink
feat!: Custom and protobuf data converters (#477)
Browse files Browse the repository at this point in the history
- Adds custom data converter feature and changes the DataConverter API. Design doc: https://github.com/temporalio/sdk-typescript/tree/main/docs/data-converter.md
  ```ts
  interface DataConverter {
    payloadConverterPath?: string;
    payloadCodec?: PayloadCodec;
  }

  interface PayloadConverter {
    toPayload<T>(value: T): Payload | undefined;
    fromPayload<T>(payload: Payload): T;
  }

  interface PayloadCodec {
    encode(payloads: Payload[]): Promise<Payload[]>;
    decode(payloads: Payload[]): Promise<Payload[]>;
  }
  ```
  Note: Codec is not yet run on Payloads in interceptor headers.
- Unreverts #430 and modified the Protobuf data converter API: https://github.com/temporalio/sdk-typescript/tree/main/docs/protobuf-libraries.md#current-solution
- Separated `common` package into:
  ```
  common
  internal-workflow-common
  internal-non-workflow-common
  ```
  The new `common` only exports things you might want to use in your own common code (shared between client/worker/workflow) like data converters, failures, and errors. The default exports of `common` and `internal-workflow-common` are included in the Workflow bundle.
- Always include `assert` in the Workflow bundle.
- Closes #130
- Closes #237
- Closes #434
  • Loading branch information
lorensr authored Mar 2, 2022
1 parent e1213da commit 82351cc
Show file tree
Hide file tree
Showing 133 changed files with 4,815 additions and 2,483 deletions.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,5 @@ build/
commonjs/
es2020/
lerna*.log
packages/test/protos/json-module.js
packages/test/protos/root.d.ts
2 changes: 2 additions & 0 deletions .prettierignore
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ packages/docs/docs/api
typedoc-sidebar.js
lerna.json
README.md
packages/test/protos/root.d.ts
packages/test/protos/json-module.js

# Ignored since it fails Windows CI
packages/create-project/cli.js
60 changes: 60 additions & 0 deletions docs/data-converter.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
When designing the custom data converter feature, we considered two routes:

- Doing conversion outside of Workflow vm
- Doing conversion inside and outside of Workflow vm

### Outside vm

- Pro: Users can use any node module in their custom data converter code.
- Pro: Methods can be async (users can use Promises).
- Con: Only works because `vm` allows for passing complex objects into/out of vm. If we switch to an isolation method like `isolated-vm` or `rusty_v8`, conversion needs to be done inside.
- Con: `object instanceof Class` doesn't work on object that come from the vm, because the `Class` definition inside the vm is from a different instance of the code. A workaround like this must be used:

```ts
function workflowInclusiveInstanceOf(instance: unknown, type: Function): boolean {
let proto = Object.getPrototypeOf(instance);
while (proto) {
if (proto.constructor?.toString() === type.toString()) return true;
proto = Object.getPrototypeOf(proto);
}
return false;
}
```

## Decision

Given the possibility of switching or adding other isolation methods in future, we opted to convert to/from Payloads inside the vm (`PayloadConverter`). We also added another transformer layer called `PayloadCodec` that runs outside the vm, can use node async APIs (like `zlib.gzip` for compression or `crypto.scrypt` for encryption), and operates on Payloads. A `DataConverter` is a `PayloadConverter` and a `PayloadCodec`:

```ts
export interface DataConverter {
payloadConverterPath?: string;
payloadCodec?: PayloadCodec;
}

export interface PayloadConverter {
toPayload<T>(value: T): Payload | undefined;
fromPayload<T>(payload: Payload): T;
}

export interface PayloadCodec {
encode(payloads: Payload[]): Promise<Payload[]>;
decode(payloads: Payload[]): Promise<Payload[]>;
}
```

### Worker converter flow

`PayloadCodec` only runs in the main thread.

When `WorkerOptions.dataConverter.payloadConverterPath` is provided, the code at that location is loaded into the main thread and the webpack Workflow bundle.

`Worker.create`:
_main thread_

- imports and validates `options.dataConverter.payloadConverterPath`
- passes `payloadConverterPath` to `WorkflowCodeBundler`

`worker-interface.ts#initRuntime`:
_workflow vm_

- Imports `__temporal_custom_payload_converter`, which will either be the code bundled from `payloadConverterPath` or `undefined`. If it's defined, sets `state.payloadConverter`.
122 changes: 122 additions & 0 deletions docs/protobuf-libraries.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
Loren & Roey surveyed the available protobuf libraries in Dec '21 for use with our `ProtobufBinaryDataConverter` and `ProtobufJsonDataConverter`. The main criteria was:

- A. TypeScript types for messages
- B. Being able to check at runtime whether an object passed to the SDK as input or returned to the SDK from a workflow/query/activity is meant to be protobuf-serialized, without adding annotations to the functions.
- C. Spec-compliant [proto3 JSON encoding](https://developers.google.com/protocol-buffers/docs/proto3#json) so that the TS SDK is interoperable with the other SDKs

## Options

### protobufjs

A and B, but not C.

- Most popular lib (5M downloads/wk)
- Fairly inactive maintainers (infrequent updates, many open PRs & issues)
- [Non-standard](https://github.com/protobufjs/protobuf.js/issues/1304) JSON serialization
- Message classes with generated types and runtime-checkable instances

### proto3-json-serializer

C

- Adds spec-compliant JSON encoding to protobufjs
- Maintained by responsive Googlers, 900k downloads/wk
- Requires runtime-loaded messages (not compatible with generated classes)

### google-protobuf

B

- Official Google lib, 800k downloads/wk
- No types or JSON encoding
- Compiler installed separately (not on npm)

### ts-proto

A and some of C

- Generates TS interfaces and encoding functions
- Designed for POJOs (no instances of message classes), so can't do B
- JSON encoding is probably [not yet fully spec compliant](https://github.com/stephenh/ts-proto/pull/448#issuecomment-998166664)

### protoc-gen-ts

A and B

- Plugin for Google's `protoc` compiler
- Generated classes extend `google-protobuf`'s Message, but doesn't add JSON
- Maintainer [seems interested in JSON encoding](https://github.com/protocolbuffers/protobuf/issues/4540#issuecomment-915609405), but isn't there yet (only has `to/fromObject` methods—need eg a fromJSON that converts the below base64 to a bytearray, and a toJSON that converts a bytearray to base64)

## Current solution

- Use `protobufjs` with `proto3-json-serializer`
- Have users use runtime-loaded messages (not generated classes) and `Class.create` (not `new Class()`, which doesn't work with runtime-loaded messages)
- Patch `json-module` output (which adds `nested` attributes to lowercase namespaces [which causes a TS error](https://github.com/protobufjs/protobuf.js/issues/1014))

```ts
// json-module.js generated with:
// pbjs -t json-module -w commonjs -o json-module.js *.proto

// protos/root.js
const { patchProtobufRoot } = require('@temporalio/common');
const unpatchedRoot = require('./json-module');
module.exports = patchProtobufRoot(unpatchedRoot);

// root.d.ts generated with:
// pbjs -t static-module *.proto | pbts -o root.d.ts -

// src/data-converter.ts
import { DefaultPayloadConverterWithProtobufs } from '@temporalio/common/lib/protobufs';
import root from '../protos/root';

export const dataConverter = new DefaultPayloadConverterWithProtobufs({ protobufRoot: root });

// src/worker.ts
import { dataConverter } from './data-converter';

const worker = Worker.create({ dataConverter, ... });

// src/client.ts
import { foo } from '../protos/root';
import { dataConverter } from './data-converter';

const client = new WorkflowClient(connection.service, { dataConverter });
await client.start(protoWorkflow, {
args: [foo.bar.ProtoInput.create({ name: 'Proto', age: 1 })], // can't use `new foo.bar.ProtoInput()`
taskQueue: 'tutorial',
workflowId: 'my-business-id',
});

// src/workflows.ts
import { foo } from '../protos/root';

export async function protoWorkflow(input: foo.bar.ProtoInput): Promise<foo.bar.ProtoResult> {
return foo.bar.ProtoResult.create({ sentence: `Name is ${input.name}` });
}
```

We originally were thinking of this, but the namespaces in `json-module.js` get lost through `patchProtobufRoot()`:

```ts
import * as generatedRoot from '../protos/json-module';

const patchProtobufRoot = <T>(x: T): T => x;
const root = patchProtobufRoot(generatedRoot);

function myWorkflowError(input: root.foo.bar.ProtoActivityInput) {
return input.name;
}
```

On root in `root.foo.bar.ProtoActivityInput`, TS errors: `Cannot find namespace 'root'.`

## Future work

If we can get changes merged into `protobufjs` (or want to fork), we can do one or both of the below:

1. Change the `json-module` output to not have `nested` attributes so we don't have to patch
2. Add to the generated classes:

- spec-compliant `to/fromJSON` methods
- `typename` field that includes the namespace (eg `"foo.bar.MyMessage"`)
- "this is a generated file" comment @ top
5 changes: 3 additions & 2 deletions docs/sdk-structure.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,15 @@
# SDK Structure

The TypeScript SDK is developed in a mono-repo consisting of several packages managed with [`lerna`](https://lerna.js.org/), the public packages are listed below.
The TypeScript SDK is developed in a monorepo consisting of several packages managed with [`lerna`](https://lerna.js.org/). The public packages are:

- [`temporalio`](../packages/meta) - Meta package, bundles the common packages for ease of installation.
- [`@temporalio/worker`](../packages/worker) - Communicates with the Temporal service and runs workflows and activities
- [`@temporalio/workflow`](../packages/workflow) - Workflow runtime library
- [`@temporalio/activity`](../packages/activity) - Access to current activity context
- [`@temporalio/client`](../packages/client) - Communicate with the Temporal service for things like administration and scheduling workflows
- [`@temporalio/proto`](../packages/proto) - Compiled protobuf definitions
- [`@temporalio/common`](../packages/common) - Common package for other SDK packages
- [`@temporalio/workflow-common`](../packages/workflow-common) - Code shared between `@temporalio/workflow` and other packages
- [`@temporalio/common`](../packages/common) - All shared code (re-exports everything in `@temporalio/workflow-common`, and also has code shared between packages other than `@temporalio/workflow`)
- [`@temporalio/create`](../packages/create-project) - NPM package initializer

[Repo visualization](https://octo-repo-visualization.vercel.app/?repo=temporalio%2Fsdk-typescript)
Loading

0 comments on commit 82351cc

Please sign in to comment.