Ruby phase 1 #93

cretz · 2024-07-10T18:17:32Z

Summary

This is the proposal for the first phase of the Ruby SDK. This is the second discussion of this proposal, see Ruby SDK - Phase 1 - Initial Comment Period #92 for past discussion.
See it rendered.
Much of this is inspired by the current Temporal Ruby SDK (and its proposals which were moved to a sub directory) and our other SDKs.
Use this PR for comments/discussion. Don't be afraid of making too many comments; if it gets too noisy, I'll just create another PR.
Any points of contention will be discussed in-person by the SDK team and a decision will be made.

For the public, please also join us on #ruby-sdk in Slack.

cretz · 2024-07-12T19:56:12Z

ruby/sdk-phase-1.md

+* Custom metrics and optional metrics buffer for custom processing
+* Typed search attributes, memos, headers, etc
+* API key, RPC metadata, HTTP CONNECT proxy support, etc
+* Loggers with contextual information


This will likely be our own logger abstraction since we need a pluggable logger factory and Ruby stdlib doesn't really offer one

alexevanczuk · 2024-07-16T15:15:09Z

ruby/sdk-phase-1.md

+    but in general Ruby typing right now is a mess and we likely will hit bugs or advanced cases a yard/RBS combination
+    library won't work with.
+    * ❓ How hard do we want to work to avoid duplicate yard/RBS typing?
+  * 💭 Why not Sorbet? Sorbet can come later or provided separately, but RBS seems to be the newer movement.


Hey @cretz , the org where I work is very excited about the official Ruby SDK for temporal!

Just wanted to drop a note that we use Sorbet heavily and it seems like it's the much more popular typing system in the community (despite the introduction of RBS). It won't affect our usage much either way (although first-class sorbet typing would make it slightly easier to integrate), but if you were interested in exploring/chatting more about sorbet, just wanted to offer to jump on a synchronous call to talk through concerns/tradeoffs. I've added sorbet to many gems and applications and would be very happy to chat through things.

Thanks for prioritizing this work!!

Thanks! Yes, I definitely believe RBS is weaker than Sorbet, and less adopted, and Steep is not a great tool, but I am hoping to avoid non-stdlib deps where we can. We considered no typing at all and that still may be an option. But during PoC, RBS was acceptable for this limited use case, but I expect most users to treat this as an untyped library, including Sorbet users. So I definitely agree it won't affect most people's usage either way. At this point with such low RBS adoption, it essentially just becomes a check on ourselves (we did this successfully in the last incarnation at https://github.com/temporalio/sdk-ruby/tree/v0.1.1).

We were just wanting to avoid maintaining Sorbet annotations in repo. But we would totally encourage external ones at https://github.com/Shopify/rbi-central or https://github.com/sorbet/sorbet-typed or wherever.

Would definitely like to chat on this further. If easier, can come join us in #ruby-sdk in Slack. The pinned message in that channel also has a link to a survey w/ a request at the end to setup a meeting to discuss Ruby if that's preferred.

I filled out the survey, thanks!

I commented in that channel in case you wanted to chat through Sorbet more :)

Thanks again!

bdchauvette · 2024-07-16T16:32:23Z

ruby/sdk-phase-1.md

+  * 💭 Why classes instead of methods/functions like every other SDK? There are tradeoffs to both approaches. As
+    classes, they are more easily referenced, a bit more easily typed in RBS, and metadata can be applied more easily at
+    the class level. This comes at a cost of not being able to easily share state across classes, but users can provide
+    something in the constructor multiple activities use. Open to considering activity methods instead.


You might've already discussed this internally, but a middle ground here might be supporting any object that responds to #call rather than #execute.

Using #call would potentially allow simple activities to be implemented as a Proc or Lambda, while still allowing people to use custom classes for more complex scenarios.

In the spirit of "there's more than one way to do it", this would give SDK users multiple options for defining activities, and they could choose whichever one is most appropriate for their circumstances:

module Activities # Constant procs SomeProcActivity = ->(params) { # ... } # Proc factory methods def self.another_proc_activity(config) ->(params) { # can use config to customize proc } } # Maybe some kind of DSL? DSLActivity = Temporalio.activity(options) do |params| # ... end # Child classes for full control ChildClassActivity < Temporalio::Activity def initialize(config) # ... end def call(params) # ... end end end

# Activity registration using a hash, similar to TS API: worker1 = Temporalio::Worker.new( # ... activities: { SomeProcActivity: Activities::SomeProcActivity, AnotherProcActivity: Activities.another_proc_activity(config), DSLActivity: Activities::DSLActivity, ChildClassActivity: Activities::ChildClassActivity, # Even inline activities, which is almost certainly not a good # idea for production code, but could be useful in some situations. InlineActivity: ->(params) { # ... } } # ... )

And possibly a block-based API on the worker:

worker.register_activity('MyActivity') do |params| # ... end

As well as anything that responds to #to_proc:

worker.register_activity('MyActivity', &:my_activity)

All of these could of course also be supported using the #execute method, but then there'd be a discrepancy between the SDK's #execute and the proc's #call.

Some similar APIs:

the Rack specification, which makes heavy use of objects that respond to #call

Logger#formatter, which is documented as supported either a proc or an instance Logger::Formatter (which responds to #call)

AWS SDK uses a Callable interface for various callback-like options in some services

Using #call would potentially allow simple activities to be implemented as a Proc or Lambda, while still allowing people to use custom classes for more complex scenarios.

A couple of important aspects we want for activities are 1) they are easily referenceable by a caller without calling (e.g. Workflow.execute_activity(MyActivityClass, ..., w/ possible generic typing support w/ RBS/Sorbet), and 2) people can use existing things (e.g. mixins) and there is not much magic.

In the spirit of "there's more than one way to do it", this would give SDK users multiple options for defining activities,

We actually try to avoid this in SDKs where we can at declaration time. Sure some SDKs do have a couple of ways, but in general SDKs say "activities are defined as X" without many values for X and without more magic than we have to have. And remember there is the referenceability concern.

A couple of important aspects we want for activities are 1) they are easily referenceable by a caller without calling (e.g. Workflow.execute_activity(MyActivityClass, ..., w/ possible generic typing support w/ RBS/Sorbet), and 2) people can use existing things (e.g. mixins) and there is not much magic.

Got it. If I'm understanding them correctly, those all make sense to me as goals / requirements 👍

For (1), is that referring to the ability to call an activity with something like MyActivity.execute rather than workflow.execute_activity(MyActivity)? If so, then using a method other than #call definitely makes sense to me. IIUC, that'll help prevent users from trying to execute activity classes that haven't been wired up to Temporal.

We actually try to avoid this in SDKs where we can at declaration time. Sure some SDKs do have a couple of ways, but in general SDKs say "activities are defined as X" without many values for X and without more magic than we have to have. And remember there is the referenceability concern.

I totally get this, and I'm personally not a huge fan of how Ruby has adopted Perl's TIMTOWTDI philosophy. But I also think a lot of Rubyists definitely view that as one of Ruby's strengths, so just wanted to put in a plug for that way of thinking 😅

That being said, I don't think anyone will avoid using Temporal just because they can't implement activities as procs, so it's probably not worth the effort to implement (& support!) that level of flexibility.

is that referring to the ability to call an activity with something like MyActivity.execute rather than workflow.execute_activity(MyActivity)?

No, we want the latter. We learned from other SDKs that it is best to not make the remote activity invocation look like a regular local method invocation.

they can't implement activities as procs

Maybe we can have sugar for this, e.g. class ProcActivity < Temporalio::Activity and a class method on Temporalio::Activity that creates it. But it is the same type of sugar a user can write if we make sure that we support activities as classes.

bdchauvette · 2024-07-16T17:01:32Z

ruby/sdk-phase-1.md

+
+### Workflows in Client
+
+Early thoughts on workflows are that they'll be Ractors. For the purposes of this phase, all that may matter is how


I left a comment about this in the Google form as well, but I'm not sure if Ractors would be the right move here. They're still very experimental and still have some fundamental limitations:

https://rubykaigi.org/2023/presentations/ko1.html

Ractor was introduced in Ruby 3.0 as an experimental feature to enable parallel programming in Ruby. Now Ractor is not widely used for many reasons. For example, from the perspective of a Ractor implementer, Ractor does not have enough performance and quality. In this talk, we will explain which features are not enough, why they are difficult, and how to solve these difficulties.

https://rubykaigi.org/2024/presentations/ko1.html

This talk presents recent updates to Ractor, which enables parallel and concurrent programming on Ruby.

Ractor still lacks fundamental features. For example, we cannot use “require” method and “timeout” methods on non-main Ractors because of synchronization and implementation issues. We will discuss such problems and how to solve them. From a performance point of view, we have introduced the M:N thread scheduler in Ruby 3.3 and we will show the performance analysis with recent improvements.

They also still output scary warnings, even in Ruby 3.3:

irb(main):001> Ractor.new { puts RUBY_VERSION } (irb):1: warning: Ractor is experimental, and the behavior may change in future versions of Ruby! Also there are many implementation issues. 3.3.4 => #<Ractor:#2 (irb):1 terminated>

I definitely think that Ractors could be a good solution in a future world where Ractors are more polished, but IMO for now it would be safer to follow the coinbase/temporal-ruby SDK and use Fibers as the basis for scaling workflow executions.

One of the biggest challenges in SDK development is sandboxing for determinism. There are two main impediments to determinism: state isolation and restricted API use (e.g. disk IO). Ignoring the latter, if we can get the former it may be worth it. JS is lucky in that it can spin up a new V8 instance. But it is a nightmare for Go and Java and .NET and other devs to remember that they cannot share state. We expended a lot of effort making a Python sandbox just to get this state isolation so that globals don't pollute.

We will have to evaluate performance to see if this can work, and we can provide a way to opt-out of Ractors (for the most part none of the Ractor stuff would be visible to users unless they wanted to do bad things like expose something shareable).

Fibers as the basis for scaling workflow executions.

We will definitely still be using fibers (we will have our own fiber scheduler since we need deterministic coroutines), but if we can get state isolation by default, it may be worth the performance penalty. This is more of a phase 2 thing though, workflows aren't going to be done now.

Interestingly enough, deterministic isolated Temporal workflows is like one of the only good use cases I have seen for Ractors, heh. But it's subject to testing to make sure the good limitations are good and the bad limitations aren't too bad (and of course we can allow it to be disabled). Most users will never know there are Ractors under the hood powering workflows.

They also still output scary warnings, even in Ruby 3.3:

This is scary. Maybe it won't be ready. We will evaluate in phase 2.

Got it, that makes sense! If it's only for workflow-internal things, I think Ractors sound great.

But if the goal is to isolate global state even between activities, I could see that being very challenging with Ractors. For example, I'm not sure if ActiveRecord would would play nicely with Ractors, because (AFAIK) the connection pool wouldn't be able to be shared between Ractors.

Interestingly enough, deterministic isolated Temporal workflows is like one of the only good use cases I have seen for Ractors, heh.

Same, TBH 😅

I want to use them, but haven't really found a use case yet, especially given their current limitations :(

But if the goal is to isolate global state even between activities, I could see that being very challenging with Ractors

Definitely not the goal. Activities will not have anything to do with Ractors at all. Activities are just normal Ruby code where you can do what you want (we will allow different executors so people can choose whether to run fiber/async vs threaded (the default)). In fact, it is a very common use case to share state between activities. For instance, it is very normal to have multiple activity invocations share a database client. This is why we will allow users to instantiate the activity class, meaning the same instance will be called on each attempt (but if you pass the class in, we will instantiate per attempt).

Sounds great! Thanks for helping me understand!

bdchauvette · 2024-07-16T19:43:48Z

ruby/sdk-phase-1.md

+* API key, RPC metadata, HTTP CONNECT proxy support, etc
+* Loggers with contextual information
+* Full error hierarchy with `Temporalio::Error` as the root
+* ❓ Are there any Ruby libraries so widely adopted they might as well be in the standard library and therefore deserve


ActiveSupport is probably the big contender here, but TBH I don't know how much the Temporal would really benefit from having it available.

Optional support for things like ActiveSupport::Duration might be nice for things like start_delay, but that could be a pain to implement (& maintain) alongside normal Numeric arguments.

👍 We definitely plan to have samples showing integrations, but not sure we need an explicit integration for this. Whatever integrations we do have would be optional/additional because it is really important that we have as few dependencies as possible in the primary SDK because our dependency constraints become users' transitive constraints. I believe we can have only 1 dependency (protobuf) at this time. temporalio-opentelemetry is the only additional gem I can think we'll have at first. I expect active support, rails, etc samples to exist though.

bdchauvette · 2024-07-16T19:53:25Z

ruby/sdk-phase-1.md

+* Workflow handles exist with all relevant calls.
+* Note that we use the kwargs constructor + options struct pattern so that a user can access duped options, change one,
+  and reconstruct via `Client.new(**my_new_options.to_h)`
+  * ❓Is there a better way to allow people to easily change an option? We didn't want a `with` pattern because, at


One very common pattern in Ruby is to have config methods (including some initializers) accept an optional block that yields the config object for modification:

class Client def initialize(**options) @config = Configuration.new(**options) yield @config if block_given? # ... end end

client = Client.new do |config| config.namespace = "my-namespace" end

I think that would feel very natural for many Rubyists, but might be a pretty large departure from how the other SDKs are configured?

For client connect and initialization (and some others) we are wanting keyword arguments for all but one or two required positional args since the vast majority are optional and we're not really wanting multiple ways to configure the client (e.g. allow both block and kwargs). In this case it's more like Client.connect("localhost:7233", "my-namespace") which is a shortcut for Client.new(Connection.new("localhost:7233"), "my-namespace"). There is a PR coming either here in a couple of hours or early tomorrow that will show this.

Here is the intentionally-limited PR: temporalio/sdk-ruby#155. Would love feedback there, I sprinkled some PR comments throughout.

bdchauvette · 2024-07-16T20:40:09Z

ruby/sdk-phase-1.md

+* Activity can choose its executor, default is `:default`.
+  * See worker section for more information on executors.
+* Cancellation is reported to the executor.
+  * For thread pool executor (the default), this raises a cancellation exception via `Thread.raise`.


If this means that the executor will raise an exception on the activity by calling e.g. activity_thread.raise CancellationError, that could potentially be really dangerous. Ruby's built-in Timeout module uses this approach, and it can be problematic:

https://headius.blogspot.com/2008/02/rubys-threadraise-threadkill-timeoutrb.html

https://jvns.ca/blog/2015/11/27/why-rubys-timeout-is-dangerous-and-thread-dot-raise-is-terrifying/

https://github.com/zombocom/rack-timeout/blob/main/doc/risks.md

That being said, the SDK would probably be fine >99% of the time by using Thread.handle_interrupt, similar to Sidekiq:

https://github.com/sidekiq/sidekiq/blob/fdd82b797fb07f5c4e797f742b991874734493f7/lib/sidekiq/processor.rb#L181-L182

But even that can be really subtle: sidekiq/sidekiq#6017 😬

Yes, this was a topic of discussion in Python too where this kind of thing is even less common (we have to make a C call to https://docs.python.org/3/c-api/init.html#c.PyThreadState_SetAsyncExc, it's not even in the standard library). We can provide a way to opt-out of this (and we will provide a "shield" way to ensure certain code runs), but we've found it's dangerous to let runaway activities continue to run that cannot get interrupted. If there is some other model in the standard library for propagating cancellation to threaded tasks, we'd like to use it.

I will make a note to make sure we use handle_interrupt for our exception only and make sure to have tests demonstrate other forms of interrupt.

cretz · 2024-08-19T21:22:35Z

ruby/sdk-phase-1.md

+  * 💭 Why no `Temporalio::Common`? Ruby expects to have common things where they make sense instead of under a `Common`
+    module.


We may have to walk this back a bit. It is unreasonable to have Temporalio::WorkflowIDReusePolicy, Temporalio::WorkflowIDConflictPolicy, etc cluttering the top-level (or is it?). However, we don't want RetryPolicy to be in Common. So I guess we may need some kinda hybrid rule that says use the Common module for small or lesser used common classes?

Maybe we should just have common_enums.rb at that defines top-level, common enums, but still under Temporalio module.

cretz · 2024-09-13T22:07:34Z

ruby/sdk-phase-1.md

+  * It has a class method for `current` that returns from thread/fiber-local and class method shortcuts for all instance
+    methods.
+    * 💭 Why class methods equivalents? `Temporalio::Activity::Context.heartbeat` is easier to read than
+      `Temporalio::Activity::Context.current.heartbeat`. In .NET we made them get current instance first because that's
+      normal there, and in Python we only offered package-level methods because that's normal there.


Suggested change

* It has a class method for `current` that returns from thread/fiber-local and class method shortcuts for all instance

methods.

* 💭 Why class methods equivalents? `Temporalio::Activity::Context.heartbeat` is easier to read than

`Temporalio::Activity::Context.current.heartbeat`. In .NET we made them get current instance first because that's

normal there, and in Python we only offered package-level methods because that's normal there.

* It has a class method for `current` that returns from thread/fiber-local.

I think we are going to walk back this decision for also having class methods. I think it's not really common in Ruby to define class and instance methods of the same name, and Singleton and others show that people are ok calling .current or something.

mjameswh · 2024-09-17T15:35:08Z

ruby/sdk-phase-1.md

+
+## Clients
+
+`Temporalio::Client` is a namespace-specific client and `Temporalio::Client::Connection` is a namespace-independent


Temporalio::Client::Connection is a namespace-independent connection

Are you still holding on to this statement, given the semantic change introduced by Cloud's API Keys? I'm just curious, not arguing either way.

Yes. Connection is the Rust Client and the CloudOperationsClient here (as in other SDKs) will leverage the Connection. Can see this separation in practice already at main of https://github.com/temporalio/sdk-ruby (and in Python SDK and .NET SDK).

mjameswh

Looks good to me.

cretz added 4 commits May 24, 2024 10:34

Ruby SDK - Phase 1

8991062

Minor typos

03a5c72

Added worker sample

f59f389

Updated Ruby proposal based on feedback/learnings

42e496f

cretz requested a review from a team July 10, 2024 18:17

cretz mentioned this pull request Jul 10, 2024

Ruby SDK - Phase 1 - Initial Comment Period #92

Closed

cretz commented Jul 12, 2024

View reviewed changes

alexevanczuk reviewed Jul 16, 2024

View reviewed changes

bdchauvette reviewed Jul 16, 2024

View reviewed changes

cretz mentioned this pull request Jul 16, 2024

Scaffolding and bare-bones client temporalio/sdk-ruby#155

Merged

cretz commented Aug 19, 2024

View reviewed changes

cretz commented Sep 13, 2024

View reviewed changes

mjameswh reviewed Sep 17, 2024

View reviewed changes

mjameswh approved these changes Sep 17, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ruby phase 1 #93

Ruby phase 1 #93

cretz commented Jul 10, 2024 •

edited

Loading

cretz Jul 12, 2024 •

edited

Loading

alexevanczuk Jul 16, 2024

cretz Jul 16, 2024 •

edited

Loading

alexevanczuk Jul 16, 2024

bdchauvette Jul 16, 2024

cretz Jul 16, 2024 •

edited

Loading

bdchauvette Jul 16, 2024

cretz Jul 16, 2024 •

edited

Loading

bdchauvette Jul 16, 2024

cretz Jul 16, 2024 •

edited

Loading

bdchauvette Jul 16, 2024

cretz Jul 16, 2024 •

edited

Loading

bdchauvette Jul 16, 2024

bdchauvette Jul 16, 2024

cretz Jul 16, 2024 •

edited

Loading

bdchauvette Jul 16, 2024

cretz Jul 16, 2024 •

edited

Loading

cretz Jul 16, 2024

bdchauvette Jul 16, 2024

cretz Jul 16, 2024 •

edited

Loading

cretz Aug 19, 2024 •

edited

Loading

cretz Sep 13, 2024

mjameswh Sep 17, 2024

cretz Sep 23, 2024

mjameswh left a comment


		### Workflows in Client

		Early thoughts on workflows are that they'll be Ractors. For the purposes of this phase, all that may matter is how

		* 💭 Why no `Temporalio::Common`? Ruby expects to have common things where they make sense instead of under a `Common`
		module.


		## Clients

		`Temporalio::Client` is a namespace-specific client and `Temporalio::Client::Connection` is a namespace-independent

Ruby phase 1 #93

Are you sure you want to change the base?

Ruby phase 1 #93

Conversation

cretz commented Jul 10, 2024 • edited Loading

Summary

cretz Jul 12, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cretz Jul 16, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cretz Jul 16, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cretz Jul 16, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cretz Jul 16, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cretz Jul 16, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cretz Jul 16, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cretz Jul 16, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cretz Jul 16, 2024 • edited Loading

Choose a reason for hiding this comment

cretz Aug 19, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mjameswh left a comment

Choose a reason for hiding this comment

cretz commented Jul 10, 2024 •

edited

Loading

cretz Jul 12, 2024 •

edited

Loading

cretz Jul 16, 2024 •

edited

Loading

cretz Jul 16, 2024 •

edited

Loading

cretz Jul 16, 2024 •

edited

Loading

cretz Jul 16, 2024 •

edited

Loading

cretz Jul 16, 2024 •

edited

Loading

cretz Jul 16, 2024 •

edited

Loading

cretz Jul 16, 2024 •

edited

Loading

cretz Jul 16, 2024 •

edited

Loading

cretz Aug 19, 2024 •

edited

Loading