Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Have you seen the other signals crate? #1

Closed
davidhewitt opened this issue Mar 12, 2018 · 14 comments
Closed

Have you seen the other signals crate? #1

davidhewitt opened this issue Mar 12, 2018 · 14 comments

Comments

@davidhewitt
Copy link
Contributor

Not sure how it compares (haven't read in much detail) but I just saw that this thing has also sprung into existence on crates.io: https://crates.io/crates/signals

(I found it while looking for this repo after it moved out of dominator!)

@Pauan
Copy link
Owner

Pauan commented Mar 12, 2018

Indeed I saw it. It was published literally 15 minutes before I tried to publish my signals crate.

It's currently very bare-bones (it's only ~126 lines of code), and it has a lot of overhead (it uses Arc + Mutex + broadcasting for every Signal, etc.), and it also doesn't integrate with Futures or Streams at all.

Of course all of that can be fixed, but I already have a zero-cost full-featured Signals library. So I had offered to collaborate with them, but no reply yet.

For now it's best to use a git dependency for rust-signals

@davidhewitt
Copy link
Contributor Author

davidhewitt commented Mar 12, 2018

Ugh; slightly frustrating for the name to go like that! Hopefully the collaboration offer is accepted!

Failing that this crate can always take the name zero_cost_signals 😉

@Pauan
Copy link
Owner

Pauan commented Mar 12, 2018

That's not a bad name. I've actually spent a non-trivial amount of time trying to come up with an alternate name, just in case the offer isn't accepted. So any more suggestions are appreciated.

@Pauan
Copy link
Owner

Pauan commented Mar 14, 2018

I think futures-signals is probably the best name. That emphasizes that it is closely related to Futures.

Long-term I would love to have it integrated directly into futures, but that will require a lot of work.

@davidhewitt
Copy link
Contributor Author

davidhewitt commented Mar 18, 2018

futures-signals is indeed a nice name. I see that's published on crates.io now! 👍

@zutils
Copy link

zutils commented Oct 23, 2018

Hi! Sorry for the belated response. It's good to hear there are many people working on some type of "signals" system. I am more than happy to collaborate! The original intent of my signals crate is to replace what I lost with Qt's signals/slots mechanism when moving to Rust. As this was one of my first Rust projects, you are right that there is cost associated with them - a cost I would not mind fixing. My version of signals MUST be multi-threaded. I want to be able to call a closure whenever a file or other event is received. So far, the only way I've been using the crate is an easy way to do multi threaded closures. Do you recommend a better way to collaborate live?

@Pauan
Copy link
Owner

Pauan commented Oct 25, 2018

I am more than happy to collaborate!

That's great!

My version of signals MUST be multi-threaded.

Could you explain more about what you mean by that? Like specific requirements that you have?

My signals library is fully thread safe: you can send signals between threads, you can broadcast to multiple threads, etc.

I want to be able to call a closure whenever a file or other event is received.

For events I don't recommend using Signals, because Signals are lossy.

The purpose of Signals is to act like a mutable variable which lets you be notified when the value changes. So if your use case fits into that mental model, then Signals work great!

But if instead you want an ordered sequence of multiple values, a Stream is much better (Streams are provided by the futures crate). They maintain an internal queue so that you don't lose any events.

Do you recommend a better way to collaborate live?

I do most of my work on GitHub, I don't really have a faster form of communication.

@zutils
Copy link

zutils commented Oct 26, 2018

Pauan,
Thanks for getting back to me! You appear to be very knowledgeable, and I hope that I can answer your questions the best I can.

It seems like streams really are something different for my vision. The futures crate's streams look like I have to poll data. If any polling is to be done, it will be taking place "inside" one of my signals as a separate design pattern. Although I anticipate streams being used separately from my signals.

Your definition of Signals as "a mutable variable which lets you be notified when the value changes" is very accurate, yet I went a step further. The further step was that when a mutable signal variable is modified, it can map the data and/or call a function before pushing the data to any listening signals.

As far as events are concerned, I believe that those would be a separate design patterns like polling that would take place in the signals' mappable function.

The reason for multi threaded was so that while the function that maps the data is being called, it doesn't prevent a bottleneck to the current thread. For example, a signal could handle async http requests, and when the request comes in, that data could be read, mapped, or passed on to one or more handlers.

I've never heard of signals as lossy. Could you point me to a resource?

Please let me know what your thoughts are.

@Pauan
Copy link
Owner

Pauan commented Oct 26, 2018

It seems like streams really are something different for my vision. The futures crate's streams look like I have to poll data.

This is a tricky and nuanced subject, so this is going to be a long post. I apologize in advance for that.

There's been a lot of research done into signals, and there's generally two flavors: push, and pull:

  • The push model (which you are using) essentially uses callbacks (or similar). It's quite similar to event listeners.

  • The pull model requires you to regularly poll the signal for changes.

Both systems have advantages and disadvantages. The advantage of pull is that it's very simple to implement, and it works wonderfully with Rust's memory model. The downside is that it's inefficient and has high latency (because it has to constantly poll).

The advantage of push is that it never misses changes, it has good performance, low latency, etc. But it sometimes has poor behavior/performance in some common situations. It also has a lot of hidden performance costs in Rust (due to Rust's memory model).

The biggest downside of push is that it can trigger multiple updates for a single change. Consider this hypothetical graph of signals:

  A
 / \
B   C
 \ /
  D

When A changes we want to automatically update B and C, and when either B or C changes, we want to automatically update D.

This sort of diamond pattern happens often with signals, so it's important that it behaves correctly and efficiently.

Let's suppose that A changes. What happens now?

  • With pull, it's quite simple: it polls D, which then polls B and C, then B and C poll A, and everything works great.

  • With push, A will push the change to B, which then pushes the change to D... and then A pushes the change to C, which pushes the change to D.

    Notice that it pushed the change twice to D, which is inefficient (imagine that D is some complex map which does some heavy computation, now that computation will be done twice!).

So there are some tricky trade-offs between push and pull. Based upon my own experiences working on signals implementations (mostly in JavaScript), I believe the best option is a hybrid push + pull system. It uses push to notify that a change has occurred, but it uses pull to actually retrieve the changes. This is a very efficient system which handles every situation correctly.

The futures crate (which my signals crate is based on) uses a hybrid push + pull system. It's a wonderful, well designed, and very efficient push + pull system (I did not design it, I'm simply piggybacking off of the work already done in the futures crate).

I haven't done any benchmarks, but I suspect that a hybrid push + pull system will be faster than a pure push system, at least in Rust.

Also, it's important to note that it's possible to convert a push-based event into a push+pull Future/Stream/Signal, so even with a push+pull system you can efficiently react to external push-based events.

It's also important to note that all this polling is an internal implementation detail. When a user wishes to use Futures/Streams/Signals, they're not doing any manual polling, instead they're doing something like this:

let some_signal = mutable.signal()
    .map(|x| x + 10)
    .filter(|x| x < 10);

spawn_future(some_signal.for_each(|x| {
    println!("Signal value changed to: {}", x);
    ready(())
}));

In other words, they use handy combinators like map, filter, etc. and when they're done they use for_each + spawn to actually run the Future/Stream/Signal.

The for_each method internally polls, but you don't need to worry about any of that, it's an implementation detail.

If you want, I can go into more detail about the overall design of the futures crate, and also its specific implementation in Rust (and why it's so fast in Rust). It was quite eye-opening to me when I learned it.

The further step was that when a mutable signal variable is modified, it can map the data and/or call a function before pushing the data to any listening signals.

The code will be difficult to understand if you don't understand Future + Pin, but I do suggest taking a look at the code (and in particular the SignalExt methods).

My Signals fully support a wide range of methods, including map, inspect, dedupe, map_future, filter_map, flatten, switch, wait_for, and, not, or, and more.

And all of the various combinators and Signals efficiently support cancellation.

Not only that, but the methods are extremely fast: as an example, map is fully stack allocated (even the closure, input Signal, and output Signal are stack allocated), and it's constant time, and the constant time is very fast.

How fast? This line is the entire implementation of map (yes, really). You might think that it's hiding all the complexity somewhere else, but it's not. If you fully expand the code, it becomes this:

match self.signal.poll_change(waker) {
    Poll::Ready(Some(value)) => Poll::Ready(Some((self.callback)(value))),
    a => a,
}

It's just normal pattern matching on an enum. This is about as simple and efficient as it gets.

The reason for multi threaded was so that while the function that maps the data is being called, it doesn't prevent a bottleneck to the current thread. For example, a signal could handle async http requests, and when the request comes in, that data could be read, mapped, or passed on to one or more handlers.

Okay, I haven't tested it, but I think that can be handled by my library. Mutable internally uses a RwLock, which allows for multiple threads to have simultaneous read-only access to the data.

So the only time when there will be lock contention is when you actually change the value of the Mutable (because that requires write access).

I've never heard of signals as lossy. Could you point me to a resource?

I don't have any resources I can link to, it's based on my own research. When I say "lossy", I mean something very specific:

When I say that "you should think of Signals as being like mutable variables", I mean it. Let's consider a mutable variable:

let mut foo = 0;

foo = 1;
foo = 2;

println!("{}", foo);

Obviously it will print 2, and the value 1 is completely ignored, like as if it never existed. In other words, intermediate values don't matter, only the current value matters.

The same is true with my Signals library:

let foo = Mutable::new(0);

foo.set(1);
foo.set(2);

If you have some Signals which are listening to changes to foo, they will only receive the value 2, as far as they're concerned the value 1 never even existed.

So that means if you make multiple changes to a Signal, only the last one counts, the intermediate changes are discarded and ignored. Or to put it another way, Signals only care about their current value, not their past values.

It's guaranteed that you will always receive the correct current value, but you cannot rely upon receiving every value (because intermediate values might be ignored).

That has a lot of benefits in terms of performance, and also having a clean and correct API (some combinators like switch or flatten behave weirdly if you keep past values).

But it does mean that Signals are not suitable for events, because events often require you to process every event, so "losing" events is unacceptable.

On the other hand, Streams are an ordered sequence of values, and they internally use a buffer to hold values which haven't been processed yet, so if you use Streams you are guaranteed that it won't lose events, and it also guarantees that the events will occur in the correct order, which is important for events!

That means that although Signals are quite bad for events, Streams work great! It's easy to convert from an event-based system into a Stream-based system (and then you gain many useful Stream combinators).

Since my Signals are built on top of the futures crate, it has great support for converting to/from Futures and Streams, so you can mix-and-match Futures/Streams/Signals in the same program (I recommend doing this, because each of them is useful in different situations, there isn't a one-size-fits-all data structure!)

@zutils
Copy link

zutils commented Oct 27, 2018

That is a very long post! You have a lot of good information in there. Have you thought about posting much of that to medium.com or other blog? It seems like you put a lot of work into signals, and you are very knowledgeable!

About push/pull model:
So it could be interesting for you to know that my current system has a push model, yet the pull model will not pull from a root signal. The pull model can only see the value of the signal itself. Using a separate library, a developer could then take that value and add it to a stream to prevent "loss".

About diamond pattern:
This as been long time planned with my signals crate. This was, at minimum, postponed due to satisfying some functionality of what was lost without other signal systems.

About hybrid system:
I like your idea of a hybrid system, If you say that it's the best way to go, your word sounds good enough to me. The way that you explain the hybrid system is almost what is currently available in my signals crate anyway. The last listening signal (take a look at the "fail" variable name in my complex example), can possibly (up to the developer) have code to notify an event. After that notification, it can be read from. It sounds that the existing system is currently set up with what you refer to as this hybrid push/pull system. Can you tell me a little more about this?

About manual polling:
My system doesn't do manual polling unless it is added using the developers own polling mechanism.

About the futures crate:
My experience with the futures crate was less than satisfactory. The futures crate wanted to panic on a simple call. When a library panics, and there is nothing the developer can do about it, that is the equivalent of crashing. Perhaps there is a way to stop the panic?

About your SignalExt methods:
I like it... it's simple :) Have you considered adding that example to your readme file when viewing crates.io?

About all your features:
map, inspect, dedupe, map_future, filter_map, flatten, switch, wait_for, and, not or - These are all great features to have. If it's makes sense to you, I like to leave these things up to the developer using my library. Yes it's a little bit more code, but to me it makes the library simple, easy to use, and your function calls can be very short - decreasing technical debt. I DO like your poll_change function! That's pretty clever and interesting :) I like it in it's simplicity! Great job!

About cancellation:
Now THIS is something very useful! Cancellation was supposed to be in my crate. It was removed as a feature due to thinking that the developer could add this functionality to any long running systems manually. It may be a good idea to come up with some examples of where cancellation would benefit on a per-use benefit, and add those features to the signal crate at that time.

Multi-threading:
My signals can each be stored on a different thread and talk to each other. I'm interested in knowing the feasibility of your signals in this realm.

About lossy signals:
Ah! This makes perfect sense, if a signal notifies it's listeners of a change, and the same signals notify before the listener can make the change, there is data loss before the handling. Fear not! My system is not inherently susceptible to this! The developer's code CAN be susceptible if implemented with my signals incorrectly, and I may be able to create example code of how not to implement it, but then again, the developer can shoot themselves in their foot in ways unrelated to my crate, and should my code be responsible for that too?

Overall, you have done a thorough job with your crate. It would be really nice for everyone if you were to post example code in the readme file! It sounds like the major changes needing to be done to my crate involve moving away from Arc<Mutex<>>, adding the diamond pattern, and having a lot more examples and marketing points to let everyone know the capabilities of what it can really be used for!

You use the words "there isn't a one-size-fits-all" to represent data structures. Perhaps it is meant to represent rust modules as well. It seems like there is use for BOTH of our crates. That being said, I'm verily interested in knowing what more would you hope to see from our collaboration?

@Pauan
Copy link
Owner

Pauan commented Oct 29, 2018

Have you thought about posting much of that to medium.com or other blog?

I hadn't, but maybe that's a good idea.

So it could be interesting for you to know that my current system has a push model, yet the pull model will not pull from a root signal. The pull model can only see the value of the signal itself. Using a separate library, a developer could then take that value and add it to a stream to prevent "loss".

Could you explain more? I've taken a look at your code, and as far as I can see there's no pull at all, it's pure push.

If you say that it's the best way to go, your word sounds good enough to me.

I don't think you should take anybody's word for it (including mine). Ideas should be believed because they're true and right, not based on who said them.

The way that you explain the hybrid system is almost what is currently available in my signals crate anyway. The last listening signal (take a look at the "fail" variable name in my complex example), can possibly (up to the developer) have code to notify an event. After that notification, it can be read from.

Once again, could you explain more?

It sounds that the existing system is currently set up with what you refer to as this hybrid push/pull system. Can you tell me a little more about this?

From looking at your code, it seems like a pure push implementation, no pull at all. Definitely not a hybrid push+pull.

The way that a hybrid push+pull system works is that it uses push to send change notifications (but it doesn't send the new value!) and then when the leaf signal receives the notification it then pulls the new value (which recursively pulls from the leaf to the root).

Here is a very simplified working example of a push+pull system. It doesn't handle cancellation, but it does solve the diamond problem (by using a global change ID). It should give you an idea of how push+pull differs from your model.

As you can see, Map and Map2 don't require Arc or Vec or Box at all: they're fully stack allocated. So it only needs to allocate for Mutable and the for_each function. Everything else is extremely fast.

The futures crate (and my signals crate) use an implementation which is similar to the above code (but with a lot of tweaks and improvements).

There is a blog post that goes into more detail about how Futures/Streams/Signals are implemented (and why they're implemented that way!). It's a couple years old, but it's still mostly correct.

There's also an even older blog post that gives a high-level overview of why Futures/Streams are important.

The futures crate wanted to panic on a simple call. When a library panics, and there is nothing the developer can do about it, that is the equivalent of crashing. Perhaps there is a way to stop the panic?

Could you explain more? As far as I know, the only time it should panic is if your code is buggy. Errors are handled with Result, just like normal Rust functions.

Have you considered adding that example to your readme file when viewing crates.io?

There is a tutorial, but it's currently not published to crates.io, because of a bug in the Rust Doc generator.

If it's makes sense to you, I like to leave these things up to the developer using my library. Yes it's a little bit more code, but to me it makes the library simple, easy to use, and your function calls can be very short - decreasing technical debt.

It's your library, you can do what you want, but I strongly disagree. Your complex example is written like this:

let root = Signal::new_arc_mutex( |x: &u32| Ok((*x).to_string()) );
let peek = Signal::new_arc_mutex( |x: &String| { println!("Peek: {}", x); Ok(()) } );
let to_i32 = Signal::new_arc_mutex( |x: &String| Ok(x.parse::<i32>()?) );
let inc = Signal::new_arc_mutex( |x: &i32| Ok(*x+1) );
let fail = Signal::new_arc_mutex( |x: &i32| { assert_ne!(*x, 8); Ok(()) } );
 
root.lock().unwrap().register_listener(&peek);
root.lock().unwrap().register_listener(&to_i32);
to_i32.lock().unwrap().register_listener(&inc);
inc.lock().unwrap().register_listener(&fail);
 
root.lock().unwrap().emit(7);

The code is almost entirely sequential (the data flows from root -> to_i32 -> inc -> fail), but that is not obvious at all: I had to spend several minutes trying to figure out how the data was flowing through the signals.

Here is how that example would look like with my Signals:

let root = Mutable::new(6);

let future = root.signal()
    .map(|x| x.to_string())
    .inspect(|x| println!("Peek: {}", x))
    .map(|x| x.parse::<i32>())
    .map(|x| x.map(|x| x + 1))
    .for_each(|x| {
        assert_ne!(x.unwrap(), 8);
        ready(())
    });

spawn_local(future);

root.set_neq(7);

The code is much shorter, but more importantly, it's clearer: I can see immediately what the data flow is:

  1. It flows from root into .map(|x| x.to_string())

  2. And then into .inspect(|x| println!("Peek: {}", x))

  3. And then into .map(|x| x.parse::<i32>())

  4. And then into .map(|x| x.map(|x| x + 1))

  5. And lastly it terminates in .for_each(|x| { ... })

The data flow is sequential, and the code itself is also sequential, so it's easy to understand.

And when you do have complex data flow, it stands out, because it looks different from the simple sequential flow. With your system, both the sequential and complex data flow look the same.

It may be a good idea to come up with some examples of where cancellation would benefit on a per-use benefit, and add those features to the signal crate at that time.

Cancellation is really fundamental, I don't think it's something that should be "added by developers on a per-use basis". That's been tried with other languages, and the end result is usually awful: because cancellation requires some extra code, developers either forget to add it in, or they are too lazy to add it in.

So you end up with a lot of libraries which don't handle cancellation correctly, which then leads to applications which don't handle cancellation correctly. The end result is slow, bloated programs which have subtle errors and bugs.

Cancellation is very important for dynamically changing signal graphs (such as flatten or switch), because it needs to unregister from the old signals and then re-register with the new signals.

In addition, if you don't have automatic cancellation, that usually leads to memory leaks (because child signals keep listening to the parent signals even though they shouldn't).

My signals can each be stored on a different thread and talk to each other. I'm interested in knowing the feasibility of your signals in this realm.

Yes, as I said, you can send signals between threads, and they can broadcast to different threads. Everything works across threads.

Perhaps it is meant to represent rust modules as well. It seems like there is use for BOTH of our crates.

There probably is. I know push-based signals are popular, so I'm sure some users will find use from your crate.

That being said, I'm verily interested in knowing what more would you hope to see from our collaboration?

Since our crates have fundamentally different priorities, goals, and implementations, I don't see much opportunity for collaboration, unfortunately.

Of course you can contribute to my crate if you like, and I'll gladly share my knowledge about signals, but that's up to you.

@zutils
Copy link

zutils commented Oct 29, 2018

I guess it was worth a shot. Thanks for your input over these past few days. It was very helpful. I'm sorry that we couldn't see through the same lens.

@kellytk
Copy link

kellytk commented Nov 27, 2019

Okay, I haven't tested it, but I think that can be handled by my library. Mutable internally uses a RwLock, which allows for multiple threads to have simultaneous read-only access to the data.

@Pauan Is it possible that a program using Signals could add Tokio for an easy performance improvement by transparently spreading work over multiple threads/cores?

@Pauan
Copy link
Owner

Pauan commented Nov 27, 2019

@kellytk Sure, Signals was designed so it can be used in any Futures executor. And it was also designed so all the types are Sync and Send, so you can actually use it in a multi-threaded environment.

I just haven't tested that yet, because my focus has been on Wasm (which doesn't have multi-threads).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants