Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Piping data to functions #2049

Open
andrewmarx opened this issue Jun 27, 2017 · 56 comments
Open

Piping data to functions #2049

andrewmarx opened this issue Jun 27, 2017 · 56 comments
Labels
T-lang Relevant to the language team, which will review and decide on the RFC.

Comments

@andrewmarx
Copy link

Something I've seen elsewhere is the concept of piping data to functions, which can drastically increase readability in cases where someone would either use intermediate variables or call functions as parameters.

Pseudocode for a contrived example:

x = {1,-7,-5,3}

// Without pipes, intermediate variables.

a = sum(x)
b = abs(a)
c = sqrt(b)
result = round(c, 2)


// Without pipes, nesting functions.

result = round(sqrt(abs(sum(x))), 2)


// With pipes the result on the left of the pipe is passed as the first parameter
//of the function on the right, allowing the code to be read left to right.

result = x %>% sum() %>% abs() %>% sqrt() %>% round(2)


// It's particularly nice for splitting code across lines

result = 
  x %>% 
  sum() %>% 
  abs() %>% 
  sqrt() %>% 
  round(2)

The %>% operator is just an example (from R), it could be something else

I'm just learning rust, and know nothing about language development, so the complexities involved might make this impractical. At the same time, it would also be really interesting to see how something like this could be incorporated to work with tuples and multiple return values.

@Luthaf
Copy link

Luthaf commented Jun 27, 2017

If your functions do take a &self as first parameter, you can achieve similar result with dot chaining:

result = x.sum()
          .abs()
          .sqrt()
          .round(2);

@burdges
Copy link

burdges commented Jun 27, 2017

I think you're stuck with the reverse polish dot chaining on &self or self for Rust.

As an aside, if you consider a language like Haskell, Idris, OCaML, etc. where juxtaposition is high-precedence curried function application, then you have both . that acts as function composition and $ that acts as low-precedence curried function application. I'd consider this the most elegant and expressive way of doing this. And I hope someone eventually develops such a language built on borrowing instead of garbage collection. I do think Rust's decision to resemble C and object oriented languages makes Rust more approachable for most programmers. It also seemingly made it easier to focus on zero-cost abstraction, like by simplifying type-inference. To be precise, Idris now strikes a nice enough compromise on overloading to suggest that Rust could looked like a functional language, but those developments happened concurrently with Rust's development, and Idris' type inference used to be incredibly weak, so dealing with this could've been a distraction for Rust.

@oli-obk
Copy link
Contributor

oli-obk commented Jun 28, 2017

It's actually pretty simple to implement this with a wrapper type: https://is.gd/42O3Ns

But I agree that it's nice that Rust code is readable for people who've never seen a functional language and I wouldn't want to break this for purity's sake.

@nixpulvis
Copy link

nixpulvis commented Jun 30, 2017

While I'm on a little Rust RFC binge I'll just swing by here and say that I really hope to never see syntax like x %>% y in Rust. One of the things (along with many, many others) that first drew me to Rust was that it's actually pretty syntactically light. Nothing like C99 of course, but much easier to read than Haskell or the like.

@mark-i-m
Copy link
Member

mark-i-m commented Jul 2, 2017

I think this is an attempt at building convenient function composition. And i would totally support that!

@alilleybrinker
Copy link

So, translating the pseudocode above into something more Rust-y:

// ... assume various functions are implemented here.

fn main() {
    let x = vec![1, -7, -5, 3];
    let a = sum(x);
    let b = abs(a);
    let c = sqrt(b);
    let result = round(c, 2);
    let result = round(sqrt(abs(sum(x))), 2);
    let result = x %>% sum %>% abs %>% sqrt %>% |x| round(x, 2)
}

I'm not completely opposed to adding an infix function composition operator. But I don't think %>% is the right name for it. Pulling the name from Haskell, you could have either .> (which in Haskell is a convenient name for flip (.), or you can use >>> from Control.Arrow, which is more general):

// ... assume various functions are implemented here.

fn main() {
    let x = vec![1, -7, -5, 3];
    let a = sum(x);
    let b = abs(a);
    let c = sqrt(b);
    let result = round(c, 2);
    let result = round(sqrt(abs(sum(x))), 2);
    let result = x >>> sum >>> abs >>> sqrt >>> |x| round(x, 2)
}

@mark-i-m
Copy link
Member

mark-i-m commented Jul 2, 2017 via email

@durka
Copy link
Contributor

durka commented Jul 2, 2017

See also the pipeline crate which implements this in a macro.

@alilleybrinker
Copy link

alilleybrinker commented Jul 2, 2017

There is something worth pointing out here, which is the distinction between (.) and (<<<) (and, equivalently, between (.>) and (>>>)). (.) and (.>) are function composition operators, one right-to-left ((.)) and one left-to-right ((.>)). (<<<) and (>>>) are more generic composition operators. In Haskell, they are defined on Category types, where a Category is defined as having an identity "morphism" (id :: Category cat => cat a a, which is like a generic version of the identity function, id :: a -> a), and having morphism composition, (.) :: Category cat => cat b c -> cat a b -> cat a c (yes, this annoyingly overloads the . operator. Like the last operator, this is a generic version of (.) :: (b -> c) -> (a -> b) -> (a -> c)).

(<<<) is equivalent to the Category version of (.), and is defined as (<<<) :: Category cat => cat b c -> cat a b -> cat a c. (>>>) is the same function with the first two arguments flipped (making it left-to-right rather than right-to-left). Its type is (>>>) :: Category cat => cat a b -> cat b c -> cat a c.

I mention this because if we're going to do infix composition, there's the question of whether we'd want to keep it to functions of one argument (like with (.)), or do we want to (can we? not right now, I believe) make it more generic, and akin to the Category typeclass in Haskell?

Name Direction Defined on
. right-to-left functions of one argument (a -> b)
.> left-to-right functions of one argument (a -> b)
<<< right-to-left all categories (with functions described above)
>>> left-to-right all categories (with functions described above)

@alilleybrinker
Copy link

alilleybrinker commented Jul 2, 2017

@mark-i-m, that particular overloading seems like it would be confusing. I certainly wouldn't expect bitwise-or on functions to work, much less to provide function composition.

It reminds me a bit of C++ overloading >> and << for I/O, and not just left/right bit shifting.

@nixpulvis
Copy link

Call me crazy, but one day we wont be limited to shitty ASCII keyboards, could we start using unicode like ∘ for this?

Only like 75% joking.

@nixpulvis
Copy link

In reality foo.bar().baz() should be prefered imo.

@alilleybrinker
Copy link

@nixpulvis, I mean, APL and APL family languages have all gone down this rabbit hole. In the Haskell world, the proliferation of operators (imo) really hurts readability. I think there's a core set that can be good, but I am generally against going completely crazy with introducing new operators.

Haskell's lens library drives me nuts sometimes, because a lot of the docs use the custom operators rather than the equivalent prefix functions.

@mark-i-m
Copy link
Member

mark-i-m commented Jul 2, 2017

Yeah, i guess chaining method calls are the primary alternative... Did anybody know if they equally expressive?

@nixpulvis
Copy link

They should be, but might require creating a trait.

@burdges
Copy link

burdges commented Jul 2, 2017

You cannot beat the curried languages for expressiveness: If you consider the original example round(sqrt(abs(sum(x))), 2) then you would write this in Haskell as (`round` 2) . sqrt . abs . sum $ x, which reads "compose all these functions and apply them to the value x". No maze of parentheses. No bizarre operators like the proposed backwards composition %>% or strange round(2). Just ordinary function composition and a low precedence function application. And the (`round` 2) could be written \x -> round x 2 or flip round 2 too.

As I said before, I think people interested in these question should be asking : How do we bring borrowing, mutability, Rust ABI, etc., to a curried language? Could it be as thin a layer over Rust as say Dogelang is over Python? etc. It's partially that function composition becomes less useful without currying and combinators.

@nixpulvis
Copy link

Upping the level off expressivity isnt always a good thing. I think Haskell itself should be a cautionary tale in what happens when you make something have too many expressive forms. Simplicity should be a goal of any practical language. Readability then comes naturally. Rust already struggles here due to it's complex type system, however I personally believe this is warranted due to the static memory guarantees it provides.

@OvermindDL1
Copy link

Don't forget OCaml, it is as powerful as Haskell but goes through the Module system with sane names instead of doing operator hell as Haskell does. It comes with very few infix operators, one of those is indeed the pipe operator (|>) and the reverse pipe operator (@@, though they would prefer <| but the associativity does not work out). It does not come with . or <<< or >>> or anything else of the sort, things remain readable. In my opinion Rust has taken far too much from Haskell as it is, which contributes to some of what I consider the syntax woe's in Rust and compiling speed hit (compare to OCaml, which remains readable (and even more so once Implicit Module support lands) and compiles to ~C++ runtime speed but with a compile-time often measured in sub-1s).

@mark-i-m
Copy link
Member

mark-i-m commented Jul 4, 2017

@burdges

You cannot beat the curried languages for expressiveness

Are you making a theoretical statement or an experiential one? Personally, if chaining can be done via some sort of trait impl, I think that would be the most rustic, but it really depends on the tradeoffs of how much we would lose over composition...

@egilburg
Copy link

@nixpulvis I think that's the direction Julia takes, here are some syntax literals.

@TedDriggs
Copy link

TedDriggs commented Jul 25, 2017

The only case where I've found myself looking for this was in Iterator, Result, or Option contexts which already encourage a self chain. Having to either break the chain or have some of the items out of order due to a specific function not being a method was painful.

What if we added a trait like the one @oli-obk provided and then implemented that for the three types that encourage the chaining style? That would keep the syntax lightweight but also let those cases use their preferred style. Having it be a trait would also let people implement it for their own types that use the same style.

@Centril
Copy link
Contributor

Centril commented Sep 11, 2017

@OvermindDL1 Maybe it's just me, but I find Haskell way more readable than Rust or any other language syntactically in the C language family. I find that Haskell's abundance of white space really helps me read the code as opposed to code being littered with terminal symbols < > & { } ( ) everywhere.

A significant advantage of compositional programming which you get with pointfree style is that you get away with a complete lack of imagination for variable names since you have none to name.

@burdges is spot on - function composition is not very ergonomic when you can't box anything on the heap anytime you like and when you have to think about lifetimes.

@mark-i-m
Copy link
Member

find that Haskell's abundance of white space really helps me read the code as opposed to code being littered with terminal symbols < > & { } ( ) everywhere

I think this is mostly a matter of taste, though... Personally, I find well-formatted rust rather pleasing to the eye...

when you can't box anything on the heap anytime you like and when you have to think about lifetimes

I'm not sure I follow... as I understand it, a composition feature would be sugar for something you can already write, so how would there be any difference?

@Centril
Copy link
Contributor

Centril commented Sep 12, 2017

@mark-i-m I guess it's subjective some degree - and there are worse offenders ;)

Sure, it would be sugar, but getting the same expressive power as Haskell might be hard as you might have to distinguish between different types of functions and closures. Some of the expressive you get from partial application and auto-currying, which might need to transparently box the arguments - which you probably wouldn't like in Rust. All I'm saying is it won't be a trivial task to design this well in Rust.

@Centril Centril added the T-lang Relevant to the language team, which will review and decide on the RFC. label Dec 6, 2017
@elcritch
Copy link

Occasional Rust user here; I landed here after googling for "rust pipe operator". :-) Given Rust's general railway programming style I’m a little surprised it’s not in the language already! A pipe operator provides a useful visual cue that some data is being transformed and enables non-self-methods to be used. Programming in Elixir, Julia, and OCaml/F# has given me a taste for the |> syntax as they all use that syntax. Its simple and straightfoward without the complexity of a full Haskell-style set of operators. There are likely a few functional programmers using Rust wherein a functional map/fold/etc programming style with a pipe operator would be appreciated. It should be pretty easy for anyone with Unix shell philosophy to grok given the syntatic similarity to Unix | pipes.

@sergeysova
Copy link

Also Javascript has |> pipeline operator proposal 😆

@sergeysova
Copy link

sergeysova commented Mar 15, 2018

fn main() {
    let result = vec![1, -7, -5, 3]
        |> sum
        |> abs
        |> sqrt
        |> |x| round(x, 2);
      
    let result = round(sqrt(abs(sum(vec![1, -7, -5, 3]))), 2);
}

@scottmcm
Copy link
Member

scottmcm commented Mar 16, 2018

Rust has a pipe operator, ()., that even works with the functions in the example:

    let result = vec![1.0, -7.0, -5.0, 3.0].into_iter
        (). sum::<f32>
        (). abs
        (). sqrt
        (). round
        ();
    println!("{}", result);

Try it on play: https://play.rust-lang.org/?gist=e00f6a9e9738ee655b23c62866d78653&version=stable

More seriously, should it just be easier to make extension methods?

@egilburg
Copy link

@scottmcm That work only on methods of the receiver, not an arbitrary function

@mark-i-m
Copy link
Member

I don't know how well this would work, but we could have a.b(c) desugar to b(a, c) if the type of a does not have a b method. This would be kind of like an extension of UFCS...

@OvermindDL1
Copy link

OvermindDL1 commented Mar 19, 2018

I don't know how well this would work, but we could have a.b(c) desugar to b(a, c) if the type of a does not have a b method. This would be kind of like an extension of UFCS...

Sounds similar to C++'s Unified Call Syntax proposal.

However that doesn't really fix pipelining entirely as ambiguity can pop up on which to use in such a syntax.

For note, a big reason for pipelining is to prevent stupid things like Haskell'y operator explosion. Instead of doing things like blah >>> bloop you can instead 'name' the calls via blah |> andThen(bloop) or whatever. It allows you make DSEL things but with a unified syntax easily and across type that you may or may not control. Traditionally functional languages pipe into the last position, but many popping up are piping into the front position so it visually 'flows' properly, I.E. blah |> andThen(bloop) === andThen(blah, bloop), thus meaning that both forms seem proper. However, knowing what subject should be passed in is not always correct, so some kind of 'put here' form is useful, perhaps like blah |> andThen(_, bloop) being equivalent to the prior when used in non-default position (bonus points for supporting doing stuff to the _ bit as well, whatever the _ syntax ends up actually being).

@durka
Copy link
Contributor

durka commented Mar 19, 2018 via email

@Centril
Copy link
Contributor

Centril commented Mar 19, 2018

@OvermindDL1

For note, a big reason for pipelining is to prevent stupid things like Haskell'y operator explosion.

It is not true that you have an explosion of operators in Haskell -- there are only a few packages that do this (like Lens which is an immensely useful package btw..) and also calling things stupid is not helpful.

Instead of doing things like blah >>> bloop you can instead 'name' the calls via blah |> andThen(bloop) or whatever.

The former as function composition ("pipelining") seems infinitely more ergonomic -- you learn the operator once and then you know what it means. You can also use the reverse, which is just bloop . blah.

@durka Seems to me that b(a, c) would be tried when inherent methods and trait methods and deref conversion doesn't work anymore. The "figure out what code does" objection seems valid and is for sure a trade-off to consider.

Personally, I'd like some general and ergonomic mechanism for function composition tho so I can write pointfree code (I hate naming function parameters and names for lambda binders) so some currying and composition would be nice.

@OvermindDL1
Copy link

OvermindDL1 commented Mar 19, 2018

and also calling things stupid is not helpful.

Sorry, I just have a bit of a thing against it from having to maintain some other people's Haskell code... ^.^;

The former as function composition ("pipelining") seems infinitely more ergonomic -- you learn the operator once and then you know what it means. You can also use the reverse, which is just bloop . blah.

More ergonomic perhaps, but that also means that the user has to learn those operators, which is still more overhead compared to nice and descriptive names. I do some teaching of programming in real life so I may be biased to well descriptive function names instead of non-descriptive operators though as it really helps the teaching process.

@durka Seems to me that b(a, c) would be tried when inherent methods and trait methods and deref conversion doesn't work anymore.

What @durka said is just one of those ambiguities that can pop up that I mentioned, in the C++ world there are even more issues (yay templates...).

Personally, I'd like some general and ergonomic mechanism for function composition tho so I can write pointfree code (I hate naming function parameters and names for lambda binders) so some currying and composition would be nice.

Sounds like you'd prefer Haskell then, I significantly prefer readable code (and no, horror's like Ruby are not readable, though I tend to find most OCaml code readable unlike Haskell code, I.E. newbies in my classes can actually read OCaml code and understand most of it without needing to learn the language, unlike Haskell, which at times looks perl'y in its line-noiseness in the 'common standard' code, which yes includes >>>/>>=/./etc...).

Also as for this part of:

I hate naming function parameters and names for lambda binders

That implies that you don't maintain the code in the long term. Proper names significantly, like amazingly significantly improves maintainability.

@Centril
Copy link
Contributor

Centril commented Mar 19, 2018

I do some teaching of programming in real life so I may be biased to well descriptive function names instead of non-descriptive operators though as it really helps the teaching process.

Operators should be introduced with care and not whilly nilly, but most experienced Haskell programmers instantly recognize the meaning of ., $, <$>, <*>, *>, <*, >>=, >=>, <>, ++. Just as it is easy to introduce too many operators it is also very easy to give functions bad names. There is no substitute for programmers making good decisions here.

My time as a TA showed me that students had a hard time understanding Java and that descriptive function names were the least of their problems.

Sounds like you'd prefer Haskell then, I significantly prefer readable code

Again; this is your judgement that Haskell has unreadable code - I don't agree with this at all. Signal to noise ratio in Haskell is pretty minimal. Also, I'm perfectly capable of liking more than one language. My preference is towards correctness, polymorphism and strong type systems, so naturally I gravitate towards languages like Rust, Haskell, and Idris. Syntactically I prefer Haskell, yes, - but Rust's syntax OK and not more important to me than the very nice semantic properties of Rust (zero cost abstractions, low latency, performance, no runtime, ..).

I tend to find most OCaml code readable unlike Haskell code

I take just the opposite view; OCaml is utterly unreadable for me... but I believe this is just about what you are used too... People who are used to C style syntax will think that python and Haskell is unreadable. I don't think you can draw any general conclusions about this other than conclusions about the person making these claims.

With this said; hopefully we can get back on topic (and sorry for being a bit hypocritical..) ^,-

@burdges
Copy link

burdges commented Mar 19, 2018

As an aside, I've never used lens in Haskell, but my impression is lens mostly makes up for the lack of mutability, so rust does not obviously need anything like that. I'll therefore conjecture that a curried language built on Rust, or even an ML, does not benefit nearly as much from lenses as Haskell. I think similar claims can be made about many other operators in Haskell, like monads being replaced with a more detailed effects system in newer languages. I therefore doubt Rust would even suffer the same flavor of operator explosion as Haskell, although entirely new flavors might be possible or worse.

As I said upthread #2049 (comment) there is probably too much strangeness budget cost in making a bunch of new operators that do basically what . and $ do in Haskell. It's better to encourage the the functional language researchers to build their research projects on top of Rust, MIR, etc., do mutability, borrowing, etc. similarly to Rust. We've so much more to gain from that direction, like hybrids between traits and 1ML-style modules.

If otoh you want to allow unicode then maybe we can have an actual function composition operator instead of ., but then nobody can complain when 💩 gets used for certain module or helper trait names. ;)

@OvermindDL1
Copy link

but most experienced Haskell programmers instantly recognize the meaning

As do I, but that does not make it easier than well named functions as you still have to remember precedence and all.

My time as a TA showed me that students had a hard time understanding Java and that descriptive function names were the least of their problems.

Actually back when we still taught Java here (thankfully not anymore!) I found the difficult part was not the naming but rather the intense and overwhelming verbosity to get, well, anything done, and that was primarily a (standard) library issue, not necessarily an issue with the language (though the overwhelming OOP'ness of the language design definitely made a lot of things more verbose then it would be in other better languages).

I take just the opposite view; OCaml is utterly unreadable for me... but I believe this is just about what you are used too... People who are used to C style syntax will think that python and Haskell is unreadable. I don't think you can draw any general conclusions about this other than conclusions about the person making these claims.

I find the general OCaml ecosystem (which 'tends' to eschew operator-explosion, generally just using |> and && in addition to the standard mathematical operators, which will be simplified further once implicit module support as added) significantly more readable. Though yes, I tended to grow up and learn around the C/C++/OCaml/Erlang/Pascal world, but a lot of what I say is experience from teaching new programmers. I do not like watching them fight the language ecosystem just because other programmers like short code (which yes even the haskell code tends to be far less maintainable than the equivalent OCaml I've found from a student perspective).

With this said; hopefully we can get back on topic (and sorry for being a bit hypocritical..) ^,-

Heh, I quite enjoy it, plus it is still kind of on topic here, the pipeline operator is easily one of the best operators to come out of any language (and is why it is being added to so many languages as well), far far more used than even the mathematical operators as it makes code easier to read, use, and maintain. :-)

I've never used lens in Haskell, but my impression is lens mostly makes up for the lack of mutability, so rust does not obviously need anything like that.

Eh, lens's have other features in that it is easier to pass around 'what' to do instead of just how, but with trivial lambda's that's not as big of an issue either. I like lenses, but honestly I never really use them...

like monads being replaced with a more detailed effects system in newer languages.

Oh if you want to see a good effect system, look at what's being designed for OCaml right now (OCaml is fully functional like Haskell, though with a bit more powerful of a typing system and significantly faster to compile, like really it is the fastest optimizing compiler out and compiles to within an order of C speed on fairly generic code), it is amazing looking (and for note, OCaml doesn't really use monads all over, it supports typed mutable cells like Rust does as well, though GC and not ownership model if you use them).

It's better to encourage the the functional language researchers to build their research projects on top of Rust, MIR, etc., do mutability, borrowing, etc. similarly to Rust.

There are quite a few massive syntax warts I find in Rust that things like OCaml just tend to do better (though again entirely biased I do admit), that seem like things Rust took from Haskell instead of something like OCaml (OCaml's HPT's are so much more powerful, especially with implicit modules, and faster to compile, than Haskell's HKT's and typeclasses, which Rust is trying to mimic it seems sadly...). But that's all another discussion and not much can be done about it now that Rust has some of those syntax warts pretty permanently baked in.

1ML-style modules

Not just ML-style modules, but look at OCaml modules specifically, they are full Higher Polymorphic types, which are significantly more powerful and better able to represent concepts than standard ML modules or most things in Haskell with it's ton of nonstandard type extensions in GHC's case. :-)

@Centril
Copy link
Contributor

Centril commented Mar 19, 2018

As an aside, I've never used lens in Haskell, but my impression is lens mostly makes up for the lack of mutability, so rust does not obviously need anything like that.

It can be used that way, and that mostly only describes https://hackage.haskell.org/package/lens-4.15.1/docs/Control-Lens-Setter.html#g:5 , but you can do so much more with Lens. The biplate (Data.Data.Lens, Control.Lens.Plated) allows you to get typesafe SYB programming to walk deep inside ASTs, apply some monadic computation bottom up and then combine all that into one monadic computation.
An example from my bachelors thesis:

normEvery :: (Monad m, Data s, Typeable a, Data a)
          => (a -> m a) -> s -> m s
normEvery = transformMOnOf biplate uniplate

https://github.com/DATX02-17-26/DATX02-17-26/blob/dev/libsrc/Norm/NormM.hs#L366

-- | Flatten blocks
execFlattenBlock :: NormCUA
execFlattenBlock = normEvery $ \case
  Block stmts -> fmap Block $ flip traverseJ stmts $ \case
    SBlock (Block ss) -> change ss
    s                 -> unique [s]
  x -> unique x

https://github.com/DATX02-17-26/DATX02-17-26/blob/dev/libsrc/Norm/ElimRedundant.hs

Prisms on the other hand allow you to take any sum type and extract out the contents of any variant to Maybe <a tuple of the variant's fields>...

I think similar claims can be made about many other operators in Haskell, like monads being replaced with a more detailed effects system in newer languages.

Monad transformers are not bad -- as I understand it, they are more flexible than Idris' algebraic effects. They also allow you to compose things in different ways - the fact that things do not always commute with monad transformers is not always a bad thing. I think Idris has an effect system due/thanks to dependent types tho, but Idris also has monads.

As I said upthread #2049 (comment) there is probably too much strangeness budget cost in making a bunch of new operators that do basically what . and $ do in Haskell.

Hopefully we would not need a bunch, just . (with some different token) and some form of currying as a syntactic sugar would go a long way ;)

It's better to encourage the the functional language researchers to build their research projects on top of Rust, MIR, etc., do mutability, borrowing, etc. similarly to Rust. We've so much more to gain from that direction, like hybrids between traits and 1ML-style modules.

FP research on top of MIR would be great; Building Haskell's runtime in Rust is also something I'd like to do one day.

@OvermindDL1

As do I, but that does not make it easier than well named functions as you still have to remember precedence and all.

I never had this problem... I use operators in such a way (and they have a fixity in such a way) that I run into precedence issues very seldom.

Actually back when we still taught Java here (thankfully not anymore!) I found the difficult part was not the naming but rather the intense and overwhelming verbosity to get, well, anything done, and that was primarily a (standard) library issue, not necessarily an issue with the language (though the overwhelming OOP'ness of the language design definitely made a lot of things more verbose then it would be in other better languages).

The verbosity is/was a very big problem especially pre lambdas in Java 8, and that was due to language features - not library features. But other big problem students faced were arrays, pointers, and mutability. I think pure functions are easier to teach since it is closer to the mathematics computer science students already learned in discrete maths and such.

I do not like watching them fight the language ecosystem just because other programmers like short code (which yes even the haskell code tends to be far less maintainable than the equivalent OCaml I've found from a student perspective).

It's not so much about short code as it is about pointfree code - I find that Haskellers just love composition and think in those terms -- and we use a small toolkit of infix operators to facilitate this.

Oh if you want to see a good effect system, look at what's being designed for OCaml right now (OCaml is fully functional like Haskell, though with a bit more powerful of a typing system [..]

Are you referring to the 1ML paper here? AFAIK OCaml does not have dependent types which Haskell does (to some extent - and more is being worked on..) but I could be wrong.

[..] most things in Haskell with it's ton of nonstandard type extensions in GHC's case. :-)

My only note here is that practically speaking, standardization of Haskell does not matter and GHC == Haskell is in practice true -- so the "nonstandard" extensions are essentially standard for Haskell programmers.

The language I'm being most impressed with these days is Idris.

On a separate not I think Rust could gain an effects system at least in part for const.

@OvermindDL1
Copy link

I never had this problem... I use operators in such a way (and they have a fixity in such a way) that I run into precedence issues very seldom.

It is a habit thing, and I don't habitually use Haskell enough for it to become engrained as I generally just use it to maintain things and not write new in it, but this is a definite friction point when you don't live Haskell.

The verbosity is/was a very big problem especially pre lambdas in Java 8, and that was due to language features - not library features.

That really was such a nice feature they added in J8! It needed it long before (honestly in a different way internally, but eh), but we'd dropped Java here as something to teach back in the J6 era. ^.^;

It's not so much about short code as it is about pointfree code - I find that Haskellers just love composition and think in those terms -- and we use a small toolkit of infix operators to facilitate this.

But for most people this makes the code exceptionally hard to read, especially when teaching. Point-free coding is a section we've used here (though in OCaml instead but the syntax was similar enough to Haskell) and 99% of students always preferred the code that was actually readable, even if longer. It requires them to hold less 'state' in their head at a given time (and yes that is something you have to exercise to get good at, and even then it does not gain enough to be worth it in my couple-decades of experience).

Are you referring to the 1ML paper here?

Ooo? Something to read? Linkie please? :-)

AFAIK OCaml does not have dependent types which Haskell does (to some extent - and more is being worked on..) but I could be wrong.

It does not, that is one feature that I would quite like, although there are languages with dependent and refined types that do compile 'to' OCaml (OCaml is such a bliss of a language to both compile from it to other things as well as compile 'to' it). Honestly I'd love an OCaml'y language, with implicit modules and algebraic effects and an ownership system with tracked lifetimes, throw in dependent types (preferably in a way that does not impact compile time, one thing I really exceptionally love about OCaml is how blazing fast it compiles, unlike smaller Haskell code I have to maintain that takes ~20m to compile compared to much larger OCaml code that compiles in a couple of seconds and it executes far faster), that would be bliss. Rust is a nice balance between my C++ and ML worlds though, but it is getting too many Haskell'isms in it that I quite think is hurting the syntax and it is soooo dreadfully slow to compile at times like C++/Haskell...

My only note here is that practically speaking, standardization of Haskell does not matter and GHC == Haskell is in practice true -- so the "nonstandard" extensions are essentially standard for Haskell programmers.

True in 'general' but we didn't use GHC here the few times we show other languages like Haskell (not my choice, I would have used GHC, but you know colleges...).

The language I'm being most impressed with these days is Idris.

It does look quite nice, though is also quite slow to compile. I've been playing with making a codegen backend to compile it to the BEAM VM lately actually. :-)

On a separate not I think Rust could gain an effects system at least in part for const.

+1

I don't think it should have full unbounded continuations that are really required for a 'proper' full effect system due to their fairly tremendous runtime costs compared to pure native code, but you can get 95% of the way to full effects while keeping native speed by using delimited continuations ala MC-OCaml or many Lisps. You can get fast unbounded continuations if you don't mind hoisting some of the work to the user though, but definitely not as easy to use then.

@Centril
Copy link
Contributor

Centril commented Mar 19, 2018

It is a habit thing, and I don't habitually use Haskell enough for it to become engrained as I generally just use it to maintain things and not write new in it, but this is a definite friction point when you don't live Haskell.

I think that holds in the reverse too. I find that once you live in Haskell (and use pointfree code) you get friction when you try to use other things.. It's the old "once you got nice things, you don't want to give them up".

It requires them to hold less 'state' in their head at a given time (and yes that is something you have to exercise to get good at, and even then it does not gain enough to be worth it in my couple-decades of experience).

This is strange... as pointfree code has less points (variable bindings) to pass around, and so it should have less state? For me quux = foo . bar . baz is simpler to think about than quux = \x -> foo $ bar $ baz x.

Ooo? Something to read? Linkie please? :-)

Here you go: https://people.mpi-sws.org/~rossberg/1ml/
I haven't read them myself (but I did read the Algebraic effects paper by Edwin Brady); but I'm reading up on the TaPL again and will read the 1ML paper soon ;) Hopefully there's something there we can use for Rust (in terms of effect polymorphism of some form).

[..] throw in dependent types (preferably in a way that does not impact compile time [..]

That does not sound realistic given that dependent types allow you to do more computations at compile time ^,-

[..] unlike smaller Haskell code I have to maintain that takes ~20m to compile compared to much larger OCaml code that compiles in a couple of seconds and it executes far faster

I think part of the problem is that laziness seemed like a nice idea but makes it very hard to reason about performance... However, I'd like to note that you usually don't compile Haskell code until you actually want to deploy - I usually only live in ghci which is a fantastic REPL and does not take 20m when you :r.

True in 'general' but we didn't use GHC here the few times we show other languages like Haskell (not my choice, I would have used GHC, but you know colleges...).

That's the university's fault :P At Chalmers University of Technology (AKA "Church of Haskell", and where Agda was invented) Haskell is the first course the computer engineering programme uses and it is also taught in one of the most appreciated courses for the IT students.

It does look quite nice, though is also quite slow to compile. I've been playing with making a codegen backend to compile it to the BEAM VM lately actually. :-)

I had thought of compiling Haskell Core to BEAM VM as my masters thesis incidentally =P

@OvermindDL1
Copy link

I think that holds in the reverse too. I find that once you live in Haskell (and use pointfree code) you get friction when you try to use other things.. It's the old "once you got nice things, you don't want to give them up".

For actively using it yes, but not everyone uses exclusively one language, or even uses it all that often (I touch Haskell to keep certain things updated only a couple times a year at most). And that still ignores the learnability part and later maintainability of the lack of descriptiveness.

This is strange... as pointfree code has less points (variable bindings) to pass around, and so it should have less state?

Less 'program' state, but not mental state, it requires you more to worry about the types and how they are being combined and transformed as it goes back and forth across the operators, where a simple pipeline is trivial, especially in comparison. This makes languages a LOT harder to learn for students as well as still requiring more work for those that do even know it. In a pipeline you only have to worry about how types transform from one pipe to the next, where in full point free you have to see how they combine what new types return from those combinations, how they get called (if not just outright curried) and more and more.

Here you go: people.mpi-sws.org/~rossberg/1ml
I haven't read them myself (but I did read the Algebraic effects paper by Edwin Brady); but I'm reading up on the TaPL again and will read the 1ML paper soon ;) Hopefully there's something there we can use for Rust (in terms of effect polymorphism of some form).

Awesome, added to my list, thanks! You might be curious in seeing how the multi-core/effectful OCaml is being implemented as well as they've tried to taking everything learned thus far from elsewhere to implement something that compiles to exceedingly efficient code, while remaining blazing fast to compile and still able to handle the 95%+ of effects that are needed, it is quite a fascinating implementation, compared to most other languages that incur a massive speed hit either at runtime, compiletime, or usually both (GHC likes to have both I've noticed, they are very thunk-friendly...).

I think part of the problem is that laziness seemed like a nice idea but makes it very hard to reason about performance... However, I'd like to note that you usually don't compile Haskell code until you actually want to deploy - I usually only live in ghci which is a fantastic REPL and does not take 20m when you :r.

Even then it still takes a lot more than the split second an OCaml incremental compile takes (either by calling the compiler or recompiling via the REPL), especially when heavy HKT usage is around.

That's the university's fault :P At Chalmers University of Technology (AKA "Church of Haskell", and where Agda was invented) Haskell is the first course the computer engineering programme uses and it is also taught in one of the most appreciated courses for the IT students.

Where here it is generally starting with either SML and C both, or Python (depending on if you are coming from the engineering or business sides respectively), before proceeding to more languages.

I had thought of compiling Haskell Core to BEAM VM as my masters thesis incidentally =P

You so should! That'd get my Idris->BEAM-VM magically done for me via Haskell translation! ^.^

@scottmcm
Copy link
Member

Trying to get back to the topic at hand...

Honest question: When is this better for a function that wouldn't just have been defined as a method in Rust in the first place? Everything here so far has either been something that's a method already or a nonsense name like blah or bloop.

API design as a whole seems to be moving heavily towards "almost always a method", like how even symmetric things like std::cmp::min are becoming available as methods (see Ord::min) instead.

And one can always add an extension trait to let you call things method-style if you want:

trait Methodize: Sized {
    fn methodize<F, R>(self, f: F) -> R
        where F: FnOnce(Self) -> R
    {
        f(self)
    }
}
impl<T> Methodize for T {}

fn main() {
    println!("{:?}", (1).methodize(std::iter::repeat));
    println!("{:?}", (1).methodize(|x| 9 - x));
}

@Centril
Copy link
Contributor

Centril commented Mar 20, 2018

@scottmcm Extension traits for just a method to get method call syntax is a pain compared to just writing the function and then using f(a, b) so I don't see it as a viable alternative. It also leads to the Itertools problem. Itertools could have been simpler if you just had iter.tuples() be syntactic sugar for tuples(iter) (the trait would go away and you'd only have free functions) and you'd also gain in the ability to use IntoIterator, so: vec![1, 2, 3, 4].tuples().

@scottmcm
Copy link
Member

Extension traits for just a method to get method call syntax is a pain

So the problem is "It's too annoying to make extension methods", not "I want to call free functions with pipe syntax instead of using methods"?

the trait would go away

That has its advantages and disadvantages. For example, it would lead to use itertools::prelude::*;, which isn't necessarily any better than use itertools::IterTools;. And once specialization is a thing, the hypothetical function might end up calling a trait anyway so the implementation can be specialized.

and you'd also gain in the ability to use IntoIterator

All iterators are intoiterator, so the extension trait could have just been on IntoIterator instead the whole time. I don't know why it isn't, or whether that would actually be better, but it totally works:

trait IntoIteratorTools: IntoIterator {
    fn my_sum1(self) -> Option<Self::Item>
        where Self: Sized, Self::Item: std::ops::AddAssign
    {
        let mut it = self.into_iter();
        let mut sum = it.next()?;
        for x in it {
            sum += x;
        }
        Some(sum)
    }
}
impl<T: IntoIterator> IntoIteratorTools for T {}

fn main() {
    let v = vec![1, 2, 3];
    println!("{:?}", v.my_sum1());
    let i: Box<Iterator<Item = i32>> = Box::new(0..10);
    println!("{:?}", i.my_sum1());
}

@TedDriggs
Copy link

So the problem is "It's too annoying to make extension methods", not "I want to call free functions with pipe syntax instead of using methods"?

I'm not sure I'd go that far. Having extension trait wrappers all over the place would be a high barrier to entry for someone trying to preserve the reading order of a set of function calls.

@I60R
Copy link

I60R commented May 9, 2018

(already proposed that on internals)


Non-associated function to be called on value with . syntax, and then argument position where value goes should be marked with in:

   let result = x
        .sum(in)       // Called on `x`
        .abs(in)       // Called on resut of `sum`
        .sqrt(in)      // ...
        .round(in, 2); // Other arguments allowed
    return collection.iter()
        .apply_mapping_combinators(in)
        .apply_logging_combinators(in)
        .collect();

Borrowing and copying should work like this:

    return Type::builder()
        .apply_common_settings(&mut in) // To mutably borrow builder.
        .build();    
    long_name_binding
        .borrowed(&in, &in); // That's allowed for all types.
    long_name_binding
        .copied(in, in); // That's allowed only if `Copy` is implemented.

Alternative placeholders might be: it, that, this, __, |>, (move, ref, mut ref).

@eedahl
Copy link

eedahl commented May 24, 2018

@I60R That looks pretty clean to me. Is there a nice way to handle cases where a function might return a result, a tuple, etc.? With pattern matching you could do stuff like x.foo(in).|(in, y)| { x }.bas(in) or x.foo(in).(|in, y)| x ).bas(in), say, or other kinds of destructuring, which also suggests that multiple in parameters may be useful, say, in.n, in[n], _n, |n> or other.

By the way, Rust does already use fn f -> y to signify a function, so -> would seem a natural fit for a piping operator: foo -> bas when the return value of foo matches the input of bas, and piping through pattern matching closures or e.g. foo -> bas(in, x). I could see this being very convenient.

@I60R
Copy link

I60R commented May 24, 2018

@eedahl you can use x.foo(in).1.bas(in) to access specific index from tuple and for Result you can use x.foo(in)?.bas(in) or x.foo(in).unwrap().bas(in).

However things like x.foo(in.1).bar(in[0]).baz(in.unwrap().clone()) might make sense, but then it will be better to rename alias to it.

Syntax with -> is interesting, because we might get rid of placeholder when it's not required, e.g. foo->bar(), since we know that non-associated function is called and there might be convention that value is passed as last argument.

@scottmcm
Copy link
Member

Having extension trait wrappers all over the place would be a high barrier

If one needs to write out the whole thing I agree, but one could instead imagine adding a feature such that fn foo(self: i32) was all you needed to write, and it just desugared into the full extension trait.

@sgeos
Copy link

sgeos commented May 17, 2019

I am coming from Elixir. I frequently use the |> pipe operator to describe functional data transformation pipelines. The goal is essentially to present a relatively easy to understand y = f(x) where f(x) is actually a composition of functions such that f(x) = g(h(i(j(x))))). To repeat a key point, a functional pipeline is described. Every step is a function, not a method. Some of these functions may be user defined. Note that in Elixir x |> f(y, z, ...) is the same as f(x, y, z, ...).

To take a working example from a real production system (as opposed to a polished perfect nitpick-proof example):

defp get_resource_by_partial_key_transaction(input) when is_list(input) do
  input
  |> Enum.map(fn %{platform_id: platform_id, user_id: user_id}=key -> {platform_id, user_id, get_resource(key)} end)
  |> Enum.map(fn {platform_id, user_id, list} -> list |> Enum.map(&({:resource_key, platform_id, user_id, &1})) end)
  |> Enum.reduce([], &Kernel.++/2)
  |> Enum.uniq()
  |> Enum.sort()
  |> Enum.map(&(:mnesia.read({:resource, &1})))
  |> Enum.reduce([], &Kernel.++/2)
  |> Enum.map(&resource_to_map/1)
end
defp get_resource_by_partial_key_transaction(input), do: get_resource_by_partial_key_transaction([input])

The above is a multi-headed function designed to operate on list and single item inputs. The first head performs the desired operation on a list of input items and the second head simply wraps a single input value in a list. To rephrase the body of the above operation in English:

  1. take a regular hash map as input
  2. map it to a top level tuple containing a partial composite key and a list of values
  3. for lookup purposes, map each tuple into an expanded list of tuples containing the partial composite key and a scalar from the list of values; {key, [value]} -> [{key, value}]
  4. flatten the list
  5. remove duplicates
  6. sort
  7. read all resources for each partial composite key
  8. flatten the list
  9. convert each resource into a regular hash map
  10. implicitly return this list of values

The top level y = f(x) is "give me all the resource for these partial composite keys". The intermediate steps involve transforming data with a combination of relatively straightforward mapping, library functions and user defined functions. Some of the steps could be combined, but this would sacrifice readability.

A much simpler but somewhat contrived example using an anonymous function looks like this.

fn r ->
  r
  |> :math.pow(2)
  |> Kernel.*(:math.pi)
end

Take the radius, square it and multiply by pi. In other words, given the radius of a circle, calculate its area.

@nixpulvis
Copy link

nixpulvis commented May 18, 2019

@sgeos what about

fn area(r: f64) -> f64 {
    PI * r.powf(2.0)
}

I write Elixir sometimes and I might argue that

"foo bar" |> String.split |> length |> IO.puts

should be

println!("{}", "foo bar".split(" ").count())

But there could be cases it's convenient.

I think I'd be more excited about a proposal that solves the ergonomics in a way that makes it easier to use method call syntax, like

"foo bar".split(" ").count().println("{}")

if you really wanted it to all be consistent within the expression.

@sgeos
Copy link

sgeos commented May 19, 2019

@nixpulvis
Calculating the area of a circle absolutely does not require the use a pipeline. It does, however, make for an easy to understand toy pipeline with an emphasis on data transformation. For that reason I said that the example was contrived.

Pipelines make sense when you want y = f(x) and you do not care to assign the intermediate g(h(i(j(x))))) steps to variables. Furthermore, there are definitely cases where function composition without pipelines makes sense. Note that it sometimes makes sense to break a pipeline to introduce a named variable for the sake of clarity.

fn f(x: type) -> other_type {
  let named_z = x // because a "named_z" is clearer in context
  |> g()
  |> h(w);
  i(y, named_z) // broken pipeline continues here
  |> j(v)
  |> k()
}

For what it is worth, some people prefer to make intermediate values explicit and this definitely makes sense at times.

fn print_word_count(input: &str, token: &str) {
  let word_count = input.split(token).count();
  println!("{}", word_count);
}

Having said that, bike-shedding the trivial examples is not productive. Skyscrapers are built by appropriately combing a variety of techniques. It appears that method call chains serve the same function as pipelines when they can be used. Furthermore, shadowing could also be used for essentially the same purpose. (It can even make sense to properly name the intermediate steps, but I would argue that you do not really want a pipeline at that point.)

// shoehorning previous function into a bike-shed example
// the goal is to illustrate the technique
fn print_word_count(input: &str, token: &str) {
  let word_count = input;  // to make the "pipeline" longer
  let word_count = word_count.split(token);
  let word_count = word_count.count();
  println!("{}", word_count);
}

To go back to a version of my real world example, what is the idiomatic way to express the following kind of logic in Rust? Would a pipeline add any value above and beyond the existing idiomatic way of expressing the logic?

fn do_something(input: Vec<user defined input type>) -> Vec<user defined output type> {
  input
    .map(anonymous_function {})
    .map(anonymous_function {})
    .reduce(library_function {})
    .library_function()
    .user_defined_function() // because this happens too
    .library_function()
    .map(library_function {})
    .reduce(library_function {})
    .map(user_defined_function {})
}

// same file
fn my_user_defined_functions_probably_live_here(input: type, additional_parameters: other_type, ...) -> different_type {
  // logic ...
  result
}

Note that I often mix user defined functions into my pipelines, either directly or through functions like map. My Rust could be better, but many of these user defined functions would not be methods in the other languages I am more familiar with. Further note that this just happens to be a map heavy example. Plenty of valid pipelines do not require mapping.

@Rudxain
Copy link

Rudxain commented Apr 28, 2024

@Centril

abundance of white space really helps me read the code as opposed to code being littered with terminal symbols

I partly agree with this. I like how Lisp does it: operators are just fns. This has the nice property of allowing binary operators to be easily extended as variadic. In some stack-oriented langs x y z a b c ...+ is just sugar for x y + z + a + b + c +.

Note

... syntax is made-up, I haven't seen it.
... means "take all values from the locally-scoped stack as arguments"

@Rudxain
Copy link

Rudxain commented Apr 29, 2024

some people prefer to make intermediate values explicit

Potentially relevant real-life example in TS:

const lb = Math.log2

/**
Integer (truncated) Binary Logarithm.
*/
const ilb = (x: number) => Math.trunc(lb(x)) // I prefer `trunc` over `floor`


/**
Calculates the minimum int payload bits needed to encode `s`
*/
const min_bit_len = (s: string) => {
    // avoid `lb(0)`
    if (s == '') return 0

    /** charset size (unique code-points) of `s` */
    const cs_size = (new Set(s)).size

    /** total number of code-points in `s` */
    const cp_count = [...s].length

    // special case for unary
    if (cs_size == 1) return ilb(cp_count) + 1

    /** bits per character */
    const bpc = Math.ceil(lb(cs_size))

    return cp_count * bpc
}

@burdges
Copy link

burdges commented Apr 29, 2024

As an aside, the more expressive your langauge ala haskell then the worse its automatic code formatting, well rustfmt already screws up attempts to express math significance. Haskell guys have instead promoted AST aware diff tools as solving the same problem as automatic code formatting.

@kennytm
Copy link
Member

kennytm commented Apr 29, 2024

if we have postfix-match #3295 the pipeline could be written as

result = x
    .match { x => sum(x) }
    .match { a => abs(a) }
    .match { b => sqrt(b) }
    .match { c => round(c, 2) };

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

No branches or pull requests