Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Constraints on renderBytes, renderText #64

Open
drquicksilver opened this issue Oct 29, 2015 · 3 comments
Open

Constraints on renderBytes, renderText #64

drquicksilver opened this issue Oct 29, 2015 · 3 comments

Comments

@drquicksilver
Copy link

It may be that I misunderstand the docs but the types of renderBytes and renderText are unexpected to me following from the phrase "this module does not provide IO and ST variants, since the underlying rendering operations are pure functions":

renderBuilder :: Monad m => RenderSettings -> Conduit Event m Builder Source
renderBytes :: (MonadBase base m, PrimMonad base) => RenderSettings -> ConduitM Event ByteString m () Source
renderText :: (MonadThrow m, MonadBase base m, PrimMonad base) => RenderSettings -> ConduitM Event Text m () Source

The PrimMonad constraints force you to deal quite explicitly with ST or IO to get the answers out. It also seems weird that renderText gets MonadThrow while the other two do not - it's not clear why that one would throw an exception any more than the other two; renderBytes by design can only generate valid UTF8 ByteStrings so conversion to Text can never fail.

@snoyberg
Copy link
Owner

PR welcome to clarify the docs. There is currently no function for utf8
decoding that does not have the MonadThrow constraint, but if you add such
a function, we can use it here.

On Thu, Oct 29, 2015, 6:14 AM drquicksilver [email protected]
wrote:

It may be that I misunderstand the docs but the types of renderBytes and
renderText are unexpected to me following from the phrase "this module
does not provide IO and ST variants, since the underlying rendering
operations are pure functions
":

renderBuilder :: Monad m => RenderSettings -> Conduit Event m Builder Source
renderBytes :: (MonadBase base m, PrimMonad base) => RenderSettings -> ConduitM Event ByteString m () Source
renderText :: (MonadThrow m, MonadBase base m, PrimMonad base) => RenderSettings -> ConduitM Event Text m () Source

The PrimMonad constraints force you to deal quite explicitly with ST or IO
to get the answers out. It also seems weird that renderText gets
MonadThrow while the other two do not - it's not clear why that one would
throw an exception any more than the other two; renderBytes by design can
only generate valid UTF8 ByteStrings so conversion to Text can never fail.


Reply to this email directly or view it on GitHub
#64.

@drquicksilver
Copy link
Author

Happy to put together some better docs. Just trying to get my head around the situation and what is intended here.

For RenderBytes there is a commented-out signature without the MonadBase+PrimMonad constraint. That suggests that at one point the author hoped to implement this without the constraint? But it is an implementation detail of Data.Conduit.Blaze that builderToByteString produces a MonadBase+PrimMonad constraint because that is how it manages buffers to get optimal performance? All the question marks are because I’ve been reading a lot of source code without full confidence that I understood the reasons for every choice.

It seems like a shame because RenderBytes is intuitively a ‘pure conduit’; it corresponds to a well-behaved conduit which batches up the obvious pure function Event -> ByteString.

As it happens I am trying to render a stream of Events into a Text object as part of a pure function and that is why I’m asking all these questions. I’m probably going to be forced to use unsafePerformIO in the end - and maybe that’s the right solution for my use case but it does feel inelegant.

Then the MonadThrow issue is because the decoding happens with the Data.Conduit.Text.decode function which quite rightly has a way to signal errors in general, but in this particular case we know statically that no error can occur because renderBytes can only build correct UTF8 strings. Maybe there should be an unsafeDecode variant which is a Conduit without the MonadThrow constraint and it bundles any conversion errors in with error.

I can’t currently see how to write a general function which strips off a MonadThrow constraint from inside ConduitM. You can do it after you’ve run the conduit by doing

fmap (either (error . show) id) . runExceptT . runConduit

or similar.

One more remark on reading this API: it’s initially counter intuitive that renderText is built on top of renderBytes : it seems more natural to first render Events to Text, and then encode the Text as a ByteString in some particular encoding for writing to file or network. I presume the reason for this choice is that ‘output to UTF-8 encoded bytes’ is considered to be the most common path and that will be fasted by the hand-written blaze byte string builder approach?

@snoyberg
Copy link
Owner

snoyberg commented Nov 1, 2015

This is indeed about efficiency. Creating individual chunks of Text would
be the most elegant API (as you're looking at it), but would be much
slower. You can approach that be turning each Builder into a ByteString and
then decoding it, but performance will be abysmal.

On Thu, Oct 29, 2015, 1:31 PM drquicksilver [email protected]
wrote:

Happy to put together some better docs. Just trying to get my head around
the situation and what is intended here.

For RenderBytes there is a commented-out signature without the MonadBase+
PrimMonad constraint. That suggests that at one point the author hoped to
implement this without the constraint? But it is an implementation detail
of Data.Conduit.Blaze that builderToByteString produces a MonadBase+
PrimMonad constraint because that is how it manages buffers to get
optimal performance? All the question marks are because I’ve been reading a
lot of source code without full confidence that I understood the reasons
for every choice.

It seems like a shame because RenderBytes is intuitively a ‘pure
conduit’; it corresponds to a well-behaved conduit which batches up the
obvious pure function Event -> ByteString.

As it happens I am trying to render a stream of Events into a Text object
as part of a pure function and that is why I’m asking all these questions.
I’m probably going to be forced to use unsafePerformIO in the end - and
maybe that’s the right solution for my use case but it does feel inelegant.

Then the MonadThrow issue is because the decoding happens with the
Data.Conduit.Text.decode function which quite rightly has a way to signal
errors in general, but in this particular case we know statically that no
error can occur because renderBytes can only build correct UTF8 strings.
Maybe there should be an unsafeDecode variant which is a Conduit without
the MonadThrow constraint and it bundles any conversion errors in with
error.

I can’t currently see how to write a general function which strips off a
MonadThrow constraint from inside ConduitM. You can do it after you’ve run
the conduit by doing

fmap (either (error . show) id) . runExceptT . runConduit

or similar.

One more remark on reading this API: it’s initially counter intuitive that
renderText is built on top of renderBytes : it seems more natural to
first render Events to Text, and then encode the Text as a ByteString in
some particular encoding for writing to file or network. I presume the
reason for this choice is that ‘output to UTF-8 encoded bytes’ is
considered to be the most common path and that will be fasted by the
hand-written blaze byte string builder approach?


Reply to this email directly or view it on GitHub
#64 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants