Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using LZ4 to send stats #336

Open
andysworkshop opened this issue Jun 20, 2018 · 17 comments
Open

Using LZ4 to send stats #336

andysworkshop opened this issue Jun 20, 2018 · 17 comments

Comments

@andysworkshop
Copy link
Contributor

I have configured a socket for receiving line data compressed with LZ4. The sender is java. I consulted the LZ4 compatibility page and selected Apache commons compress. I tried both block and framed mode (which one is correct?) and neither works. Packets are dropped silently at carbon-c-relay with no logging at all. Basically nothing happens. None of the internal counters are updated (monitored with -d option).

For reference when I switch to a gzip wrapped socket and try that it works perfectly, as does plain uncompressed data. Any hints as to where I might be going wrong? carbon-c-relay is built against LZ4 lib version 1.8.2 (with static linking).

My listeners are configured like this:

listen
    type linemode
        2013 proto tcp
    ;

listen
    type linemode transport lz4
        2113 proto tcp
    ;

listen
    type linemode transport gzip
        2213 proto tcp
    ;
@grobian
Copy link
Owner

grobian commented Jun 20, 2018

I use LZ4_compress_fast_continue to write to the compressed stream, e.g. it's (re-)using dictionaries etc. from previously compressed blocks to gain better compression ratio. The function belongs to the "Streaming Compression Functions" group from lz4.h. I don't think I really do anything special here.

@andysworkshop
Copy link
Contributor Author

Compression on the relay isn't the problem. The java client is compressing a metrics batch. The relay decompresses and forwards to carbon cache.

The problem is that the whole compressed packet vanishes on the relay which makes me think that something is wrong with the compressed data format since lz4 does appear to have interoperability issues.

Do you have a sample command line that will generate correct lz4 format that I can pipe into the relay with 'nc' for testing and reference?

@andysworkshop
Copy link
Contributor Author

andysworkshop commented Jun 21, 2018

Looking briefly at the source code for both the relay and the lz4 library it seems the problem is the use of the block streaming API by the relay.

LZ4_decompress_safe_continue() is not safe to use with partial blocks as could be returned by a short read from the socket (dispatcher.c). Also, you don't know how the sender is generating LZ4 blocks so you don't have enough metadata to buffer until you've read a complete block.

I think that the answer would be to use the lzframe API that frames the incoming blocks with a header that describes what's coming.

I've only spent 30 minutes looking at the sources so I could be wrong here so please let me know if I'm off the mark.

@grobian
Copy link
Owner

grobian commented Jun 21, 2018

safe here is because the input data is random, a crash may not happen due to misformed data.

Yesterday I tried to produce a simple example using lz4 util, but failed. So there's something wrong here. I tested this at the time using two relays, but I agree it would be useful to work well with other (standard) tools.

A small problem here is that we don't want to buffer so much data before we send it, e.g. send a block after each metric to push the metric out. Adding frames and stuff is probably going to defeat the purpose, but I may be wrong here. Tonight I will look into why the read using lz4 util fails. You may be right about doing the wrong thing there.

@grobian
Copy link
Owner

grobian commented Jun 21, 2018

I just tested, using two relays, it works as expected. The problem appears to be that nothing seems to be able to just generate block streams, except the c-library itself.

@grobian
Copy link
Owner

grobian commented Jun 21, 2018

I'm thinking I better introduce a "heavy-weight" lz4 mode if you want to send lz4 data from a non-C producer.

@andysworkshop
Copy link
Contributor Author

I just tested, using two relays, it works as expected. The problem appears to be that nothing seems to be able to just generate block streams, except the c-library itself.

It will only work even between relays if this socket read returns the entire block, such as when both relays are on a fast network, or the same machine. If there is network congestion and the socket read returns, say 1 byte, as it can, then this call will fail because it must be provided with an entire block. "streaming" in LZ4 terminology refers to a stream of blocks, not bytes. The API is much more unfriendly and hard to use than zlib.

I'm thinking I better introduce a "heavy-weight" lz4 mode if you want to send lz4 data from a non-C producer.

I agree. That would be a perfect solution.

@grobian
Copy link
Owner

grobian commented Jun 22, 2018

since the read you refer to is appending to the buffer, a short read will not harm, decompression will fail, the next read will complete the block though

@grobian
Copy link
Owner

grobian commented Aug 10, 2018

Any reason why you need LZ4 here? Would zstd for instance be ok for you too?

@andysworkshop
Copy link
Contributor Author

Sorry for the late reply, I must have missed the notification for this. zstd looks almost too good to be true. We'd certainly switch over to that if it were available. We don't specifically need LZ4. Anything that achieves similar compression to zlib with lower CPU usage and has java client libs available is great.

@grobian
Copy link
Owner

grobian commented Sep 4, 2018

Alright. I've just looked into creating the "heavy-weight" lz4 support, but (at least on the writer side) it isn't that simple. It most likely won't compress much because it will run per-metric. So you won't benefit that much if I would do this. Have you looked at Snappy? It is close in compression rate to lz4, although its compression and decompression rates aren't that stunning. Still compared to zlib it's much faster. I could add support for that easily, I think. There should be very good Java support for it, since it's used by a lot of BigData formats such as Parquet and Avro.

@andysworkshop
Copy link
Contributor Author

I'll have a look at the compression performance of Snappy against our payloads. I have a feeling it will be good because our payloads contain a lot of repeating text so compression algorithms don't have to try hard to do well.

@andysworkshop
Copy link
Contributor Author

I tested some of our payloads and found that with zlib set to level 5 compression we got 95.5% and with snappy it's about 90% using the 'snzip' command line client utility. Quite close really.

Nobody's complaining yet about CPU load caused by our use of zlib so there's no rush to implement snappy but if you do it then we would probably migrate over to it.

@grobian
Copy link
Owner

grobian commented Sep 6, 2018

I started on it, got it mostly done, except that it doesn't work yet :) You're right that zlib will be best, but at the cost of CPU. Compression rates may be different as the relay does it metric by metric though.

grobian added a commit that referenced this issue Sep 7, 2018
This is related to issue #336, where snappy could be an alternative take
to lz4.
@grobian grobian closed this as completed Oct 21, 2018
@andysworkshop
Copy link
Contributor Author

Would you be interested in a pull request that implements the LZ4 framed format? I've got it working here and have tested communication between java -> relay and relay -> relay.

@grobian grobian reopened this Nov 12, 2018
@grobian
Copy link
Owner

grobian commented Nov 12, 2018

Yeah, I think there's no way around this, I need to drop the compatibility as it will not be able to work reliably this way without enveloping/framing to ensure the decompressor has everything it needs.
I have no changes to the code currently, so you should have no difficulties getting a the code merge-ready.

@andysworkshop
Copy link
Contributor Author

Cool, I'll leave it running for a bit longer capturing metrics and forwarding between relays to make sure it's all good then work on the PR. I'll branch off your last release tag and make my edits there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants