EMBEDDED COMPRESSION LIBRARY

ECL aka EMBEDDED COMPRESSION LIBRARY is NOT ONLY for embedded, it is mostly oriented for small data and has special optimized low-memory modes for restricted environments.

Language: C

Platforms: any

Endianness: any

Library version: 1.0.3

Tested on

Windows 7: msvc2013, msvc2015, gcc 4.8, gcc 7.2
Mac OS 10.12: clang (Apple LLVM version 8.0.0)
Embedded ARM Cortex-M3: armcc 5.06

COMPRESSORS

Some of modes of some compressors use intermediate buffers for compression, they don't use any implicit allocation (unless otherwise specified) - user can easily choose how to allocate buffers. Every compression method that uses temporary buffer (say, more than 10 bytes) - has it specified in documentation near method declaration.

ECL:NanoLZ - meticulously formatted version of traditional LZ77 algorithm.

use cases - various, same with other pure LZ algorithms;
takes advantage of repeated sequences with length of 2 bytes;
provides API for adjusting compressed data format (see "schemes") to gain better compression ratio for user's datasets (for advanced users);
can be beneficial for very small amounts of data (e.g. to fit some data into single Bluetooth Low Energy packet);
compression ratio - middle..high;
compression ratio limit - roughly infinite;
compressors complexity - linear..cubic;
compressors performance - low..high, provides different modes, configurable compression/performance trade off;
compressors performance for small data payloads - optimized;
decompressor complexity - linear;
decompressor performance - middle..high;
compressors buffer memory consumption - from zero to any, has different modes using 0, 256, 512,.. bytes for temporary buffers;
decompressor buffer memory consumption - zero;
static const memory consumption - low;
stack consumption - normal;
(TBD if needed) stream modes for processing data by chunks;

ECL:ZeroDevourer - a diff-oriented compressor that takes advantage of zero bytes (even single ones) in your stream.

use cases - incremental update of a structure FOO where you compress (FOO_before XOR FOO_after) rather than FOO itself, or data with significant amount of zeroes;
compression ratio - roughly, linearly depends on percentage of zeroes in your data, for target use cases - high;
compression ratio limit - roughly infinite;
compressor complexity - linear;
compressor performance - high (up to gigabyte per second);
decompressor complexity - linear;
decompressor performance - high (up to several gigabytes per second);
compressor buffer memory consumption - zero;
decompressor buffer memory consumption - zero;
static const memory consumption - 10 bytes;
stack consumption - low;
binary code size - small;

ECL:ZeroEater - a diff-oriented compressor that takes advantage of zero bytes (starting from two bytes in a row) in your stream.

use cases - incremental update of a structure FOO where you compress (FOO_before XOR FOO_after) rather than FOO itself, or data with significant amount of zeroes;
compression ratio - similar to ZeroDevourer but worse, for target use cases - high;
compression ratio limit - 64;
compressor has dry-run mode;
compressor complexity - linear;
compressor performance - very high (up to gigabyte per second);
decompressor complexity - linear;
decompressor performance - very high (up to several gigabytes per second);
compressor buffer memory consumption - zero;
decompressor buffer memory consumption - zero;
static const memory consumption - zero;
stack consumption - lowest;
binary code size - minimum;
doesn't require "ECL_common.c" file to be compiled;

FORMATS

Formats of compressors are described in "formats/" dir, there are common features shared between compressors (except ZeroEater, which is simple and independent):

"formats/ECL_JH.txt" - format of Jumping Header, main feature and core of the library;
"formats/ECL_E_number_format.txt" - format of 'Extensible' numbers;

EXTRA CONFIGURING

See ECL_config.h for details on configuring, mostly controlled by ECL_USE* macros.

you can explicitly specify bitness of length variables on your consideration (ECL_USE_BITNESS_16 / 32 / 64 macro), default is 32;
you can enable/disable branchless optimizations for your consideration (currently inefficient) - see ECL_USE_BRANCHLESS;
you can enable/disable internal asserts (work if system assert works e.g. no NDEBUG macro specified) - see ECL_USE_ASSERT;
you can disable malloc/free in case they cause compilation errors on some restricted platforms - see ECL_DISABLE_MALLOC;
you can disable/exclude memory-demanding functions by defining ECL_EXCLUDE_HIMEM (useful for 16bit compilers, e.g. arduino environment) to fix some warnings;
you can allow all NanoLZ schemes or only specific one - to let compiler inline more for better performance - see ECL_NANO_LZ_ONLY_SCHEME;
to use ECL as dynamic library - uncomment define for ECL_USE_DLL, to build dynamic library - define also ECL_DLL_EXPORT;
in case you don't have uint*_t types defined in stdint.h - define those types there near "user setup part";

PERFORMANCE BENCHMARKS

PC benchmarks are performed for Intel core i5-3570k @ 3.4 GHz / Windows 7 64 bit / 16gb RAM 1600 MHz. All benchmarks are performed for ECL version 1.0.0. Compiled with GCC 7.2.0, options: -m32 -Wall -Wextra -pedantic -O3.

ECL sources are compiled as single file: "ecl-all-c-included/ECL_all_c_included.c" (and for Embedded benchmarks too);
Speed is in megabytes per second (mb/s);
For NanoLZ used Scheme1 unless otherwise specified;
ECL is built for 32 bits (ECL_USE_BITNESS_32, default option) unless otherwise specified. In most cases 32bit build is the most efficient;
Compressor parameter (for LZ4_HC - compressionLevel, for NanoLZ - search_limit) is further referenced as Param;
Ratio is size of compressed data comparing to size of original data, e.g. compressing 1000 bytes -> 100 bytes corresponds to ratio 0.1.

Benchmarking small datasets (PC and Embedded environment)

Used samples around 2 kb each. Run in big external cycle:

main comparison - NanoLZ versus LZ4 (which can also be configured for low-memory);
for PC: NanoLZ demo scheme (Scheme2) - to show that you can achieve different parameters within NanoLZ, if needed;
for PC: LZ4 in high-compression mode (LZ4_HC) with big memory buffers used;
for PC: NanoLZ with bigger memory buffers used (mid2min, fast1/fast2 using window_size_bits=11 - further referenced as Window) for detailed codec comparison on small data.

PC environment

Compressor	Param	Ratio	Compression	Decompression	Compressor memory
LZ4_compress_default (v1.8.1)		0.548	625 mb/s	2020 mb/s	256 bytes
ECL_NanoLZ_Compress_mid1min		0.465	249 mb/s	335 mb/s	256 bytes
ECL_NanoLZ_Compress_mid1	2	0.427	114 mb/s	332 mb/s	256 bytes
ECL_NanoLZ_Compress_mid1	3	0.413	81 mb/s	332 mb/s	256 bytes
ECL_NanoLZ_Compress_mid1	4	0.4	65 mb/s	332 mb/s	256 bytes
ECL_NanoLZ_Compress_mid1	5	0.397	56 mb/s	337 mb/s	256 bytes
ECL_NanoLZ_Compress_mid1	10	0.387	39 mb/s	344 mb/s	256 bytes
ECL_NanoLZ_Compress_mid1	50	0.377	24 mb/s	355 mb/s	256 bytes
ECL_NanoLZ_Compress_mid1min Scheme2 (demo)		0.592	228 mb/s	890 mb/s	256 bytes

ECL_NanoLZ_Compress_mid2min		0.465	347 mb/s	334 mb/s	513 bytes
ECL_NanoLZ_Compress_fast1	20	0.383	60 mb/s	340 mb/s	4612 bytes (16bit build)
ECL_NanoLZ_Compress_fast1	20	0.383	66 mb/s	342 mb/s	9224 bytes
ECL_NanoLZ_Compress_fast2	20	0.377	93 mb/s	343 mb/s	135172 bytes (16bit build)
ECL_NanoLZ_Compress_fast2	20	0.377	79 mb/s	350 mb/s	270344 bytes
ECL_NanoLZ_Compress_fast2	100	0.37	54 mb/s	350 mb/s	135172 bytes (16bit build)
ECL_NanoLZ_Compress_fast2	100	0.37	50 mb/s	360 mb/s	270344 bytes

LZ4_compress_HC (v1.8.1)	3	0.49	87 mb/s	2230 mb/s	384 kb
LZ4_compress_HC (v1.8.1)	9	0.49	59 mb/s	2230 mb/s	384 kb
LZ4_compress_HC (v1.8.1)	11	0.49	16 mb/s	2230 mb/s	384 kb
LZ4_compress_HC (v1.8.1)	12	0.49	10 mb/s	2230 mb/s	384 kb

On some other small datasets (highly compressible) NanoLZ:Scheme1 decompression speed exceeded 1000 mb/s, while compression speed of ECL_NanoLZ_Compress_mid2min reached 570 mb/s.

Embedded environment

hardware: ARM Cortex-M3 120 MHz;
compiler: armcc 5.06;
optimization options: -O3.

Compressor	Param	Ratio	Compression	Decompression	Compressor memory
LZ4_compress_default (v1.8.0)		0.53	1.785 mb/s	10.12 mb/s	1024 bytes
ECL_NanoLZ_Compress_mid2min		0.465	1.822 mb/s	2.42 mb/s	513 bytes
ECL_NanoLZ_Compress_mid2	2	0.427	0.865 mb/s	2.42 mb/s	513 bytes
ECL_NanoLZ_Compress_mid2	3	0.413	0.718 mb/s	2.46 mb/s	513 bytes
ECL_NanoLZ_Compress_mid2	4	0.4	0.636 mb/s	2.48 mb/s	513 bytes
ECL_NanoLZ_Compress_mid2	5	0.397	0.584 mb/s	2.54 mb/s	513 bytes
ECL_NanoLZ_Compress_mid2	10	0.387	0.455 mb/s	2.57 mb/s	513 bytes

Benchmarking large datasets (only PC environment)

Big datasets used here to show performance measured on big data without wrapping in external cycle, they also show that NanoLZ is inappropriate choice for big data. On my measurements compression ratio of LZ4 wins on data bigger than around 25 kb (this bound isn't accurately measured and appropriate statistic isn't provided). Though, again, with custom scheme you are able to achieve different characteristics.

Benchmarks for data samples from Silesia Corpus (further referenced as Silesia): 12 files of different types, sizes range in 6..51 mb:

Compressor	Param	Window	Compression	Decompression	Compressor memory
ECL_NanoLZ_Compress_fast2	1	16	57..158 mb/s	122..310 mb/s	512kb
ECL_NanoLZ_Compress_fast2	10	16	25..98 mb/s	132..435 mb/s	512kb
ECL_NanoLZ_Compress_fast2	100	16	8..31 mb/s	164..569 mb/s	512kb
ECL_NanoLZ_Compress_fast2	10	18	22..97 mb/s	135..436 mb/s	1.3mb
ECL_NanoLZ_Compress_fast2	100	18	5..23 mb/s	161..586 mb/s	1.3mb

ECL_ZeroEater_Compress			518..1446 mb/s	1353..7000 mb/s	0
ECL_ZeroDevourer_Compress			341..1456 mb/s	739..10317 mb/s	0

Note that prodigious decompression speed of ZeroEater and ZeroDevourer is achieved on files they cannot compress, see next table for accumulated statistic.

Benchmarks for single Silesia.tar file (202 mb):

Compressor	Param	Window	Ratio	Compression	Decompression	Compressor memory
memcpy			1.000	6300 mb/s	6300 mb/s	0
LZ4_compress_default (v1.8.1)			0.479	437 mb/s	1543 mb/s	16 kb
LZ4_compress_HC (v1.8.1)	3		0.383	74.5 mb/s	1704 mb/s	384 kb
LZ4_compress_HC (v1.8.1)	9		0.367	26.9 mb/s	1783 mb/s	384 kb

ECL_NanoLZ_Compress_fast2	1	16	0.473	92.5 mb/s	192.7 mb/s	512 kb
ECL_NanoLZ_Compress_fast2	10	16	0.407	47.5 mb/s	221.5 mb/s	512 kb
ECL_NanoLZ_Compress_fast2	100	16	0.39	15.3 mb/s	250 mb/s	512 kb
ECL_NanoLZ_Compress_fast2	10	18	0.405	42.7 mb/s	222 mb/s	1.3 mb
ECL_NanoLZ_Compress_fast2	100	18	0.385	10.5 mb/s	254 mb/s	1.3 mb

ECL_ZeroEater_Compress dry			0.948	1136 mb/s		0
ECL_ZeroEater_Compress			0.948	925 mb/s	2940 mb/s	0
ECL_ZeroDevourer_Compress			0.935	696 mb/s	1905 mb/s	0

MULTITHREADING

Codecs don't share any non-const data, so API is thread safe.

SAMPLE TESTING PROGRAM

There's sample program to compress/decompress in NanoLZ:Scheme1 format via command line: see "sample/" dir. It's enough to compile single "sample/sample.cpp" file, some building scripts are provided in same dir. Program is unable to compress too large files.

USAGE

In general usage is pretty straightforward - you call *Compress method, you call *Decompress method - that's it.

usage samples present in headers of each codec;
see "sample/sample.cpp" example program to encode/decode files;
you can find examples in tests located in "tests/" dir.

BUILDING

It's enough to build single "ecl-all-c-included/ECL_all_c_included.c" file, so do it unless you have reasons to compile minimum amount of code. If you need to minimize amount of code to compile, then: to have compressor "FOO" available - include "ECL_FOO.h", compile "ECL_common.c" + "ECL_FOO.c".

If you need a static/dynamic library instead of adding ECL sources to your project - you will have to compile it yourself (no such scripts here). To build dynamic library define ECL_USE_DLL and ECL_DLL_EXPORT macros, to import it - define only ECL_USE_DLL macro.

Note that building as C rather than C++ results in higher performance due to absence of code generated for exception handling, so simple #including code into some of your C++ source files is not the most efficient option.

TESTS

unit tests are in "tests/" dir, can be built and launched with scripts in same place. Basically single "tests/tests.cpp" file is enough to be compiled;
scripts in "tests/" dir build and run ECL consequently for 16, 32 and 64 bits (bitness of ECL_usize);
executable test file has optional "depth" parameter on launch: a number between -10 and 1000 (e.g. "tests.exe 50"). It determines tests coverage and time spent on tests (-10 for fast run, 1000 for very deep/slow run);
sources of tests can be easily embedded into another project and run any amount of times with any parameters from there, see "tests/tests.cpp".

Name		Name	Last commit message	Last commit date
Latest commit History 111 Commits
ecl-all-c-included		ecl-all-c-included
formats		formats
sample		sample
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
CONTRIBUTION.md		CONTRIBUTION.md
COPYING		COPYING
ECL_JH_States.h		ECL_JH_States.h
ECL_NanoLZ.c		ECL_NanoLZ.c
ECL_NanoLZ.h		ECL_NanoLZ.h
ECL_NanoLZ_schemes_inline.h		ECL_NanoLZ_schemes_inline.h
ECL_ZeroDevourer.c		ECL_ZeroDevourer.c
ECL_ZeroDevourer.h		ECL_ZeroDevourer.h
ECL_ZeroEater.c		ECL_ZeroEater.c
ECL_ZeroEater.h		ECL_ZeroEater.h
ECL_common.c		ECL_common.c
ECL_config.h		ECL_config.h
ECL_utils.h		ECL_utils.h
NEWS.md		NEWS.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EMBEDDED COMPRESSION LIBRARY

Language: C

Platforms: any

Endianness: any

Library version: 1.0.3

Tested on

COMPRESSORS

ECL:NanoLZ - meticulously formatted version of traditional LZ77 algorithm.

ECL:ZeroDevourer - a diff-oriented compressor that takes advantage of zero bytes (even single ones) in your stream.

ECL:ZeroEater - a diff-oriented compressor that takes advantage of zero bytes (starting from two bytes in a row) in your stream.

FORMATS

EXTRA CONFIGURING

PERFORMANCE BENCHMARKS

Benchmarking small datasets (PC and Embedded environment)

PC environment

Embedded environment

Benchmarking large datasets (only PC environment)

Benchmarks for data samples from Silesia Corpus (further referenced as Silesia): 12 files of different types, sizes range in 6..51 mb:

Benchmarks for single Silesia.tar file (202 mb):

MULTITHREADING

SAMPLE TESTING PROGRAM

USAGE

BUILDING

TESTS

About

Releases 2

Packages

Languages

License

Nonoum/ECL

Folders and files

Latest commit

History

Repository files navigation

EMBEDDED COMPRESSION LIBRARY

Language: C

Platforms: any

Endianness: any

Library version: 1.0.3

Tested on

COMPRESSORS

ECL:NanoLZ - meticulously formatted version of traditional LZ77 algorithm.

ECL:ZeroDevourer - a diff-oriented compressor that takes advantage of zero bytes (even single ones) in your stream.

ECL:ZeroEater - a diff-oriented compressor that takes advantage of zero bytes (starting from two bytes in a row) in your stream.

FORMATS

EXTRA CONFIGURING

PERFORMANCE BENCHMARKS

Benchmarking small datasets (PC and Embedded environment)

PC environment

Embedded environment

Benchmarking large datasets (only PC environment)

Benchmarks for data samples from Silesia Corpus (further referenced as Silesia): 12 files of different types, sizes range in 6..51 mb:

Benchmarks for single Silesia.tar file (202 mb):

MULTITHREADING

SAMPLE TESTING PROGRAM

USAGE

BUILDING

TESTS

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 2

Packages 0

Languages

Packages