feat: implement ALP-RD compression #947

a10y · 2024-09-29T18:08:10Z

Fixes #10: Add ALP-RD compression.

Currently our only floating point compression algorithm is standard ALP, which targets floats/doubles that are originally decimal, and thus have some natural integer they can round to when you undo the exponent.

For science/math datasets, there are a lot of "real doubles", i.e. floating point numbers that use most/all of their available precision. These do not compress with standard ALP. The ALP paper authors had a solution for this called "ALP for 'Real' Doubles" / ALP-RD, which is implemented in this PR.

Basics

The key insight of ALP-RD is that even for dense floating point numbers, within a column they often share the front bits (exponent + first few bits of mantissa). We try and find the best cut-point within the leftmost 16-bits.

There are generally a small number of unique values for the leftmost bits, so you can create a dictionary of fixed size (here we use the choice of 8 from the C++ implementation) which naturally bit-packs down to 3 bits. If you compress perfectly without exceptions, you can store 49 bits/value ~23% compression. In practice the amount varies. In the comments below you can see a test with the POI dataset referenced in the ALP paper, and we replicate their results of 55 and 56 bits/value respectively.

List of changes

Reorganized the vortex-alp crate. I created two top-level modules, alp and alp_rd, and moved the previous implementation into the alp` module
Added new ALPRDArray in the alp_rd module. It supports both f32 and f64, and all major compute functions are implemented (save for MaybeCompareFn and the Accessors I will file an issue to implement these in a FLUP if alright, this PR is already quite large)
Added corresponding ALPRDCompressor and wired the CompressorRef everywhere I could find ALPCompressor
New benchmark for RD compression in the existing ALP benchmarks suite

a10y · 2024-09-30T14:26:22Z

Some Q's:

For ALP-RD, we store a small dictionary (<= 16 bytes) for the left-parts. Should that be stored as a child or as metadata on the array?
ALP-RD from the paper prescribes specific compression algorithms for each of its subcomponents (fused dict+FL for left-parts, FL bit-pack for right-parts). Should we implement those, or should we just let it cascade in the compressor?

Separately just an observation, but ALP from the paper recommends having one pair of exponents per vector, rather than for the entire array like we do now.

EDIT: answers

Store in metadata
Can bit-pack the left/right side explicitly

a10y · 2024-10-01T18:27:44Z

encodings/alp/src/alp_rd/mod.rs

+        // dict-encode the left-parts, keeping track of exceptions
+        for (idx, left) in left_parts.iter_mut().enumerate() {
+            // TODO: revisit if we need to change the branch order for perf.
+            if let Some(code) = self.codes.iter().position(|v| *v == *left) {


I had originally used HashMap for this, like the C++ code does, but it turns out that doing linear search on a small fixed-size array is considerably faster (~5x) than doing hashmap lookups

a10y · 2024-10-01T19:44:51Z

I implemented a small test using the POI Kaggle dataset referenced from the paper, and I was able to replicate the compression ration results.

/Users/aduffy/.cargo/bin/cargo run --color=always --bin compress_poi --manifest-path /Volumes/Code/vortex/bench-vortex/Cargo.toml
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.16s
     Running `target/debug/compress_poi`
reading with schema: Struct(StructDType { names: ["name", "latitude_radian", "longitude_radian", "num_links", "links", "num_categories", "categories"], dtypes: [Utf8(Nullable), Primitive(F64, Nullable), Primitive(F64, Nullable), Primitive(I64, Nullable), Utf8(Nullable), Primitive(I64, Nullable), Utf8(Nullable)] }, NonNullable)
raw POI dataset
root: vortex.struct(0x04)({latitude_radian=f64?, longitude_radian=f64?}, len=424205) nbytes=6.79 MB (100.00%)
  metadata: StructMetadata { length: 424205, validity: NonNullable }
  "latitude_radian": vortex.primitive(0x03)(f64?, len=424205) nbytes=3.39 MB (50.00%)
    metadata: PrimitiveMetadata { validity: AllValid }
    buffer: 3.39 MB
  "longitude_radian": vortex.primitive(0x03)(f64?, len=424205) nbytes=3.39 MB (50.00%)
    metadata: PrimitiveMetadata { validity: AllValid }
    buffer: 3.39 MB

Compressed POI data
root: vortex.struct(0x04)({latitude_radian=f64?, longitude_radian=f64?}, len=424205) nbytes=5.95 MB (100.00%)
  metadata: StructMetadata { length: 424205, validity: NonNullable }
  "latitude_radian": vortex.alprd(0x1e)(f64?, len=424205) nbytes=2.95 MB (49.66%)
    metadata: ALPRDMetadata { is_f32: false, right_bit_width: 52, dict_len: 8, dict: [1022, 1021, 3069, 1020, 1023, 3070, 3068, 3071], left_parts_dtype: Primitive(U16, Nullable), has_exceptions: true }
    left_parts: fastlanes.bitpacked(0x15)(u16?, len=424205) nbytes=159.08 kB (2.68%)
      metadata: BitPackedMetadata { validity: AllValid, bit_width: 3, offset: 0, length: 424205, has_patches: false }
      buffer: 159.36 kB
    right_parts: fastlanes.bitpacked(0x15)(u64?, len=424205) nbytes=2.76 MB (46.38%)
      metadata: BitPackedMetadata { validity: AllValid, bit_width: 52, offset: 0, length: 424205, has_patches: false }
      buffer: 2.76 MB
    left_parts_exceptions: vortex.sparse(0x08)(u16?, len=424205) nbytes=36.42 kB (0.61%)
      metadata: SparseMetadata { indices_dtype: Primitive(U64, NonNullable), indices_offset: 0, indices_len: 8325, len: 424205, fill_value: Scalar { dtype: Primitive(U16, Nullable), value: Null } }
      indices: fastlanes.bitpacked(0x15)(u64, len=8325) nbytes=19.77 kB (0.33%)
        metadata: BitPackedMetadata { validity: NonNullable, bit_width: 19, offset: 0, length: 8325, has_patches: false }
        buffer: 21.89 kB
      values: vortex.primitive(0x03)(u16?, len=8325) nbytes=16.65 kB (0.28%)
        metadata: PrimitiveMetadata { validity: AllValid }
        buffer: 16.65 kB
  "longitude_radian": vortex.alprd(0x1e)(f64?, len=424205) nbytes=2.99 MB (50.34%)
    metadata: ALPRDMetadata { is_f32: false, right_bit_width: 53, dict_len: 8, dict: [1535, 510, 511, 512, 1536, 509, 1533, 1534], left_parts_dtype: Primitive(U16, Nullable), has_exceptions: true }
    left_parts: fastlanes.bitpacked(0x15)(u16?, len=424205) nbytes=159.08 kB (2.68%)
      metadata: BitPackedMetadata { validity: AllValid, bit_width: 3, offset: 0, length: 424205, has_patches: false }
      buffer: 159.36 kB
    right_parts: fastlanes.bitpacked(0x15)(u64?, len=424205) nbytes=2.81 MB (47.27%)
      metadata: BitPackedMetadata { validity: AllValid, bit_width: 53, offset: 0, length: 424205, has_patches: false }
      buffer: 2.82 MB
    left_parts_exceptions: vortex.sparse(0x08)(u16?, len=424205) nbytes=23.37 kB (0.39%)
      metadata: SparseMetadata { indices_dtype: Primitive(U64, NonNullable), indices_offset: 0, indices_len: 5342, len: 424205, fill_value: Scalar { dtype: Primitive(U16, Nullable), value: Null } }
      indices: fastlanes.bitpacked(0x15)(u64, len=5342) nbytes=12.69 kB (0.21%)
        metadata: BitPackedMetadata { validity: NonNullable, bit_width: 19, offset: 0, length: 5342, has_patches: false }
        buffer: 14.59 kB
      values: vortex.primitive(0x03)(u16?, len=5342) nbytes=10.68 kB (0.18%)
        metadata: PrimitiveMetadata { validity: AllValid }
        buffer: 10.68 kB

We can see that our bits-per-value are roughly 55 for latitude_radians and 56 for longitude_radians:

latitude = 55.6 bits per pixel
longitude: 56.4

This nets us an overall compression ratio of ~12.5

%

a10y · 2024-10-01T20:14:54Z

bench-vortex/src/reader.rs

@@ -89,7 +89,7 @@ pub async fn rewrite_parquet_as_vortex<W: VortexWrite>(
    Ok(())
 }

-pub fn read_parquet_to_vortex(parquet_path: &Path) -> VortexResult<ChunkedArray> {
+pub fn read_parquet_to_vortex<P: AsRef<Path>>(parquet_path: P) -> VortexResult<ChunkedArray> {


a10y · 2024-10-01T20:16:09Z

encodings/alp/src/alp/mod.rs

@@ -19,7 +26,14 @@ impl Display for Exponents {
    }
 }

-pub trait ALPFloat: Float + Display + 'static {
+mod private {


in theory this was previously extensible, but we probably want to constrain it

poor f16 not considered in the paper. Anyway this is the right thing to do

ah i too forgot about f16. I suppose that we probably want special compressors for things like bf16

Yes, this is just a comment for future readers. The paper only talked about the common float types

encodings/alp/src/alp_rd/mod.rs

a10y · 2024-10-01T20:17:41Z

encodings/alp/src/alp_rd/mod.rs

+    }
+}
+
+// Only applies for F64.


old comment. replace with real doc comment

robert3005 · 2024-10-02T12:46:46Z

one note - don't bother with accessors we decided that we likely need to change them

a10y · 2024-10-02T13:27:52Z

encodings/alp/Cargo.toml

@@ -17,6 +17,8 @@ readme = { workspace = true }
 workspace = true

 [dependencies]
+fastlanes = { workspace = true }


robert3005

some small changes

bench-vortex/src/lib.rs

vortex-dtype/src/dtype.rs

vortex-sampling-compressor/src/compressors/alp_rd.rs

encodings/alp/src/alp_rd/compute/slice.rs

encodings/alp/src/alp_rd/compute/filter.rs

encodings/alp/src/alp_rd/array.rs

lwwmanning · 2024-10-02T16:16:05Z

🥳

a10y force-pushed the aduffy/alp-rd branch from faae112 to 65c64ee Compare September 29, 2024 18:11

a10y force-pushed the aduffy/alp-rd branch 2 times, most recently from e293a50 to c51ecc4 Compare October 1, 2024 02:08

a10y commented Oct 1, 2024

View reviewed changes

a10y added 9 commits October 1, 2024 15:46

beginnings of ALP-RD

42301b9

impl f64

9029c5a

impl ScalarAtFn

00280a6

make ALP-RD work for f32, more compute fns

abbc5eb

move ALP into separate module in same crate

f686600

f32/f64 tests for all compute fns

2d0cb2f

add benchmarks -> improve perf a lot

6aab7c4

hook up to SamplingCompressor

72bd8ab

update for SparseArray change

7cb597c

a10y force-pushed the aduffy/alp-rd branch from 281ea7b to 7cb597c Compare October 1, 2024 19:49

a10y marked this pull request as ready for review October 1, 2024 19:49

remove unused

6dfa5b3

a10y force-pushed the aduffy/alp-rd branch from 25488da to 6dfa5b3 Compare October 1, 2024 19:51

a10y changed the title ~~WIP: ALP-RD~~ feat: implement ALP-RD compression Oct 1, 2024

a10y commented Oct 1, 2024

View reviewed changes

encodings/alp/src/alp_rd/mod.rs Outdated Show resolved Hide resolved

a10y commented Oct 1, 2024

View reviewed changes

encodings/alp/src/alp_rd/mod.rs Outdated Show resolved Hide resolved

a10y commented Oct 1, 2024

View reviewed changes

a10y commented Oct 2, 2024

View reviewed changes

robert3005 approved these changes Oct 2, 2024

View reviewed changes

a10y added 2 commits October 2, 2024 10:37

cleanups

bc46516

fix bit-width calc

7ce1aad

a10y added 2 commits October 2, 2024 11:42

comments + some more

a732275

docs and some other cleanup

daccef7

a10y enabled auto-merge (squash) October 2, 2024 15:58

a10y merged commit 389e6a4 into develop Oct 2, 2024
5 checks passed

a10y deleted the aduffy/alp-rd branch October 2, 2024 16:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: implement ALP-RD compression #947

feat: implement ALP-RD compression #947

a10y commented Sep 29, 2024 •

edited

Loading

a10y commented Sep 30, 2024 •

edited

Loading

a10y Oct 1, 2024

a10y commented Oct 1, 2024 •

edited

Loading

a10y Oct 1, 2024

a10y Oct 1, 2024

robert3005 Oct 2, 2024

a10y Oct 2, 2024

robert3005 Oct 2, 2024

a10y Oct 1, 2024 •

edited

Loading

robert3005 commented Oct 2, 2024

a10y Oct 2, 2024

robert3005 left a comment

lwwmanning commented Oct 2, 2024

feat: implement ALP-RD compression #947

feat: implement ALP-RD compression #947

Conversation

a10y commented Sep 29, 2024 • edited Loading

Basics

List of changes

a10y commented Sep 30, 2024 • edited Loading

EDIT: answers

a10y Oct 1, 2024

Choose a reason for hiding this comment

a10y commented Oct 1, 2024 • edited Loading

a10y Oct 1, 2024

Choose a reason for hiding this comment

a10y Oct 1, 2024

Choose a reason for hiding this comment

robert3005 Oct 2, 2024

Choose a reason for hiding this comment

a10y Oct 2, 2024

Choose a reason for hiding this comment

robert3005 Oct 2, 2024

Choose a reason for hiding this comment

a10y Oct 1, 2024 • edited Loading

Choose a reason for hiding this comment

robert3005 commented Oct 2, 2024

a10y Oct 2, 2024

Choose a reason for hiding this comment

robert3005 left a comment

Choose a reason for hiding this comment

lwwmanning commented Oct 2, 2024

a10y commented Sep 29, 2024 •

edited

Loading

a10y commented Sep 30, 2024 •

edited

Loading

a10y commented Oct 1, 2024 •

edited

Loading

a10y Oct 1, 2024 •

edited

Loading