Skip to content

Commit

Permalink
apacheGH-17682: [Format] Add Bool8 Canonical Extension Type (apache#4…
Browse files Browse the repository at this point in the history
…3234)

### Rationale for this change

Closes: apache#17682

Arrow Boolean arrays store values as individual bits, which is a very compact representation but does not match the layout of many systems with which it interoperates. By adding an 8-bit Boolean extension type, zero-copy compatibility with many systems can be improved at the cost of large physical representation.

Go implementation: apache#43323
C++ / Python implementation: apache#43488

### What changes are included in this PR?

Proposal and documentation for `Bool8` canonical extension type.

### Are these changes tested?

N/A

### Are there any user-facing changes?

N/A

* GitHub Issue: apache#17682

Lead-authored-by: Joel Lubinitsky <[email protected]>
Co-authored-by: Joel Lubinitsky <[email protected]>
Co-authored-by: Felipe Oliveira Carvalho <[email protected]>
Signed-off-by: Joel Lubinitsky <[email protected]>
  • Loading branch information
joellubi and felipecrv authored Aug 8, 2024
1 parent ee3273e commit c537700
Showing 1 changed file with 22 additions and 0 deletions.
22 changes: 22 additions & 0 deletions docs/source/format/CanonicalExtensions.rst
Original file line number Diff line number Diff line change
Expand Up @@ -393,6 +393,28 @@ Examples:

{"type_name": "OTHER", "vendor_name": "JDBC driver name"}

8-bit Boolean
=============

Bool8 represents a boolean value using 1 byte (8 bits) to store each value instead of only 1 bit as in
the original Arrow Boolean type. Although less compact than the original representation, Bool8 may have
better zero-copy compatibility with various systems that also store booleans using 1 byte.

* Extension name: ``arrow.bool8``.

* The storage type of this extension is ``Int8`` where:

* **false** is denoted by the value ``0``.
* **true** can be specified using any non-zero value. Preferably ``1``.

* Extension type parameters:

This type does not have any parameters.

* Description of the serialization:

Metadata is an empty string.

=========================
Community Extension Types
=========================
Expand Down

0 comments on commit c537700

Please sign in to comment.