Vector is a collection of efficient Int
-indexed array implementations:
boxed, unboxed, storable, and primitive vectors
(all can be mutable or immutable). The package features a generic API,
polymorphic in vector type, and implements stream fusion,
a powerful optimisation framework that can help eliminate intermediate data structures.
A beginner-friendly tutorial for vectors can be found on MMHaskell.
If you have already started your adventure with vectors, the tutorial on Haskell Wiki covers more ground.
Arrays are data structures that can store a multitude of elements and allow immediate access to every one of them. However, they are often seen as legacy constructs that are rarely used in modern Haskell. Even though Haskell has a built-in Data.Array module, arrays might be a bit overwhelming to use due to their complex API. Conversely, vectors incorporate the array’s O(1) access to elements with a much friendlier API of lists. Since they allow for framework optimisation via loop fusion, vectors emphasise efficiency and keep a rich interface. Unless you’re confident with arrays, it’s well-advised to use vectors when looking for a similar functionality.
Lazy boxed vectors (Data.Vector
) store each of their elements as a
pointer to a heap-allocated value. Because of indirection, lazy boxed vectors
are slower in comparison to unboxed vectors.
Strict boxed vectors (Data.Vector.Strict
) contain elements that are
strictly evaluated.
Unboxed vectors (Data.Vector.Unboxed
) determine an array's representation
from its elements' type. For example, vector of primitive types (e.g. Int
) will be
backed by primitive array while vector of product types by structure of arrays.
They are quite efficient due to the unboxed representation they use.
Storable vectors (Data.Vector.Storable
) are backed by pinned memory, i.e.,
they cannot be moved by the garbage collector. Their primary use case is C FFI.
Primitive vectors (Data.Vector.Primitive
) are backed by simple byte array and
can store only data types that are represented in memory as a sequence of bytes without
a pointer, i.e., they belong to the Prim
type class, e.g., Int
, Double
, etc.
It's advised to use unboxed vectors if you're looking for the performance of primitive vectors,
but more versality.
An optimisation framework used by vectors, stream fusion is a technique that merges
several functions into one and prevents creation of intermediate data structures. For example,
the expression sum . filter g . map f
won't allocate temporary vectors if
compiled with optimisations.