brainandforce · brainandforce · Nov 18, 2023 · Nov 18, 2023 · Nov 18, 2023 · Nov 18, 2023
diff --git a/docs/make.jl b/docs/make.jl
@@ -15,7 +15,9 @@ makedocs(
     warnonly = is_ci_env,
     pages = [
         "Home" => "index.md",
-        "Types" => "types.md",
+        "Lattices" => "lattices.md",
+        "Atoms and crystals" => "atoms.md",
+        "Data grids" => "grids.md",
         "File formats" => "filetypes.md",
         "API" => Any[
             "Lattices" => "api/lattices.md"

diff --git a/docs/src/atoms.md b/docs/src/atoms.md
@@ -0,0 +1,109 @@
+# Atoms
+
+Once a lattice is defined, we can include other data, and atomic coordinates are one of the most
+common datasets encountered.
+
+## `NamedAtom`
+
+A `NamedAtom` contains two pieces of information: an atom name and the associated atomic number.
+The atom name is a string no longer than 15 codepoints (internally, we use 
+`InlineStrings.InlineString15` to store this data), and the atomic number is an `Int`.
+
+### Constructors
+
+The default constructor for `NamedAtom` is `NamedAtom(::AbstractString, ::Integer)`. The atom name
+can be inferred and specified automatically as the associated symbol if only a number is given.
+
+In the case of only a string being provided, the atomic number may be inferred. This is done by
+matching the first letters up to a non-letter character of the string with a known symbol. If no
+match is found, the atomic number will be 0, corresponding to a dummy atom (see below).
+
+!!! note
+    Support for inferring from symbols representing unnamed elements or atomic numbers of those
+    elements (for instance, element 119, Uue) is not yet implemented.
+
+### Sorting
+
+`NamedAtom` types are sorted by atomic number first, then by name.
+
+### Dummy atoms
+
+Sometimes, it is useful to include _dummy atoms_: sites which do not correspond to a real atomic
+position, but are useful to track in certain circumstances. Dummy atoms have atomic number zero.
+
+If no string is provided, `NamedAtom(0)` returns `NamedAtom("dummy", 0)`, as does any argument not
+between 1 and 118.
+
+## Atomic positions
+
+### Atom position data
+
+Atomic positions are given by a `CartesianAtomPosition{D}` for Cartesian coordinates, or a
+`FractionalAtomPosition{D}` for fractional coordinates with respect to a lattice. This information
+consists of a `NamedAtom`, a coordinate in real space as an `SVector{D,Float64}`, and optional
+occupancy data for partially occupied sites - if not specified, it defaults to 1 (fully occupied).
+
+### Lists of atomic positions
+
+The `AbstractAtomList{D}` type represents a list of atoms.
+
+`AtomList{D}` is a wrapper for `Vector{CartesianAtomPosition{D}}`, and represents atomic positions
+in an arbitrary `D`-dimensional space.
+
+`PeriodicAtomList{D}` wraps `Vector{FractionalAtomPosition{D}}`, but also includes the basis vectors
+of the lattice associated with fractional coordinates.
+
+The two types can be interconverted with the appropriate constructors. An `AtomList` can be
+constructed from a `PeriodicAtomList` directly, but for the reverse conversion, the basis vectors of
+the new `PeriodicAtomList` must be specified.
+
+### Sorting
+
+Structures often have many identical `NamedAtom` instances, so sorting of `FractionalAtomPosition`
+and `CartesianAtomPosition` is accomplished by whichever has the lowest initial coordinate, moving
+to the next coordinate in the case of ties.
+
+### Constructing supercells
+
+In many cases, it is useful to construct a supercell from a unit cell. This can be done to convert
+a primitive cell to a conventional representation, or to a cell of a larger size or different shape.
+
+The `supercell(::PeriodicAtomList, ::AbstractMatrix{<:Integer})` function takes an
+`AbstractMatrix{<:Integer}`, which transforms the lattice basis vectors by right multiplication, and
+fills the new lattice with copies of atoms from the original cell. Scalar arguments (calling
+`supercell(::PeriodicAtomList, ::Integer)`) scale each direction by the identity matrix multiplied
+by the scalar, and vector arguments (calling `supercell(::PeriodicAtomList, ::Integer)`) convert the
+vector to a diagonal matrix.
+
+Internally, this is done using the [Smith normal form][1] of the transformation matrix, using the
+diagonal factors to extend the lattice the correct number of times along specific dimensions, and
+moving them into the cell with the help of the unimodular factors associated with the Smith normal
+form.
+
+```@docs; canonical=false
+Electrum.supercell
+```
+## Crystals
+
+The `Crystal{D}` represents a generating set of atomic positions for a periodic structure. Along
+with a `PeriodicAtomList`, a `Crystal{D}` contains the space group and origin setting associated
+with the data, as well as an integer matrix transformation describing the unit cell generated by
+the given cell. The `set_transform!` function allows for the transformation matrix to be altered 
+after the creation of a `Crystal{D}` object.
+
+This data can be converted to a `PeriodicAtomList{D}` with a constructor or `Base.convert` call, and
+this uses the `supercell` function to construct the full list of atoms from the generating set.
+
+!!! warning
+    At the moment, the use of space group data to generate atomic sites is not yet implemented.
+    A space group number and setting may be provided, but this will not be used to generate any
+    atomic sites. 
+
+Additionally, we provide the `CrystalWithDatasets{D,K,V}`, which associates a `Crystal{D}` with a
+`Dict{K,V}` containing datasets associated with the structure. This data structure may be indexed
+like the underlying dictionary.
+
+!!! note
+    `CrystalWithDatasets{D,K,V}` will likely be changed significantly or removed in Electrum 0.2.
+
+[1]: https://en.wikipedia.org/wiki/Smith_normal_form
diff --git a/docs/src/grids.md b/docs/src/grids.md
@@ -0,0 +1,68 @@
+# Data grids
+
+The output of many quantum chemistry packages are datasets defined on a spatial grid in real or
+reciprocal space. Theoretical tools may want to perform mathematical operations on this data, but
+operating on bare arrays containing the dta requires tracking properties associated with each array,
+such as the basis vectors of the lattice
+
+## `DataGrid`, `RealDataGrid`, and `ReciprocalDataGrid`
+
+An `DataGrid{D,B<:Electrum.LatticeBasis,S<:AbstractVector{<:Real},T}` contains data defined in a
+crystal lattice of `D` with basis vectors of type `B`, a shift parameter of type `S`, and elements
+of type `T`, either in real space (`RealDataGrid`) or in reciprocal space (`ReciprocalDataGrid`).
+These aliases are defined as follows:
+
+```julia
+const RealDataGrid{D,T} = DataGrid{D,RealBasis{D,Float64},SVector{D,Float64},T}
+const ReciprocalDataGrid{D,T} = DataGrid{D,ReciprocalBasis{D,Float64},KPoint{D},T}
+```
+!!! note
+    Look closely at the above definition: the shift data type for `RealDataGrid{D}` is
+    `SVector{D,Float64}`, but the shift data type for `ReciprocalDataGrid{D}` is `KPoint{D}`. When
+    the `DataGrid` constructor without the shift type parameter is invoked, the basis type is used
+    to infer the appropriate shift type so that a `RealDataGrid` or `ReciprocalDataGrid` is
+    constructed.
+
+`DataGrid` uses zero-based, periodic indexing: the first index of an `AbstractDataGrid{D}` is
+`zero(NTuple{D,Int})`, and indices whose moduli with respect to size along that dimension are
+identical will reference the same element: for instance, for `g::AbstractDataGrid{3}` with size
+`(10, 10, 10)`, `g[69, 420, 1337] === g[9, 0, 7]`. Encountering a `BoundsError` is not possible
+when indexing an `DataGrid`.
+
+The basis of an `DataGrid` can be recovered with `basis(::DataGrid)`, which will be of the type
+specified by the type parameter. 
+
+## Broadcasting and mathematical operations
+
+Broadcasting is defined for `DataGrid` with a custom `Base.Broadcast.BroadcastStyle` subtype:
+```julia
+Electrum.DataGridStyle{D,B,S} <: Broadcast.AbstractArrayStyle{D}
+```
+This allows `DataGrid` instances to operated on with dot syntax. However, they must share lattice
+basis vectors and shift values. If they do not match, an `Electrum.LatticeMismatch` exception will
+be thrown.
+
+!!! info
+    Although `Base.Broadcast.ArrayStyle` is usually overridden by other subtypes of 
+    `Base.Broadcast.AbstractArrayStyle`, it does not override `Electrum.DataGridStyle`. Adding a
+    `DataGrid` to an `Array` returns an `Array`, and adding a `DataGrid` to other `AbstractArray`
+    subtypes returns the `AbstractArray` subtype defined by the `Broadcast.BroadcastStyle`. In the 
+    case of a dimension mismatch, the broadcast style wll be `Broadcast.ArrayConflict` - the
+    operation will throw a `DimensionMismatch`.
+
+The `+` and `-` operators are defined for `DataGrid` instances, and they are faster than the
+broadcasted `.+` and `.-` equivalents. As with the broadcasted versions, checks are implemented to
+ensure that the lattice basis vectors and shifts match.
+
+Similarly, the `*`, `/`, and `\` operators are defined for pairs of `DataGrid` and `Number`
+instances, and again, are faster than their broadcasted equivalents.
+
+### Fourier transforms
+
+The Fourier transform and its inverse are available through an overload of `FFTW.fft()` and
+`FFTW.ifft()`. The transforms are normalized with respect to the basis vectors of the space, so
+for `g::DataGrid`, `ifft(fft(g)) ≈ g` (to within floating point error).
+
+`fft` and `ifft` defined for `DataGrid` generally, but it is critical to note that in the vast
+majority of cases, you will want to call `fft` on `RealDataGrid` instances and `ifft` on
+`ReciprocalDataGrid` instances.
diff --git a/docs/src/lattices.md b/docs/src/lattices.md
@@ -0,0 +1,142 @@
+# Lattices
+
+The starting point for solid state structures is a periodic lattice. Electrum provides convenient
+types for working with lattices of arbitrary dimension
+
+## Real and reciprocal space traits
+
+The `Electrum.BySpace{D}` supertype contains two types, `Electrum.ByRealSpace{D}` and
+`Electrum.ByReciprocalSpace{D}`, where `D` is the number of dimensions associated with the dataset.
+These types are used to denote whether data is associated with real space (e.g. electron density) or 
+reciprocal space (e.g. the Fourier transform of the electron density). When working with lattices,
+it is important to distinguish the two types of lattice: this is the primary reason why bare
+`SMatrix{D,D,T}` instances are not used in this package.
+
+## `Electrum.LatticeBasis` and methods
+
+The `Electrum.LatticeBasis{S<:Electrum.BySpace,D,T}` data type is a wrapper for an
+`SMatrix{D,D,T,D^2}` which represents the real or reciprocal space basis vectors of a lattice.
+
+Electrum does not export `Electrum.LatticeBasis`, but instead provide the following aliases. This
+allows developers to alter the implementation of `Electrum.LatticeBasis` without breaking the API:
+```julia
+const RealBasis = Electrum.LatticeBasis{ByRealSpace}
+const ReciprocalBasis = Electrum.LatticeBasis{ByReciprocalSpace}
+const AbstractBasis = Electrum.LatticeBasis{<:BySpace}
+```
+!!! warning
+    Note that `RealBasis{D} === Electrum.LatticeBasis{ByRealSpace,D}`, and not
+    `Electrum.LatticeBasis{ByRealSpace{D},D}`. While the latter is a valid type and can be used to
+    store data, the result of `Electrum.DataSpace(::Electrum.LatticeBasis{ByRealSpace{D},D})` is
+    an error!
+
+The units of `RealBasis` are bohr, and those of `ReciprocalBasis` are radians over bohr,
+corresponding with the convention that the dot product of a real basis vector with a corresponding
+reciprocal basis vector is 2π.
+
+### Construction
+
+A `RealBasis` or `ReciprocalBasis` can be constructed from an `AbstractMatrix`, `Tuple`, or iterator
+of the correct size.
+
+In the case of a `StaticMatrix`, the size and element type are already known, so the constructors
+can simply be called as `RealBasis` or `ReciprocalBasis`:
+```
+julia> RealBasis(SMatrix{3,3}(1, 0, 0, 0, 2, 0, 0, 0, 3))
+Electrum.LatticeBasis{Electrum.ByRealSpace, 3, Int64}:
+    a: [  1.000000   0.000000   0.000000 ]   (1.000000 bohr)
+    b: [  0.000000   2.000000   0.000000 ]   (2.000000 bohr)
+    c: [  0.000000   0.000000   3.000000 ]   (3.000000 bohr)
+```
+However, in the case of a `Matrix` or other dynamically sized data, the dimension is not known and
+should be supplied to avoid an exception being thrown. This is to avoid type instability arising
+from determining the size at runtime:
+```
+julia> RealBasis{3}([1 0 0; 0 2 0; 0 0 3])
+Electrum.LatticeBasis{Electrum.ByRealSpace, 3, Int64}:
+    a: [  1.000000   0.000000   0.000000 ]   (1.000000 bohr)
+    b: [  0.000000   2.000000   0.000000 ]   (2.000000 bohr)
+    c: [  0.000000   0.000000   3.000000 ]   (3.000000 bohr)
+```
+In either case, you can supply the element type (though it must be proceeded by the dimension in all
+cases) to convert the input elements to the desired type (here, it's `Float32`):
+```
+julia> RealBasis{3,Float32}(SMatrix{3,3}(1, 0, 0, 0, 2, 0, 0, 0, 3))
+Electrum.LatticeBasis{Electrum.ByRealSpace, 3, Float32}:
+    a: [  1.000000   0.000000   0.000000 ]   (1.000000 bohr)
+    b: [  0.000000   2.000000   0.000000 ]   (2.000000 bohr)
+    c: [  0.000000   0.000000   3.000000 ]   (3.000000 bohr)
+```
+### Conversion
+
+A `RealBasis` can be converted to a `ReciprocalBasis` via `Base.convert` or their constructors,
+and vice versa:
+```julia
+b = ReciprocalBasis(a)
+c = convert(RealBasis, b)
+```
+!!! tip
+    Avoid needless back-and-forth conversions between `RealBasis` and `ReciprocalBasis` to avoid
+    numerical instabilities.
+
+### Mathematical operations
+
+The `RealBasis` and `ReciprocalBasis` types support the majority of common operations used in 
+solid-state chemistry, including addition, subtraction, multiplication, left division, and right
+division. 
+
+Importantly, it supports the QR decomposition provided by `LinearAlgebra.qr`. This decomposition is
+useful in that it generates a `Q`` factor, which is an orthogonal matrix (representing a Euclidean
+point isometry - compositions of rotations and reflections) and an ``R`` factor, which is an upper
+triangular matrix. This operation is useful in converting lattices to a standard orientation: in the
+case of a QR decomposition, the $R$ factor places the first basis vector of the lattice along the
+first basis vector of space. In 3D, this means that ``\vec{a}`` is collinear with ``\vec{x}``.
+
+Calling `LinearAlgebra.qr(::AbstractBasis{D,T})` returns a
+`StaticArrays.QR{SMatrix{D,D,T,D^2}, SMatrix{D,D,T,D^2}, SVector{D,Int}}`, so the ``Q`` and ``R``
+factors are bare ``SMatrix`` instances. While this is fine for the ``Q`` matrix, the ``R`` matrix
+represents a basis, and we expect an `AbstractBasis` return value. The `triangularize` function
+returns the ``R`` factor of a QR decomposition as an `AbstractBasis`, discarding the ``Q`` factor.
+
+```@docs; canonical=false
+Electrum.triangularize
+```
+!!! note
+    Support for the corresponding LQ decomposition (where ``L`` is a lower triangular matrix) is not
+    yet implemented.
+
+### Implementation details
+
+The `SMatrix{D1,D2,T,L}` type requires four type parameters - `D1` and `D2` are each of the matrix
+dimensions, and `T` is the element type of the matrix. However, one last parameter is needed: `L`, 
+the length of the `NTuple` that backs the `SMatrix`.
+
+Julia currently does not allow for the calculation of type parameters from other type parameters,
+which poses a problem in the declaration of structs. Technically, a type like `SMatrix{3,3,Float64}`
+is an abstract type, as the `L` parameter is undeclared, even though the value of `L` is always
+`D1 * D2`. By declaring a struct to have this type, there seems to be a performance drop.
+
+The `LatticeBasis` types wrap an `SVector{D,SVector{D,Float64}}`. This only requires a single 
+type parameter `D`, and allows for fully concrete struct declarations. The `:matrix` property
+converts this data to an `SMatrix{D,D,T,D^2}` instance, and methods needing to access something that
+looks like a matrix reference this property.
+
+## Basis vectors in composite types
+
+Some types (such as `Electrum.DataGrid`) store a set of basis vectors as part of the dataset. The
+`basis` function allows a user to access that set of basis vectors. By default, `basis(x)` returns
+`x.basis`, so defining a field `basis::RealBasis{...}` or `basis::ReciprocalBasis{...}` implements
+these functions automatically.
+
+!!! warning
+    `basis(x)` returns an `AbstractBasis`, but there is no guarantee whether the return type is
+    `RealBasis` or `ReciprocalBasis`. Use `convert(::Type{<:AbstractBasis}, basis(x))` to ensure
+    that the return type is what you expect.
+
+The type of basis vectors stored also allows for inference of the data space trait, as the default
+definition of `DataSpace(::Type{T})` is `fieldtype(T, :basis)`. We encourage you to choose your
+basis vector type to match the data you wish to represent.
+
+!!! note
+    For performance reasons, we encourage you to define struct fields with concrete types, and use 
+    type parameters in your struct definitions to ensure that the type is concrete.