labels: experimental
- learn a singular basis over the data
- treat the eigenvalues as relative sampling weight, sample a basis dimension/component each batch
- sample a batch of data, using that datum's value in the sampled component as their relative likelihood (threshold this)
- incorporate a temperature term so we can sample disproportionately from the important components in early training, progressively increasing the weight of the tail probabilities.
can we learn this online?