-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Roadmap #7
Comments
Great ideas, Matt! |
Julia could definitely use a perfect-hashing "dictionary" type. If anyone tackles this, please make it a standalone package rather than burying it in some other package. There would be many users. |
|
+1
FWIW in biology it is not uncommon to talk about categories of thousands of species, tens of thousands of gene families. In medicine O(10^5) identifiers for diagnoses, similar for patients. |
See also work on new packages inspired by AxisArrays JuliaCollections/AxisArraysFuture#1 |
We're getting to the point where the core API is stabilizing. We can now start making things fancier and building out interfaces to base methods and other packages (like #5). Here's a current brain-dump of some of my thoughts. Additions, critiques and comments are very welcome.
Remaining core infrastructure
setindex!
(mirroringgetindex
's capabilities). (WIP: adds AxisArray setindex! methods #11)Axis
types to specify the dimension names. It would be an interesting experiment to store the dimension names asAxis
types instead of symbols. I'm not sure if that'd make things simpler or more convoluted. It may be a mixed bag.Possible additions to the core infrastructure
Add a third flavor of axis trait for Dimensional axes with elements of a discrete step-like type. The key defining characteristic of this element type is that their StepRanges must enumerate all values between the endpoints, allowing us to provide sensible indexing directly with a StepRange. This also means that there's no issues along the lines of floating point instability, so we can also allow indexing directly with single-elements of this type (and don't need to force the use of Intervals). The main use-case I see here is forDate
. Are there other types that satisfy these criteria? What is a sensible name for this trait?Axis
type (Move more logic into Axis type? #15). Ensuring that large, non-Range Dimensional axes are monotonically increasing can take a long time (we may be able to speed this up some with a special Ordering type, but it's still O(n)). Similarly, ensuring that elements in categorical vectors are unique requires hashing all elements (which could be used for the above hashmap).eachslice
iterator would be nice and useful in and of itself, and if we allow the same sort of syntax and semantics asmapslices
it can serve as the building block for augmenting that Base function.Signals.window
. I was thinking of allowing windowing as an indexing operation (Interval types mbauman/Signals.jl#10), but constructing vectors of interval types with deferred promotion (to allow, e.g., windows specified in time about integer indices) has been very challenging.Extensions to Base
sum
,mean
,maximum
,mapslices
, etc. We can also return AxisArrays with the properly reduced axis set, dropping a dimension and eliminating the type-unstable squeezes.Extensions to other packages
The text was updated successfully, but these errors were encountered: