-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] First DataStreams version #13
base: master
Are you sure you want to change the base?
Conversation
I think this also comes close to the comment at https://github.com/FugroRoames/PointClouds.jl:
julia> s = LasIO.Source("test/srs.las")
julia> d = Data.stream!(s, DataFrame)
julia> d = Data.close!(d)
julia> d[:intensity]
10-element Array{UInt16,1}:
0x0104
0x0118
0x0118
0x0118
0x0104
0x00f0
0x00f0
0x0118
0x0118
0x0104 |
|
Not sure about this clashing with #16, these are two separate approaches in my opinion. |
How do you mean two separate approaches? As in we should have one or the other? I thought we can have both right? In any case it might be good to try to get LAS 1.3 and 1.4 support in first, as it is becoming increasingly common. |
Here is a WIP for implementing DataStreams. Still rough around the edges, but I'd like some feedback for the overall direction/API.
What
DataStreams seems the way to go in the Julia data ecosystem, enabling streaming conversions, for example, between CSV and SQLite, or DataFrames, DataTables. This enables users to easily read LAS files into DataFrames and back, without needing to know any raw header/point information.
Why
This addresses most of the comments from @c42f in #4 for a new API and v0.1
LasIO.Source
provides only the header, points on request withstream!
, so no tuple (header, points) anymoreLasIO.Source
works without FileIOTODO
For using the Source:
stream!
For using the Sink
Discussion
For using the Sink we need some discussion. Writing now only works with Source columns that match a LasPoint perfectly. Do we fill these gaps, and thus allow for an invalid point type? And the xyz coordinates will be in Float, which we need to scale/offset. Doing this afterwards is detrimental for performance. I would propose for float input:
LasIO.Sink(filename, bbox; precision=2, crs=nothing)
Further improvements can be made to the pointtypes. Since these are hardly used by this implementation (only looking up attributes and types for the Schema creation) we could explode some attributes such as the flag_byte into their individual components for better accessibility. I'm not sure what this would do to the performance though.
Demo