Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Special condensed packing for point data #33

Open
mourner opened this issue Dec 10, 2020 · 4 comments
Open

Special condensed packing for point data #33

mourner opened this issue Dec 10, 2020 · 4 comments
Labels
enhancement New feature or request

Comments

@mourner
Copy link
Owner

mourner commented Dec 10, 2020

I wonder if it's possible to use twice less memory specifically for point data, where we know that maxX and maxY duplicate minX and minY. This could look like an option in the constructor (onlyPoints = false), and then adding special case handling for indexed leaves.

@mourner mourner added the enhancement New feature or request label Dec 10, 2020
@bjornharrtell
Copy link

This has also crossed my mind a few times. :)

@kylebarron
Copy link
Contributor

kylebarron commented Oct 9, 2023

Is there still interest in this? If it would be considered, I'd consider attempting a PR. My general thoughts would be:

  • Have a flag on the class for whether the class is point-only or not.
  • Change the computed nodeByteSize to be 2-per-node for point data
    const nodesByteSize = numNodes * 4 * this.ArrayType.BYTES_PER_ELEMENT;
  • Add an addPoint method that takes only x and y as arguments
  • Update add to throw if this.onlyPoints is true (or maybe only throw if minX !== maxX and minY !== maxY).
  • Updates to finish to ensure correct sorting for point data. (I know... but seems doable at a glance and left to explore fully in a pr)
  • Bump the serialized format version number
    const VERSION = 3; // serialized format version
    . We might also have to allocate an extra byte in the header so that there's a way to recollect whether the specified buffer only contains points or not (or if we assume the buffer is always valid input, I suppose you could check whether the length of the buffer matches what you'd expect with either 2-item or 4-item boxes?)

@kylebarron
Copy link
Contributor

(I know... but seems doable at a glance and left to explore fully in a pr)

I had some time on a flight, tried to implement this, and learned why drawing the rest of the owl is non-trivial 😉. It took me embarrassingly long to realize that only the leaf nodes contain a single x and y, because all intermediate nodes contain more than one point 🙈.

So it seems like the complexity is mostly in the structure of the boxes and the use of << 2 to navigate the tree. We wouldn't be able to use << 2 directly anymore, because the boxes at the beginning of the tree only take up half the length as expected.

It would increase complexity and might have a small performance regression for the non-point case, but it seems like we could work around this by having a subtraction offset (computed based on the number of items in the tree) and based on the tree level.

This is an interesting problem, and it would be really awesome to have a single tree structure that works equally well for points and non-points, so I think I'll still try to get a solution in the near future

@mourner
Copy link
Owner Author

mourner commented Oct 10, 2023

@kylebarron awesome, thanks for working on this! I attempted this in the past but run into the same issue — it demanded a significant increase in complexity and branching not just in indexing but search methods too, and at some point it seemed that it would be easier to just fork the project with a separate point-based version rather than implement it all in one class. But if you find an elegant way to incorporate this, I'd be happy to review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants