Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize COUNT( DISTINCT ...) for strings (up to 9x faster) #8849

Merged
merged 37 commits into from
Jan 29, 2024

Commits on Jan 16, 2024

  1. chkp

    Signed-off-by: jayzhan211 <[email protected]>
    jayzhan211 committed Jan 16, 2024
    Configuration menu
    Copy the full SHA
    9c44d04 View commit details
    Browse the repository at this point in the history
  2. chkp

    Signed-off-by: jayzhan211 <[email protected]>
    jayzhan211 committed Jan 16, 2024
    Configuration menu
    Copy the full SHA
    6cb8bbe View commit details
    Browse the repository at this point in the history
  3. draft

    Signed-off-by: jayzhan211 <[email protected]>
    jayzhan211 committed Jan 16, 2024
    Configuration menu
    Copy the full SHA
    9d662a7 View commit details
    Browse the repository at this point in the history
  4. iter done

    Signed-off-by: jayzhan211 <[email protected]>
    jayzhan211 committed Jan 16, 2024
    Configuration menu
    Copy the full SHA
    1744cb3 View commit details
    Browse the repository at this point in the history
  5. short string test

    Signed-off-by: jayzhan211 <[email protected]>
    jayzhan211 committed Jan 16, 2024
    Configuration menu
    Copy the full SHA
    e3b0568 View commit details
    Browse the repository at this point in the history
  6. add test

    Signed-off-by: jayzhan211 <[email protected]>
    jayzhan211 committed Jan 16, 2024
    Configuration menu
    Copy the full SHA
    12cf50c View commit details
    Browse the repository at this point in the history
  7. remove unused

    Signed-off-by: jayzhan211 <[email protected]>
    jayzhan211 committed Jan 16, 2024
    Configuration menu
    Copy the full SHA
    4f9a3f0 View commit details
    Browse the repository at this point in the history
  8. to_string directly

    Signed-off-by: jayzhan211 <[email protected]>
    jayzhan211 committed Jan 16, 2024
    Configuration menu
    Copy the full SHA
    626b1cb View commit details
    Browse the repository at this point in the history
  9. rewrite evaluate

    Signed-off-by: jayzhan211 <[email protected]>
    jayzhan211 committed Jan 16, 2024
    Configuration menu
    Copy the full SHA
    2e80cb7 View commit details
    Browse the repository at this point in the history
  10. return Vec<String>

    Signed-off-by: jayzhan211 <[email protected]>
    jayzhan211 committed Jan 16, 2024
    Configuration menu
    Copy the full SHA
    d2d1d6d View commit details
    Browse the repository at this point in the history
  11. fmt

    Signed-off-by: jayzhan211 <[email protected]>
    jayzhan211 committed Jan 16, 2024
    Configuration menu
    Copy the full SHA
    ebb8726 View commit details
    Browse the repository at this point in the history
  12. add more queries

    Signed-off-by: jayzhan211 <[email protected]>
    jayzhan211 committed Jan 16, 2024
    Configuration menu
    Copy the full SHA
    98a9cd1 View commit details
    Browse the repository at this point in the history

Commits on Jan 17, 2024

  1. add group by query and rewrite evalute with state()

    Signed-off-by: jayzhan211 <[email protected]>
    jayzhan211 committed Jan 17, 2024
    Configuration menu
    Copy the full SHA
    07831fa View commit details
    Browse the repository at this point in the history
  2. move evaluate back

    Signed-off-by: jayzhan211 <[email protected]>
    jayzhan211 committed Jan 17, 2024
    Configuration menu
    Copy the full SHA
    62c8084 View commit details
    Browse the repository at this point in the history
  3. upd test

    Signed-off-by: jayzhan211 <[email protected]>
    jayzhan211 committed Jan 17, 2024
    Configuration menu
    Copy the full SHA
    e3b65c8 View commit details
    Browse the repository at this point in the history
  4. add row sort

    Signed-off-by: jayzhan211 <[email protected]>
    jayzhan211 committed Jan 17, 2024
    Configuration menu
    Copy the full SHA
    3f0e9a9 View commit details
    Browse the repository at this point in the history

Commits on Jan 20, 2024

  1. Configuration menu
    Copy the full SHA
    4bc483a View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    0475687 View commit details
    Browse the repository at this point in the history
  3. Rework set to avoid copies

    alamb committed Jan 20, 2024
    Configuration menu
    Copy the full SHA
    a764e99 View commit details
    Browse the repository at this point in the history
  4. Merge branch 'bytes-distinctcount' of github.com:jayzhan211/arrow-dat…

    …afusion into bytes-distinctcount
    alamb committed Jan 20, 2024
    Configuration menu
    Copy the full SHA
    bde49c6 View commit details
    Browse the repository at this point in the history
  5. Simplify offset construction

    alamb committed Jan 20, 2024
    Configuration menu
    Copy the full SHA
    a101b62 View commit details
    Browse the repository at this point in the history
  6. fmt

    alamb committed Jan 20, 2024
    Configuration menu
    Copy the full SHA
    0f2fa02 View commit details
    Browse the repository at this point in the history

Commits on Jan 21, 2024

  1. Improve comments

    alamb committed Jan 21, 2024
    Configuration menu
    Copy the full SHA
    489e130 View commit details
    Browse the repository at this point in the history
  2. Improve comments

    alamb committed Jan 21, 2024
    Configuration menu
    Copy the full SHA
    c39988a View commit details
    Browse the repository at this point in the history

Commits on Jan 22, 2024

  1. add fuzz test

    Signed-off-by: jayzhan211 <[email protected]>
    jayzhan211 committed Jan 22, 2024
    Configuration menu
    Copy the full SHA
    0e33b12 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    b3bcc68 View commit details
    Browse the repository at this point in the history
  3. Merge branch 'bytes-distinctcount' of github.com:jayzhan211/arrow-dat…

    …afusion into bytes-distinctcount
    alamb committed Jan 22, 2024
    Configuration menu
    Copy the full SHA
    d7efcf6 View commit details
    Browse the repository at this point in the history
  4. refine fuzz test

    alamb committed Jan 22, 2024
    Configuration menu
    Copy the full SHA
    a80b39c View commit details
    Browse the repository at this point in the history
  5. Add tests for size accounting

    alamb committed Jan 22, 2024
    Configuration menu
    Copy the full SHA
    3e9289a View commit details
    Browse the repository at this point in the history
  6. Split into new module

    alamb committed Jan 22, 2024
    Configuration menu
    Copy the full SHA
    7b9d067 View commit details
    Browse the repository at this point in the history

Commits on Jan 24, 2024

  1. Configuration menu
    Copy the full SHA
    d405744 View commit details
    Browse the repository at this point in the history
  2. Remove use of Mutex

    alamb committed Jan 24, 2024
    Configuration menu
    Copy the full SHA
    3a6a066 View commit details
    Browse the repository at this point in the history

Commits on Jan 25, 2024

  1. Configuration menu
    Copy the full SHA
    f177aed View commit details
    Browse the repository at this point in the history
  2. revert changes

    alamb committed Jan 25, 2024
    Configuration menu
    Copy the full SHA
    8640907 View commit details
    Browse the repository at this point in the history

Commits on Jan 27, 2024

  1. Configuration menu
    Copy the full SHA
    214ba5b View commit details
    Browse the repository at this point in the history

Commits on Jan 28, 2024

  1. Configuration menu
    Copy the full SHA
    1e10b9c View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    f5e268d View commit details
    Browse the repository at this point in the history