Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: prune row groups correctly for columns with the same name #3802

Merged
merged 3 commits into from
Apr 26, 2024

Conversation

evenyag
Copy link
Contributor

@evenyag evenyag commented Apr 25, 2024

I hereby agree to the terms of the GreptimeDB CLA.

Refer to a related PR or issue link (optional)

What's changed and what's your intention?

This PR fixes an issue that the parquet reader might prune row groups unexpectedly if two different columns are using the same name. This is possible if we drop a column and then add a column with the same name.

This PR passes the metadata of the region to the RowGroupPruningStats so it can get the correct column id to read.

For example, if we have the following table:

CREATE TABLE test(i TIMESTAMP TIME INDEX, j INTEGER, k INTEGER NOT NULL);
INSERT INTO test(i, j, k) VALUES (1, 11, 5), (2, 12, 5);
SELECT FLUSH_TABLE('test');
ALTER TABLE test DROP COLUMN j;
ALTER TABLE test ADD COLUMN j INTEGER DEFAULT 0;
INSERT INTO test(i, j, k) VALUES (3, 0, 6);
INSERT INTO test VALUES (4, 7, 0);

Before this PR:

SELECT * FROM test order by i;
+-------------------------+---+---+
| i                       | k | j |
+-------------------------+---+---+
| 1970-01-01T00:00:00.001 | 5 | 0 |
| 1970-01-01T00:00:00.002 | 5 | 0 |
| 1970-01-01T00:00:00.003 | 6 | 0 |
| 1970-01-01T00:00:00.004 | 7 | 0 |
+-------------------------+---+---+

SELECT * FROM test WHERE j = 0 order by i;
+-------------------------+---+---+
| i                       | k | j |
+-------------------------+---+---+
| 1970-01-01T00:00:00.003 | 6 | 0 |
| 1970-01-01T00:00:00.004 | 7 | 0 |
+-------------------------+---+---+

After this PR:

SELECT * FROM test WHERE j = 0 order by i;
+-------------------------+---+---+
| i                       | k | j |
+-------------------------+---+---+
| 1970-01-01T00:00:00.001 | 5 | 0 |
| 1970-01-01T00:00:00.002 | 5 | 0 |
| 1970-01-01T00:00:00.003 | 6 | 0 |
| 1970-01-01T00:00:00.004 | 7 | 0 |
+-------------------------+---+---+

Checklist

  • I have written the necessary rustdoc comments.
  • I have added the necessary unit tests and integration tests.
  • This PR does not require documentation updates.

@github-actions github-actions bot added the docs-not-required This change does not impact docs. label Apr 25, 2024
@evenyag evenyag marked this pull request as ready for review April 26, 2024 03:08
@evenyag evenyag requested review from v0y4g3r, waynexia and a team as code owners April 26, 2024 03:08
Copy link

codecov bot commented Apr 26, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 85.30%. Comparing base (2d0f493) to head (5f58c33).
Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3802      +/-   ##
==========================================
- Coverage   85.62%   85.30%   -0.32%     
==========================================
  Files         947      953       +6     
  Lines      161925   162886     +961     
==========================================
+ Hits       138647   138957     +310     
- Misses      23278    23929     +651     

Copy link
Collaborator

@fengjiachun fengjiachun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@waynexia waynexia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@waynexia waynexia added this pull request to the merge queue Apr 26, 2024
Merged via the queue into GreptimeTeam:main with commit 77fc1e6 Apr 26, 2024
39 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs-not-required This change does not impact docs.
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants