Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allows hex() to work on type of UUID #32170

Merged
merged 13 commits into from
Jan 22, 2022
Merged

Conversation

FrankChen021
Copy link
Contributor

@FrankChen021 FrankChen021 commented Dec 3, 2021

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Added UUID data type support for functions hex, bin.

To simplify the conversion of UUID to string.

And lower(hex(uuid)) equals to replace(toString(uuid), '-', '')

And bin/hex share the same code abstraction, the change in this patch also allows bin() to work on type of UUID.

Signed-off-by: frank chen <[email protected]>
Signed-off-by: frank chen <[email protected]>
Signed-off-by: frank chen <[email protected]>
@robot-clickhouse robot-clickhouse added the pr-improvement Pull request with some product improvements label Dec 3, 2021
@alesapin
Copy link
Member

alesapin commented Dec 3, 2021

@Mergifyio update

@mergify
Copy link
Contributor

mergify bot commented Dec 3, 2021

update

✅ Branch has been successfully updated

@kitaisreal kitaisreal self-assigned this Dec 3, 2021
@kitaisreal
Copy link
Collaborator

@Mergifyio update

@mergify
Copy link
Contributor

mergify bot commented Dec 3, 2021

update

✅ Branch has been successfully updated

@@ -11,3 +11,13 @@ with generateUUIDv4() as uuid,
identity(lower(hex(reverse(reinterpretAsString(uuid))))) as str,
reinterpretAsUUID(reverse(unhex(str))) as uuid2
select uuid = uuid2;

with generateUUIDv4() as uuid,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@FrankChen021 better to add new test, without modifying old one and fix it. Please check CI report.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me check why it fails. The test passed before I opened the PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found the answer. Original hex/bin functions don't output the leading zero for integers, so if each 64bit of UUID has leading zeros, they're also skipped. This might lead to test failure because UUID is generated randomly, and in that case, I think the generated UUID contains leading zeros.

For UUID's hex output, the leading zero should also be output.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@FrankChen021 can't we use string implementation instead of trying to reuse integer implementations ?
Also new changes in tests should be moved to new tests, that way it is easier to track where tests start to fail

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't use the string implementation. Because we want to get the result in big-endian order, while the string implementation still generates a string in little-endian order.

Copy link
Collaborator

@kitaisreal kitaisreal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move new test cases to new test file.

Signed-off-by: frank chen <[email protected]>
@FrankChen021
Copy link
Contributor Author

Following Fast test fails: 01053_window_view_proc_hop_to_now | FAIL

Following Stateless tests fails: 00993_system_parts_race_condition_drop_zookeeper | FLAKY

They seem have nothing to do with the changes in this PR.

@alexey-milovidov
Copy link
Member

But we still need green CI to do merge.
Please merge with upstream to fix the fast test.

@FrankChen021
Copy link
Contributor Author

@alexey-milovidov I understand it. But my question is even the current master branch does not pass all the CI checks, how could I ensure the checks on my branch will be passed after I merge the master into my branch?

@alexey-milovidov
Copy link
Member

alexey-milovidov commented Dec 13, 2021

10 days ago one test in the master branch was broken and quickly fixed.
While it is a terrible incident and we admit how miserable we are,
we expect that the master branch should be always green.

@alexey-milovidov alexey-milovidov added the can be tested Allows running workflows for external contributors label Dec 13, 2021
@FrankChen021
Copy link
Contributor Author

@alexey-milovidov Some checks failed because they didn't successfully start. What should I do now?

@alexey-milovidov
Copy link
Member

@Mergifyio update

@mergify
Copy link
Contributor

mergify bot commented Dec 23, 2021

update

✅ Branch has been successfully updated

select str1 = str2;

-- hex on UUID always generate 32 characters even there're leading zeros
select lower(hex(toUUID('00000000-80e7-46f8-0000-9d773a2fd319')));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer if the order of output characters will be exactly the same as in UUID:

0000000080e746f800009d773a2fd319

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this is how toUUID treats the input string. toUUID treats the input string as a little-endian order string.

@alexey-milovidov
Copy link
Member

It will be very strange if hex/unhex for UUID will use different byte order that UUID has in its text representation.

@FrankChen021
Copy link
Contributor Author

@alexey-milovidov Maybe you're right. Let me take a consideration.

Signed-off-by: frank chen <[email protected]>
@kitaisreal kitaisreal self-assigned this Jan 18, 2022
@FrankChen021
Copy link
Contributor Author

@alexey-milovidov @kitaisreal Your comments have been addressed in the latest commit. Please take a look at it at your
convenience.

@alexey-milovidov alexey-milovidov merged commit 8cabd37 into ClickHouse:master Jan 22, 2022
@FrankChen021 FrankChen021 deleted the hex branch January 22, 2022 14:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
can be tested Allows running workflows for external contributors pr-improvement Pull request with some product improvements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants