Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(hogql): asterisk expander #14271

Merged
merged 99 commits into from
Feb 22, 2023
Merged

feat(hogql): asterisk expander #14271

merged 99 commits into from
Feb 22, 2023

Conversation

mariusandra
Copy link
Collaborator

@mariusandra mariusandra commented Feb 16, 2023

Problem

The query select * from anywhere would fail.

Changes

  • Automatically expands * fields on tables:

image

  • Also removes some "legacy" handling from the "query_events_list" query. More specifically, selecting * would also select a few "person" fields from a virtual subtable. This "virtual subtable" doesn't work for most users on cloud (why the event explorer flag is still disabled) as most users aren't using "persons on events". Thus to keep things simple, I made the asterisk "*" only expand the actual fields on the table, excluding those person fields. I also removed the virtual "person" column/tuple that the "query_events_list" used if you selected just "person". It has also been removed in the interface for a while. All of this will be (already is) fixed in the followup PR feat(hogql): Events table based on hogql #14315

How did you test this code?

Wrote tests. Checked in the browser, both by making queries and using the "old" data explorer.

@mariusandra mariusandra changed the title feat(hogql): splotch desplotcher feat(hogql): asterisk expander Feb 21, 2023
@mariusandra mariusandra marked this pull request as ready for review February 21, 2023 10:19
Comment on lines +213 to +214
if expr == "*":
expr = f'tuple({", ".join(SELECT_STAR_FROM_EVENTS_FIELDS)})'
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The events query expects a tuple when it gets a "*" field in its select array. This will all be refactored in #14315

Base automatically changed from hogql-node to master February 22, 2023 09:25
@neilkakkar
Copy link
Collaborator

Give whoever convinced you to change the name from splotch desplotcher to asterisk expander a medal 😂 (will review in a bit!)

@mariusandra
Copy link
Collaborator Author

That has to be @Twixes :D

@mariusandra
Copy link
Collaborator Author

Found one more case that wasn't covered: select * from (select * from event). Works now, and ready for a real review.

Copy link
Contributor

@thmsobrmlr thmsobrmlr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@mariusandra
Copy link
Collaborator Author

Thanks @thmsobrmlr ! I'll merge it in, since * support is quite useful when browsing unknown databases... Any late 👀 are always welcome to add comments here or in the next PR: #14286

@mariusandra mariusandra merged commit 036c1cf into master Feb 22, 2023
@mariusandra mariusandra deleted the hogql-transforms branch February 22, 2023 14:34
@@ -40,6 +41,8 @@ def print_ast(

# modify the cloned tree as needed
if dialect == "clickhouse":
expand_asterisks(node)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can remove the pass below now

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤦

def get_asterisk(self) -> List[str]:
list: List[str] = []
for field in self.__fields__.values():
def get_asterisk(self) -> Dict[str, DatabaseField]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will need to support person on events conditionally when it's enabled 😬 (i.e. if there's a person.properties, make a decision whether to join persons table or go to the person_properties column instead).

Unclear how this should work yet, but not blocking now!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also food for thought: Many of our query optimisations are based on programmatically choosing which columns to select (example: choosing event.properties can OOM if it's not actually necessary for the subsequent query because these are big json payloads).

Would be nice if hogql could work with that logic, even for random client queries.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, person properties will get their own handling. Part of it is happening in #14286 . Right now we've come sort of full circle. In that PR you can only get person properties from a joined table, not from the person_properties field :D. Since PoE is off for most everyone, I'm trying to get this working first. It should be easy to say "don't join a table, take a field instead"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The person property "expansion" is implemented in the next PR as a separate "LazyTableResolver" step precisely so we would know and select only the fields we need. Very soon also only the properties we need.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants