Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: improve pg_table_is_visible performance #59880

Merged
merged 1 commit into from
Feb 9, 2021

Conversation

rafiss
Copy link
Collaborator

@rafiss rafiss commented Feb 5, 2021

This is motivated by a query that Django makes in order to retrieve all
the tables visible in the current session. The query uses the
pg_table_is_visible function. Previously this was implemented by using
the internal executor to inspect the pg_catalog. This was expensive
because the internal executor does not share the same descriptor
collection as the external context, so it ended up fetching every single
table descriptor one-by-one.

Now, we avoid the internal executor and lookup the table descriptor
directly.

There are other builtins that use the internal executor that may face
the same problem. Perhaps an approach for the future is to allow builtin
functions to be implemented using user-defined scalar functions.

I added a benchmark that shows the original Django query performing a
constant number of descriptor lookups with this new implementation.

fixes #57924

Release note (performance improvement): Improve performance of the
pg_table_is_visible builtin function.

@rafiss rafiss requested review from jordanlewis, otan, a team and miretskiy and removed request for a team February 5, 2021 23:49
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@rafiss rafiss force-pushed the django-query-bench branch 2 times, most recently from b758054 to d593785 Compare February 7, 2021 17:56
"SELECT nspname FROM pg_catalog.pg_class c JOIN pg_catalog.pg_namespace n ON c.relnamespace=n.oid "+
"WHERE c.oid=$1 AND nspname=ANY(current_schemas(true))", oid)
oid := tree.MustBeDOid(args[0])
isVisibe, err := ctx.Planner.IsTableVisible(ctx.Context, ctx.SessionData.SearchPath, int64(oid.DInt))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VISIBE!!!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixd

@@ -205,6 +206,31 @@ func (p *planner) CommonLookupFlags(required bool) tree.CommonLookupFlags {
}
}

// IsTableVisible is part of the tree.EvalDatabase interface.
func (p *planner) IsTableVisible(
ctx context.Context, searchPath sessiondata.SearchPath, tableId int64,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: tableID

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix'd

return false, nil
}
iter := searchPath.Iter()
for scName, ok := iter.Next(); ok; scName, ok = iter.Next() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to check database too? what if db is in dropping state?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uhh i guess not since it relies on search path.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm actually my code won't work because you could have two schemas in different databases, but with the same name.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i guess the old code had the same bug..

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HMMM though actually when cross-database references are fully banned, then we don't need to worry about it... (#55791)


// IsTableVisible checks if the table with the given ID belongs to a schema
// on the given sessiondata.SearchPath.
IsTableVisible(ctx context.Context, searchPath sessiondata.SearchPath, tableId int64) (bool, error)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: tableID

@rafiss
Copy link
Collaborator Author

rafiss commented Feb 8, 2021

RFAL -- i updated to explicitly account for cross-db references (and match the PG behavior)

@rafiss rafiss removed the request for review from miretskiy February 8, 2021 20:15
This is motivated by a query that Django makes in order to retrieve all
the tables visible in the current session. The query uses the
pg_table_is_visible function. Previously this was implemented by using
the internal executor to inspect the pg_catalog. This was expensive
because the internal executor does not share the same descriptor
collection as the external context, so it ended up fetching every single
table descriptor one-by-one.

Now, we avoid the internal executor and lookup the table descriptor
directly.

There are other builtins that use the internal executor that may face
the same problem. Perhaps an approach for the future is to allow builtin
functions to be implemented using user-defined scalar functions.

I added a benchmark that shows the original Django query performing a
constant number of descriptor lookups with this new implementation.
Previously, the same benchmark would make hundreds of roundtrips.

Release note (performance improvement): Improve performance of the
pg_table_is_visible builtin function.
@rafiss
Copy link
Collaborator Author

rafiss commented Feb 9, 2021

tftr! bors r=otan

@rafiss
Copy link
Collaborator Author

rafiss commented Feb 9, 2021

i mean

bors r=otan

@craig
Copy link
Contributor

craig bot commented Feb 9, 2021

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sql: Django's table introspection query is slow (seconds to minutes, depending on number of tables)
3 participants