Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add query.max-run-time.hard-limit #15115

Merged
merged 1 commit into from
Nov 30, 2022

Conversation

nevillelyh
Copy link
Member

@nevillelyh nevillelyh commented Nov 19, 2022

Description

Add query.max-run-time.hard-limit.
The maximum allowed time for a query may be raised on a per-query basis using the query_max_run_time session property, but may not exceed this hard limit if it is specified.

Additional context and related issues

Release notes

( ) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
(x) Release notes are required, with the following suggested text:

# Section
* Add `query.max-run-time.hard-limit` configuration property

@findepi
Copy link
Member

findepi commented Nov 21, 2022

@nevillelyh Bounds is what we have validation for -- see validation param in PropertyMetadata factory methods.
Does this work for you?

@nevillelyh nevillelyh changed the title Support session property lower & upper bounds Support query.max-run-time upper bound Nov 21, 2022
@nevillelyh
Copy link
Member Author

nevillelyh commented Nov 21, 2022

@nevillelyh Bounds is what we have validation for -- see validation param in PropertyMetadata factory methods. Does this work for you?

Yeah I'm aware of the validation param. Hard to apply that generically to all properties though as it just get baked into decoder. Anyway irrelevant now that I reworked the PR.

Not sure about the query.max-run-time.allow-overrides name though, since overrides are still technically allowed, just not greater than the default value. @electrum thoughts?

@findepi
Copy link
Member

findepi commented Nov 22, 2022

Not sure about the query.max-run-time.allow-overrides name though, since overrides are still technically allowed, just not greater than the default value. @electrum thoughts?

Fair point. Let's have some English native person speak here.

@electrum @martint @alexjo2144 @erichwang

@alexjo2144
Copy link
Member

alexjo2144 commented Nov 22, 2022

Is there really a use-case for turning this off at the cluster level? I could see granting individual users, or an admin role permission to do it, but for all users in a cluster I'm not so sure.

As far as the naming goes, maybe query.allow-additional-run-time? I don't love that either

@nevillelyh
Copy link
Member Author

Feels like access control & upper bound are orthogonal.
ACL grants someone the permission to change the value, it can't check whether it exceeds cluster level.
OTOH one might want to reduce the query.max-run-time for a query to fail fast, or increase it beyond cluster level for a resource intensive query to finish. This flag determines whether the latter case should be allowed.

query.max-run-time.extendable maybe?

@findepi
Copy link
Member

findepi commented Nov 23, 2022

I can imagine a use-case where query.max-run-time is by default set to small value, e.g. 10 minutes, but users are allowed to tune it up when needed.
This also means we maybe don't want a boolean flag, but simply a default value (what we have currently) and a hard limit, which users can't exceed.

Also, a session property can be set by users directly, but it can also be set via a Session Property Manager installed on the coordinator. Such use also would be better served with a hard limit.

Sorry that I didn't think about this before. @nevillelyh can you adjust?

@nevillelyh
Copy link
Member Author

Sounds good. That actually seems to be a better solution. Something like query.max-run-time.hard-limit that defaults to 100d assuming no existing users have query.max-run-time higher than that.

@nevillelyh nevillelyh changed the title Support query.max-run-time upper bound Add query.max-run-time.hard-limit Nov 23, 2022
@hashhar
Copy link
Member

hashhar commented Nov 23, 2022

You can leave default for query.max-run-time.hard-limit unset (or infinite) and only admins who want to enforce a limit can set a value.

@alexjo2144
Copy link
Member

Or should the default be set to be the same as the query max run time and admins have to opt-in to allow users to extend it?

@nevillelyh
Copy link
Member Author

nevillelyh commented Nov 23, 2022

You can leave default for query.max-run-time.hard-limit unset (or infinite) and only admins who want to enforce a limit can set a value.

Duration type doesn't have a notion of infinite. I can make it Optional that defaults to empty() i.e. no enforcement.

Or should the default be set to be the same as the query max run time and admins have to opt-in to allow users to extend it?

While this feel like the more desirable default, it's a breaking change and might be hard to explain, i.e. max-run-time can't be exceeded unless hard-limit is explicitly set higher?

@hashhar I went with your suggestion.

@hashhar
Copy link
Member

hashhar commented Nov 23, 2022

Duration type doesn't have a notion of infinite.

Duration (airlift duration) supports max unit as days so Double.MAX_VALUE with days as unit is the max value. 😄
Optional is indeed better however.

@nevillelyh
Copy link
Member Author

@findepi @hashhar @electrum PTAL again?

allowed time for a query may be raised for a session using the
``query_max_run_time`` session property, but must not exceed this hard limit
if it is specified.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @colebow @mosabua for this file

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Late to the party, but it looks good.

@nevillelyh
Copy link
Member Author

@findepi addressed all comments

@findepi findepi merged commit 5f7ae28 into trinodb:master Nov 30, 2022
@findepi findepi mentioned this pull request Nov 30, 2022
@github-actions github-actions bot added this to the 404 milestone Nov 30, 2022
@martint
Copy link
Member

martint commented Dec 1, 2022

I'm not sure I understand the use case for this. In the past, we've actively discouraged from adding more knobs and configuration options like these (if I recall, we even considered something like this specific option but concluded it wasn't necessary). The existing option is already a hard limit, and can only be changed by users that have permissions to do so. If someone wants to run a query with a lower time bound, they can kill the query when the desired time elapses. There might be other options, too, like allowing the session property to be lowered but not increased beyond the value of the config option.

My concerns are:

  • We've changed the meaning of the existing option, which was supposed to be a hard limit
  • We've introduced yet another option, which makes understanding how to configure a cluster even more complicated
  • There are probably other mechanisms to execute a query with a lower time bound than what's specified by the configured limit that weren't considered.

@nevillelyh nevillelyh deleted the neville/session-bounds branch December 2, 2022 00:42
@electrum
Copy link
Member

electrum commented Dec 2, 2022

If someone wants to run a query with a lower time bound, they can kill the query when the desired time elapses.

I don't think this is a reasonable argument. We've had the time limit session property for years as a convenient and often used way to automatically limit the runtime. You're telling administrators that if they want to limit the upper bound for the max runtime, their users must now stop using this feature and write custom code to externally kill the query after a time limit,(which is especially difficult if executing the query using a normal tool where the query ID isn't exposed).

We've changed the meaning of the existing option, which was supposed to be a hard limit

The existing config property query.max-run-time is not a hard limit. Rather, it is the default value for the session property.

There might be other options, too, like allowing the session property to be lowered but not increased beyond the value of the config option.

This a potentially good option. The problem is that the security check checkCanSetSystemSessionProperty simply takes the property name. We could potentially have a variant that either takes the new value, and possibly the existing value, or a flag that indicates if the value is being lowered.

This might be less flexible than a hard-limit system, which lets an administrator specify both a default and an upper bound. Query maximum memory is the other use case that comes to mind. The admin might want to have a default of say 100GB but have an upper bound of 1TB for users who need it. We have no way to allow this type of configuration today.

@martint
Copy link
Member

martint commented Dec 2, 2022

The existing config property query.max-run-time is not a hard limit. Rather, it is the default value for the session property.

It's a hard limit in that queries won't exceed that time. If it's protected with ACLs, then no one other than authorized users should be able to change it.

In any case, I don't think it's tenable to add a "hard-limit" version for every knob that already exists. Aside from memory limits, there are options for max analysis time, max execution time, max planning time, max CPU time and max data scanned. We need a better way. Since the motivation for this change is not clear, and the discussion in the comments seems to be very speculative (about what someone might want to do), I'd prefer to err on the side of not adding the new option and doing it in a more principled way. Once the option is there, it's harder to remove later due to backward compatibility issues.

@findepi
Copy link
Member

findepi commented Dec 5, 2022

The use case is described in the proposed release notes in a more end-user friendly manner #15058 (comment)

General
Provide an optional limit for query_max_run_time session property values.

Previously it wasn't possible to configure a system so that users can set query_max_run_time but within some allowed range.
Why allowed range is important? Obviously for cluster stability etc (this is where permissions / AC could be employed as well). But also for node draining for graceful shutdown (this is where having it configurable irrespective of AC is imperative).

@hashhar
Copy link
Member

hashhar commented Dec 6, 2022

If it's protected with ACLs, then no one other than authorized users should be able to change it.

Seems to be the root of the underlying issue. Can https://trino.io/docs/current/security/file-system-access-control.html#session-property-rules not solve this and other related issues?

@nevillelyh
Copy link
Member Author

It can address the original motivation of this PR, which was to disallow changing query-max-run-time for cluster stability, drainage, etc. The question is whether we should allow lowering this value per session, which can't be done with AC.

@martint
Copy link
Member

martint commented Dec 6, 2022

But also for node draining for graceful shutdown (this is where having it configurable irrespective of AC is imperative).

How would this help for draining?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

7 participants