Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Manual host banning to PgCat #340

Merged
merged 8 commits into from
Mar 6, 2023

Conversation

drdrsh
Copy link
Collaborator

@drdrsh drdrsh commented Mar 3, 2023

Sometimes we want an admin to be able to ban a host for some time to route traffic away from that host for reasons like partial outages, replication lag, and scheduled maintenance.

We can achieve this today using a configuration update but a quicker approach is to send a control command to PgCat that bans the replica for some specified duration.

This command does not change the current banning rules like

  • Primaries cannot be banned
  • When all replicas are banned, all replicas are unbanned

Commands added

BAN <host> <duration_seconds>;
BAN localhost 10;
    db     |     user      |  role   |   host
------------+---------------+---------+-----------
 sharded_db | other_user    | replica | localhost
 sharded_db | other_user    | replica | localhost
 sharded_db | other_user    | replica | localhost
 simple_db  | simple_user   | replica | localhost
 sharded_db | sharding_user | replica | localhost
 sharded_db | sharding_user | replica | localhost
 sharded_db | sharding_user | replica | localhost


SHOW BANS;
     db     |     user      |  role   |   host    |    reason    |          ban_time          | ban_duration_seconds | ban_remaining_seconds
------------+---------------+---------+-----------+--------------+----------------------------+----------------------+-----------------------
 sharded_db | sharding_user | replica | localhost | AdminBan(10) | 2023-03-03 21:58:20.258965 | 10                   | 8
 sharded_db | sharding_user | replica | localhost | AdminBan(10) | 2023-03-03 21:58:20.259333 | 10                   | 8
 sharded_db | sharding_user | replica | localhost | AdminBan(10) | 2023-03-03 21:58:20.259635 | 10                   | 8
 sharded_db | other_user    | replica | localhost | AdminBan(10) | 2023-03-03 21:58:20.259851 | 10                   | 8
 sharded_db | other_user    | replica | localhost | AdminBan(10) | 2023-03-03 21:58:20.262950 | 10                   | 8
 sharded_db | other_user    | replica | localhost | AdminBan(10) | 2023-03-03 21:58:20.267296 | 10                   | 8
 simple_db  | simple_user   | replica | localhost | AdminBan(10) | 2023-03-03 21:58:20.267537 | 10                   | 8


UNBAN <host>;
UNBAN localhost;
     db     |     user      |  role   |   host
------------+---------------+---------+-----------
 sharded_db | sharding_user | replica | localhost
 sharded_db | sharding_user | replica | localhost
 sharded_db | sharding_user | replica | localhost
 sharded_db | other_user    | replica | localhost
 sharded_db | other_user    | replica | localhost
 sharded_db | other_user    | replica | localhost
 simple_db  | simple_user   | replica | localhost

@drdrsh drdrsh changed the title Mostafa manual ban support Add Manual host banning to PgCat Mar 3, 2023
src/errors.rs Outdated
MessageReceiveFailed,
FailedCheckout,
StatementTimeout,
ManualBan,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ManualBan,
AdminBan,

src/admin.rs Outdated
{
let host = match tokens.get(1) {
Some(host) => host,
None => return error_response(stream, "BAN command requires a hostname to ban").await,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

non blocking, can be done in a follow up: do we want to accept a duration string for how long to ban it for?

@drdrsh drdrsh marked this pull request as ready for review March 3, 2023 21:53
@drdrsh drdrsh requested a review from levkk March 4, 2023 02:04
res.put(row_description(&columns));

for (id, pool) in get_all_pools().iter() {
for address in pool.get_addresses_from_host(host) {
Copy link
Contributor

@levkk levkk Mar 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we should use the address name here instead to make this more like the other admin commands?

Copy link
Collaborator Author

@drdrsh drdrsh Mar 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The common use case for admin banning (from my experience) is when a database is in a degraded state or is about to under go some maintenance. Using hostname for admin banning in these situation makes more sense as opposed to having to do an extra lookup to figure out the address name that corresponds to the host

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as we show that host name somewhere in our stats, so the user can find it without guessing, e.g. sometimes people use IP addresses and sometimes they use DNS, and sometimes both refer to the same place.

src/admin.rs Outdated
_ => pool.settings.ban_time,
};
let remaining = ban_duration - (now - ban_time.timestamp());
if remaining <= 0 || address.role == Role::Primary {
Copy link
Contributor

@levkk levkk Mar 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Primary should never be added to this data structure.If it was, we may want to know.

src/admin.rs Outdated

for (id, pool) in get_all_pools().iter() {
for address in pool.get_addresses_from_host(host) {
if !pool.is_banned(&address) && address.role != Role::Primary {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The primary check should be handled by the pool ideally as it is now I believe?

Copy link
Contributor

@levkk levkk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@drdrsh drdrsh merged commit 2cc6a09 into postgresml:main Mar 6, 2023
levkk pushed a commit that referenced this pull request Mar 9, 2023
Sometimes we want an admin to be able to ban a host for some time to route traffic away from that host for reasons like partial outages, replication lag, and scheduled maintenance.

We can achieve this today using a configuration update but a quicker approach is to send a control command to PgCat that bans the replica for some specified duration.

This command does not change the current banning rules like

Primaries cannot be banned
When all replicas are banned, all replicas are unbanned
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants