Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pageserver: add mgmt API to scan layers for disposable keys #9393

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

problame
Copy link
Contributor

This PR adds a pageserver mgmt API to scan a layer file for disposable keys.

It hooks it up to the sharding compaction test, demonstrating that we're not filtering out all disposable keys.

This is extracted from PGDATA import (#9218)
where I do the filtering of layer files based on is_key_disposable.

@problame problame requested a review from a team as a code owner October 15, 2024 08:27
@problame problame requested review from skyzh and jcsp and removed request for skyzh October 15, 2024 08:27
.with_context(|| format!("get layer from key: {key}"))
.expect("not found")
.clone()
}

pub(crate) fn try_get_from_key(&self, key: &PersistentLayerKey) -> Option<&Layer> {
// The assumption for the `expect()` is that all code maintains the following invariant:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment appears to have been copied out of context

@@ -2645,6 +2649,43 @@ def layer_exists(
layers = self.list_layers(tenant_id, timeline_id)
return layer_name in [parse_layer_file_name(p.name) for p in layers]

def timeline_scan_no_disposable_keys(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function would benefit from a docstring ("disposable keys" is a bit of an obscure concept when one isn't actively working on sharding stuff)

@@ -1210,6 +1210,99 @@ async fn layer_map_info_handler(
json_response(StatusCode::OK, layer_map_info)
}

#[instrument(skip_all, fields(tenant_id, shard_id, timeline_id, layer_name))]
async fn timeline_layer_file_scan_disposable_keys(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "file" is a bit redundant in names like this (we can just say "layer" rather than "layer file")

@@ -3047,6 +3140,10 @@ pub fn make_router(
"/v1/tenant/:tenant_shard_id/timeline/:timeline_id/layer/:layer_file_name",
|r| api_handler(r, evict_timeline_layer_handler),
)
.post(
"/v1/tenant/:tenant_shard_id/timeline/:timeline_id/layer/:layer_name/scan_disposable_keys",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I presume this is not a testing-only API because some piece of automation in the field is going to use it? Can you add a comment to the handler definition that mentions the use case?

impl From<ImageLayerName> for PersistentLayerKey {
fn from(image_layer_name: ImageLayerName) -> Self {
Self {
key_range: image_layer_name.key_range,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird that we never needed this before -- are there some locations that do this by hand which could be updated to use the From?

Copy link

0 tests run: 0 passed, 0 failed, 0 skipped (full report)


Test coverage report is not available

The comment gets automatically updated with the latest test results
ad9134b at 2024-10-15T09:31:33.891Z :recycle:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants