Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDDS-11205. Implement a search feature for users to locate keys pending Deletion within the OM Deleted Keys Insights section #6969

Merged
merged 22 commits into from
Oct 18, 2024

Conversation

ArafatKhan2198
Copy link
Contributor

@ArafatKhan2198 ArafatKhan2198 commented Jul 19, 2024

What changes were proposed in this pull request?

This pull request enhances the search functionality for deleted pending keys in the Deleted Keys Insights section. The previous implementation was limited to a subset of the available data returned by the API. This update ensures accurate retrieval of specific key information by improving pagination and handling of both prefix-based searches and general search scenarios.

The patch implements the ability to search for deleted keys using a specified prefix, efficiently returning results that match the prefix across both FSO and Object Store layouts. The logic has been refactored to the OMDBInsightEndpoint, unifying the method for handling all cases where the startPrefix is provided or not.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-11205

How was this patch tested?

  • Unit tests were created and updated to cover various scenarios, including pagination, searches with and without startPrefix, and for both FSO and non-FSO layouts.
  • Manual testing was conducted to ensure that the search functionality works as expected across different scenarios.
  • We can run the following command on docker container running OM.
// Use the Freon tool to create 5 directories with 10 keys each 
for i in {1..5}; do
  ozone freon dfsg --path=ofs://om/s3v/fso-bucket/ -s=10000 -n=10
done

// Delete all the directories under the fso-bucket which will result in records being added to deletedTable 
ozone fs -rm -r -skipTrash ofs://om/s3v/fso-bucket
  • Using the given API endpoint which will list out all the deleted keys under the bucket directory 1inkjkisd which is under fso-bucket.
http://localhost:9888/api/v1/keys/deletePending?startPrefix=/s3v/fso-bucket/1inkjkisd
image

…ng Deletion within the OM Deleted Keys Insights section.
@ArafatKhan2198 ArafatKhan2198 marked this pull request as ready for review July 20, 2024 06:44
@ArafatKhan2198
Copy link
Contributor Author

@devmadhuu @dombizita Please take a look!

@errose28
Copy link
Contributor

For FSO buckets, how does the search work if the parent directories or even the volume and bucket have been deleted already? The key may still be pending deletion but the prefix will have no matches once the parents are cleared.

Copy link
Contributor

@dombizita dombizita left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this @ArafatKhan2198, overall it looks good to me, I had one idea, let me know what you think of it. Also could you answer to @errose28's comment?

keyIter.seek(startPrefix);
@GET
@Path("/deletePending/search")
public Response searchDeletedKeys(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The beginning and ending of this method have a lot in common with searchOpenKeys.

@GET
@Path("/open/search")
public Response searchOpenKeys(
@DefaultValue(DEFAULT_START_PREFIX) @QueryParam("startPrefix")
String startPrefix,
@DefaultValue(RECON_OPEN_KEY_DEFAULT_SEARCH_LIMIT) @QueryParam("limit")
int limit,
@DefaultValue(RECON_OPEN_KEY_SEARCH_DEFAULT_PREV_KEY) @QueryParam("prevKey") String prevKey) throws IOException {

Can we extract them? Do you think it would make sense?

@devmadhuu devmadhuu self-requested a review September 23, 2024 07:24
@ArafatKhan2198
Copy link
Contributor Author

Thanks for the comment @errose28

The DeletedTable has a structure similar to that of the key table, following this pattern:

volumeName/bucketName/directoryName/KeyName/ObjectID

When a key is deleted and placed inside the DeletedTable, it is stored as a key-value pair where:

  • The key follows the structure shown above, since the entire path is being mentioned as the key we can use that to our advantage.
  • The value is a RepeatedOmKeyInfo object.

The search operation leverages the .seek() method provided by RocksDB tables. Users provide a prefix path (complete or partial), and the seek operator fetches the first record that matches this prefix in the DeletedTable. It then iterates and collects all subsequent records matching this prefix, returning them in the appropriate format.

Even if the bucket or directory is deleted, it does not impact the search because we are only seeking values in the DeletedTable that match the given prefix.

Additionally, I have verified how each record key is stored in the DeletedTable. For example: ➖

image

For example, if the user provides a search prefix like volume/bucket1/dir1/, the following record shown in the image, along with any other record that matches this prefix, will be collected and returned.

@ArafatKhan2198
Copy link
Contributor Author

@dombizita @devmadhuu @errose28
Could you please take a look.

@devmadhuu
Copy link
Contributor

Pls fix the checkstyle issue.

Copy link
Contributor

@devmadhuu devmadhuu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ArafatKhan2198 for improving the patch. Changes LGTM +1. Just fix your checkstyle issue.

Copy link
Contributor

@devabhishekpal devabhishekpal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ArafatKhan2198 for the changes, looks good to me.

@ArafatKhan2198
Copy link
Contributor Author

Thanks @devabhishekpal @devmadhuu @sumitagrawl for the review and comments on this PR please consider the latest changes :-

@ArafatKhan2198
Copy link
Contributor Author

ArafatKhan2198 commented Oct 10, 2024

Thanks for the review @devmadhuu @sumitagrawl @devabhishekpal please consider the latest commit having the final changes as discussed offline

In the latest commit, we made significant improvements to the search functionality for deleted keys:

  • Unified Search Logic: The search logic has been refactored and moved to OMDBInsightEndpoint. This ensures that both cases (with and without startPrefix) are handled by the same method, simplifying the code.

  • Pagination Handling: We refined the logic to ensure accurate pagination in all scenarios. Now, searches correctly handle the prevKey and return the appropriate records, whether a startPrefix is provided or not.

  • Expanded Unit Testing: Added new unit tests in TestOmDBInsightEndPoint and TestDeletedKeysSearchEndpoint to cover the four key search scenarios:

    • prevKey provided, startPrefix empty.
    • prevKey empty, startPrefix empty.
    • startPrefix provided, prevKey empty.
    • startPrefix provided, prevKey provided.

Copy link
Contributor

@sumitagrawl sumitagrawl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@devmadhuu devmadhuu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ArafatKhan2198 for improving the patch. Few comments left.

@ArafatKhan2198
Copy link
Contributor Author

Please take another look @devmadhuu

Copy link
Contributor

@devmadhuu devmadhuu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ArafatKhan2198 for improving the patch. Just few nits and one old comment not yet addressed.

Copy link
Contributor

@devmadhuu devmadhuu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor comment.

Copy link
Contributor

@devmadhuu devmadhuu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pls check , some inconsistency for blank value check of startPrefix at some places but not a some places.

Copy link
Contributor

@devmadhuu devmadhuu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pls check my comment.

Copy link
Contributor

@devmadhuu devmadhuu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor comment.

@ArafatKhan2198
Copy link
Contributor Author

@ArafatKhan2198 I understand that why we are assigning the startPrefix as empty. My original question was not about why you are assigning the startPrefix as empty. Rather the point is why there is a. need to validate that if API is already setting default as EMPTY String. That check is redundant.

@devmadhuu
The reason for adding a check to see if the startPrefix is default (empty) is to handle cases where both startPrefix and prevKey are empty. In such scenarios, we need to iterate through all the records in the table.

Without this null check, validateStartPrefix would return false because it mistakenly interprets the empty startPrefix as a volume-level search, which isn't allowed. By adding a check for an empty string, we explicitly allow this condition where startPrefix is "". Without this, for the default value of "", we would encounter an error and the following would be logged:

return createBadRequestResponse("Invalid startPrefix: Path must be at the bucket level or deeper.");

This adjustment ensures the search behaves correctly when startPrefix is empty, avoiding unnecessary errors.

Copy link
Contributor

@devmadhuu devmadhuu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ArafatKhan2198 for improving the patch. Changes LGTM +1. Kindly check that API 404 error when we landup on the keyPendingForDeletion page in Recon UI. And I understand that for openKeys search API, you will be fixing things in follow up PR. Rest all looks good.

@devmadhuu devmadhuu self-requested a review October 18, 2024 13:55
Copy link
Contributor

@devmadhuu devmadhuu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes LGTM now. Thanks @ArafatKhan2198

@devmadhuu devmadhuu merged commit 86b7aae into apache:master Oct 18, 2024
39 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants