Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDDS-10650. Delete hsync key info from openFileTable while deleting directory recursively. #6495

Merged
merged 11 commits into from
Apr 24, 2024

Conversation

ashishkumar50
Copy link
Contributor

What changes were proposed in this pull request?

When volume or bucket or directory is deleted recursively and if corresponding hsync keys are present in openFileTable, then we should delete those keys as well.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-10650

How was this patch tested?

New integration test

@kerneltime kerneltime requested a review from errose28 April 8, 2024 16:21
@jojochuang jojochuang added the hbase HBase on Ozone support label Apr 9, 2024
Copy link
Contributor

@smengcl smengcl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ashishkumar50 the fix. Patch looks straightforward to me. This is what #6079 had missed.

lgtm. just one nit above.

@ashishkumar50
Copy link
Contributor Author

Thanks @ashishkumar50 the fix. Patch looks straightforward to me. This is what #6079 had missed.

lgtm. just one nit above.

Thanks @smengcl for the review, updated above suggestion.

Copy link
Contributor

@ChenSammi ChenSammi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not delete them directly from openKeyTable here. Direct delete will result in blocks not released, causing orphan blocks. We can mark these hsynced open key delectable, and modify openKeyCleanupService to delete these keys and release the blocks.

@ashishkumar50
Copy link
Contributor Author

ashishkumar50 commented Apr 10, 2024

We should not delete them directly from openKeyTable here. Direct delete will result in blocks not released, causing orphan blocks. We can mark these hsynced open key delectable, and modify openKeyCleanupService to delete these keys and release the blocks.

@ChenSammi Yes It seems the last block may be orphan on DN if we delete directly from table. I think this PR also need to avoid directly deleting and should use openKeyCleanupService @smengcl

@smengcl
Copy link
Contributor

smengcl commented Apr 10, 2024

We should not delete them directly from openKeyTable here. Direct delete will result in blocks not released, causing orphan blocks. We can mark these hsynced open key delectable, and modify openKeyCleanupService to delete these keys and release the blocks.

@ChenSammi Good point.

A simple solution could be to update deletedKey with the up-to-date block list from the open key before putting it to deletedTable? If the goal is to prevent orphan blocks that should do it.

OpenKeyCleanup should also work. But one small problem with OpenKeyCleanupService is that it is only triggered every 24 hours by default, rather than KeyDeletingService's 60 seconds. In edge cases we might hear admins complain about why the cluster space is not freed upon key deletion after a few hours.

@ashishkumar50 Yes we now need to update OMKeyDeleteRequest and all its variants to contain the correct block list for omKeyInfo. Or take the OpenKeyCleanup approach.

Copy link
Contributor

@smengcl smengcl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Marked as blocker for 1.4.1 Previously I mistakenly thought HDDS-9930 (which could lead to orphan blocks in the case Sammi mentioned above) was in 1.4.0. I have removed 1.4.1 as a target version.

@ashishkumar50
Copy link
Contributor Author

@ChenSammi Which approach do you suggest, Whether we should use OpenKeyCleanupService or through KeyDeletingService?
I think for OpenKeyCleanupService we can add metadata in the openKeyTable as deleted during key deletion and use that in OpenKeyCleanupService to identify whether key is deleted.
For KeyDeletingService approach we need to read blocks from openKeyTable and use that block for deletion from SCM/DN.
And remove those keys from openKeyTable as done in this PR.

@ashishkumar50
Copy link
Contributor Author

@ChenSammi @smengcl Handled using OpenKeyCleanupService. PTAL thanks.

Comment on lines +123 to +125
OmKeyInfo openKeyInfo = omMetadataManager.getOpenKeyTable(getBucketLayout()).get(dbOpenKey);
if (openKeyInfo != null) {
openKeyInfo.getMetadata().put(DELETED_HSYNC_KEY, "true");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But what happens when the same client tries to create a key at the same path?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tell me if this would happen:

  • When the same client (with the same client ID, thus same dbOpenKey) is allowed to write to the same path, the openKeyTable (openFileTable) entry will be overwritten, still leading to orphan blocks.

On the other hand, it also doesn't make sense to block the same client from writing to the same key path, just because the key is deleted by another client or because the key is errorneously deleted without being close()d first.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this problem exist even for non hsync keys as well.
When Client1 is writing a key, and some other client comes and deletes the directory or bucket and recreate again. Client1 will fail to commit the key and also will not able to write same key till openKey is cleaned up or it may overwrite.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will not able to write same key till openKey is cleaned up

Got it. So at least in this edge case it won't cause new orphan blocks. But we should revisit if this is the desired behavior.

Do we have CLI command to manually trigger a run of OpenKeyCleanup?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked but don't find any CLI to manually trigger OpenKeyCleanup.

import static org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos.Status.OK;
import static org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos.Status.PARTIAL_DELETE;

/**
* Response for DeleteKey request.
*/
@CleanupTableInfo(cleanupTables = {KEY_TABLE, OPEN_KEY_TABLE, DELETED_TABLE, BUCKET_TABLE})
@CleanupTableInfo(cleanupTables = {KEY_TABLE, DELETED_TABLE, BUCKET_TABLE})
Copy link
Contributor

@ChenSammi ChenSammi Apr 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why OPEN_KEY_TABLE is removed here? Could both OPEN_KEY_TABLE and OPEN_FILE_TABLE included here, and other involved response classes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now we are not deleting from OPEN_KEY_TABLE here which was added in #6079. Instead using OpenKeyCleanupService to do the same. Still it is required?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This @CleanupTableInfo is to cleanup the table cache after the corresponding transactions flushed to RocksDB. In the delete, we put the empty entry in openKeyTable cache to represent key is deleted from the RocksDB.

   // Remove the open key by putting a tombstone entry
      openKeyTable.addCacheEntry(dbOpenKey, trxnLogIndex);

But now we alter the key metadata instead of delete the key. So shall we still put an empty entry here? This also reminds me we may need check the output of CLI command which list the openKeys, we need to tell if the key is marked for deleted or not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed adding empty cache entry here.
Okay I will raise Jira to update CLI for displaying whether openKey is marked for deletion or not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ashishkumar50 , I don't mean remove the cache entry adding, instead I think we should add the cache entry with update the keyInfo, instead of an empty entry currently.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added with updated keyInfo.

@@ -180,7 +185,7 @@ public OMClientResponse validateAndUpdateCache(OzoneManager ozoneManager, TermIn
omClientResponse = new OMKeyDeleteResponse(
omResponse.setDeleteKeyResponse(DeleteKeyResponse.newBuilder())
.build(), omKeyInfo, ozoneManager.isRatisEnabled(),
omBucketInfo.copyObject(), dbOpenKey);
omBucketInfo.copyObject(), openKeyInfoMap);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please pass openKeyInfo instead of a Map here, and below OMKeyDeleteRequestWithFSO.java.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

if (!openKeyInfoMap.isEmpty()) {
for (Map.Entry<String, OmKeyInfo> entry : openKeyInfoMap.entrySet()) {
omMetadataManager.getOpenKeyTable(getBucketLayout()).putWithBatch(
batchOperation, entry.getKey(), entry.getValue());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ashishkumar50 , since the file is kept in openKeyTable for lazy deletion, we shall also need to check this "DELETED_HSYNC_KEY" flag in allocateBlock, commitKey and recoverLease request.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@jojochuang jojochuang requested a review from smengcl April 18, 2024 07:48
Copy link
Contributor

@ChenSammi ChenSammi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last patch LGTM. Thanks @ashishkumar50 .

@smengcl , would you like to take another look?

Copy link
Contributor

@smengcl smengcl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ashishkumar50 . The latest revision lgtm

@ChenSammi ChenSammi merged commit fd188d1 into apache:HDDS-7593 Apr 24, 2024
38 checks passed
@smengcl
Copy link
Contributor

smengcl commented Apr 29, 2024

Hi @ashishkumar50 , do we need to cherry-pick this to master branch as well?

chungen0126 pushed a commit to chungen0126/ozone that referenced this pull request May 3, 2024
chungen0126 pushed a commit to chungen0126/ozone that referenced this pull request May 3, 2024
jojochuang pushed a commit to jojochuang/ozone that referenced this pull request May 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hbase HBase on Ozone support
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants