Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip unsupported object names in list output #1876

Draft
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

gargnitingoogle
Copy link
Collaborator

@gargnitingoogle gargnitingoogle commented Apr 29, 2024

Description

Skip unsupported directory names in list prefixes. Warn on unsupported object names returned on listing.

GCS supports objects of path-type <bucket>/<path1>//<path2>
and treats it different from <bucket>/<path1>/<path2>, while they are both the same in linux filesystem.
GCSFuse being a POSIX-compliant file-system only support the latters and throws error on former. This error can disallow the listing of all directories in the parent directory i.e. <path1>.
The current change ignores the listing of prefixes (directory names) which are empty such
as "/" above to ignore the error and logs the above event as a warning.

This is followed up in

  1. Add unsupported directory substrings . and .. #2001, which adds . and .. and their derivative object names to the list of unsupported object names, And in
  2. Add e2e tests for fix for unsupported directory list #2011 which adds e2e tests for this change.

Link to the issue in case of a bug fix.

NA

Testing details

  1. Manual -
    • GCS bucket structure: 1. <bucket>//hello.txt, 2. <bucket>/a/hello.txt
    • GCSFuse mount command: gcsfuse --implicit-dirs --log-file=$logfile --debug_fuse --log-format=text $bucket $mountpath
    • Without the fix,
      ls $mountpath
      ls: reading directory $mountpath: Input/output error
      cat $logfile
      - No warning/error log -
    • With the fix,
          ls $mountpath
          a
          cat $logfile
          ...
          time=... severity=WARNING message="Ignoring unsupported prefix \"/\""
          ...
  2. Unit tests - NA
  3. Integration tests - NA

@gargnitingoogle gargnitingoogle force-pushed the gargnitin/fix-empty-directory-list-issue branch from 3d1d4bf to dfcbf92 Compare April 29, 2024 11:44
@gargnitingoogle gargnitingoogle added the execute-integration-tests Run only integration tests label Apr 29, 2024
@gargnitingoogle gargnitingoogle force-pushed the gargnitin/fix-empty-directory-list-issue branch 2 times, most recently from ed0d51d to 9024dc6 Compare April 30, 2024 09:11
@gargnitingoogle gargnitingoogle force-pushed the gargnitin/fix-empty-directory-list-issue branch from 9024dc6 to c1195c4 Compare May 4, 2024 20:45
internal/util/util_test.go Outdated Show resolved Hide resolved
internal/util/util_test.go Outdated Show resolved Hide resolved
@gargnitingoogle gargnitingoogle force-pushed the gargnitin/fix-empty-directory-list-issue branch 4 times, most recently from 02488f1 to cca204a Compare May 6, 2024 09:56
@gargnitingoogle gargnitingoogle marked this pull request as ready for review May 6, 2024 10:00
@gargnitingoogle gargnitingoogle requested review from ashmeenkaur and a team as code owners May 6, 2024 10:00
@gargnitingoogle gargnitingoogle changed the title Skip empty-directory names in listed prefixes Skip unsupported names in listed prefixes May 6, 2024
@gargnitingoogle
Copy link
Collaborator Author

Presubmit tests passed: logs

@gargnitingoogle gargnitingoogle force-pushed the gargnitin/fix-empty-directory-list-issue branch from 80306c2 to fbbc0b1 Compare May 8, 2024 09:18
Copy link
Collaborator

@ashmeenkaur ashmeenkaur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still need to take a look at the tests

internal/util/util.go Outdated Show resolved Hide resolved
internal/util/util.go Outdated Show resolved Hide resolved
internal/storage/bucket_handle.go Outdated Show resolved Hide resolved
internal/util/util.go Outdated Show resolved Hide resolved
internal/fs/implicit_dirs_test.go Show resolved Hide resolved
internal/fs/implicit_dirs_test.go Outdated Show resolved Hide resolved
@gargnitingoogle gargnitingoogle force-pushed the gargnitin/fix-empty-directory-list-issue branch from fbbc0b1 to dc55c22 Compare May 9, 2024 07:24
@gargnitingoogle gargnitingoogle force-pushed the gargnitin/fix-empty-directory-list-issue branch from dc55c22 to 1d76261 Compare May 31, 2024 06:35
@gargnitingoogle gargnitingoogle removed the execute-integration-tests Run only integration tests label Jun 6, 2024
@gargnitingoogle gargnitingoogle force-pushed the gargnitin/fix-empty-directory-list-issue branch 2 times, most recently from 4acab17 to 2a971eb Compare June 11, 2024 04:17
Copy link

codecov bot commented Jun 11, 2024

Codecov Report

Attention: Patch coverage is 95.12195% with 2 lines in your changes missing coverage. Please review.

Project coverage is 78.39%. Comparing base (3b6804a) to head (5fb1408).

Files with missing lines Patch % Lines
internal/util/gcs_util.go 92.30% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1876      +/-   ##
==========================================
+ Coverage   78.38%   78.39%   +0.01%     
==========================================
  Files         107      108       +1     
  Lines       11793    11833      +40     
==========================================
+ Hits         9244     9277      +33     
- Misses       2058     2064       +6     
- Partials      491      492       +1     
Flag Coverage Δ
unittests 78.39% <95.12%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@gargnitingoogle gargnitingoogle force-pushed the gargnitin/fix-empty-directory-list-issue branch 3 times, most recently from 827980a to cc6edfe Compare June 15, 2024 19:03
Copy link
Collaborator Author

@gargnitingoogle gargnitingoogle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added self-review comments.

internal/fs/inode/dir.go Outdated Show resolved Hide resolved
internal/fs/implicit_dirs_test.go Outdated Show resolved Hide resolved
internal/fs/implicit_dirs_test.go Outdated Show resolved Hide resolved
internal/fs/implicit_dirs_test.go Outdated Show resolved Hide resolved
internal/fs/implicit_dirs_test.go Outdated Show resolved Hide resolved
internal/fs/implicit_dirs_test.go Outdated Show resolved Hide resolved
internal/fs/implicit_dirs_test.go Outdated Show resolved Hide resolved
internal/fs/inode/dir.go Outdated Show resolved Hide resolved
internal/storage/bucket_handle_test.go Show resolved Hide resolved
@gargnitingoogle gargnitingoogle force-pushed the gargnitin/fix-empty-directory-list-issue branch from c5ced39 to 2fa79ce Compare June 17, 2024 09:31
@gargnitingoogle
Copy link
Collaborator Author

@ashmeenkaur @sethiay I've resolved all your comments. Please review again.

@gargnitingoogle gargnitingoogle force-pushed the gargnitin/fix-empty-directory-list-issue branch from 7348976 to 376c68e Compare June 18, 2024 10:13
internal/fs/inode/dir.go Show resolved Hide resolved
internal/util/gcs_util_test.go Show resolved Hide resolved
internal/storage/bucket_handle_test.go Show resolved Hide resolved
internal/fs/inode/dir.go Show resolved Hide resolved
internal/fs/implicit_dirs_test.go Outdated Show resolved Hide resolved
internal/fs/implicit_dirs_test.go Outdated Show resolved Hide resolved
AssertEq(4, len(entries))
verifyFileInfo(entries[0], "4", false)
verifyFileInfo(entries[1], "a", true)
verifyFileInfo(entries[2], "bar", true)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this expected? There is no valid object having "bar" directory. I think this will cause issues in case of RemoveDir operation as the object won't be listed during the operation. Please check.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this expected? There is no valid object having "bar" directory.

Yes. This is the cost of using MaxResults=1 for lookUpInode.

If we have objects bar//1, and bar/2 in a bucket, gcs list call during lookUpInode with MaxResults=1 will return bar//1 . If we reject unsupported object bar//1 at the time of lookup-inode, then bar will never be identified as a directory by GCSFuse, and thus bar/2 will never be accessible through GCSFuse, which is no better than the current bug. Now, even if there is no object named bar/2, we can not know from gcs list call with MaxResults=1, and we end up having to identify bar as a directory, albeit empty.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will cause issues in case of RemoveDir operation as the object won't be listed during the operation

I'm not sure I understand what you mean by issues. Are you saying that bar//1 will not be deleted, or that it'll be deleted without being listed ?

I'll test out RemoveDir manually to see what the behaviour is.

Copy link
Collaborator Author

@gargnitingoogle gargnitingoogle Jun 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll test out RemoveDir manually to see what the behaviour is.

I tested this out manually, in the above example, if I try and delete/rename bar in the gcsfuse mount, it only deletes/renames bar/2 and has no impact on bar//1 in GCS . This looks like the correct behaviour to me as we intend to have bar//1 be invisible to a GCSFuse mount.

@gargnitingoogle gargnitingoogle force-pushed the gargnitin/fix-empty-directory-list-issue branch 2 times, most recently from 179725a to 1d4da0e Compare June 18, 2024 16:38
@gargnitingoogle
Copy link
Collaborator Author

This PR is causing a problem with rm -r on a nested directory, so marking it as draft until that issue is resolved.

@gargnitingoogle gargnitingoogle marked this pull request as draft June 24, 2024 05:16
@gargnitingoogle gargnitingoogle force-pushed the gargnitin/fix-empty-directory-list-issue branch from c18507e to b867458 Compare September 30, 2024 05:43
@@ -234,7 +234,7 @@ func (bm *bucketManager) SetUpBucket(
}

// Periodically garbage collect temporary objects
go garbageCollect(bm.gcCtx, bm.config.TmpObjectPrefix, sb)
//go garbageCollect(bm.gcCtx, bm.config.TmpObjectPrefix, sb)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Undo this.

@gargnitingoogle gargnitingoogle removed the execute-integration-tests Run only integration tests label Sep 30, 2024
@gargnitingoogle gargnitingoogle force-pushed the gargnitin/fix-empty-directory-list-issue branch 3 times, most recently from 28c8bfe to 00ff9c4 Compare October 3, 2024 10:50
@ashmeenkaur ashmeenkaur removed their request for review October 4, 2024 05:50
@gargnitingoogle gargnitingoogle force-pushed the gargnitin/fix-empty-directory-list-issue branch from 00ff9c4 to 5fb1408 Compare October 7, 2024 10:05
@gargnitingoogle gargnitingoogle force-pushed the gargnitin/fix-empty-directory-list-issue branch from 5fb1408 to 8635cfb Compare October 7, 2024 11:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants