Skip to content
This repository has been archived by the owner on Jul 24, 2024. It is now read-only.

storage: fix s3 walk directory #750

Merged
merged 6 commits into from
Feb 25, 2021
Merged

storage: fix s3 walk directory #750

merged 6 commits into from
Feb 25, 2021

Conversation

glorv
Copy link
Collaborator

@glorv glorv commented Feb 24, 2021

What problem does this PR solve?

close #749

What is changed and how it works?

#713 tried to fix the s3 WalkDir prefix issue in the wrong way and introduced another bug.

The original issue is that if prefix and subDir are both empty, the prefix parameter in the ListObject request then is "/", which causes s3 to return empty results.

The PR avoids adding the "/" postfix for the s3Storage object if the original prefix is an empty string.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Code changes

Side effects

Related changes

  • Need to cherry-pick to the release branch
  • Need to update the documentation

Release Note

@@ -279,8 +279,10 @@ func newS3Storage(backend *backup.S3, opts *ExternalStorageOptions) (*S3Storage,
return nil, errors.Annotatef(berrors.ErrStorageInvalidConfig, "Bucket %s is not accessible: %v", qs.Bucket, err)
}
}
if len(qs.Prefix) > 0 {
qs.Prefix += "/"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if there is a '/' already?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if we just path.Clean it 🤷

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Prefix is parsed from url it can't contains the "/" suffix. See:

br/pkg/storage/parse.go

Lines 69 to 70 in 578be7f

prefix := strings.Trim(u.Path, "/")
s3 := &backup.S3{Bucket: u.Host, Prefix: prefix}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We just not want the bare "/" prefix. for other non-empty prefixes, the suffix "/" is needed.
BTW, maybe we shouldn't depend on the Prefix filed to contain the "/" suffix, but in this way, we need to change more code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the user wants to load all data from one bucket?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original pr is try to resolve this issue then. If the input path is the bucket name, we should leave the prefix parameter empty in the ListObject request, otherwise, s3 will return empty result.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

	if len(prefix) > 0 && !strings.HasSuffix(prefix, "/") {

What about checking suffix too? In case the API is called from other projects.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

if len(prefix) > 0 && !strings.HasSuffix(prefix, "/") {
prefix += "/"
}
prefix := rs.options.Prefix + opt.SubDir
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about path.Join?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

path.Join cannot work on windows since it separator is not "/". And options.Prefix already contains the suffix "/"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@glorv path.Join always use /. you're thinking of filepath.Join 😛

@overvenus overvenus added this to the v4.0.12 milestone Feb 24, 2021
@glorv glorv changed the title fix s3 walk directory storage: fix s3 walk directory Feb 24, 2021
@kennytm
Copy link
Collaborator

kennytm commented Feb 24, 2021

@kennytm
Copy link
Collaborator

kennytm commented Feb 24, 2021

/run-integration-test

@3pointer
Copy link
Collaborator

https://internal.pingcap.net/idc-jenkins/blue/organizations/jenkins/br_ghpr_unit_and_integration_test/detail/br_ghpr_unit_and_integration_test/4889/pipeline

[2021-02-24T03:59:46.545Z] TEST: [br_full_ddl] fail due to stats are not equal

😥

#690 fix this test, we can merge it first

@@ -881,6 +881,31 @@ func (s *s3Suite) TestWalkDir(c *C) {
}, nil
}).
After(secondCall)
forthCall := s.s3.EXPECT().
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo?

Suggested change
forthCall := s.s3.EXPECT().
fourthCall := s.s3.EXPECT().

Contents: contents[4:],
}, nil
}).
After(forthCall)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
After(forthCall)
After(fourthCall)

@@ -279,8 +279,10 @@ func newS3Storage(backend *backup.S3, opts *ExternalStorageOptions) (*S3Storage,
return nil, errors.Annotatef(berrors.ErrStorageInvalidConfig, "Bucket %s is not accessible: %v", qs.Bucket, err)
}
}
if len(qs.Prefix) > 0 {
qs.Prefix += "/"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the user wants to load all data from one bucket?

Copy link
Contributor

@lichunzhu lichunzhu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@ti-srebot ti-srebot added the status/LGT1 LGTM1 label Feb 24, 2021
Copy link
Member

@overvenus overvenus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest LGTM

@@ -279,8 +279,10 @@ func newS3Storage(backend *backup.S3, opts *ExternalStorageOptions) (*S3Storage,
return nil, errors.Annotatef(berrors.ErrStorageInvalidConfig, "Bucket %s is not accessible: %v", qs.Bucket, err)
}
}
if len(qs.Prefix) > 0 {
qs.Prefix += "/"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

	if len(prefix) > 0 && !strings.HasSuffix(prefix, "/") {

What about checking suffix too? In case the API is called from other projects.

@overvenus
Copy link
Member

LGTM

@ti-srebot ti-srebot removed the status/LGT1 LGTM1 label Feb 25, 2021
@ti-srebot ti-srebot added the status/LGT2 LGTM2 label Feb 25, 2021
@glorv
Copy link
Collaborator Author

glorv commented Feb 25, 2021

/merge

@ti-srebot
Copy link
Contributor

/run-all-tests

@ti-srebot ti-srebot merged commit 577dc24 into pingcap:master Feb 25, 2021
ti-srebot pushed a commit to ti-srebot/br that referenced this pull request Feb 25, 2021
@ti-srebot
Copy link
Contributor

cherry pick to release-4.0 in PR #756

@glorv glorv deleted the fix-walk branch February 25, 2021 10:43
ti-srebot added a commit that referenced this pull request Mar 3, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Import from S3 seems to scan parent/root of given path.
6 participants