Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lakectl local: implement init #6280

Merged
merged 11 commits into from
Aug 2, 2023
Merged

lakectl local: implement init #6280

merged 11 commits into from
Aug 2, 2023

Conversation

N-o-Z
Copy link
Member

@N-o-Z N-o-Z commented Jul 30, 2023

Linked Issue

Closes #6241


Change Description

Background

Implement first lakectl local command which initializes a local folder with a lakefs paths

New Feature

Part of task #6239

Testing Details

Unit testing for tools, lakectl tests will be added for esti upon feature completion

Breaking Change?

No


@N-o-Z N-o-Z added exclude-changelog PR description should not be included in next release changelog lakectl-local labels Jul 30, 2023
@N-o-Z N-o-Z requested review from nopcoder and a team July 30, 2023 14:07
@N-o-Z N-o-Z self-assigned this Jul 30, 2023
@N-o-Z N-o-Z force-pushed the task/lakectl-local-init-6241 branch from 77a59e8 to 9f86e01 Compare July 30, 2023 14:08
@github-actions
Copy link

github-actions bot commented Jul 30, 2023

🎊 PR Preview ed2c277 has been successfully built and deployed to https://treeverse-lakeFS-preview-pr-6280.surge.sh

🕐 Build time: 0.021s

🤖 By surge-preview

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. We have pkg/fileutil for this kind of helpers
  2. Not sure we need this one - the code uses it to validate if path (string) is directory but we actually need to verify if there is a file there or something else which means that we eat the error and assume it is a file.

Copy link
Member Author

@N-o-Z N-o-Z Jul 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to fileutil
We check if the path is an existing dir in order to add the /* prefix to it.
This is a best effort as we cannot tell for the path string itself if it is a directory. In any other case (including if it doesn't exist) we want to return false.
I've update the code

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General suggestion - package name is git so we can drop it from a lot of the function names.

pkg/git/git.go Outdated
Comment on lines 21 to 37
func git(dir string, args ...string) (int, string) {
stdout := new(strings.Builder)
stderr := new(strings.Builder)
cmd := exec.Command("git", args...)
cmd.Dir = dir
cmd.Stdout = stdout
cmd.Stderr = stderr
err := cmd.Run()
if err != nil {
var exitError *exec.ExitError
if errors.As(err, &exitError) {
return exitError.ExitCode(), stderr.String()
}
return -1, err.Error()
}
return 0, stdout.String()
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cmd.CombinedOutput() should give you all you need - it will capture the stderr and stdout and we can use err is indicator of success as we do not associate any specific code to an error.

pkg/git/git.go Outdated
Comment on lines 39 to 43
// IsGitRepository Return true if dir is a path to a directory in a git repository, false otherwise
func IsGitRepository(dir string) bool {
rc, _ := git(dir, "rev-parse", "--is-inside-work-tree")
return rc == 0
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we use it for testing only?
lakectl use it before calling git.Ignore but git.Ignore should verify the same.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be used in future PRs

pkg/git/git.go Outdated
}

// GetGitRepositoryPath Returns the git repository root path if dir is a directory inside a git repository, otherwise returns error
func GetGitRepositoryPath(dir string) (string, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RepositoryPath

cmd/lakectl/cmd/local_init.go Show resolved Hide resolved
cmd/lakectl/cmd/local_init.go Outdated Show resolved Hide resolved
if err != nil {
DieErr(err)
}
Fmt("location added to %s\n", ignoreFile)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Current location? we print out the ignore file which is already the same as our repository, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No - we don't know where the ignore file is relative to where we executed the command. It could be anywhere on the tree or it could have been created by us

cmd/lakectl/cmd/local_init.go Outdated Show resolved Hide resolved
pkg/git/git.go Outdated
Comment on lines 97 to 148
markerLine := "# " + marker
info, err := os.Stat(ignoreFilePath)
switch {
case err == nil: // ignore file exists
mode = info.Mode()
ignoreFile, err := os.Open(ignoreFilePath)
if err != nil {
return "", err
}
fileScanner := bufio.NewScanner(ignoreFile)
for fileScanner.Scan() {
line := strings.TrimSpace(fileScanner.Text())
ignoreContent = append(ignoreContent, line)
if line == markerLine {
found = true
for fileScanner.Scan() {
line = strings.TrimSpace(fileScanner.Text())
if line == "" {
ignoreContent = append(ignoreContent, "")
break
}
if !slices.Contains(ignoreEntries, line) {
ignoreContent = append(ignoreContent, line)
}
}
ignoreContent = append(ignoreContent, ignoreEntries...)
}
}

if !found { // Add the marker and ignore list to the end of the file
ignoreContent = append(ignoreContent, "")
ignoreContent = append(ignoreContent, markerLine)
ignoreContent = append(ignoreContent, ignoreEntries...)
}

err = ignoreFile.Close()
if err != nil {
return "", err
}
case !os.IsNotExist(err):
return "", err
default: // File doesn't exist
ignoreContent = append(ignoreContent, markerLine)
ignoreContent = append(ignoreContent, ignoreEntries...)
}

buffer := strings.Join(ignoreContent, "\n") + "\n"
if err = os.WriteFile(ignoreFilePath, []byte(buffer), mode); err != nil {
return "", err
}

return ignoreFilePath, nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we split into 'readIgnore', 'updateIgnore' and 'writeIgnore'?
Think we should not mix the marker update and reading.

}

if git.IsGitRepository(localPath) {
ignoreFile, err := git.Ignore(localPath, []string{localPath, IndexFileName}, []string{IndexFileName}, IgnoreMarker)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a flag to skip update of gitignore?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No - why?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll open a different issue to explain the use-case I had in mind

pkg/git/git.go Outdated
// If section exists, it will append paths to the given section, otherwise writes the sections at the end of the file.
// File paths must be absolute.
// Creates the .ignore file if it doesn't exist.
func Ignore(dir string, ignorePaths, excludePaths []string, marker string) (string, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

general note - if we use io/fs (passing fs.FS to this call) we can abstract the os level and have the code test better without doing actual os level calls.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can pass an FS object since the dir passed might not be the git repository root

@N-o-Z N-o-Z requested a review from nopcoder July 31, 2023 14:10
type Index struct {
root string
PathURI string `yaml:"src"`
Head string `yaml:"at_head"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Head string `yaml:"at_head"`
AtHead string `yaml:"at_head"`

cmd/lakectl/cmd/local_init.go Show resolved Hide resolved
}

if git.IsGitRepository(localPath) {
ignoreFile, err := git.Ignore(localPath, []string{localPath, IndexFileName}, []string{IndexFileName}, IgnoreMarker)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll open a different issue to explain the use-case I had in mind

// Index defines the structure of the lakefs local reference file
// consisting of the information linking local directory with lakefs path
type Index struct {
root string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
root string
root string `yaml:"-"`

return false
default:
DieErr(err)
return false // go fmt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

go fmt?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's going to change in the next PR

Comment on lines 14 to 15
initMinArgs = 1
initMaxArgs = 2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generic name - need to add localInit or it may conflict with other geneic name soon

dir = args[1]
}
flagSet := cmd.Flags()
force := MustBool(flagSet.GetBool("force"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can start using the generic Must

pkg/git/git.go Outdated
Comment on lines 70 to 92
func updateIgnoreFileSection(contents []byte, marker string, entries []string) []byte {
var newContent []byte
scanner := bufio.NewScanner(bytes.NewReader(contents))
for scanner.Scan() {
line := strings.TrimSpace(scanner.Text())
newContent = append(newContent, []byte(line+"\n")...)
if line == marker {
for scanner.Scan() {
line = strings.TrimSpace(scanner.Text())
if line == "" {
break
}
if !slices.Contains(entries, line) {
newContent = append(newContent, []byte(line+"\n")...)
}
}
buffer := strings.Join(entries, "\n") + "\n"
newContent = append(newContent, buffer...)
}
}

return newContent
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to use API which uses the OS's new-line or use the the right value for new-line when we write to a file

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did my best - golang doesn't have a definition for new line. Instead I used fmt.Sprintln()

@N-o-Z N-o-Z requested a review from nopcoder August 1, 2023 18:42
@N-o-Z N-o-Z force-pushed the task/lakectl-local-init-6241 branch from 0a85eae to 305d1ec Compare August 2, 2023 06:25
Copy link
Contributor

@nopcoder nopcoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added optional suggestions.
My only concern is the marker range in the gitignore - we should consider using different marker for start and another for end to mark the section.
A user can update the ignore file and remove the blank line the next update will move part that is not of the marker.

DieErr(err)
}
if IndexExists(localPath) && !force {
Die(fmt.Sprintf("directory '%s' already linked to a lakefs path, run command with --force to overwrite", localPath), 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DieFmt

Comment on lines 51 to 55
if err != nil && !errors.Is(err, git.ErrNotARepository) {
DieErr(err)
} else if err == nil {
fmt.Println("location added to", ignoreFile)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternative indent vs less 'if'

Suggested change
if err != nil && !errors.Is(err, git.ErrNotARepository) {
DieErr(err)
} else if err == nil {
fmt.Println("location added to", ignoreFile)
}
if err != nil {
if !errors.Is(err, git.ErrNotARepository) {
DieErr(err)
}
fmt.Println("location added to", ignoreFile)
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not exactly, but modified similarly

Comment on lines +70 to +92
func updateIgnoreFileSection(contents []byte, marker string, entries []string) []byte {
var newContent []byte
scanner := bufio.NewScanner(bytes.NewReader(contents))
for scanner.Scan() {
line := strings.TrimSpace(scanner.Text())
newContent = append(newContent, []byte(fmt.Sprintln(line))...)
if line == marker {
for scanner.Scan() {
line = strings.TrimSpace(scanner.Text())
if line == "" {
break
}
if !slices.Contains(entries, line) {
newContent = append(newContent, []byte(fmt.Sprintln(line))...)
}
}
buffer := strings.Join(entries, fmt.Sprintln("")) + fmt.Sprintln("")
newContent = append(newContent, buffer...)
}
}

return newContent
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest the following to update/append the content:

func updateIgnoreFileSection(contents []byte, marker string, entries []string) ([]byte, error) {
	// Read content and remove/collect entries under marker
	var (
		result   bytes.Buffer
		inMarker bool
		scanner  = bufio.NewScanner(bytes.NewReader(contents))
	)
	for scanner.Scan() {
		line := strings.TrimSpace(scanner.Text())
		switch {
		case inMarker && line == "":
			inMarker = false
		case inMarker && !slices.Contains(entries, line):
			entries = append(entries, line)
		case !inMarker && line == marker:
			inMarker = true
		default:
			_, _ = fmt.Fprintln(&result, line)
		}
	}
	if err := scanner.Err(); err != nil {
		return nil, err
	}

	// Render marker and entries at the end
	_, _ = fmt.Fprintln(&result, marker)
	for _, entry := range entries {
		_, _ = fmt.Fprintln(&result, entry)
	}
	_, _ = fmt.Fprintln(&result)

	return result.Bytes(), nil
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would prefer not to rewrite it to the end of the file for now. I will open another PR to also add an end marker

@N-o-Z N-o-Z enabled auto-merge (squash) August 2, 2023 08:14
@N-o-Z N-o-Z merged commit 8a5596b into master Aug 2, 2023
36 of 37 checks passed
@N-o-Z N-o-Z deleted the task/lakectl-local-init-6241 branch August 2, 2023 08:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
exclude-changelog PR description should not be included in next release changelog lakectl-local
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement lakectl local init
2 participants