Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sync: the include and exclude options consistent with rsync behavior #1554

Merged
merged 12 commits into from
Mar 18, 2022
3 changes: 3 additions & 0 deletions cmd/sync.go
Original file line number Diff line number Diff line change
Expand Up @@ -281,6 +281,9 @@ func isS3PathType(endpoint string) bool {

func doSync(c *cli.Context) error {
setup(c, 2)
if c.IsSet("include") && !c.IsSet("exclude") {
logger.Warnf("The include option needs to be used with the exclude option,so the result of the current sync may not match your expectations")
SandyXSD marked this conversation as resolved.
Show resolved Hide resolved
}
config := sync.NewConfigFromCli(c)
go func() { _ = http.ListenAndServe(fmt.Sprintf("127.0.0.1:%d", config.HTTPPort), nil) }()

Expand Down
10 changes: 9 additions & 1 deletion docs/en/reference/command_reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -512,6 +512,10 @@ The format of both the source and destination paths is `[NAME://][ACCESS_KEY:SEC
- `BUCKET[.ENDPOINT]`: The access address of the data storage service, the format may be different for different storage types, please refer to [document](how_to_setup_object_storage.md#supported-object-storage).
- `[/PREFIX]`: Optional, a prefix for the source and destination paths that can be used to limit the synchronization to only data in certain paths.

:::note
If you want to express the concept of a folder in `SRC` or `DST`, please make sure that the path ends with "/" or "\", otherwise it will be considered as the prefix of the object name.
:::

#### Options

`--start KEY, -s KEY`<br />
Expand Down Expand Up @@ -551,7 +555,11 @@ delete extraneous objects from destination (default: false)
exclude keys containing PATTERN (POSIX regular expressions)

`--include PATTERN`<br />
only include keys containing PATTERN (POSIX regular expressions)
need to be used with `--exclude PATTERN`. Don't exclude files matching PATTERN (POSIX regular expressions)

:::tip
The order in which `--exclude` and `--include` are set will affect the result. Each object will be matched according to the order in which the two parameters appear. Once the pattern of a parameter is matched, the behavior of the object is the type of the parameter, and the matching of the parameters that appear later will not be attempted. If the object is not matched by any of the parameters, the default behavior of the object is include
:::

`--manager value`<br />
manager address
Expand Down
12 changes: 11 additions & 1 deletion docs/zh_cn/reference/command_reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -510,6 +510,10 @@ juicefs sync [command options] SRC DST
- `BUCKET[.ENDPOINT]`:数据存储服务的访问地址,不同存储类型格式可能不同,具体请参考[文档](how_to_setup_object_storage.md#支持的存储服务)。
- `[/PREFIX]`:可选,源路径和目标路径的前缀,可用于限定只同步某些路径中的数据。

:::tip
如果想要在`SRC`或者`DST`中表达文件夹的概念时,请确保路径是以"/"或者"\"结尾的,否则将会被认为是对象名的前缀。
zhijian-pro marked this conversation as resolved.
Show resolved Hide resolved
:::

#### 选项

`--start KEY, -s KEY`<br />
Expand Down Expand Up @@ -549,7 +553,13 @@ juicefs sync [command options] SRC DST
跳过包含 PATTERN (POSIX正则表达式) 的对象名

`--include PATTERN`<br />
仅同步包含 PATTERN (POSIX正则表达式) 的对象名
需要与`--exclude` 配合使用。不排除匹配 PATTERN (POSIX正则表达式) 的文件
zhijian-pro marked this conversation as resolved.
Show resolved Hide resolved


zhijian-pro marked this conversation as resolved.
Show resolved Hide resolved
:::tip
`--exclude` 与 `--include`的设置顺序将会影响运行结果。每个对象将按照这两个参数出现的先后顺序依次匹配,一旦匹配某个参数的 PATTERN ,那么该对象的行为就是这个参数的类型,不再尝试后出现的参数的匹配。如果该个对象没有被任何一个参数匹配到,那么该对象的默认行为 include
:::


`--manager value`<br />
管理者地址
Expand Down
42 changes: 34 additions & 8 deletions pkg/sync/sync.go
Original file line number Diff line number Diff line change
Expand Up @@ -686,20 +686,46 @@ func filter(keys <-chan object.Object, include, exclude []string) <-chan object.
inc := compileExp(include)
exc := compileExp(exclude)
r := make(chan object.Object)

var includeBeforeExclude bool
for _, arg := range os.Args {
if arg == "--include" || arg == "-include" {
includeBeforeExclude = true
break
} else if arg == "--exclude" || arg == "-exclude" {
break
}
}

go func() {
for o := range keys {
if o == nil {
break
}
if findAny(o.Key(), exc) {
logger.Debugf("exclude %s", o.Key())
continue
}
if len(inc) > 0 && !findAny(o.Key(), inc) {
logger.Debugf("%s is not included", o.Key())
continue
// Consistent with rsync behavior, the matching order is adjusted according to the order of the "include" and "exclude" options
if includeBeforeExclude {
if len(inc) > 0 && findAny(o.Key(), inc) {
logger.Debugf("%s is included", o.Key())
r <- o
continue
}
if findAny(o.Key(), exc) {
logger.Debugf("exclude %s", o.Key())
continue
}
r <- o
} else {
if findAny(o.Key(), exc) {
logger.Debugf("exclude %s", o.Key())
continue
}
if len(inc) > 0 && findAny(o.Key(), inc) {
logger.Debugf("%s is included", o.Key())
r <- o
continue
}
r <- o
}
r <- o
}
close(r)
}()
Expand Down
51 changes: 49 additions & 2 deletions pkg/sync/sync_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ package sync
import (
"bytes"
"math"
"os"
"reflect"
"testing"

Expand Down Expand Up @@ -83,7 +84,11 @@ func deepEqualWithOutMtime(a, b object.Object) bool {
}

// nolint:errcheck
func TestSync(t *testing.T) {
func TestSyncExcludeBeforeInclude(t *testing.T) {
defer func() {
_ = os.RemoveAll("/tmp/a")
_ = os.RemoveAll("/tmp/b")
}()
config := &Config{
Start: "",
End: "",
Expand All @@ -98,7 +103,7 @@ func TestSync(t *testing.T) {
Verbose: false,
Quiet: true,
}

os.Args = append([]string{}, "--exclude", "--include")
a, _ := object.CreateStorage("file", "/tmp/a/", "", "")
a.Put("a", bytes.NewReader([]byte("a")))
a.Put("ab", bytes.NewReader([]byte("ab")))
Expand Down Expand Up @@ -159,3 +164,45 @@ func TestSync(t *testing.T) {
t.Fatalf("sync: %s", err)
}
}

// nolint:errcheck
func TestSyncIncludeBeforeExclude(t *testing.T) {
defer func() {
_ = os.RemoveAll("/tmp/a")
_ = os.RemoveAll("/tmp/b")
}()
config := &Config{
Start: "",
End: "",
Threads: 50,
Update: true,
Perms: true,
Dry: false,
DeleteSrc: false,
DeleteDst: false,
Exclude: []string{"ab.*"},
Include: []string{"[a|b].*", "c1"},
Verbose: false,
Quiet: true,
}
os.Args = append([]string{}, "--include", "--exclude")
a, _ := object.CreateStorage("file", "/tmp/a/", "", "")
a.Put("a1", bytes.NewReader([]byte("a1")))
a.Put("b1", bytes.NewReader([]byte("b1")))
a.Put("ab1", bytes.NewReader([]byte("ab1")))
a.Put("ab2", bytes.NewReader([]byte("ab2")))
a.Put("c1", bytes.NewReader([]byte("c1")))
a.Put("c2", bytes.NewReader([]byte("c2")))

b, _ := object.CreateStorage("file", "/tmp/b/", "", "")
b.Put("a1", bytes.NewReader([]byte("a1")))

// Now a: {"a1", "b1", "ab1","ab2","c1","c2","d1"}, b: {"a1"}
// Copy : "b1", "ab1","ab2","c1",","c2"
if err := Sync(a, b, config); err != nil {
t.Fatalf("sync: %s", err)
}
if c := copied.Current(); c != 5 {
t.Fatalf("should copy 5 keys, but got %d", c)
}
}