Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

同步过去不是少了数据就是多了数据 #86

Open
ysq5202121 opened this issue Jul 18, 2023 · 2 comments
Open

同步过去不是少了数据就是多了数据 #86

ysq5202121 opened this issue Jul 18, 2023 · 2 comments

Comments

@ysq5202121
Copy link

通过sqlserver cdc连接器同步到clickhouse ,
总是发现数据不是多了就是少了(ReplacingMergeTree,有唯一ID,执行合并后还是对不上两边的数据)
'url' = '',
'database-name' = '
',
'table-name' = '
**',
'sink.batch-size' = '10000',
'use-local'='true',
'sink.flush-interval' = '120s',
'sink.max-retries' = '3'

@itinycheng
Copy link
Owner

itinycheng commented Jul 19, 2023

@ysq5202121
有主键的话,sink的时候是upsert模式;
cdc生成的数据包含update,delete,这些会在sink时转换成clickhouse alter语句,批量写入clickhouse的时候有可能会出现你描述的问题;
关注下sink.update-strategysink.ignore-delete,可能会对你有所帮助;

@ysq5202121
Copy link
Author

@itinycheng 很玄幻,感觉似乎这种同步并不好用。。。,无法掌控,哎

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants