Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use go-tpc instead of benchmarksql #1

Merged
merged 9 commits into from
Mar 17, 2020
Merged

Use go-tpc instead of benchmarksql #1

merged 9 commits into from
Mar 17, 2020

Conversation

yeya24
Copy link
Collaborator

@yeya24 yeya24 commented Mar 6, 2020

Signed-off-by: yeya24 [email protected]

This PR uses go-tpc to replace the previous benchmarksql.

Tasks:

  • Update cocde
  • Manual tests
  • Update comments and docs

@yeya24 yeya24 requested a review from XuHuaiyu March 6, 2020 20:36
Signed-off-by: yeya24 <[email protected]>

update tables params

Signed-off-by: yeya24 <[email protected]>
@yeya24 yeya24 changed the title Use go-tpc instead of benchmarksql WIP: Use go-tpc instead of benchmarksql Mar 7, 2020
Copy link
Collaborator

@XuHuaiyu XuHuaiyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

除了现有的 comments 外,还有 2 点:

  1. 添加一个 flag,用来启动 tpcc 测试
  2. 给 repo 添加一个 readme

main.go Outdated
// empty means generating all tables
specifiedTables = []string{""}
case 2:
specifiedTables = []string{"--csv.tables customer", "--csv.tables stock,orders"}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. 在存在 3 个 lightning 时,分别为:
  • stock
  • order_line
  • customer, config, district, history, item, new_order, oorder, warehouse
  1. 当存在 2 个 lightning 时,分别为:
  • stock
  • order_line, customer, config, district, history, item, new_order, oorder, warehouse
  1. 因为lightning 目前导入的瓶颈,在 stock 表上(约需要 2 小时),其次是 stock 表(约需要 1h30m),剩余表一起的导入时间约 1h。所以当 > 3 台 lightning 时,只需要用到 3 台即可,用更多的 lightning 也无法加速整体的导入时间。

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@XuHuaiyu 在 go-tpc 中,order_line 的生成是依赖于 orders 表。所以当存在3个lightning的时候,需要把order_line 和 order 放在同一台机器生成,那这个怎么调整一下呢?就在第二台机器上生成order_line和orders把?

}
if _, _, err = runCmd("bash", "-c", fmt.Sprintf(`mysql -h %s -u root -P %s -e "create database tpcc"`, tidbIP, tidbPort)); err != nil {
func genSchema(tidbIP, tidbPort, lightningIP, dbName string) (err error) {
if _, _, err = runCmd("bash", "-c", fmt.Sprintf(`mysql -h %s -u root -P %s -e "drop database if exists %s"`, tidbIP, tidbPort, dbName)); err != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

建库和建表,都放在 /tmp/go-tpc tpcc schema -U root -H %s -P %s -D %s 中吧。用户的服务器上可能没有安装 mysql client

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

现在已经删掉schema 这个命令了。然后建表和建库都是自动的。但是drop掉已经存在的库不支持,这个怎么办?还是用mysql drop一下吧?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

为啥不代码里用 driver drop?

@XuHuaiyu
Copy link
Collaborator

XuHuaiyu commented Mar 8, 2020

@yeya24 PTAL

Signed-off-by: yeya24 <[email protected]>
Signed-off-by: yeya24 <[email protected]>
Signed-off-by: yeya24 <[email protected]>
@yeya24 yeya24 changed the title WIP: Use go-tpc instead of benchmarksql Use go-tpc instead of benchmarksql Mar 10, 2020
@yeya24
Copy link
Collaborator Author

yeya24 commented Mar 10, 2020

I also updated the readme. PTAL @XuHuaiyu

main.go Outdated
errCh <- err
return
}
fmt.Println("Download go-tpc binary successfully!")
}
if _, _, err = runCmd("ssh", ip, fmt.Sprintf("mkdir -p %s", lightningDirs[i])); err != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

loop variable i captured by func literal

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Will update soon

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-       threads   = flag.Int64(nmThreads, 40, "number of threads")
+       threads   = flag.Int64(nmThreads, 40, "number of threads of go-tpc")

        //ansibleDir = flag.String(nmAnsibleDir, "", "ansible directory path")

@@ -170,12 +170,11 @@ func getLightningIPsAndDataDirs() (lightningIPs, dataDirs []string, err error) {
 > wget -O /tmp/go-tpc binary_url; chmod +x /tmp/go-tpc
 */
 func fetchTpcc(lightningDirs []string, lightningIPs []string, binaryURL string, skipDownloading bool) (err error) {
-       errCh := make(chan error, 3)
+       errCh := make(chan error, len(lightningIPs)*10) // 3 should enough, echo run 3 cmd at most
        wg := &sync.WaitGroup{}
-       for i, lightningIP := range lightningIPs {
-               ip := lightningIP
+       for i := 0; i < len(lightningIPs); i++ {
                wg.Add(1)
-               go func() {
+               go func(ip string, dir string) {
                        defer wg.Done()
                        if !skipDownloading {
                                if _, _, err = runCmd("ssh", ip, `rm -f /tmp/go-tpc`); err != nil {
@@ -189,16 +188,18 @@ func fetchTpcc(lightningDirs []string, lightningIPs []string, binaryURL string,
                                }
                                fmt.Println("Download go-tpc binary successfully!")
                        }
-                       if _, _, err = runCmd("ssh", ip, fmt.Sprintf("mkdir -p %s", lightningDirs[i])); err != nil {
+                       if _, _, err = runCmd("ssh", ip, fmt.Sprintf("mkdir -p %s", dir)); err != nil {
                                errCh <- err
                                return
                        }
-               }()
+               }(lightningIPs[i], lightningDirs[i])
        }
+
        go func() {
                wg.Wait()
                close(errCh)
        }()
+

also looks maybe block at push into errCh if having multi lightning, have we test it with mutl lightning?
I make a change but has no right to push

and maybe block at push into errCh with multi lightning

1. download [go-tpc](https://github.com/pingcap/go-tpc) binary
2. generate csv data
3. import csv data through [tidb-lightning](https://github.com/pingcap/tidb-lightning)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个建议再详细点,比如会修改 tidb-lignting 的配置文件

另外说明下使用要先部署要 lightning, 并且需要有 wget, mysql 等(如果还有其它的)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Add to my todo list now. Will do tomorrow

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants