Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: 完善私有构建机不认领任务的出错信息 #5806

Closed
6 tasks done
irwinsun opened this issue Dec 15, 2021 · 0 comments
Closed
6 tasks done

feat: 完善私有构建机不认领任务的出错信息 #5806

irwinsun opened this issue Dec 15, 2021 · 0 comments
Assignees
Labels
area/ci/backend CI 后端issue done Production environment in tencent has been deploy for gray UAT environment in tencent has been deploy for test Test environment in tencent has been deploy grayed uat环境测试通过/test passed for uat stage kind/enhancement 功能改进特性 kind/feat/tech 技术性特性 streams/done stream 生产部署成功 streams/for gray stream 灰度环境部署成功 streams/for test stream 测试环境部署成功 tested 测试环境通过/test patest passed for test stage
Milestone

Comments

@irwinsun
Copy link
Member

irwinsun commented Dec 15, 2021

What would you like to be added:

  • 1、基础环境检测优化:优化 workerBuildFinish提示worker-agent.jar 包的问题原因提示的日志信息需要细化:

    • [优化自愈] 自动检测,如果发现tmp下有升级文件,则尝试替换掉worker-agent.jar 以使恢复正常
    • [优化自助排查] 如果失败,则提示让其到指定目录下进行手动修复
  • 2、[fix] 解决临时目录被系统清理、内存占用过大问题:devopsAgent 优化启动worker-agent.jar的命令增加以下参数

    • -Djava.io.tmpdir=当前agent所在目录/build_tmp 目录
    • -Xmx2g (顺带在启动命令时就限制worker-agent.jar进程的内存占用,避免内存过大导致失败)
  • 3、[自助排查] 构建机的并发数满而不认领任务的构建日志需要有优化提示展示。打印查看构建机详情的跳转链接 /console/environment/{projectId}/nodeDetail/{nodeHashId}

  • 4、[fix] 解决worker过老,或者异常,导致拿不到版本号,而无法自愈或升级的问题 # 5045

  • 5、[优化自助排查] 增加构建日志并回传给agent上报到构建日志,让业务知晓出错原因 # 4943

    • [自愈 ] 对于超过3分钟没有影响的尝试恢复,如下图打印相应文字并尝试修复

image

  • 6、[fix] 构建机因磁盘满等问题导致构建机Agent一直处于升级状态,影响构建任务领取 # 5172
    • [自愈] 前置检查出错就结束升级流程,如果不满足升级条件,就不做升级,让agent可以继续运行。

Why is this needed:
1、 自助:便于用户自助排查问题
2、自愈:能解决的问题

@irwinsun irwinsun added the kind/enhancement 功能改进特性 label Dec 15, 2021
@irwinsun irwinsun added this to the v1.8 milestone Dec 15, 2021
@irwinsun irwinsun added area/ci/backend CI 后端issue kind/feat/tech 技术性特性 labels Dec 15, 2021
@irwinsun irwinsun self-assigned this Dec 15, 2021
@irwinsun irwinsun added the for test Test environment in tencent has been deploy label Mar 22, 2022
@irwinsun irwinsun added the tested 测试环境通过/test patest passed for test stage label Mar 22, 2022
mingshewhe added a commit that referenced this issue Mar 22, 2022
feat: 完善私有构建机不认领任务的出错信息 #5806
@bkci-bot bkci-bot added the for gray UAT environment in tencent has been deploy label Mar 22, 2022
irwinsun added a commit that referenced this issue Mar 22, 2022
feat: 完善私有构建机不认领任务的出错信息 #5806
@irwinsun irwinsun added grayed uat环境测试通过/test passed for uat stage streams/grayed stream 灰度环境测试通过 and removed streams/grayed stream 灰度环境测试通过 labels Mar 24, 2022
@bkci-bot bkci-bot added streams/for gray stream 灰度环境部署成功 streams/done stream 生产部署成功 labels Mar 25, 2022
@bkci-bot bkci-bot added the done Production environment in tencent has been deploy label Mar 31, 2022
@bkci-bot bkci-bot added the streams/for test stream 测试环境部署成功 label May 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ci/backend CI 后端issue done Production environment in tencent has been deploy for gray UAT environment in tencent has been deploy for test Test environment in tencent has been deploy grayed uat环境测试通过/test passed for uat stage kind/enhancement 功能改进特性 kind/feat/tech 技术性特性 streams/done stream 生产部署成功 streams/for gray stream 灰度环境部署成功 streams/for test stream 测试环境部署成功 tested 测试环境通过/test patest passed for test stage
Projects
None yet
Development

No branches or pull requests

2 participants