Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update doc #1735

Merged
merged 6 commits into from
Apr 18, 2022
Merged

Update doc #1735

merged 6 commits into from
Apr 18, 2022

Conversation

TeslaZhao
Copy link
Collaborator

@TeslaZhao TeslaZhao commented Mar 27, 2022

Add doc/Offical_Docs/3-0_QuickStart_Int_CN.md
Add doc/Offical_Docs/9-0_Kubernetes_Int_CN.md
Add doc/Offical_Docs/7-0_Python_Pipeline_Int_CN.md
Add doc/Offical_Docs/7-1_Python_Pipeline_Basic_CN.md
Add doc/Offical_Docs/7-2_Python_Pipeline_Usage_CN.md
Add doc/Offical_Docs/7-3_Python_Pipeline_Senior_CN.md
Add doc/Offical_Docs/7-4_Python_Pipeline_Optimize_CN.md

@@ -0,0 +1,374 @@
# Python Pipeline 基础功能

设计一个通用端到端多模型组合框架所面临的挑战有如下4点:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

设计部分是否跟基础功能相关? 如没有直接联系,建议删除或单独写一个章节

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

加入设计目的:『为了解决多个深度学习模型组合的复杂问题』

| local_service_handler | (object) |local predictor handler,Op init()入参赋值 或 在Op init()中创建|


**三.Channel 设计与实现**

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

换行问题

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

<img src='../images/pipeline_serving-image3.png' height = "500" align="middle"/>
</div>

**四.二次开发**

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

换行问题

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

提供给开发者的二次开发接口有三种,分别是推理 OP 二次开发接口、RequestOp 二次开发接口和 ResponseOp 二次开发接口。


1. 推理 OP 二次开发
Copy link

@ZihanDong ZihanDong Apr 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

推理OP自定义行为

推理 OP 为开发者提供3个外部函数接口:

| 变量或接口 | 说明 |
| :----------------------------------------------: | :----------------------------------------------------------: |

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

建议把process postprocess 和pre init 拆成四个部分分别描述
1.1 1.2 1.3 1.4


需要**注意**的是,在线程版 OP 中,每个 OP 只会调用一次该函数,故加载的资源必须要求是线程安全的。

2. RequestOp 二次开发

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

request和response的二次开发是不是都针对数据结构的定制化处理? 如果是,是否可以单独以此为目标写一个章节?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

request response考虑作为进阶功能


高阶用法在复杂场景中使用,实现更多自定义能力,包括 DAG 跳过某个OP运行、自定义数据传输结构以及多卡推理等。

## DAG 跳过某个OP运行
Copy link

@ZihanDong ZihanDong Apr 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

如何去实现分支逻辑 + 图说明

- 推理耗时明显长于前后处理,合并多个请求数据推理一次会提高吞吐和GPU利用率
- 要求多个请求数据的 shape 一致

| 接口 | 说明 |
Copy link

@ZihanDong ZihanDong Apr 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

此处 接口 是否合适?表格描述的应该是不同模式的实现方式? 如果是 应该为用户说明


### 4.6 单机多卡
单机多卡推理,M 个 OP 进程与 N 个 GPU 卡绑定,在 `config.yml` 中配置3个参数有关系,首先选择进程模式、并发数即进程数,devices 是 GPU 卡 ID。绑定方法是进程启动时遍历 GPU 卡 ID,例如启动7个 OP 进程 `config.yml` 设置 devices:0,1,2,那么第1,4,7个启动的进程与0卡绑定,第2,4个启动的进程与1卡绑定,3,6进程与卡2绑定。
- 进程ID: 0 绑定 GPU 卡0

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

进城模式 和 并发数, 在config中的具体设置方式?

进程还是线程?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

XPU的支持
device ID的获取方式

再重新编译 Serving Server。


## 自定义 URL
Copy link

@ZihanDong ZihanDong Apr 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

感觉此处描述不够清晰

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

brpc不能自定义?


## 自定义 proto 中 Request 和 Response 结构

当默认 proto 结构不满足业务需求时,同时下面2个文件的 proto 的 Request 和 Response message 结构,保持一致。

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

此处描述不够清晰

@@ -0,0 +1,245 @@
# Python Pipeline 使用案例

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

此处为进阶使用的对应案例,是否可以把案例放在最后,同时在案例中体现之前几篇中:

  1. 普通功能和进阶使用的方法
  2. 优化方法

@@ -0,0 +1,96 @@
# Python Pipeline 高阶用法

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

是否可以按 数据自定义 执行自定义 通讯自定义 类似逻辑来区分功能?
目前普通/高阶 用户是否会找不到需要看的章节在哪里

server.run_server()
```

## 启动服务验证

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

缺少文字说明

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

缺少标准输出

```

## 创建config.yaml
本示例采用了brpc的client连接类型,还可以选择grpc或local_predictor。
Copy link

@ZihanDong ZihanDong Apr 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

grpc的config在哪里会有不一样? 在何处有详细说明

@@ -0,0 +1,245 @@
# Python Pipeline 使用案例

Python Pipeline 使用案例部署步骤可分为下载模型、配置、编写代码、推理测试4个步骤。

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

此处步骤名称 没有与下文一致


## 编写 Server 代码

代码示例中,重点留意3个自定义Op的preprocess、postprocess处理,以及Combin Op初始化列表input_ops=[bow_op, cnn_op],设置Combin Op的前置OP列表。
Copy link

@ZihanDong ZihanDong Apr 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

差异化的解释


Paddle Serving 实现了一套通用的多模型组合服务编程框架 Python Pipeline,不仅解决上述痛点,同时还能大幅提高 GPU 利用率,并易于开发和维护。

通过阅读以下内容掌握 Python Pipeline 框架基础功能、设计方案、使用指南等。

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

缺少子页面内,主要内容的概括性描述

analyst.save_trace(trace_filename)
```

具体操作:打开 chrome 浏览器,在地址栏输入 `chrome://tracing/` ,跳转至 tracing 页面,点击 load 按钮,打开保存的 `trace` 文件,即可将预测服务的各阶段时间信息可视化。

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

添加可视化的图示例

# Python Pipeline 优化指南


## 如何通过 Timeline 工具进行优化

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

需要描述基本步骤,保存导出,分析,进行优化

## 分析方法
根据 `pipeline.tracer` 日志中的各个阶段耗时,按以下公式逐步分析出主要耗时在哪个阶段。
```
单 OP 耗时:
Copy link

@ZihanDong ZihanDong Apr 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不同指标的分析


## 优化思路

根据长耗时在不同阶段,采用不同的优化方法.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

此处是否添加对应链接直接引导用户到相应文档

- 进程ID: 5 绑定 GPU 卡2
- 进程ID: 6 绑定 GPU 卡0

`config.yml` 中硬件配置:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

补一下三个参数


具体操作:打开 chrome 浏览器,在地址栏输入 `chrome://tracing/` ,跳转至 tracing 页面,点击 load 按钮,打开保存的 `trace` 文件,即可将预测服务的各阶段时间信息可视化。

## 在 Client 端输出 Profile 信息

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

此处与前面部分均为 如何导出profiler信息,应该是第一步的子步骤,需要注明

@TeslaZhao TeslaZhao merged commit ee8b5e2 into PaddlePaddle:develop Apr 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants