Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Studio supports allocation of agent server #351

Merged
merged 76 commits into from
Aug 6, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
76 commits
Select commit Hold shift + click to select a range
0ba7f32
check agent server alive
pan-x-c May 20, 2024
beb7b71
refactor rpc agent proto
pan-x-c May 21, 2024
eca9592
update rpc protoc of agent server
pan-x-c May 21, 2024
d82c4df
support upload source code into agent server
pan-x-c May 21, 2024
485537e
update agent server management test
pan-x-c May 21, 2024
c993e85
update tutorial
pan-x-c May 21, 2024
a3330b6
add psutils to rpc dependency
pan-x-c May 21, 2024
34c09f5
merge main
pan-x-c May 24, 2024
1b40ee4
support download file from agent server
pan-x-c May 24, 2024
8f1a303
add download file test
pan-x-c May 27, 2024
afa0a2f
add get agent memory and set model configs
pan-x-c May 27, 2024
7d592cc
update agent server management tutorial
pan-x-c May 27, 2024
eed97e6
update tutorial
pan-x-c May 27, 2024
f36f803
fix typo
pan-x-c May 28, 2024
ad0a998
fix typo
pan-x-c May 28, 2024
b8fa907
Merge branch 'main' into feature/pxc/agent_server_enhance
pan-x-c May 28, 2024
7fb3f3b
fix pre-commit
pan-x-c May 28, 2024
7dd1421
automatically download files from remote server when update placeholder
pan-x-c May 30, 2024
ca43187
fix conflict
pan-x-c Jun 11, 2024
f1a9835
Merge branch 'main' into feature/pxc/agent_server_enhance
pan-x-c Jun 11, 2024
a212a4f
add stop agent server client side method
pan-x-c Jun 11, 2024
b257af4
add stop server test
pan-x-c Jun 11, 2024
8850fc6
set rpc max meessage size to 32MB
pan-x-c Jun 17, 2024
0c3c337
fix tests
pan-x-c Jun 17, 2024
25b8416
merge main
pan-x-c Jun 18, 2024
80279a5
merge main
pan-x-c Jun 27, 2024
23b954b
Merge branch 'main' into feature/pxc/agent_server_enhance
pan-x-c Jun 27, 2024
351f195
add server studio page and refactor rpc proto
pan-x-c Jun 28, 2024
2627baf
finish agentscope server manager page
pan-x-c Jul 8, 2024
6dd5468
fix server manager page
pan-x-c Jul 8, 2024
c9580d2
remove upload source code
pan-x-c Jul 10, 2024
b85d30b
update tutorial
pan-x-c Jul 10, 2024
a7a8eaa
update test
pan-x-c Jul 10, 2024
380fbd6
mv svg out of html
pan-x-c Jul 12, 2024
64186c6
add delete all agent function
pan-x-c Jul 12, 2024
11c1ff5
add agent memory
pan-x-c Jul 15, 2024
b5876a2
fix comments
pan-x-c Jul 15, 2024
1616356
fix unittest
pan-x-c Jul 15, 2024
f289860
fix comments
pan-x-c Jul 15, 2024
373024f
fix update placeholder error
pan-x-c Jul 15, 2024
8b73ceb
fix simulation
pan-x-c Jul 15, 2024
96b74e1
first balance
pan-x-c Jul 17, 2024
12b945f
merge main
pan-x-c Jul 17, 2024
ee60f3b
finish auto alloc server
pan-x-c Jul 18, 2024
4858165
fix typo
pan-x-c Jul 18, 2024
1cfdd07
fix lazy_launch
pan-x-c Jul 18, 2024
202a550
fix pre-commit
pan-x-c Jul 18, 2024
9bd9bad
servers support load agent dir
pan-x-c Jul 19, 2024
06ed1dd
fix agent dir is none
pan-x-c Jul 19, 2024
d3e41d1
update docs
pan-x-c Jul 19, 2024
490ac1a
update tutorial for agent server manager
pan-x-c Jul 19, 2024
e22e58a
fix dist turtorial
pan-x-c Jul 24, 2024
4575889
Merge branch 'main' into feature/pxc/server_balance
pan-x-c Jul 24, 2024
e811315
update tutorial
pan-x-c Jul 29, 2024
c963c69
update req
pan-x-c Jul 29, 2024
38c6e45
update req
pan-x-c Jul 29, 2024
8cb07cd
update req
pan-x-c Jul 29, 2024
1de6dc6
update req
pan-x-c Jul 29, 2024
626545c
update req
pan-x-c Jul 29, 2024
c1b56b3
update req
pan-x-c Jul 29, 2024
09f0c0b
update req
pan-x-c Jul 29, 2024
b99fc99
update wheel
pan-x-c Jul 29, 2024
bf20616
update ffmpy
pan-x-c Jul 29, 2024
b7b5041
update ffmpy
pan-x-c Jul 29, 2024
d2a392d
update ffmpy
pan-x-c Jul 29, 2024
b498c6a
update ffmpy
pan-x-c Jul 29, 2024
4dd5249
update python
pan-x-c Jul 29, 2024
8ff32bc
update wheel
pan-x-c Jul 29, 2024
7801975
merge main
pan-x-c Jul 29, 2024
cce467a
update workflow
pan-x-c Jul 29, 2024
b50f2ee
fix typo
pan-x-c Aug 5, 2024
4fbc2ba
merge main
pan-x-c Aug 5, 2024
0c36ecc
enhance check port
pan-x-c Aug 5, 2024
57363b8
fix pre-commit
pan-x-c Aug 5, 2024
8ae467d
add security warning in tutorial
pan-x-c Aug 6, 2024
5a1ec2a
add test for auto allocated server not alive
pan-x-c Aug 6, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
79 changes: 73 additions & 6 deletions docs/sphinx_doc/en/source/tutorial/208-distribute.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,11 +82,16 @@ server.launch()
server.wait_until_terminate()
```

> For similarity, you can run the following command in your terminal rather than the above code:
>
> ```shell
> as_server --host ip_a --port 12001 --model-config-path model_config_path_a
> ```
For simplicity, you can run the following command in your terminal rather than the above code:

```shell
as_server --host ip_a --port 12001 --model-config-path model_config_path_a --agent-dir parent_dir_of_agent_a_and_b
```

> Note:
> The `--agent-dir` field is used to specify the directory where your customized agent classes are located.
> Please make sure that all custom Agent classes are located in `--agent-dir`, and that the custom modules they depend on are also located in the directory.
> Additionally, because the above command will load all Python files in the directory, please ensure that the directory does not contain any malicious files to avoid security risks.

Then put your model config file accordingly in `model_config_path_b`, set environment variables, and run the following code on `Machine2`.

Expand All @@ -112,7 +117,7 @@ server.wait_until_terminate()
> Similarly, you can run the following command in your terminal to setup the agent server:
>
> ```shell
> as_server --host ip_b --port 12002 --model-config-path model_config_path_b
> as_server --host ip_b --port 12002 --model-config-path model_config_path_b --agent-dir parent_dir_of_agent_a_and_b
> ```

Then, you can connect to the agent servers from the main process with the following code.
Expand All @@ -139,6 +144,9 @@ And developers just need to write the application flow in a centralized way in t

### Step 2: Orchestrate Distributed Application Flow

> Note:
> Currently, distributed version of Agent only supports `__call__` method call (i.e. `agent(x)`), not support calling other methods or reading/writing properties.

In AgentScope, the orchestration of distributed application flow is exactly the same as non-distributed programs, and developers can write the entire application flow in a centralized way.
At the same time, AgentScope allows the use of a mixture of locally and distributed deployed agents, and developers do not need to distinguish which agents are local and which are distributed.

Expand Down Expand Up @@ -315,6 +323,65 @@ When running large-scale multi-agent applications, it's common to have multiple
ok = client.delete_all_agent()
```

#### Connecting to AgentScope Studio

The agent server process can be connected to [AgentScope Studio](#209-gui-en) at startup, allowing the `to_dist` method in subsequent distributed applications to be assigned automatically by Studio without the need for any parameters.
pan-x-c marked this conversation as resolved.
Show resolved Hide resolved

For scenarios where the agent server process is started using Python code, simply fill in the `studio_url` in the initialization parameters of `RpcAgentServerLauncher`. This requires that the URL is correct and accessible over the network, for example, the default URL for the Studio is `http://127.0.0.1:5000`.

```python
# import some packages

# register models which can be used in the server
agentscope.init(
model_configs=model_config_path_a,
)
# Create an agent service process
server = RpcAgentServerLauncher(
host="ip_a",
port=12001, # choose an available port
custom_agent_classes=[...], # register your customized agent classes
studio_url="http://studio_ip:studio_port", # connect to AgentScope Studio
)

# Start the service
server.launch()
server.wait_until_terminate()
```

For scenarios using the command `as_server` in your command line, simply fill in the `--studio-url` parameter.

```shell
as_server --host ip_a --port 12001 --model-config-path model_config_path_a --agent-dir parent_dir_of_agent_a_and_b --studio-url http://studio_ip:studio_port
pan-x-c marked this conversation as resolved.
Show resolved Hide resolved
```

After executing the above code or command, you can enter the Server Manager page of AgentScope Studio to check if the connection is successful. If the connection is successful, the agent server process will be displayed in the page table, and you can observe the running status and resource occupation of the process in the page, then you can use the advanced functions brought by AgentScope Studio. This section will focus on the impact of `to_dist` method brought by AgentScope Studio, and please refer to [AgentScope Studio](#209-gui-en) for the specific usage of the page.

After the agent server process successfully connects to Studio, you only need to pass the `studio_url` of this Studio in the `agentscope.init` method, and then the `to_dist` method no longer needs to fill in the `host` and `port` fields, but automatically select an agent server process that has been connected to Studio.

```python
# import some packages

agentscope.init(
model_configs=model_config_path_a,
studio_url="http://studio_ip:studio_port",
)

a = AgentA(
name="A"
# ...
).to_dist() # automatically select an agent server

# your application code
```

> Note:
>
> - The Agent used in this method must be registered at the start of the agent server process through `custom_agent_classes` or `--agent-dir`.
> - When using this method, make sure that the agent server process connected to Studio is still running normally.

After the application starts running, you can observe in the Server Manager page of Studio which agent server process this Agent is specifically running on, and after the application is completed, you can also delete this Agent through the Server Manager page.

## Implementation

### Actor Model
Expand Down
Binary file modified docs/sphinx_doc/en/source/tutorial/209-gui.md
Binary file not shown.
79 changes: 73 additions & 6 deletions docs/sphinx_doc/zh_CN/source/tutorial/208-distribute.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,11 +80,16 @@ server.launch()
server.wait_until_terminate()
```

> 为了进一步简化使用,可以在命令行中输入如下指令来代替上述代码:
>
> ```shell
> as_server --host ip_a --port 12001 --model-config-path model_config_path_a
> ```
为了进一步简化使用,可以在命令行中输入如下指令来代替上述代码:

```shell
as_server --host ip_a --port 12001 --model-config-path model_config_path_a --agent-dir parent_dir_of_agent_a_and_b
```

> Note:
> `--agent-dir` 用来指定你的自定义 Agent 类所在的目录。
> 请确保所有的自定义 Agent 类都位于 `--agent-dir` 指定的目录下,并且它们所依赖的自定义模块也都位于该目录下。
> 另外,因为上述指令会加载目录下的所有 Python 文件,在运行前请确保指定的目录内没有恶意文件,以避免出现安全问题。

在 `Machine2` 上运行如下代码,这里同样要确保已经将模型配置文件放置在 `model_config_path_b` 位置并设置环境变量,从而确保运行在该机器上的 Agent 能够正常访问到模型。

Expand All @@ -110,7 +115,7 @@ server.wait_until_terminate()
> 这里也同样可以用如下指令来代替上面的代码。
>
> ```shell
> as_server --host ip_b --port 12002 --model-config-path model_config_path_b
> as_server --host ip_b --port 12002 --model-config-path model_config_path_b --agent-dir parent_dir_of_agent_a_and_b
> ```

接下来,就可以使用如下代码从主进程中连接这两个智能体服务器进程。
Expand All @@ -137,6 +142,9 @@ b = AgentB(

### 步骤2: 编排分布式应用流程

> Note:
> 当前分布式版本的 Agent 仅支持 `__call__` 方法调用 (即 `agent(x)`),不支持调用其他方法或是属性读写。

在AgentScope中,分布式应用流程的编排和非分布式的程序完全一致,开发者可以用中心化的方式编写全部应用流程。
同时,AgentScope允许本地和分布式部署的智能体混合使用,开发者不用特意区分哪些智能体是本地的,哪些是分布式部署的。

Expand Down Expand Up @@ -312,6 +320,65 @@ b = AgentB(
ok = client.delete_all_agent()
```

#### 连接 AgentScope Studio

智能体服务器进程可以在启动时连接 [AgentScope Studio](#209-gui-zh) ,从而让后续搭建的分布式应用中的 `to_dist` 方法不再需要填写任何参数,而是由 Stduio 为其自动分配智能体服务器进程。

对于使用 Python 代码启动智能体服务器进程的场景,只需要在 `RpcAgentServerLauncher` 的初始化参数中填入 `studio_url` 即可,这里需要确保填写正确且能够通过网络访问,例如默认情况下启动的 Studio 的 URL 为 `http://127.0.0.1:5000`。

```python
# import some packages

# register models which can be used in the server
agentscope.init(
model_configs=model_config_path_a,
)
# Create an agent service process
server = RpcAgentServerLauncher(
host="ip_a",
port=12001, # choose an available port
custom_agent_classes=[...] # register your customized agent classes
studio_url="http://studio_ip:studio_port", # connect to AgentScope Studio
)

# Start the service
server.launch()
server.wait_until_terminate()
```

对于使用命令行 `as_server` 的场景,也只需要在命令行中填入 `--studio-url` 参数。

```shell
as_server --host ip_a --port 12001 --model-config-path model_config_path_a --agent-dir parent_dir_of_agent_a_and_b --studio-url http://studio_ip:studio_port
```

执行上述代码或命令后可以进入 AgentScope Studio 的 Server Manager 页面查看是否连接成功。如果连接成功,该智能体服务器进程会显示在页面的表格中,并且可以在页面中观察到该进程的运行状态以及资源占用情况,之后就可以使用 AgentScope Studio 所带来的高级功能了。本节将聚焦于 AgentScope Studio 对 `to_dist` 方法带来的影响,而页面的具体用法请参考 [AgentScope Studio](#209-gui-zh)。

在智能体服务器进程成功连接 Studio 后,只需要在 `agentscope.init` 方法中传入该 Studio 的 `studio_url`,后续的 `to_dist` 方法就不再需要填写 `host` 和 `port` 域,而是自动选择一个已经连接到 Studio 的智能体服务器进程。

```python
# import some packages

agentscope.init(
model_configs=model_config_path_a,
studio_url="http://studio_ip:studio_port",
)

a = AgentA(
name="A"
# ...
).to_dist() # automatically select an agent server

# your application code
```

> Note:
>
> - 该方法中使用的 Agent 必须在智能体服务器进程启动时就已经通过 `custom_agent_classes` 或 `--agent-dir` 注册。
> - 使用该方法时需要确定连接到 Studio 的智能体服务器进程还在正常运行。

在应用开始运行后,可以在 Studio 的 Server Manager 页面中观察该 Agent 具体运行在哪个智能体服务器进程上,应用运行完成后也可以通过 Server Manager 页面删除该 Agent。

## 实现原理

### Actor模式
Expand Down
62 changes: 59 additions & 3 deletions docs/sphinx_doc/zh_CN/source/tutorial/209-gui.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,8 @@ AgentScope Studio 是一个开源的 Web UI 工具包,用于构建和监控多

- **Dashboard**:一个用户友好的界面,可以在其中监视正在运行的应用程序,并查看运行历史。
- **Workstation**:一个强大的界面,可通过**拖拽**的方式构建多智能体应用程序。
- **Server Manager**:一个简单易用的监控与管理工具,用于管理大规模分布式的多智能体应用程序。
- **Gallery**:即将推出!
- **Server Management**:即将推出!


## 启动 AgentScope Studio

Expand Down Expand Up @@ -91,7 +90,7 @@ agentscope.studio.init(
)
```

## About Workstation
## Workstation

Workstation 是为零代码用户设计的,可以通过**拖拽**的方式构建多智能体应用程序。

Expand All @@ -117,6 +116,7 @@ AgentScope Studio中,拖过点击 workstation 图标进入 Workstation 界面
#### 构建应用程序

要构建应用程序,请按照以下步骤操作:

- **选择和拖动组件**:从侧边栏中选择您想要的组件,然后将其拖放到中央工作区。
- **连接节点**:大多数节点都有输入和输出点。单击一个组件的输出点,然后将其拖动到另一个组件的输入点,以创建消息流管道。这个过程允许不同的节点传递消息。
- **配置节点**:将节点拖放到工作区后,单击任何节点以填写其配置设置。可以自定义提示、参数和其他属性。
Expand Down Expand Up @@ -148,3 +148,59 @@ as_workflow config.json --compile ${YOUR_PYTHON_SCRIPT_NAME}.py
- 必填字段验证:所有必填字段必须填充,以确保每个节点具有正确运行所需的参数。
- 一致的配置命名:Agent 节点使用的“Model config name”必须对应于 Model 节点中定义的“Config Name”。
- 节点嵌套正确:ReActAgent 等节点应仅包含工具节点。类似地,IfElsePipeline 等 Pipeline 节点应包含正确数量的元素(不超过 2 个),而 ForLoopPipeline、WhileLoopPipeline 和 MsgHub 应遵循一个元素的规则(必须是 SequentialPipeline 作为子节点)。

## Server Manager

> 阅读本节内容需要先了解 AgentScope [分布式](#208-distribute-zh) 的基本概念及用法。

Server Manager 是一个用于监控和管理 AgentScope 智能体服务器进程(Server)以及大规模分布式应用的图形化界面。

### 注册 Server 进程

在初始化 `RpcAgentServerLauncher` 时传入 `studio_url` 参数即可实现注册。

```python
# import some packages
server = RpcAgentServerLauncher(
# ...
studio_url="http://studio_ip:studio_port", # connect to AgentScope Studio
)
```

更具体的注册方法请参考 [分布式](#208-distribute-zh) 中 *连接 AgentScope Studio* 部分。

### 管理 Server 进程

从 AgentScope Studio 主页面或侧边栏中的 Server Manager 按钮即可进入 Server Manager 页面。
当前 Server Manager 页面由 Servers 列表, Agents 列表, Memory 列表三个部分构成。

<h1 align="center">
<img src="https://gw.alicdn.com/imgextra/i2/O1CN01zvhoVE1MMrmbvu4mU_!!6000000001421-0-tps-3204-1854.jpg
" width="600" alt="agentscope-manager">
</h1>

#### Servers 列表

注册到 Studio 的智能体服务器进程(Server)都会显示在 Server Manager 页面的 Servers 列表中,列表中会不仅会显示每个 Server 的 `ID`, `Hostname`, `Port`, `Created Time`,还会显示每个 Server 的状态以及计算资源使用情况,包括 `Status`, `CPU Usage`, `Memory Usage`。

其中 `Status` 有以下几种:
- `running`:表示 Server 正在运行。
- `dead`:表示 Server 已停止运行。
- `unknown`:表示目前无法正常访问 Studio 服务。

只有在 `running` 状态的 Server 才会显示 CPU 和 Memory 的使用情况。用户可以点击 Servers 栏左边的刷新按钮来刷新 Servers 列表,同时也能够通过点击 Servers 栏右侧的删除按钮来一键删除所有已经处于 `dead` 状态的 Server。

Servers 列表每行的最后一列都提供了删除按钮,用于关闭并删除 Server,需要注意的是该操作是无法恢复的,因此需要谨慎使用。

#### Agents 列表

在点击任意处于 `running` 状态的 Server 行后,会在页面中展开 Agents 列表,该列表中会显示该 Server 下所有 Agent,列表中会显示每个 Agent 的 `ID`, `Name`, `Class`, `System Prompt` 以及 `Model`。

用户同样可以通过 Agents 列表栏左侧的刷新按钮来刷新 Agents 列表。并且用户也可以通过每个 Agent 行最右侧的删除按钮来删除该 Agent,并通过 Agents 列表栏右侧的删除按钮来批量删除 Server 中所有的 Agent。这里的删除操作都是不可恢复的,因此需要谨慎使用。

#### Memory 列表

在点击任意 Agent 行后,会在页面中展开 Memory 列表,该列表中会显示该 Agent 的 Memory 中的所有消息,每条消息会在左侧显示其 `Name` 和 `Role` 属性值,在点击后会在列表右侧显示该消息的具体内容。
这里同样可以点击 Memory 列表栏左侧的刷新按钮来刷新当前的 Memory 列表。

[[回到顶部]](#209-gui-zh)
18 changes: 9 additions & 9 deletions src/agentscope/agents/agent.py
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ def __call__(cls, *args: tuple, **kwargs: dict) -> Any:
),
max_timeout_seconds=to_dist.pop( # type: ignore[arg-type]
"max_timeout_seconds",
1800,
7200,
pan-x-c marked this conversation as resolved.
Show resolved Hide resolved
),
local_mode=to_dist.pop( # type: ignore[arg-type]
"local_mode",
Expand Down Expand Up @@ -100,9 +100,9 @@ def __init__(
host: str = "localhost",
port: int = None,
max_pool_size: int = 8192,
max_timeout_seconds: int = 1800,
max_timeout_seconds: int = 7200,
local_mode: bool = True,
lazy_launch: bool = True,
lazy_launch: bool = False,
pan-x-c marked this conversation as resolved.
Show resolved Hide resolved
):
"""Init the distributed configuration.

Expand All @@ -113,12 +113,12 @@ def __init__(
Port of the rpc agent server.
max_pool_size (`int`, defaults to `8192`):
Max number of task results that the server can accommodate.
max_timeout_seconds (`int`, defaults to `1800`):
max_timeout_seconds (`int`, defaults to `7200`):
Timeout for task results.
local_mode (`bool`, defaults to `True`):
Whether the started rpc server only listens to local
requests.
lazy_launch (`bool`, defaults to `True`):
lazy_launch (`bool`, defaults to `False`):
Only launch the server when the agent is called.
"""
self["host"] = host
Expand Down Expand Up @@ -424,9 +424,9 @@ def to_dist(
host: str = "localhost",
port: int = None,
max_pool_size: int = 8192,
max_timeout_seconds: int = 1800,
max_timeout_seconds: int = 7200,
local_mode: bool = True,
lazy_launch: bool = True,
lazy_launch: bool = False,
launch_server: bool = None,
) -> AgentBase:
"""Convert current agent instance into a distributed version.
Expand All @@ -441,15 +441,15 @@ def to_dist(
The max number of agent reply messages that the started agent
server can accommodate. Note that the oldest message will be
deleted after exceeding the pool size.
max_timeout_seconds (`int`, defaults to `1800`):
max_timeout_seconds (`int`, defaults to `7200`):
Only takes effect when `host` and `port` are not filled in.
Maximum time for reply messages to be cached in the launched
agent server. Note that expired messages will be deleted.
local_mode (`bool`, defaults to `True`):
Only takes effect when `host` and `port` are not filled in.
Whether the started agent server only listens to local
requests.
lazy_launch (`bool`, defaults to `True`):
lazy_launch (`bool`, defaults to `False`):
Only takes effect when `host` and `port` are not filled in.
If `True`, launch the agent server when the agent is called,
otherwise, launch the agent server immediately.
Expand Down
Loading