Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DashScope API based multimodal service functions in AgentScope library #255

Merged
merged 11 commits into from
Jun 5, 2024
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -127,6 +127,7 @@ the following libraries.
- Code Execution
- File Operation
- Text Processing
- Multi Modality

**Example Applications**

Expand Down
1 change: 1 addition & 0 deletions README_ZH.md
Original file line number Diff line number Diff line change
Expand Up @@ -109,6 +109,7 @@ AgentScope支持使用以下库快速部署本地模型服务。
- 代码执行
- 文件操作
- 文本处理
- 多模态生成

**样例应用**

Expand Down
61 changes: 32 additions & 29 deletions docs/sphinx_doc/en/source/tutorial/204-service.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,35 +12,38 @@ AgentScope and how to use them to enhance the capabilities of your agents.

The following table outlines the various Service functions by type. These functions can be called using `agentscope.service.{function_name}`.

| Service Scene | Service Function Name | Description |
|-----------------------------|-----------------------|----------------------------------------------------------------------------------------------------------------|
| Code | `execute_python_code` | Execute a piece of Python code, optionally inside a Docker container. |
| Retrieval | `retrieve_from_list` | Retrieve a specific item from a list based on given criteria. |
| | `cos_sim` | Compute the cosine similarity between two different embeddings. |
| SQL Query | `query_mysql` | Execute SQL queries on a MySQL database and return results. |
| | `query_sqlite` | Execute SQL queries on a SQLite database and return results. |
| | `query_mongodb` | Perform queries or operations on a MongoDB collection. |
| Text Processing | `summarization` | Summarize a piece of text using a large language model to highlight its main points. |
| Web | `bing_search` | Perform bing search |
| | `google_search` | Perform google search |
| | `arxiv_search` | Perform arXiv search |
| | `download_from_url` | Download file from given URL. |
| | `load_web` | Load and parse the web page of the specified url (currently only supports HTML). |
| | `digest_webpage` | Digest the content of a already loaded web page (currently only supports HTML).
| | `dblp_search_publications` | Search publications in the DBLP database
| | `dblp_search_authors` | Search for author information in the DBLP database |
| | `dblp_search_venues` | Search for venue information in the DBLP database |
| File | `create_file` | Create a new file at a specified path, optionally with initial content. |
| | `delete_file` | Delete a file specified by a file path. |
| | `move_file` | Move or rename a file from one path to another. |
| | `create_directory` | Create a new directory at a specified path. |
| | `delete_directory` | Delete a directory and all its contents. |
| | `move_directory` | Move or rename a directory from one path to another. |
| | `read_text_file` | Read and return the content of a text file. |
| | `write_text_file` | Write text content to a file at a specified path. |
| | `read_json_file` | Read and parse the content of a JSON file. |
| | `write_json_file` | Serialize a Python object to JSON and write to a file. |
| *More services coming soon* | | More service functions are in development and will be added to AgentScope to further enhance its capabilities. |
| Service Scene | Service Function Name | Description |
|-----------------------------|----------------------------|----------------------------------------------------------------------------------------------------------------|
| Code | `execute_python_code` | Execute a piece of Python code, optionally inside a Docker container. |
| Retrieval | `retrieve_from_list` | Retrieve a specific item from a list based on given criteria. |
| | `cos_sim` | Compute the cosine similarity between two different embeddings. |
| SQL Query | `query_mysql` | Execute SQL queries on a MySQL database and return results. |
| | `query_sqlite` | Execute SQL queries on a SQLite database and return results. |
| | `query_mongodb` | Perform queries or operations on a MongoDB collection. |
| Text Processing | `summarization` | Summarize a piece of text using a large language model to highlight its main points. |
| Web | `bing_search` | Perform bing search |
| | `google_search` | Perform google search |
| | `arxiv_search` | Perform arXiv search |
| | `download_from_url` | Download file from given URL. |
| | `load_web` | Load and parse the web page of the specified url (currently only supports HTML). |
| | `digest_webpage` | Digest the content of a already loaded web page (currently only supports HTML).
| | `dblp_search_publications` | Search publications in the DBLP database
| | `dblp_search_authors` | Search for author information in the DBLP database |
| | `dblp_search_venues` | Search for venue information in the DBLP database |
| File | `create_file` | Create a new file at a specified path, optionally with initial content. |
| | `delete_file` | Delete a file specified by a file path. |
| | `move_file` | Move or rename a file from one path to another. |
| | `create_directory` | Create a new directory at a specified path. |
| | `delete_directory` | Delete a directory and all its contents. |
| | `move_directory` | Move or rename a directory from one path to another. |
| | `read_text_file` | Read and return the content of a text file. |
| | `write_text_file` | Write text content to a file at a specified path. |
| | `read_json_file` | Read and parse the content of a JSON file. |
| | `write_json_file` | Serialize a Python object to JSON and write to a file. |
| Multi Modality | `dashscope_text_to_image` | Convert text to image using Dashscope API. |
| | `dashscope_image_to_text` | Convert image to text using Dashscope API. |
| | `dashscope_text_to_audio` | Convert text to audio using Dashscope API. |
| *More services coming soon* | | More service functions are in development and will be added to AgentScope to further enhance its capabilities. |

About each service function, you can find detailed information in the
[API document](https://modelscope.github.io/agentscope/).
Expand Down
11 changes: 7 additions & 4 deletions docs/sphinx_doc/zh_CN/source/tutorial/204-service.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
|------------|-----------------------|-----------------------------------------|
| 代码 | `execute_python_code` | 执行一段 Python 代码,可选择在 Docker <br/>容器内部执行。 |
| 检索 | `retrieve_from_list` | 根据给定的标准从列表中检索特定项目。 |
| | `cos_sim` | 计算2个embedding的余弦相似度。 |
| | `cos_sim` | 计算2个embedding的余弦相似度。 |
| SQL查询 | `query_mysql` | 在 MySQL 数据库上执行 SQL 查询并返回结果。 |
| | `query_sqlite` | 在 SQLite 数据库上执行 SQL 查询并返回结果。 |
| | `query_mongodb` | 对 MongoDB 集合执行查询或操作。 |
Expand All @@ -24,9 +24,9 @@
| | `download_from_url` | 从指定的 URL 下载文件。 |
| | `load_web` | 爬取并解析指定的网页链接 (目前仅支持爬取 HTML 页面) |
| | `digest_webpage` | 对已经爬取好的网页生成摘要信息(目前仅支持 HTML 页面
| | `dblp_search_publications` | 在dblp数据库里搜索文献。
| | `dblp_search_authors` | 在dblp数据库里搜索作者。 |
| | `dblp_search_venues` | 在dblp数据库里搜索期刊,会议及研讨会。 |
| | `dblp_search_publications` | 在dblp数据库里搜索文献。
| | `dblp_search_authors` | 在dblp数据库里搜索作者。 |
| | `dblp_search_venues` | 在dblp数据库里搜索期刊,会议及研讨会。 |
| 文件处理 | `create_file` | 在指定路径创建一个新文件,并可选择添加初始内容。 |
| | `delete_file` | 删除由文件路径指定的文件。 |
| | `move_file` | 将文件从一个路径移动或重命名到另一个路径。 |
Expand All @@ -37,6 +37,9 @@
| | `write_text_file` | 向指定路径的文件写入文本内容。 |
| | `read_json_file` | 读取并解析 JSON 文件的内容。 |
| | `write_json_file` | 将 Python 对象序列化为 JSON 并写入到文件。 |
| 多模态 | `dashscope_text_to_image` | 使用 DashScope API 将文本生成图片。 |
| | `dashscope_image_to_text` | 使用 DashScope API 根据图片生成文字。 |
| | `dashscope_text_to_audio` | 使用 DashScope API 根据文本生成音频。 |
| *更多服务即将推出* | | 正在开发更多服务功能,并将添加到 AgentScope 以进一步增强其能力。 |

关于详细的参数、预期输入格式、返回类型,请参阅[API文档](https://modelscope.github.io/agentscope/)。
Expand Down
8 changes: 4 additions & 4 deletions examples/conversation_with_customized_services/main.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -241,7 +241,7 @@
" weather_data = weather.run(f\"{city},{country}\")\n",
" return ServiceResponse(ServiceExecStatus.SUCCESS, weather_data)\n",
" except Exception as e:\n",
" return ServiceResponse(ServiceExecStatus.FAILURE, str(e))"
" return ServiceResponse(ServiceExecStatus.ERROR, str(e))"
]
},
{
Expand Down Expand Up @@ -408,7 +408,7 @@
" return ServiceResponse(ServiceExecStatus.SUCCESS, {\"urls\": urls})\n",
" else:\n",
" err_msg = f\"status_code: {response.status_code}, code: {response.code}, message: {response.message}\"\n",
" return ServiceResponse(ServiceExecStatus.FAILURE, err_msg)"
" return ServiceResponse(ServiceExecStatus.ERROR, err_msg)"
]
},
{
Expand Down Expand Up @@ -457,7 +457,7 @@
" return ServiceResponse(ServiceExecStatus.SUCCESS, description)\n",
" else:\n",
" err_msg = f\"status_code: {response.status_code}, code: {response.code}, message: {response.message}\"\n",
" return ServiceResponse(ServiceExecStatus.FAILURE, err_msg) \n",
" return ServiceResponse(ServiceExecStatus.ERROR, err_msg) \n",
" "
]
},
Expand Down Expand Up @@ -500,7 +500,7 @@
" f.write(result.get_audio_data())\n",
" return ServiceResponse(ServiceExecStatus.SUCCESS, 'output.wav')\n",
" else:\n",
" return ServiceResponse(ServiceExecStatus.FAILURE, \"Failed to generate audio file\")"
" return ServiceResponse(ServiceExecStatus.ERROR, \"Failed to generate audio file\")"
]
},
{
Expand Down
12 changes: 7 additions & 5 deletions src/agentscope/models/dashscope_model.py
Original file line number Diff line number Diff line change
Expand Up @@ -606,7 +606,9 @@ def __call__(
messages=messages,
**kwargs,
)

# Unhandle code path here
# response could be a generator , if stream is yes
# suggest add a check here
if response.status_code != HTTPStatus.OK:
error_msg = (
f" Request id: {response.request_id},"
Expand Down Expand Up @@ -770,7 +772,7 @@ def format(
for i, unit in enumerate(input_msgs):
if i == 0 and unit.role == "system":
# system prompt
content = self._convert_url(unit.url)
content = self.convert_url(unit.url)
content.append({"text": _convert_to_str(unit.content)})

messages.append(
Expand All @@ -785,7 +787,7 @@ def format(
f"{unit.name}: {_convert_to_str(unit.content)}",
)
# image and audio
image_or_audio_dicts.extend(self._convert_url(unit.url))
image_or_audio_dicts.extend(self.convert_url(unit.url))

dialogue_history = "\n".join(dialogue)

Expand All @@ -808,7 +810,7 @@ def format(

return messages

def _convert_url(self, url: Union[str, Sequence[str], None]) -> List[dict]:
def convert_url(self, url: Union[str, Sequence[str], None]) -> List[dict]:
"""Convert the url to the format of DashScope API. Note for local
files, a prefix "file://" will be added.

Expand Down Expand Up @@ -841,7 +843,7 @@ def _convert_url(self, url: Union[str, Sequence[str], None]) -> List[dict]:
elif isinstance(url, list):
dicts = []
for _ in url:
dicts.extend(self._convert_url(_))
dicts.extend(self.convert_url(_))
return dicts
else:
raise TypeError(
Expand Down
8 changes: 8 additions & 0 deletions src/agentscope/service/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,11 @@
dblp_search_authors,
dblp_search_venues,
)
from .multi_modality.dashscope_services import (
dashscope_image_to_text,
dashscope_text_to_image,
dashscope_text_to_audio,
)
from .service_response import ServiceResponse
from .service_toolkit import ServiceToolkit
from .service_toolkit import ServiceFactory
Expand Down Expand Up @@ -78,6 +83,9 @@ def get_help() -> None:
"dblp_search_publications",
"dblp_search_authors",
"dblp_search_venues",
"dashscope_image_to_text",
"dashscope_text_to_image",
"dashscope_text_to_audio",
# to be deprecated
"ServiceFactory",
]
Loading
Loading