Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[paddle-pipelines] faq semantic search question answering reamde #8292

Merged
merged 3 commits into from
Apr 21, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion pipelines/examples/FAQ/Install_windows.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ xpack.security.enabled: false
#### 1.4.2 文档数据写入 ANN 索引库
```
# 以DuReader-Robust 数据集为例建立 ANN 索引库
python utils/offline_ann.py --index_name insurance --doc_dir data/insurance --split_answers --delete_index
python utils/offline_ann.py --index_name insurance --doc_dir data/insurance --split_answers --delete_index --query_embedding_model rocketqa-zh-nano-query-encoder --passage_embedding_model rocketqa-zh-nano-para-encoder --embedding_dim 312
```
参数含义说明
* `index_name`: 索引的名称
Expand All @@ -71,6 +71,9 @@ python utils/offline_ann.py --index_name insurance --doc_dir data/insurance --sp
运行结束后,可使用Kibana查看数据

#### 1.4.3 启动 RestAPI 模型服务

**注意** dense_faq.yaml里面的检索模型需要与前面使用offline_ann.py建库的时候使用的检索模型一致

```bash
# 指定FAQ智能问答系统的Yaml配置文件
$env:PIPELINE_YAML_PATH='rest_api/pipeline/dense_faq.yaml'
Expand Down
9 changes: 8 additions & 1 deletion pipelines/examples/FAQ/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -113,7 +113,11 @@ curl http://localhost:9200/_aliases?pretty=true
python utils/offline_ann.py --index_name insurance \
--doc_dir data/insurance \
--split_answers \
--delete_index
--delete_index \
--query_embedding_model rocketqa-zh-nano-query-encoder \
--passage_embedding_model rocketqa-zh-nano-para-encoder \
--embedding_dim 312

```
参数含义说明
* `index_name`: 索引的名称
Expand All @@ -134,6 +138,9 @@ curl http://localhost:9200/insurance/_search
```

#### 3.4.3 启动 RestAPI 模型服务

**注意** dense_faq.yaml里面的检索模型需要与前面使用offline_ann.py建库的时候使用的检索模型一致

```bash
# 指定FAQ智能问答系统的Yaml配置文件
export PIPELINE_YAML_PATH=rest_api/pipeline/dense_faq.yaml
Expand Down
5 changes: 4 additions & 1 deletion pipelines/examples/question-answering/Install_windows.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ xpack.security.enabled: false
#### 1.4.2 文档数据写入 ANN 索引库
```
# 以百科城市数据为例建立 ANN 索引库
python utils/offline_ann.py --index_name baike_cities --doc_dir data/baike
python utils/offline_ann.py --index_name baike_cities --doc_dir data/baike --query_embedding_model rocketqa-zh-nano-query-encoder --passage_embedding_model rocketqa-zh-nano-para-encoder --embedding_dim 312
```
参数含义说明
* `index_name`: 索引的名称
Expand All @@ -82,6 +82,9 @@ Updating embeddings: 10000 Docs [00:16, 617.76 Docs/s]
运行结束后,可使用Kibana查看数据

#### 1.4.3 启动 RestAPI 模型服务

**注意** dense_qa.yaml里面的检索模型需要与前面使用offline_ann.py建库的时候使用的检索模型一致

```bash
# 指定智能问答系统的Yaml配置文件
$env:PIPELINE_YAML_PATH='rest_api/pipeline/dense_qa.yaml'
Expand Down
8 changes: 7 additions & 1 deletion pipelines/examples/question-answering/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,10 @@ curl http://localhost:9200/_aliases?pretty=true
# 以百科城市数据为例建立 ANN 索引库
python utils/offline_ann.py --index_name baike_cities \
--doc_dir data/baike \
--delete_index
--delete_index \
--query_embedding_model rocketqa-zh-nano-query-encoder \
--passage_embedding_model rocketqa-zh-nano-para-encoder \
--embedding_dim 312
```
参数含义说明
* `index_name`: 索引的名称
Expand All @@ -138,6 +141,9 @@ curl -XGET http://localhost:9200/baike_cities/_count
```

#### 3.4.3 启动 RestAPI 模型服务

**注意** dense_qa.yaml里面的检索模型需要与前面使用offline_ann.py建库的时候使用的检索模型一致

```bash
# 指定智能问答系统的Yaml配置文件
export PIPELINE_YAML_PATH=rest_api/pipeline/dense_qa.yaml
Expand Down
3 changes: 3 additions & 0 deletions pipelines/examples/semantic-search/Install_windows.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,9 @@ python utils/offline_ann.py --index_name dureader_robust_query_encoder --doc_dir
运行结束后,可使用Kibana查看数据

#### 1.4.3 启动 RestAPI 模型服务

**注意** semantic_search.yaml里面的检索模型需要与前面使用offline_ann.py建库的时候使用的检索模型一致

```bash
# 指定语义检索系统的Yaml配置文件
$env:PIPELINE_YAML_PATH='rest_api/pipeline/semantic_search.yaml'
Expand Down
8 changes: 7 additions & 1 deletion pipelines/examples/semantic-search/Multi_Recall.md
Original file line number Diff line number Diff line change
Expand Up @@ -123,7 +123,10 @@ python examples/semantic-search/multi_recall_semantic_search_example.py --device
python utils/offline_ann.py --index_name dureader_robust_query_encoder \
--doc_dir data/dureader_dev \
--search_engine elastic \
--delete_index
--delete_index \
--query_embedding_model rocketqa-zh-base-query-encoder \
--passage_embedding_model rocketqa-zh-base-para-encoder \
--embedding_dim 768
```
可以使用下面的命令来查看数据:

Expand All @@ -141,6 +144,9 @@ curl http://localhost:9200/dureader_robust_query_encoder/_search
* `delete_index`: 是否删除现有的索引和数据,用于清空es的数据,默认为false

#### 3.4.2 启动 RestAPI 模型服务

**注意** multi_recall_semantic_search.yaml里面的检索模型需要与前面使用offline_ann.py建库的时候使用的检索模型一致

```bash
# 指定语义检索系统的Yaml配置文件
export PIPELINE_YAML_PATH=rest_api/pipeline/multi_recall_semantic_search.yaml
Expand Down
8 changes: 7 additions & 1 deletion pipelines/examples/semantic-search/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -122,7 +122,10 @@ python utils/offline_ann.py --index_name dureader_robust_query_encoder \
--doc_dir data/dureader_dev \
--search_engine elastic \
--embed_title True \
--delete_index
--delete_index \
--query_embedding_model rocketqa-zh-base-query-encoder \
--passage_embedding_model rocketqa-zh-base-para-encoder \
--embedding_dim 768
```
可以使用下面的命令来查看数据:

Expand All @@ -147,6 +150,9 @@ curl -XDELETE http://localhost:9200/dureader_robust_query_encoder
```

#### 3.4.3 启动 RestAPI 模型服务

**注意** semantic_search.yaml里面的检索模型需要与前面使用offline_ann.py建库的时候使用的检索模型一致

```bash
# 指定语义检索系统的Yaml配置文件
export PIPELINE_YAML_PATH=rest_api/pipeline/semantic_search.yaml
Expand Down
Loading