Skip to content

Commit

Permalink
[paddle-pipelines] faq semantic search question answering reamde (#8292)
Browse files Browse the repository at this point in the history
* Update dense qa readme

* Update readme

* Update semantic search readme
  • Loading branch information
w5688414 authored Apr 21, 2024
1 parent 3bb4bb7 commit 4039897
Show file tree
Hide file tree
Showing 7 changed files with 40 additions and 6 deletions.
5 changes: 4 additions & 1 deletion pipelines/examples/FAQ/Install_windows.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ xpack.security.enabled: false
#### 1.4.2 文档数据写入 ANN 索引库
```
# 以DuReader-Robust 数据集为例建立 ANN 索引库
python utils/offline_ann.py --index_name insurance --doc_dir data/insurance --split_answers --delete_index
python utils/offline_ann.py --index_name insurance --doc_dir data/insurance --split_answers --delete_index --query_embedding_model rocketqa-zh-nano-query-encoder --passage_embedding_model rocketqa-zh-nano-para-encoder --embedding_dim 312
```
参数含义说明
* `index_name`: 索引的名称
Expand All @@ -71,6 +71,9 @@ python utils/offline_ann.py --index_name insurance --doc_dir data/insurance --sp
运行结束后,可使用Kibana查看数据

#### 1.4.3 启动 RestAPI 模型服务

**注意** dense_faq.yaml里面的检索模型需要与前面使用offline_ann.py建库的时候使用的检索模型一致

```bash
# 指定FAQ智能问答系统的Yaml配置文件
$env:PIPELINE_YAML_PATH='rest_api/pipeline/dense_faq.yaml'
Expand Down
9 changes: 8 additions & 1 deletion pipelines/examples/FAQ/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -113,7 +113,11 @@ curl http://localhost:9200/_aliases?pretty=true
python utils/offline_ann.py --index_name insurance \
--doc_dir data/insurance \
--split_answers \
--delete_index
--delete_index \
--query_embedding_model rocketqa-zh-nano-query-encoder \
--passage_embedding_model rocketqa-zh-nano-para-encoder \
--embedding_dim 312
```
参数含义说明
* `index_name`: 索引的名称
Expand All @@ -134,6 +138,9 @@ curl http://localhost:9200/insurance/_search
```

#### 3.4.3 启动 RestAPI 模型服务

**注意** dense_faq.yaml里面的检索模型需要与前面使用offline_ann.py建库的时候使用的检索模型一致

```bash
# 指定FAQ智能问答系统的Yaml配置文件
export PIPELINE_YAML_PATH=rest_api/pipeline/dense_faq.yaml
Expand Down
5 changes: 4 additions & 1 deletion pipelines/examples/question-answering/Install_windows.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ xpack.security.enabled: false
#### 1.4.2 文档数据写入 ANN 索引库
```
# 以百科城市数据为例建立 ANN 索引库
python utils/offline_ann.py --index_name baike_cities --doc_dir data/baike
python utils/offline_ann.py --index_name baike_cities --doc_dir data/baike --query_embedding_model rocketqa-zh-nano-query-encoder --passage_embedding_model rocketqa-zh-nano-para-encoder --embedding_dim 312
```
参数含义说明
* `index_name`: 索引的名称
Expand All @@ -82,6 +82,9 @@ Updating embeddings: 10000 Docs [00:16, 617.76 Docs/s]
运行结束后,可使用Kibana查看数据

#### 1.4.3 启动 RestAPI 模型服务

**注意** dense_qa.yaml里面的检索模型需要与前面使用offline_ann.py建库的时候使用的检索模型一致

```bash
# 指定智能问答系统的Yaml配置文件
$env:PIPELINE_YAML_PATH='rest_api/pipeline/dense_qa.yaml'
Expand Down
8 changes: 7 additions & 1 deletion pipelines/examples/question-answering/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,10 @@ curl http://localhost:9200/_aliases?pretty=true
# 以百科城市数据为例建立 ANN 索引库
python utils/offline_ann.py --index_name baike_cities \
--doc_dir data/baike \
--delete_index
--delete_index \
--query_embedding_model rocketqa-zh-nano-query-encoder \
--passage_embedding_model rocketqa-zh-nano-para-encoder \
--embedding_dim 312
```
参数含义说明
* `index_name`: 索引的名称
Expand All @@ -138,6 +141,9 @@ curl -XGET http://localhost:9200/baike_cities/_count
```

#### 3.4.3 启动 RestAPI 模型服务

**注意** dense_qa.yaml里面的检索模型需要与前面使用offline_ann.py建库的时候使用的检索模型一致

```bash
# 指定智能问答系统的Yaml配置文件
export PIPELINE_YAML_PATH=rest_api/pipeline/dense_qa.yaml
Expand Down
3 changes: 3 additions & 0 deletions pipelines/examples/semantic-search/Install_windows.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,9 @@ python utils/offline_ann.py --index_name dureader_robust_query_encoder --doc_dir
运行结束后,可使用Kibana查看数据

#### 1.4.3 启动 RestAPI 模型服务

**注意** semantic_search.yaml里面的检索模型需要与前面使用offline_ann.py建库的时候使用的检索模型一致

```bash
# 指定语义检索系统的Yaml配置文件
$env:PIPELINE_YAML_PATH='rest_api/pipeline/semantic_search.yaml'
Expand Down
8 changes: 7 additions & 1 deletion pipelines/examples/semantic-search/Multi_Recall.md
Original file line number Diff line number Diff line change
Expand Up @@ -123,7 +123,10 @@ python examples/semantic-search/multi_recall_semantic_search_example.py --device
python utils/offline_ann.py --index_name dureader_robust_query_encoder \
--doc_dir data/dureader_dev \
--search_engine elastic \
--delete_index
--delete_index \
--query_embedding_model rocketqa-zh-base-query-encoder \
--passage_embedding_model rocketqa-zh-base-para-encoder \
--embedding_dim 768
```
可以使用下面的命令来查看数据:

Expand All @@ -141,6 +144,9 @@ curl http://localhost:9200/dureader_robust_query_encoder/_search
* `delete_index`: 是否删除现有的索引和数据,用于清空es的数据,默认为false

#### 3.4.2 启动 RestAPI 模型服务

**注意** multi_recall_semantic_search.yaml里面的检索模型需要与前面使用offline_ann.py建库的时候使用的检索模型一致

```bash
# 指定语义检索系统的Yaml配置文件
export PIPELINE_YAML_PATH=rest_api/pipeline/multi_recall_semantic_search.yaml
Expand Down
8 changes: 7 additions & 1 deletion pipelines/examples/semantic-search/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -122,7 +122,10 @@ python utils/offline_ann.py --index_name dureader_robust_query_encoder \
--doc_dir data/dureader_dev \
--search_engine elastic \
--embed_title True \
--delete_index
--delete_index \
--query_embedding_model rocketqa-zh-base-query-encoder \
--passage_embedding_model rocketqa-zh-base-para-encoder \
--embedding_dim 768
```
可以使用下面的命令来查看数据:

Expand All @@ -147,6 +150,9 @@ curl -XDELETE http://localhost:9200/dureader_robust_query_encoder
```

#### 3.4.3 启动 RestAPI 模型服务

**注意** semantic_search.yaml里面的检索模型需要与前面使用offline_ann.py建库的时候使用的检索模型一致

```bash
# 指定语义检索系统的Yaml配置文件
export PIPELINE_YAML_PATH=rest_api/pipeline/semantic_search.yaml
Expand Down

0 comments on commit 4039897

Please sign in to comment.