✨ feat: 支持用户自行定义 embedding model #4208

cookieY · 2024-09-29T06:40:02Z

💻 变更类型 | Change Type

🔀 变更说明 | Description of Change

本次变更主要实现了embedding 模型可配置性

通过新增环境变量DEFAULT_EMBEDDING_MODEL实现可自主配置embedding 模型, provider基于现有模型供应商列表

示例:

以 / 为分隔符 openai 为模型提供商, text-embedding-3-small为 embedding 模型

DEFAULT_EMBEDDING_MODEL=openai/text-embedding-3-small

📝 补充信息 | Additional Information

目前已支持 openai / bedrock / ollama / zhipu模型提供商的 embedding 模型

可在agent-runtime下对各模型提供商实现 embeddings 方法从而增加 embedding 模型支持

lobehubbot · 2024-09-30T08:50:54Z

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿

The index object attribute is stored in the data table

Index and object should not be stored in the data table, right? Is there an implementation problem?

From what I have seen so far, it seems that these two values do not have any special logical processing. If there is no special logical processing for these two attributes, I think it is best to extract them. At present, ollama bedrock does not have these two attributes. However, in order to be compatible with existing methods, the packaging needs to be traversed again, which is a waste of performance.

Then I suggest that it is better to directly return the array of embedding, without the need for the two fields object and index. It is recommended that you first make a refactor PR to transform the existing implementation into a version without object and index.

* 'main' of github.com:cookieY/lobe-chat: 🔖 chore(release): v1.20.8 [skip ci] 👷 build: remove unnecessary SSL judgments (lobehub#4219) 🔨 chore: upgrade `shiki` to 1.21.0 (lobehub#4218)

cookieY · 2024-10-01T05:00:43Z

@arvinxx 目前已搞定
DEFAULT_FILES_CONFIG="embedding_model=openai/embedding-text-3-large,reranker_model=cohere/rerank-english-v3.0,query_mode=full_text" 格式的需求。当用户不填写时默认使用 openai/text-embedding-3-small

目前代码个人觉得有些地方想看下有没有更好的优化方案，目前是通过在embeddingChunks，runRecordEvaluation，semanticSearch，semanticSearchForChat方法中添加以下方法获得 embedding model 信息，是否有更好的优化方案

const model =getServerGlobalConfig().defaultEmbed?.embedding_model?.model DEFAULT_EMBEDDING_MODEL.model;
const provider =getServerGlobalConfig().defaultEmbed?.embedding_model?.provider ??DEFAULT_EMBEDDING_MODEL.provider;

输出格式实现逻辑中有一段

 const items: NewEmbeddingsItem[] =
                  embeddings?.map((e) => ({
                    chunkId: chunks[e.index].id,
                    embeddings: e.embedding,
                    fileId: input.fileId,
                    model: model,
                  })) || [];

目前是通过embedding 返回数据中的 index 作为索引查找，当前情况下不管是更改格式还是不更改，都会出现一部分提供商返回的数据需要进行二次包装。(不改，bedrock/ollama需要包装。改，zhipu/openai 需要包装)，我的建议是暂时先不动了

# Conflicts: # src/libs/agent-runtime/zhipu/index.ts # src/server/routers/async/file.ts # src/server/routers/async/ragEval.ts # src/server/routers/lambda/chunk.ts

2. 新增 csv 分快

arvinxx · 2024-10-14T03:20:58Z

@cookieY rebase下重新提交吧

lobehubbot · 2024-10-14T03:21:11Z

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿

@cookieY Rebase and resubmit

cookieY · 2024-10-14T03:59:20Z

@arvinxx 等等我 reset 后重新提交一个新的分支吧

lobehubbot · 2024-10-14T03:59:31Z

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿

@arvinxx Wait for me to reset and resubmit a new branch.

cookieY · 2024-10-14T07:06:16Z

@arvinxx #4370

lobehubbot · 2024-10-14T07:06:28Z

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿

@arvinxx #4370

cookieY and others added 30 commits July 31, 2024 11:09

🔨 chore: support bedrock Claude 3.x function calling

3c3645a

Merge branch 'main' of https://github.com/lobehub/lobe-chat

f0cb696

Merge branch 'main' of https://github.com/lobehub/lobe-chat

1cdfbe3

Merge branch 'main' of https://github.com/lobehub/lobe-chat

29544f3

Merge branch 'lobehub:main' into main

a6f3432

Merge branch 'main' of https://github.com/lobehub/lobe-chat

58a0eef

Merge branch 'main' of https://github.com/lobehub/lobe-chat

4ec751d

Merge branch 'main' of https://github.com/lobehub/lobe-chat

3fbc9d0

Merge branch 'main' of https://github.com/lobehub/lobe-chat

b73b17e

Merge branch 'main' of https://github.com/lobehub/lobe-chat

b28a632

Merge branch 'main' of https://github.com/lobehub/lobe-chat

733225d

Merge branch 'main' of https://github.com/lobehub/lobe-chat

0c60b29

Merge branch 'main' of https://github.com/lobehub/lobe-chat

4e432e5

Merge branch 'main' of https://github.com/lobehub/lobe-chat

95b5597

Merge branch 'main' of https://github.com/lobehub/lobe-chat

3a7b4d7

Merge branch 'main' of https://github.com/lobehub/lobe-chat

22045ad

Merge branch 'main' of https://github.com/lobehub/lobe-chat

bb5d0b1

Merge branch 'main' of https://github.com/lobehub/lobe-chat

ffdfea4

Merge branch 'main' of https://github.com/lobehub/lobe-chat

f726d81

Merge branch 'main' of https://github.com/lobehub/lobe-chat

fa69df0

Merge branch 'main' of https://github.com/lobehub/lobe-chat

968e90c

Merge branch 'main' of https://github.com/lobehub/lobe-chat

e9987bc

Merge branch 'main' of https://github.com/lobehub/lobe-chat

30a90c6

Merge branch 'main' of https://github.com/lobehub/lobe-chat

d0b9457

Merge branch 'main' of https://github.com/lobehub/lobe-chat

92b4192

Merge branch 'main' of https://github.com/lobehub/lobe-chat

d7c143e

Merge branch 'main' of https://github.com/lobehub/lobe-chat

972ca7d

Merge branch 'lobehub:main' into main

5ceb612

📝 docs(bot): Auto sync agents & plugin to readme

7eda9b6

Merge branch 'main' of https://github.com/lobehub/lobe-chat

aee362f

cookieY and others added 5 commits September 30, 2024 18:29

Not finished changing

5874d0b

Merge branch 'main' of github.com:cookieY/lobe-chat

844b6a2

* 'main' of github.com:cookieY/lobe-chat: 🔖 chore(release): v1.20.8 [skip ci] 👷 build: remove unnecessary SSL judgments (lobehub#4219) 🔨 chore: upgrade `shiki` to 1.21.0 (lobehub#4218)

Merge branch 'main' of https://github.com/lobehub/lobe-chat

f8b2b9b

🔨 chore: embedding model use DEFAULT_FILES_CONFIG env

ab38b2a

Merge branch 'main' of https://github.com/lobehub/lobe-chat

83ead4d

actions-user and others added 15 commits October 1, 2024 06:14

Merge branch 'main' of https://github.com/lobehub/lobe-chat

04189c7

Merge branch 'main' of https://github.com/lobehub/lobe-chat

b6f9a67

Merge branch 'main' of https://github.com/lobehub/lobe-chat

0f2f2ba

Merge branch 'main' of https://github.com/lobehub/lobe-chat

90ff382

Merge branch 'main' of https://github.com/lobehub/lobe-chat

780b1a6

Merge branch 'main' of https://github.com/lobehub/lobe-chat

ffd3a3b

Merge branch 'main' of https://github.com/lobehub/lobe-chat

8250a5e

add rerank todo

eee97f0

Merge branch 'main' of https://github.com/lobehub/lobe-chat

c1de72c

Merge branch 'main' of https://github.com/lobehub/lobe-chat

ba4c819

Merge branch 'main' of https://github.com/lobehub/lobe-chat into dev

80c6011

# Conflicts: # src/libs/agent-runtime/zhipu/index.ts # src/server/routers/async/file.ts # src/server/routers/async/ragEval.ts # src/server/routers/lambda/chunk.ts

1. 修复本地代理无法获取 dalle3 图片的问题

5ba5688

2. 新增 csv 分快

Merge branch 'lobehub:main' into main

c2b4a1f

Merge branch 'main' of https://github.com/lobehub/lobe-chat

5eab845

Merge branch 'lobehub:main' into main

3da23ea

cookieY closed this Oct 12, 2024

harrypd mentioned this pull request Oct 23, 2024

[Request] 是否可以提供自定義 bedrock embedding model 的範例 #4457

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

✨ feat: 支持用户自行定义 embedding model #4208

✨ feat: 支持用户自行定义 embedding model #4208

cookieY commented Sep 29, 2024 •

edited

Loading

lobehubbot commented Sep 30, 2024

cookieY commented Oct 1, 2024

arvinxx commented Oct 14, 2024

lobehubbot commented Oct 14, 2024

cookieY commented Oct 14, 2024

lobehubbot commented Oct 14, 2024

cookieY commented Oct 14, 2024

lobehubbot commented Oct 14, 2024

✨ feat: 支持用户自行定义 embedding model #4208

✨ feat: 支持用户自行定义 embedding model #4208

Conversation

cookieY commented Sep 29, 2024 • edited Loading

💻 变更类型 | Change Type

🔀 变更说明 | Description of Change

📝 补充信息 | Additional Information

lobehubbot commented Sep 30, 2024

cookieY commented Oct 1, 2024

arvinxx commented Oct 14, 2024

lobehubbot commented Oct 14, 2024

cookieY commented Oct 14, 2024

lobehubbot commented Oct 14, 2024

cookieY commented Oct 14, 2024

lobehubbot commented Oct 14, 2024

cookieY commented Sep 29, 2024 •

edited

Loading