Feature: Search capability #22

suryyyansh · 2024-08-27T22:55:50Z

This PR aims to add search capabilities to the RAG API Server. This functionality will be enabled through the search feature. It will also leverage the llamaedge-query-server to deliver nuanced search queries and provide the RAG API Server something to fall back on in case the RAG fails due to no RAG query matches.

juntao · 2024-08-27T22:55:53Z

Hello, I am a code review agent on flows.network. Here are my reviews of changed source code files in this PR.

Cargo.toml

Potential issues

Issue: Version locking for all dependencies is not recommended as it can lead to incompatibilities and missed out on security updates. Consider using a version range or the latest compatible version for most dependencies, except for those you have explicitly patched (e.g., "socket2", "reqwest", "hyper", "tokio").
Issue: The "llama-core" dependency is pinned to a specific version ("=0.17.0") and has the "logging" feature enabled, which might not be necessary or desirable in all deployment scenarios. Consider using a newer compatible version of "llama-core" if available, and make the usage of the "logging" feature optional or configurable based on the deployment environment.

Summary of changes

Addition of the "search" feature: The main change is the addition of a new optional feature named "search". This may potentially allow for improved search functionality, but without more context or information about its purpose and implementation, it's hard to provide a detailed explanation.
Updated Git branch reference for 'tokio': The git URL and branch used for the tokio dependency has been updated from an unspecified version to the "v1.3" branch of the "second-state/wasi_tokio" repository.
No other major changes were detected in the patch, such as removal or modification of existing features.

src/backend/ggml.rs

Potential issues

Based on the given Rust code, I have identified three major coding issues that need to be addressed:

Error Handling in models_handler function: The error handling for serializing the model list result is not wrapped within a match statement, which could lead to unhandled panics if the serialization fails.
Error Handling in embeddings_handler and rag_query_handler functions: These two functions have similar issues where error handling for creating HTTP response objects is not properly handled inside a match statement. This can cause unhandled panics if the response creation fails.
Error Handling in files_handler function: The error handling for reading buffer from request body using to_bytes function and converting it into Multipart object is not wrapped within a match statement, which could lead to unhandled panics if any of these operations fails.

Summary of changes

Key Changes:

Added Conditional Compilation (#cfg) for "search" feature to selectively include/exclude sections of code based on the feature flag during compilation. This is likely a new functionality that can perform web searches when enabled.
Introduced web_search_allowed variable under the "search" feature which will be set to true when no points are retrieved, thus enabling web search.
Added logic to send a request to an external LlamaEdge query server for web searches when web_search_allowed is true and the "search" feature flag is enabled. This includes parsing query, building requests, making HTTP requests, handling responses, error checking, and injecting search summary into conversation context if the decision from query server is successful.

src/main.rs

Potential issues

Based on a review of the provided source code for an LlamaEdge-RAG API Server, I've identified three major coding issues:

Error Handling and Logging Issue (Line 286): The error message being logged when parsing the socket address may not be clear to developers or users. It would be beneficial to provide a more descriptive error message that suggests potential solutions, such as "Invalid socket address format. Please use the IP:PORT syntax."
Validation Missing for URL (Line 154): There is no validation check in place for the Qdrant REST API URL provided by the user through the command line arguments. Adding a URL validity check could prevent unnecessary errors and improve the overall reliability of the server.
Server Configuration Issue (Line 208-214): The application does not validate whether the prompt template for chat models supports system messages while the RAG policy is set to "system_message." This can lead to runtime errors, especially if developers or users use chat models that do not support system messages in their templates. Adding a validation check here would ensure consistency and prevent unexpected behavior at runtime.

Summary of changes

Key Changes:

Added a new module utils and imported it. This suggests that the project might include additional utility functions or types.
Introduced a new feature flag "search". Under this flag, added imports for SearchArguments from the utils module and a static variable SEARCH_ARGUMENTS. These changes likely indicate the addition of search functionality to the project.
Modified the CLI arguments by adding two new options: api-key, used to supply an API key for the endpoint, and query-server-url, which is a required URL for the LlamaEdge query server. Also added a new option search-backend with a default value "tavily" that requires the query-server-url. These changes indicate the extension of functionality to include internet searches.

src/utils.rs

Potential issues

In the is_valid_url function, there's no error handling for when parsing a URL fails. It would be beneficial to have some way of indicating that an invalid URL was provided, such as by returning a Result instead of a boolean.
The conversion from LogLevel::Critical to log::LevelFilter::Error in the implementation of From<LogLevel> for log::LevelFilter may not be intentional. It's worth reviewing this mapping to ensure it meets the project's requirements correctly.
The SearchArguments struct uses a string (search_backend) to represent an enumeration of possible search backends. Using an enumeration would make the code safer and more maintainable, as it would prevent invalid values from being used at compile-time.

Summary of changes

Addition of a new struct, "SearchArguments", which is used to handle additional parameters for internet search in the context of a feature called "search".
Introduction of three new fields within "SearchArguments": 'api_key', 'query_server_url', and 'search_backend'. This enables support for different search backends and query servers.
The 'gen_chat_id' function is unaffected by this change as it remains unchanged, aside from potential usage of the new "SearchArguments" struct if it's enabled with the feature flag.

juntao · 2024-08-28T17:46:01Z

@suryyyansh Please fix the CI failure. Thank you!

Cargo.toml

src/main.rs

src/utils.rs

src/backend/ggml.rs

alabulei1 · 2024-09-04T11:03:36Z

Hi @suryyyansh

Could you please write documentation on how to use the search API server? Thanks. https://github.com/LlamaEdge/docs

Is there a GitHub repo called search-api-server? If so, could you please send us your repo link and add README.md about the project? Thanks.

apepkuss · 2024-09-13T01:21:40Z

@suryyyansh Could you please fix the conflicts? Thanks a lot.

… no results Signed-off-by: suryyyansh <[email protected]>

Signed-off-by: suryyyansh <[email protected]>

apepkuss · 2024-10-18T02:17:51Z

@suryyyansh Could you please rebase this PR onto the latest code. Thanks a lot!

suryyyansh marked this pull request as ready for review August 29, 2024 21:53

apepkuss reviewed Sep 2, 2024

View reviewed changes

Cargo.toml Outdated Show resolved Hide resolved

apepkuss requested changes Sep 2, 2024

View reviewed changes

Cargo.toml Show resolved Hide resolved

src/main.rs Show resolved Hide resolved

src/main.rs Outdated Show resolved Hide resolved

src/utils.rs Show resolved Hide resolved

apepkuss requested changes Sep 4, 2024

View reviewed changes

src/backend/ggml.rs Outdated Show resolved Hide resolved

suryyyansh mentioned this pull request Sep 13, 2024

Redundant RUSTFLAGS in release.yml? #23

Open

suryyyansh and others added 11 commits September 14, 2024 19:32

basic search functionality added. Triggers when the RAG query returns…

8957840

… no results Signed-off-by: suryyyansh <[email protected]>

Added search functionality with llamaedge query server

fd54590

Signed-off-by: suryyyansh <[email protected]>

fix CI warnings

9f2c172

Signed-off-by: suryyyansh <[email protected]>

updated search_backend definition

1077a91

Signed-off-by: suryyyansh <[email protected]>

updated summarization endpoint

e48939f

Signed-off-by: suryyyansh <[email protected]>

updated dependencies

9af759b

Signed-off-by: suryyyansh <[email protected]>

Update README.md with search usage instructions

8b3d4bf

Update README.md

94a4c09

Update README.md

727a306

annotated SEARCH_ARGUMENTS and fixed typo

6467af5

Signed-off-by: suryyyansh <[email protected]>

Made search fallback to RAG incase the query server is inaccessible

de6684f

Signed-off-by: suryyyansh <[email protected]>

suryyyansh force-pushed the search-feature branch from 9b21ac6 to de6684f Compare September 14, 2024 15:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: Search capability #22

Feature: Search capability #22

suryyyansh commented Aug 27, 2024

juntao commented Aug 27, 2024 •

edited

Loading

juntao commented Aug 28, 2024

alabulei1 commented Sep 4, 2024

apepkuss commented Sep 13, 2024

apepkuss commented Oct 18, 2024

Feature: Search capability #22

Are you sure you want to change the base?

Feature: Search capability #22

Conversation

suryyyansh commented Aug 27, 2024

juntao commented Aug 27, 2024 • edited Loading

Cargo.toml

Potential issues

Summary of changes

src/backend/ggml.rs

Potential issues

Summary of changes

src/main.rs

Potential issues

Summary of changes

src/utils.rs

Potential issues

Summary of changes

juntao commented Aug 28, 2024

alabulei1 commented Sep 4, 2024

apepkuss commented Sep 13, 2024

apepkuss commented Oct 18, 2024

juntao commented Aug 27, 2024 •

edited

Loading