You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The current implementation of the ML Inference Search Response Processor in OpenSearch has an issue when handling custom prompts that include placeholders for lists or arrays. When a list or array is passed as a parameter, the string representation of the list or array is not properly escaped, leading to incorrect or invalid prompts being sent to the machine learning model.
For example, consider the following scenario:
POST /_plugins/_ml/models/2SwoD5EB6KAJXDLxezto/_predict
{
"parameters": {
"prompt": "\n\nHuman: You are a professional data analysist. You will always answer question based on the given context first. If the answer is not directly shown in the context, you will analyze the data and find the answer. If you don't know the answer, just say I don't know. Context: ${parameters.context}. \n\n Human: please summarize the documents \n\n Assistant:",
"context": ["Dr. Eric Goldberg is a fantastic doctor who has correctly diagnosed every issue that my wife and I have had. Unlike many of my past doctors, Dr. Goldberg is very accessible and we have been able to schedule appointments with him and his staff very quickly. We are happy to have him in the neighborhood and look forward to being his patients for many years to come."]
}
}
In this example, the context parameter is a list containing a single string. When the prompt is constructed using the ${parameters.context} placeholder, the list is not properly escaped, leading to an invalid prompt being sent to the model.
Solution Proposal
To address this issue, we propose adding a toString() method to the HTTP connector in the ML Commons project. This method will be responsible for properly escaping and converting lists or arrays to their string representation when used as placeholders in custom prompts.
The proposed solution will involve the following changes:
Modify the HttpConnector class in the ML Commons project to introduce a new toString() method.
The toString() method should handle the conversion of lists or arrays to their string representation, ensuring that the elements are properly escaped and formatted as a valid JSON string.
Update the MLInferenceSearchResponseProcessor class in the OpenSearch project to use the toString() method when substituting placeholders for lists or arrays in custom prompts.
Update the documentation and examples to reflect the usage of the toString() method for handling custom prompts with lists or arrays.
By implementing this solution, users will be able to provide custom prompts with placeholders for lists or arrays without encountering issues related to improper escaping or formatting. The toString() method in the HTTP connector will ensure that the lists or arrays are correctly converted to their string representation, enabling seamless integration with machine learning models that expect properly formatted prompts.
Example usage:
PUT /_search/pipeline/my_pipeline_request_review_llm
{
"response_processors": [
{
"ml_inference": {
"tag": "ml_inference",
"description": "This processor is going to run llm",
"model_id": "cf46K5EBoVpekzRp8x_3",
"function_name": "REMOTE",
"input_map": [
{
"context": "review"
}
],
"output_map": [
{
"llm_response": "response"
}
],
"model_config": {
"prompt": "\n\nHuman: You are a professional data analysist. You will always answer question based on the given context first. If the answer is not directly shown in the context, you will analyze the data and find the answer. If you don't know the answer, just say I don't know. Context: ${parameters.context.toString()}. \n\n Human: please summarize the documents \n\n Assistant:"
},
"ignore_missing": false,
"ignore_failure": false
}
}
]
}
In this example, the ${parameters.context.toString()} placeholder will be replaced with the properly escaped and formatted string representation of the context parameter, ensuring that the prompt is correctly constructed and sent to the machine learning model.
Do you have any additional context? META Issue](#2839)
[RFC for ML Inference Processors] #2173
The text was updated successfully, but these errors were encountered:
mingshl
changed the title
[RFC] Introducing toString() method in HTTPConnector for handling custom prompts with lists/arrays in ML Inference Search Response Processor
[RFC] Introducing toString() method in HTTPConnector for handling custom prompts with lists/arrays
Sep 5, 2024
Problem Statement
The current implementation of the ML Inference Search Response Processor in OpenSearch has an issue when handling custom prompts that include placeholders for lists or arrays. When a list or array is passed as a parameter, the string representation of the list or array is not properly escaped, leading to incorrect or invalid prompts being sent to the machine learning model.
For example, consider the following scenario:
In this example, the context parameter is a list containing a single string. When the prompt is constructed using the ${parameters.context} placeholder, the list is not properly escaped, leading to an invalid prompt being sent to the model.
Solution Proposal
To address this issue, we propose adding a toString() method to the HTTP connector in the ML Commons project. This method will be responsible for properly escaping and converting lists or arrays to their string representation when used as placeholders in custom prompts.
The proposed solution will involve the following changes:
Modify the HttpConnector class in the ML Commons project to introduce a new toString() method.
The toString() method should handle the conversion of lists or arrays to their string representation, ensuring that the elements are properly escaped and formatted as a valid JSON string.
Update the MLInferenceSearchResponseProcessor class in the OpenSearch project to use the toString() method when substituting placeholders for lists or arrays in custom prompts.
Update the documentation and examples to reflect the usage of the toString() method for handling custom prompts with lists or arrays.
By implementing this solution, users will be able to provide custom prompts with placeholders for lists or arrays without encountering issues related to improper escaping or formatting. The toString() method in the HTTP connector will ensure that the lists or arrays are correctly converted to their string representation, enabling seamless integration with machine learning models that expect properly formatted prompts.
Example usage:
In this example, the ${parameters.context.toString()} placeholder will be replaced with the properly escaped and formatted string representation of the context parameter, ensuring that the prompt is correctly constructed and sent to the machine learning model.
Do you have any additional context?
META Issue](#2839)
[RFC for ML Inference Processors] #2173
The text was updated successfully, but these errors were encountered: