-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Function calling results in bad state for all LLM models #2293
Comments
I tried your first example on
I had to add this to the yaml file...
that helped a little, but then I ran into another issue...
I didn't really know what this meant either, so I checked the logs, seems its throwing an error now...
It would seem that function calls are still not working correctly. edit: AFAIK this model supports functions, but I can also try something like Cheers |
Thanks for testing @bunder2015 , @mudler merged some changes that enable better grammar management (#2328), and i've been testing it. However running into some issues so documenting them here. Model: huggingface://TheBloke/NeuralHermes-2.5-Mistral-7B-GGUF/neuralhermes-2.5-mistral-7b.Q8_0.gguf function:
# disable injecting the "answer" tool
disable_no_action: true
# This allows the grammar to also return messages
grammar_message: true
# Suffix to add to the grammar
grammar_prefix: '<tool_call>\n'
return_name_in_function_response: true
# Without grammar uncomment the lines below
# Warning: this is relying only on the capability of the
# LLM model to generate the correct function call.
#no_grammar: true
# json_regex_match: "(?s)<tool_call>(.*?)</tool_call>"
replace_results:
"<tool_call>": ""
"\'": "\""
"Processing user message.": ""
""": "\""
": True": ": \"True\""
": False": ": \"False\"" First issue is around unmarshalling the returned JSON object, it seems to be a bit fragile:
The second issue is that in your commit you suggest using the regex replacement of
Claude Sonnet suggests this change to the grammar to support this:
I believe the above grammar will also solve the last error I'm seeing:
I'm working around this issue right now with these replace strings: ": True": ": \"True\""
": False": ": \"False\"" |
Here's a suggestion from Claude Sonnet around catching the unmarshalling error and trying to unmarshall it into an array. This doesn't solve the problem of being tolerant towards mildly malformed JSON. I think that would require using another library like ===================== To make the code more robust and handle different cases where the LLM result can be a single object or an array of objects, we can modify the func ParseFunctionCall(llmresult string, functionConfig FunctionsConfig) []FuncCallResults {
log.Debug().Msgf("LLM result: %s", llmresult)
for k, v := range functionConfig.ReplaceResults {
log.Debug().Msgf("Replacing %s with %s", k, v)
llmresult = strings.ReplaceAll(llmresult, k, v)
}
log.Debug().Msgf("LLM result(processed): %s", llmresult)
multipleResults := functionConfig.ParallelCalls
useGrammars := !functionConfig.NoGrammar
functionNameKey := "function"
if functionConfig.FunctionName {
functionNameKey = "name"
}
results := []FuncCallResults{}
returnResult := func(s string) (name, arguments string, e error) {
// As we have to change the result before processing, we can't stream the answer token-by-token (yet?)
var ss map[string]interface{}
// This prevent newlines to break JSON parsing for clients
s = utils.EscapeNewLines(s)
err := json.Unmarshal([]byte(s), &ss)
if err != nil {
log.Warn().Err(err).Str("escapedLLMResult", s).Msg("unable to unmarshal llm result")
}
log.Debug().Msgf("Function return: %s %+v", s, ss)
// The grammar defines the function name as "function", while OpenAI returns "name"
func_name, ok := ss[functionNameKey]
if !ok {
return "", "", fmt.Errorf("unable to find function name in result")
}
// Similarly, while here arguments is a map[string]interface{}, OpenAI actually want a stringified object
args, ok := ss["arguments"] // arguments needs to be a string, but we return an object from the grammar result (TODO: fix)
if !ok {
return "", "", fmt.Errorf("unable to find arguments in result")
}
d, _ := json.Marshal(args)
funcName, ok := func_name.(string)
if !ok {
return "", "", fmt.Errorf("unable to cast function name to string")
}
return funcName, string(d), nil
}
// if no grammar is used, we have to extract function and arguments from the result
if !useGrammars {
// the response is a string that we have to parse
result := make(map[string]string)
if functionConfig.ResponseRegex != "" {
// We use named regexes here to extract the function name and arguments
// obviously, this expects the LLM to be stable and return correctly formatted JSON
// TODO: optimize this and pre-compile it
var respRegex = regexp.MustCompile(functionConfig.ResponseRegex)
match := respRegex.FindStringSubmatch(llmresult)
for i, name := range respRegex.SubexpNames() {
if i != 0 && name != "" && len(match) > i {
result[name] = match[i]
}
}
// TODO: open point about multiple results and/or mixed with chat messages
// This is not handled as for now, we only expect one function call per response
functionName := result[functionNameKey]
if functionName == "" {
return results
}
} else if functionConfig.JSONRegexMatch != "" {
//re := regexp.MustCompile(`(?s)<tool_call>(.*?)</tool_call>`)
//m:= re.FindStringSubmatch(`<tool_call>{ foo barr }</tool_call>`)
// We use a regex to extract the JSON object from the response
var respRegex = regexp.MustCompile(functionConfig.JSONRegexMatch)
match := respRegex.FindStringSubmatch(llmresult)
if len(match) < 2 {
return results
}
funcName, args, err := returnResult(match[1])
if err != nil {
return results
}
return append(results, FuncCallResults{Name: funcName, Arguments: args})
} else {
funcName, args, err := returnResult(llmresult)
if err != nil {
return results
}
return append(results, FuncCallResults{Name: funcName, Arguments: args})
}
return append(results, FuncCallResults{Name: result[functionNameKey], Arguments: result["arguments"]})
}
// with grammars
// Handle the case where the LLM result is a single object or an array of objects
var ss []map[string]interface{}
s := utils.EscapeNewLines(llmresult)
err := json.Unmarshal([]byte(s), &ss)
if err != nil {
// If the LLM result is a single object, try unmarshaling it into a single map
var singleObj map[string]interface{}
err = json.Unmarshal([]byte(s), &singleObj)
if err != nil {
log.Warn().Err(err).Str("escapedLLMResult", s).Msg("unable to unmarshal llm result")
return results
}
ss = []map[string]interface{}{singleObj}
}
for _, s := range ss {
func_name, ok := s[functionNameKey]
if !ok {
continue
}
args, ok := s["arguments"]
if !ok {
continue
}
d, _ := json.Marshal(args)
funcName, ok := func_name.(string)
if !ok {
continue
}
results = append(results, FuncCallResults{Name: funcName, Arguments: string(d)})
}
return results
} The main changes are:
With these changes, the code should be more robust and able to handle cases where the LLM result is a single object or an array of objects. |
Taking another look at this, I haven't had a look at the new changes yet. I was able to cobble together a small coder chatbot in C++ and started adding functions to it. When I tested sending the functions in the json request, I got a blank response to my first prompt, and the second prompt ran the function for no reason, stuffing the description of the of the function call into the arguments. 🤣 (edit: or maybe that's why the first one returned blank?)
Give me a few more days and I might pull git head and see if it's any better. Cheers |
Fix was to add |
LocalAI version:
v2.13.0-cublas-cuda12-ffmpeg
Environment, CPU architecture, OS, and Version:
kubernetes helm release: https://github.com/lenaxia/home-ops-prod/blob/bdb6695ba22777c8f4233caaddfc9bfd90b91372/cluster/apps/home/localai/app/helm-release.yaml
Describe the bug
Doing regular chatting poses no problem, however any time a function call is defined the chat very quickly goes into a bad state where the LLM just repeats back to me what I type in, or very rarely stops responding entirely. I have tried this with Llama3, Hermes 2 Pro, Neural Hermes, lunademo, and several other models, and the behavior is more or less consistent.
To Reproduce
env vars:
Deploy helm releaese: https://github.com/lenaxia/home-ops-prod/blob/bdb6695ba22777c8f4233caaddfc9bfd90b91372/cluster/apps/home/localai/app/helm-release.yaml
Load either Neural Hermes: https://github.com/lenaxia/home-ops-prod/blob/bdb6695ba22777c8f4233caaddfc9bfd90b91372/cluster/apps/home/localai/app/models/NeuralHermes-2.5-Mistral-7b.yaml
Or Llama3 Instruct: https://github.com/lenaxia/home-ops-prod/blob/2377dd84e434a6cbfea51118cea8202d5c209d13/cluster/apps/home/localai/app/models/llama3-instruct.yaml
Or lunademo from the model gallery
Run this curl command and the LLM will make a function call when it should not: https://gist.github.com/lenaxia/388082e0e98beb91f2447073d0d6cd63
Expected behavior
I expect the model to properly answer the conversation and be able to maintain conversation on an ongoing basis.
Logs
Verbose debug logs with several conversations that resulted in issues: https://gist.github.com/lenaxia/4b02a9cdd72470370b37a33b998b4b42
For easier reproduction I've extracted several raw requests that caused issues (including the one above in the how to reproduce section).
Example requests that caused issues. You can run these by putting them into a JSON file and running
curl $LOCALAI/v1/chat/completions -H "Content-Type: application/json" -d @<filename> | jq
Neural Hermes through MemGPT #1, makes a function call when it should not
Lunademo through Home Assistant Assist, this example results in LLM generating an entire conversation stream.
Neural Hermes through MemGPT #2, results in LLM just repeating back what the user sent
Additional context
The text was updated successfully, but these errors were encountered: