[Advanced Paste > Paste with AI] Custom Model / Endpoint Selection #32960

nathancartlidge · 2024-05-22T01:33:08Z

Description of the new feature / enhancement

It should be possible to configure the model used (currently fixed as gpt3.5-turbo) and endpoint (currently fixed as OpenAI's) to arbitrary values

Scenario when this would be used?

Sending requests to an alternative AI endpoint (eg a local model, internal company hosted models, alternative ai providers), or ensuring higher-quality conversions (eg by pointing requests at gpt-4o)

Supporting information

Microsoft's documentation appears to suggest that the underlying library used for AI completions supports other libraries, it just needs to be provided with an endpoint.

The currently used model is a hardcoded string in this repository

The text was updated successfully, but these errors were encountered:

minzdrav · 2024-05-23T12:18:19Z

It would be nice to have local models too.
For example: https://ollama.com/
It supports Llama 3, Phi3, and a lot of other models: https://ollama.com/library
C# client: https://github.com/awaescher/OllamaSharp

nathancartlidge · 2024-05-23T14:46:30Z

@minzdrav This would be enabled by my proposed change - Ollama provides partial support for the OpenAI API schema, so you'd be able to point the plugin at your local model

wcwong · 2024-05-23T23:18:18Z

In particular, supporting an Azure OpenAI endpoint would be great first implementation. It would be even better if the Azure implementation supported Managed Identities so we don't end up with the unmanageable mess of API key distribution and rotation.

AmirH-Amini · 2024-06-18T12:07:07Z

supporting Groq would be nice too

htcfreek · 2024-06-29T00:21:29Z

IMPORTANT
Regarding the custom AI model option planned: We should make sure that companies can (still) force opt-out using Group Policies. And I think it would be great, if companies could enforce a list of supported endpoints by Group Policy.

wellmorq · 2024-07-03T10:17:53Z

bump...

alexonpeace · 2024-07-16T14:01:25Z

bump

tjtanaa · 2024-07-19T10:04:29Z

Has anyone started working on this item?

nathancartlidge · 2024-07-19T10:29:21Z

Has anyone started working on this item?

To my knowledge, no

The basics should be pretty easy to implement though! All you'd need to do to allow for a different api-compatible host and model is add two text fields to the settings page (model, URL) and link them in exactly the same way that the chatgpt token field is currently linked into the app (as far as I know, they are just additional inputs into the same function in the associated library)

Obviously making it "Microsoft-quality" will require more work on documentation and integration - see the points @htcfreek has raised in this thread for examples of these

I'd be happy to take a look, but I won't be able to for at least a week so you may be better placed than me.

htcfreek · 2024-07-19T10:44:39Z

@nathancartlidge , @tjtanaa
Directly started to implement this feature: No. But @CrazeXD and @joadoumie are working on #33109 and as I imagine their plans also include this issue or at least depends on it.

@nathancartlidge

Obviously making it "Microsoft-quality" will require more work on documentation and integration - see the points @htcfreek has raised in this thread for examples of these

Are you referring to my comment regarding the Group Policies above.

nathancartlidge · 2024-07-19T10:55:08Z

Yeah, that's what I was referring to! It's a great addition, but also the kind of thing I'd completely overlook when building this sort of feature :)

I hadn't seen that thread before, thanks for bringing it up - from a cursory reading it does look like their work could currently be independent from this, as it seems to be exclusively non-ai features - however, I agree that it could make sense to combine them for the sake of reduced development overheads.

tjtanaa · 2024-07-19T11:00:31Z

Has anyone started working on this item?

To my knowledge, no

The basics should be pretty easy to implement though! All you'd need to do to allow for a different api-compatible host and model is add two text fields to the settings page (model, URL) and link them in exactly the same way that the chatgpt token field is currently linked into the app (as far as I know, they are just additional inputs into the same function in the associated library)

Obviously making it "Microsoft-quality" will require more work on documentation and integration - see the points @htcfreek has raised in this thread for examples of these

I'd be happy to take a look, but I won't be able to for at least a week so you may be better placed than me.

Thank you very much for the suggestions @nathancartlidge .

I have a prototype version which leads me to think there are some changes that I am thinking of making. I would be great if I could get some inputs. I am planning to target local LLM Usecase on PC without dedicated GPU. (In most cases, there is only enough resources to host one model at a time).

I found that the Azure OpenAIClient is not that compatible with some OpenAI-Compatible API. I am thinking of implemented a simple class that invokes the /v1/completions or /v1/chat/completions.
Moreover, many opensource models are mainly chat/instruct models. Chat completion endpoint handles the prompt template of the models. Thus, I am thinking of adding an additional function (private Response<ChatCompletions> GetAIChatCompletion(string systemInstructions, string userMessage)) to class AICompletionHelper that calls chat completion endpoint instead for other custom endpoints.
Within the private Response<Completions> GetAICompletion(string systemInstructions, string userMessage), the model is automatically through /v1/models endpoints.
On the setting page, the users is able to see which endpoint it is pointing to it. (Default to OpenAI service endpoint)

Other feature improvements would be adding some common usecases as quick access on menu, such as

Explain
Summarise
Keypoint

Moreover, I also saw that there is a branch dev/crloewen/advancedpaste-v2improvements that has been adding more features in. Moreover, this feature has been previewed in the official channel (e.g. I saw a youtube video about it). But this branch seems to have been stale for 2 months.
If I am going to start working on it, do I start from this branch?

I am new to Group Policies. How is this feature implemented?

htcfreek · 2024-07-19T11:09:26Z

@nathancartlidge , @tjtanaa
I think we should ask the core team (@crutkas , @ethanfangg, @jaimecbernardo) and of course @craigloewen-msft if it makes sense that you spent time on it or if they are already working on it.

@tjtanaa
We can assist later with implementing the Group Policies. At the end you have to define them in xml fikes, read the registry value /common/utils/gpo.h and act based on the value in the module code. As there already exists a policy to disable paste with ai, you can look at its implementation.

tjtanaa · 2024-07-23T07:10:38Z

Tuning in here! I started taking a swing at it and have it pretty much working well with Ollama. I used Phi-3 mini and the results are great on my Nvidia 4090.

Happy to share my results if interested.

Is the development in your fork?

CrazeXD · 2024-07-23T17:32:12Z

Tuning in here! I started taking a swing at it and have it pretty much working well with Ollama. I used Phi-3 mini and the results are great on my Nvidia 4090.

Happy to share my results if interested.

Once this is done, we can begin merging the idea of custom presets discussed in #33109 with the different AI models, as well as the offline features that are directly baked in.

vkulk094 · 2024-08-02T23:15:46Z

I tried adding my Google Gemini API key to the AI paste feature but it does not work. I might just try the OpenAI key for 5 bucks and see how this feature works.

Really enjoying PowerToys thus far, thank you!

CrazeXD · 2024-08-03T06:53:49Z

I tried adding my Google Gemini API key to the AI paste feature but it does not work. I might just try the OpenAI key for 5 bucks and see how this feature works.

Really enjoying PowerToys thus far, thank you!

This feature has not been added yet.

CrazeXD · 2024-08-10T19:26:49Z

@elebumm Could you please share your code? I was hoping to start working on some of the functionality mentioned in #33109 .

Chen1Plus · 2024-09-01T03:52:00Z

As clipboard may contain important data, such as account and password. Local model should be the default option. Hope to see the feature.

geekloper · 2024-09-06T15:36:10Z

Is there any update on this feature? I’m really looking forward to it! 😊

Aniket-Bhat · 2024-09-11T07:35:55Z

Wouldn't it be easier to add support for OpenRouter? That should cover most of the popular AI models, and make things easier on the integration too, yes?

CrazeXD · 2024-09-11T16:21:51Z

Using OpenRouter requires you to use their credit platform I believe. This would not be useful to people who wish to use their own API keys.

Zaazu · 2024-09-12T02:07:57Z

I'd like to also advocate for Ollama support.

k10876 · 2024-09-23T09:19:28Z

Hi there,

I've personally made a version which fixes the AI to use Aliyun Qwen, one of the AIs available in China. Please note that my edits still do NOT offer custom options, which is somehow beyond my capabilities.

Due to the fact that I'm poor-experienced, I'll not be making a pull request since it's far from Microsoft's standards. I look forward to seeking expert developers to help with this issue.

If you are interested in my edits, please go to https://github.com/k10876/PowerToys/tree/development-qwen (I'll offer a binary installer if I have time). The interface shows OpenAI but it's actually Qwen-based. Simply plug your API keys and get ready to enjoy!

nightkall · 2024-10-23T18:25:32Z

Google Gemini 1.5 Flash (Sep) is the fastest AI and has a free API.

It would also be nice to add some AI actions by default or functionality like in Writing Tools:

Where you select a text, you choose or write a task for the AI and then it replaces the selected text with the processed AI text automatically. (Instead of copying and choosing the option to paste the LLM-processed text). You can also translate text prompting 'in Spanish' (and whatever you want).

Kiansjet · 2024-10-26T19:25:46Z

+1 on this. Quickest and minimal effort change since the team is probably busy is just let the user override the api root url. Single settings entry + some concat + maybe a bit more robust error handling since endpoint isnt fixed now.

Then if the user wants to redirect the request to a local llm server or a middleman script that proxies the request to a different model or whatever they can.

Over time though, what everyone else above said.

As a temporary solution, I found software like Fiddler Classic can be used to manually redirect calls to openai's api to anywhere you want. I'm not good with regex and I rushed it but this works:

Runs just fine on a local gemma-2 9B IQ4 XS on LM Studio Server, but I found some other models may have issues complying with the instruction to not write too much garbage. Llama in particular just kept trying to write me python to complete the task after it already completed the task.

bj114514 · 2024-10-27T10:40:56Z

I also need this feature very much. My network does not allow me to use OpenAI services, and I often have to work offline. I think we can add API address and model name options so that I can use models from Olama and other service providers.

riedel · 2024-11-03T09:08:48Z

Hi, I just want to chime in. Cool feature that I unfortunately cannot use right now...

Having OpenAI-Endpoints as the only option, is IMHO a red flag for many commercial users in Europe due to GDPR-concerns (especially coupling sth to copy & paste).

I also wonder if such bundling is not rather anti-competitive in the end. Similar to the availabilities of different models in GitHub Co-Pilot, it would seem good practice to offer different endpoints.

HaoTian22 · 2024-11-06T09:47:02Z

I also noticed that Advanced Paste is incompatible with some alternative endpoint formats.
Some API needs to post "messages": [{"role": "user", "content": "xxxx"}] instead of "prompt": ["xxxxx"], and responses with "text" instead of "message", otherwise the server might respond with 500

I wrote a script for mitmproxy, with the help of LLM. Maybe it can solve this issue.

Start command: mitmdump --mode local:PowerToys.AdvancedPaste.exe --mode upstream:http://proxy.server -s route.py

import json
from mitmproxy import http

def request(flow: http.HTTPFlow) -> None:
    if flow.request.host == "api.openai.com" and flow.request.method == "POST" and flow.request.headers.get("content-type", "").startswith("application/json"):
        try:
            # Add your alternative endpoint
            flow.request.host = "your.alternative.endpoint"
            flow.request.path = "/v1/chat/completions"
            # flow.request.headers["authorization"] = "Bearer sk-xxx" # token
            flow.request.headers["http-referer"] = "http://localhost:8080/my_great_app"

            request_data = json.loads(flow.request.get_text())

            request_data["model"] = "gpt-4o-mini"
            # API Compatible
            if "prompt" in request_data:
                request_data["messages"] = [{"role": "user", "content": request_data["prompt"][0]}]
                del request_data["prompt"]


            flow.request.set_text(json.dumps(request_data))
            pass
        except json.JSONDecodeError:
            pass

# Forbid Stream Effect
def responseheaders(flow):
    flow.response.stream = False


def response(flow: http.HTTPFlow) -> None:
    # Process  your.alternative.endpoint's response
    if flow.request.host == "your.alternative.endpoint" and flow.response.headers.get("content-type", "").startswith("application/json"):
        try:
            response_data = json.loads(flow.response.get_text())

            # API Compatible
            if "choices" in response_data:
                for choice in response_data["choices"]:
                    if "message" in choice:
                        choice["text"] = choice["message"].get("content", "")
                        del choice["message"]
            
            response_data["usage"] = {"prompt_tokens": 88,"completion_tokens": 27, "total_tokens": 115}
            
            flow.response.set_text(json.dumps(response_data))
        except json.JSONDecodeError:
            pass

an303042 · 2024-11-07T16:19:51Z

+1 for ollama support on localhost/IP on local network.

nathancartlidge added the Needs-Triage For issues raised to be triaged and prioritized by internal Microsoft teams label May 22, 2024

htcfreek added Idea-Enhancement New feature or request on an existing product Product-Advanced Paste Refers to the Advanced Paste module labels May 22, 2024

nathancartlidge mentioned this issue May 23, 2024

[Advanced Paste] Support for Azure OpenAI and others #33007

Closed

htcfreek mentioned this issue May 24, 2024

Adding Custom OPENAI API URL Configuration Option #33041

Closed

htcfreek added the Tracker Issue that is used to collect multiple sub-issues about a feature label May 24, 2024

htcfreek mentioned this issue May 30, 2024

LLM support with "paste with AI" function #33163

Closed

nathancartlidge mentioned this issue May 30, 2024

高级粘贴 about 自定义ai #33134

Closed

htcfreek mentioned this issue Jun 3, 2024

Localizes the Advanced Paste function #33115

Closed

joadoumie removed the Needs-Triage For issues raised to be triaged and prioritized by internal Microsoft teams label Jun 7, 2024

joadoumie mentioned this issue Jun 7, 2024

Paste As One Line #25916

Closed

htcfreek mentioned this issue Jun 11, 2024

Google Gemini API Key instead of OpenAI for Advanced Paste #33308

Closed

htcfreek mentioned this issue Jun 18, 2024

Add support of Azure Open AI for Paste with ai #33425

Closed

github-actions bot mentioned this issue Jul 6, 2024

Advanced Past Suggestion #33685

Closed

davidegiacometti mentioned this issue Jul 25, 2024

Special paste with AI support for Ollama #34008

Closed

davidegiacometti mentioned this issue Aug 8, 2024

Paste with AI does not allow Azure Open AI provider or other AI API inference providers that follow Open AI API schema #34211

Closed

davidegiacometti mentioned this issue Aug 18, 2024

Add Local Model Support for "Paste with AI" Functionality #34310

Closed

This was referenced Sep 27, 2024

Pasting with AI: if you're using AI, in my view it ought to be free and/or private #35075

Closed

Some minor improvements #34630

Closed

Paste with AI should use Copilot, not OpenAI #32940

Closed

crutkas mentioned this issue Oct 18, 2024

The APIs of other AIs can be configured. #35469

Closed

This was referenced Oct 29, 2024

Advanced Paste with other AI options #35511

Closed

OpenAI #34526

Closed

Enhancing AI Paste Functionality with Third-Party OpenAI API Support #34458

Closed

This was referenced Nov 5, 2024

Advanced Copy and Paste only supports OpenAI NOT AZURE OPEN AI SERVICES #35752

Closed

Advanced Paste | More AI API support #35785

Closed

ethanfangg mentioned this issue Nov 7, 2024

Please allow Advanced Paste to work with ChatGPT inside Azure OpenAI Services #35826

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Advanced Paste > Paste with AI] Custom Model / Endpoint Selection #32960

[Advanced Paste > Paste with AI] Custom Model / Endpoint Selection #32960

nathancartlidge commented May 22, 2024

minzdrav commented May 23, 2024 •

edited

Loading

nathancartlidge commented May 23, 2024

wcwong commented May 23, 2024

AmirH-Amini commented Jun 18, 2024

htcfreek commented Jun 29, 2024

wellmorq commented Jul 3, 2024

alexonpeace commented Jul 16, 2024

tjtanaa commented Jul 19, 2024

nathancartlidge commented Jul 19, 2024 •

edited

Loading

htcfreek commented Jul 19, 2024

nathancartlidge commented Jul 19, 2024

tjtanaa commented Jul 19, 2024 •

edited

Loading

htcfreek commented Jul 19, 2024 •

edited

Loading

tjtanaa commented Jul 23, 2024

CrazeXD commented Jul 23, 2024

vkulk094 commented Aug 2, 2024

CrazeXD commented Aug 3, 2024

CrazeXD commented Aug 10, 2024

Chen1Plus commented Sep 1, 2024

geekloper commented Sep 6, 2024

Aniket-Bhat commented Sep 11, 2024

CrazeXD commented Sep 11, 2024

Zaazu commented Sep 12, 2024

k10876 commented Sep 23, 2024

nightkall commented Oct 23, 2024 •

edited

Loading

Kiansjet commented Oct 26, 2024 •

edited

Loading

bj114514 commented Oct 27, 2024

riedel commented Nov 3, 2024

HaoTian22 commented Nov 6, 2024 •

edited

Loading

an303042 commented Nov 7, 2024

[Advanced Paste > Paste with AI] Custom Model / Endpoint Selection #32960

[Advanced Paste > Paste with AI] Custom Model / Endpoint Selection #32960

Comments

nathancartlidge commented May 22, 2024

Description of the new feature / enhancement

Scenario when this would be used?

Supporting information

minzdrav commented May 23, 2024 • edited Loading

nathancartlidge commented May 23, 2024

wcwong commented May 23, 2024

AmirH-Amini commented Jun 18, 2024

htcfreek commented Jun 29, 2024

wellmorq commented Jul 3, 2024

alexonpeace commented Jul 16, 2024

tjtanaa commented Jul 19, 2024

nathancartlidge commented Jul 19, 2024 • edited Loading

htcfreek commented Jul 19, 2024

nathancartlidge commented Jul 19, 2024

tjtanaa commented Jul 19, 2024 • edited Loading

htcfreek commented Jul 19, 2024 • edited Loading

tjtanaa commented Jul 23, 2024

CrazeXD commented Jul 23, 2024

vkulk094 commented Aug 2, 2024

CrazeXD commented Aug 3, 2024

CrazeXD commented Aug 10, 2024

Chen1Plus commented Sep 1, 2024

geekloper commented Sep 6, 2024

Aniket-Bhat commented Sep 11, 2024

CrazeXD commented Sep 11, 2024

Zaazu commented Sep 12, 2024

k10876 commented Sep 23, 2024

nightkall commented Oct 23, 2024 • edited Loading

Kiansjet commented Oct 26, 2024 • edited Loading

bj114514 commented Oct 27, 2024

riedel commented Nov 3, 2024

HaoTian22 commented Nov 6, 2024 • edited Loading

an303042 commented Nov 7, 2024

minzdrav commented May 23, 2024 •

edited

Loading

nathancartlidge commented Jul 19, 2024 •

edited

Loading

tjtanaa commented Jul 19, 2024 •

edited

Loading

htcfreek commented Jul 19, 2024 •

edited

Loading

nightkall commented Oct 23, 2024 •

edited

Loading

Kiansjet commented Oct 26, 2024 •

edited

Loading

HaoTian22 commented Nov 6, 2024 •

edited

Loading