-
Notifications
You must be signed in to change notification settings - Fork 382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat: Add support for multi languages #69
Conversation
src/utils.ts
Outdated
@@ -64,12 +64,26 @@ const sanitizeMessage = (message: string) => message.trim().replace(/[\n\r]/g, ' | |||
|
|||
const promptTemplate = 'Write an insightful but concise Git commit message in a complete sentence in present tense for the following diff without prefacing it with anything:'; | |||
|
|||
const getTranslatedPrompt = (lang: string) => { | |||
// List obtained by asking chatGPT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll need to reproduce it in the future. What prompt did you use?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the prompt: Give me a list from the supported langs of GPT-3. The list must be in ISO with 2 chars in format of a JS Array
, I also added it as a comment in the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if this is reliable. I get a completely different response:
["af", "sq", "am", "ar", "hy", "az", "eu", "be", "bn", "bs", "bg", "ca", "ceb", "ny", "zh", "zh-TW", "co", "hr", "cs", "da", "nl", "eo", "et", "tl", "fi", "fr", "fy", "gl", "ka", "de", "el", "gu", "ht", "ha", "haw", "he", "hi", "hmn", "hu", "is", "ig", "id", "ga", "it", "ja", "jv", "kn", "kk", "km", "rw", "ko", "ku", "ky", "lo", "la", "lv", "lt", "lb", "mk", "mg", "ms", "ml", "mt", "mi", "mr", "mn", "my", "ne", "no", "or", "ps", "fa", "pl", "pt", "pa", "ro", "ru", "sm", "gd", "sr", "st", "sn", "sd", "si", "sk", "sl", "so", "es", "su", "sw", "sv", "tg", "ta", "tt", "te", "th", "tr", "tk", "uk", "ur", "ug", "uz", "vi", "cy", "xh", "yi", "yo", "zu"]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see 🤔 Maybe we can use a short version of the supported langs, a more complete list that the first version, while we figure out how to get an official list.
For example:
['en', 'zh', 'es', 'de', 'fr', 'ja', 'it', 'ru', 'hi', 'pt', 'ko', 'ar']
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I asked it what languages it's proficient in and it seems like the answer is always "a little of everything":
As an AI language model, I have been trained on a massive amount of text data in many different languages, so I can understand and generate text in many different languages to some extent.
However, it's important to note that my proficiency and accuracy in a specific language can depend on various factors, including the amount and quality of the training data available for that language, the specific task at hand, and the complexity of the language itself.
Overall, I would say that I am most proficient and accurate in English, as it is the language that I have been trained on the most and is the language in which I have the largest amount of data available for training. However, I can also understand and generate text in many other languages to some extent, including but not limited to Spanish, French, German, Italian, Portuguese, Russian, Chinese, Japanese, and Korean.
Maybe we can move forward without a list. Since users are using their own keys, it's not like there's anything to be exploited or reverse-engineered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, let me do the changes to remove it
src/utils.ts
Outdated
@@ -64,12 +64,28 @@ const sanitizeMessage = (message: string) => message.trim().replace(/[\n\r]/g, ' | |||
|
|||
const promptTemplate = 'Write an insightful but concise Git commit message in a complete sentence in present tense for the following diff without prefacing it with anything:'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Convert this to an arrow function that accepts the language and the diff
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
Sorry, thinking about it more.... I don't think people will want to pass in the language every time. We should make this a config instead. Let me get this in first: #71 |
Don't worry 👍 when that PR is merged, I'll do the necessary changes |
Closing in favor of #96 |
I decided to create a new PR to have a cleaner commit history. Last PR was closed: #48
This PR resolves #30
The supported languages are:
'en', 'es', 'jp', 'zh', 'de', 'fr', 'it', 'pt'
. More languages can be added according to the support that GTP offers.English by default:
Using
--lang
option:Using
-l
option: