Query about ideal drive:\location for install to reduce conflicts #49

Magenta-6 · 2023-06-25T02:01:31Z

This is not an issue as such but a request for advice about where to set up an installation.

Firstly, your idea to create a cross-platform tts tool is extremely valuable, especially with a one-click installer. however
I am hesitant to just set up a folder in my C:\Users\NAME drive and expect everything to go flawlessly.

Over the past year the advent of Ai generated tti and LLM's has been an exciting journey.
I am one of those people who has gone through a steep learning curve getting to grips with virtual python environments and do not fully understand the intricacies of how the multitudes of modules inter-relate.
I do know enough to know that that conflicts can occur between them and that a lot of time can be spent un-installing and reinstalling them.

At present I have successfully set up several tts applications: Coqui, Silero, & Bark.
I have also attempted to set up Tortoise and AudioCraft, but have failed to troubleshoot installation errors. (Note1. below)

I first started using Silero in oobabooga, which worked fine.
However the voices were v. limited so I set up coqui and bark to get a better variety of voices and accents.
And the possibility of voice cloning/training is extremely appealing also.

All that is a long way to ask the question:

Is there an ideal place to install your tts-webui that will not create conflicts with other installations?
Should I un-install all the other applications first?
Is it likely that the installation will add duplicate versions of torch, conda and other dependencies that are already installed?

BACKGROUND
I am using Windows 10, RTX 4070Ti, CUDA 11.7, Anaconda3, conda 22.9.0, Torch 2.0.1
2023-06-25_pip list.txt

Here is a list of the folders where I have set up various applications:
Initially I installed tts so that they could be used with oobabooga and later Silly Tavern
C:\SuperStableDiffusion2.0\stable-diffusion-webui
C:\SuperStableDiffusion2.0\oobabooga-windows
C:\SuperStableDiffusion2.0\Bark\bark-gui
C:\SuperStableDiffusion2.0\CoquiTTS\TTS

Later I began setting up applications in the User\ directory
C:\Users\ABC\Audiocraft\audiocraft-main
C:\Users\ABC\Bark-tts\bark_win\bark-gui
C:\Users\ABC\Silero-tts
C:\Users\ABC\coqui-tts
C:\Users\ABC\tortoise-tts

(Note 1.) Issues Raised:
neonbjb/tortoise-tts#468
facebookresearch/audiocraft#123

rsxdalv · 2023-06-25T10:41:33Z

Ok so, short answer - it should not matter, I'd say C:\SuperStableDiffusion2.0\ is a good directory to keep it organized, for example, if you want to start/stop using some of them.

The technical background is - the one click installer is related[1] to oobabooga's one click installer, and it installs everything. Since I have some development tools installed on my machine I might have a blind spot, but I know that it installs 1. python 2. conda 3. all the packages, including torch and drivers, 4. git (not sure) and everything is contained within the directory you chose.

If there are conflicts - that would be a bug. The installer is optimized for being standalone and non-conflicting, hence why the installation is slower and bulkier; however, it ought to be more stable and robust.

As for the pip list, these pip packages shouldn't affect the internal virtual environment.

[1] - Although I saw that there's a second installer and that they aren't always using this one, it's a bit confusing, but the original target was to install oobabooga's UI.

Magenta-6 · 2023-06-26T02:44:53Z

@rsxdalv - Much appreciate the advice.
Having mucked up things previously, I've become a little wary of simply "pip installing" everything on offer.

Install went absolutely perfectly!
Took about 15 mins, but not a single problem so far.
You have made an exceptional one stop shop with this!!

THANK YOU !!

Kind regards
Magenta-6

rsxdalv · 2023-06-26T06:39:34Z

What you say is also correct, simply pip installing will usually result in problems, that's why there's almost always a virtual environment.

…

On Mon, Jun 26, 2023, 5:45 AM Magenta-6 ***@***.***> wrote: @rsxdalv <https://github.com/rsxdalv> - Much appreciate the advice. Having mucked up things previously, I've become a little wary of simply "pip installing" everything on offer. Kind regards Magenta-6 — Reply to this email directly, view it on GitHub <#49 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABTRXI42ASAW7ZFPH5XLRP3XNDZS7ANCNFSM6AAAAAAZS232DI> . You are receiving this because you were mentioned.Message ID: ***@***.***>

Magenta-6 · 2023-06-27T04:30:47Z

@rsxdalv had a great day yesterday looking at the functional aspects of the various programs.
Everything seemed to work fine.
However today I cannot get it to run. - I am simply d'clicking the start_windows.bat file

Error as below:
++++++++++++++++++++++++++++++++++++++++++++++++++
A matching Triton is not available, some optimizations will not be enabled.
Error caught was: No module named 'triton'
Loading extensions:
Loaded extension: callback_save_generation_ffmpeg
Loaded extension: callback_save_generation_musicgen_ffmpeg
Loaded extension: empty_extension
Loaded 2 callback_save_generation extensions.
Loaded 1 callback_save_generation_musicgen extensions.
Loading Bark models
- Text Generation: GPU: Yes, Small Model: Yes
- Coarse-to-Fine Inference: GPU: Yes, Small Model: Yes
- Fine-tuning: GPU: Yes, Small Model: No
- Codec: GPU: Yes
2023-06-27 16:19:08 | INFO | fairseq.tasks.text_to_speech | Please install tensorboardX: pip install tensorboardX
2023-06-27 16:19:08 | WARNING | xformers | Triton is not available, some optimizations will not be enabled.
This is just a warning: No module named 'triton'
Traceback (most recent call last):
File "C:\SuperStableDiffusion2.0\TTS-4.0\tts-generation-webui\server.py", line 87, in
history_tab(register_use_as_history_button)
File "C:\SuperStableDiffusion2.0\TTS-4.0\tts-generation-webui\src\history_tab\main.py", line 56, in history_tab
return history_content(
File "C:\SuperStableDiffusion2.0\TTS-4.0\tts-generation-webui\src\history_tab\main.py", line 81, in history_content
history_list_as_gallery = gr.Gallery(value=get_wav_files_img(directory))
File "C:\SuperStableDiffusion2.0\TTS-4.0\installer_files\env\lib\site-packages\gradio\components.py", line 4403, in init
IOComponent.init(
File "C:\SuperStableDiffusion2.0\TTS-4.0\installer_files\env\lib\site-packages\gradio\components.py", line 215, in init
else self.postprocess(initial_value)
File "C:\SuperStableDiffusion2.0\TTS-4.0\installer_files\env\lib\site-packages\gradio\components.py", line 4468, in postprocess
file_path = self.make_temp_copy_if_needed(img)
File "C:\SuperStableDiffusion2.0\TTS-4.0\installer_files\env\lib\site-packages\gradio\components.py", line 259, in make_temp_copy_if_needed
temp_dir = self.hash_file(file_path)
File "C:\SuperStableDiffusion2.0\TTS-4.0\installer_files\env\lib\site-packages\gradio\components.py", line 223, in hash_file
with open(file_path, "rb") as f:
FileNotFoundError: [Errno 2] No such file or directory: 'outputs\2023-06-26_16-33-06__bark__de_speaker_0.png\2023-06-26_16-33-06__bark__de_speaker_0.png.png'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:\SuperStableDiffusion2.0\TTS-4.0\tts-generation-webui\server.py", line 66, in
with gr.Blocks(
File "C:\SuperStableDiffusion2.0\TTS-4.0\installer_files\env\lib\site-packages\gradio\blocks.py", line 1411, in exit
self.config = self.get_config_file()
File "C:\SuperStableDiffusion2.0\TTS-4.0\installer_files\env\lib\site-packages\gradio\blocks.py", line 1378, in get_config_file
props = block.get_config() if hasattr(block, "get_config") else {}
File "C:\SuperStableDiffusion2.0\TTS-4.0\installer_files\env\lib\site-packages\gradio\components.py", line 4433, in get_config
"value": self.value,
AttributeError: 'Gallery' object has no attribute 'value'

Done!
Press any key to continue . . .
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

From what I've read elsewhere the Triton thing is not a problem,
but I don't know how to tackle the errors in the Traceback.
Should I simply delete the TTS-4.0 directory and re-install, or something else?

rsxdalv · 2023-06-27T06:40:11Z

The outputs seem to have a bug, I will try and fix it. For now you can just delete/backup the "outputs" folder.

…

On Tue, Jun 27, 2023, 7:31 AM Magenta-6 ***@***.***> wrote: @rsxdalv <https://github.com/rsxdalv> had a great day yesterday looking at the functional aspects of the various programs. Everything seemed to work fine. However today I cannot get it to run. - I am simply d'clicking the start_windows.bat file Error as below: ++++++++++++++++++++++++++++++++++++++++++++++++++ A matching Triton is not available, some optimizations will not be enabled. Error caught was: No module named 'triton' Loading extensions: Loaded extension: callback_save_generation_ffmpeg Loaded extension: callback_save_generation_musicgen_ffmpeg Loaded extension: empty_extension Loaded 2 callback_save_generation extensions. Loaded 1 callback_save_generation_musicgen extensions. Loading Bark models - Text Generation: GPU: Yes, Small Model: Yes - Coarse-to-Fine Inference: GPU: Yes, Small Model: Yes - Fine-tuning: GPU: Yes, Small Model: No - Codec: GPU: Yes 2023-06-27 16:19:08 | INFO | fairseq.tasks.text_to_speech | Please install tensorboardX: pip install tensorboardX 2023-06-27 16:19:08 | WARNING | xformers | Triton is not available, some optimizations will not be enabled. This is just a warning: No module named 'triton' Traceback (most recent call last): File "C:\SuperStableDiffusion2.0\TTS-4.0\tts-generation-webui\server.py", line 87, in history_tab(register_use_as_history_button) File "C:\SuperStableDiffusion2.0\TTS-4.0\tts-generation-webui\src\history_tab\main.py", line 56, in history_tab return history_content( File "C:\SuperStableDiffusion2.0\TTS-4.0\tts-generation-webui\src\history_tab\main.py", line 81, in history_content history_list_as_gallery = gr.Gallery(value=get_wav_files_img(directory)) File "C:\SuperStableDiffusion2.0\TTS-4.0\installer_files\env\lib\site-packages\gradio\components.py", line 4403, in *init* IOComponent.*init*( File "C:\SuperStableDiffusion2.0\TTS-4.0\installer_files\env\lib\site-packages\gradio\components.py", line 215, in *init* else self.postprocess(initial_value) File "C:\SuperStableDiffusion2.0\TTS-4.0\installer_files\env\lib\site-packages\gradio\components.py", line 4468, in postprocess file_path = self.make_temp_copy_if_needed(img) File "C:\SuperStableDiffusion2.0\TTS-4.0\installer_files\env\lib\site-packages\gradio\components.py", line 259, in make_temp_copy_if_needed temp_dir = self.hash_file(file_path) File "C:\SuperStableDiffusion2.0\TTS-4.0\installer_files\env\lib\site-packages\gradio\components.py", line 223, in hash_file with open(file_path, "rb") as f: FileNotFoundError: [Errno 2] No such file or directory: 'outputs\2023-06-26_16-33-06__bark__de_speaker_0.png\2023-06-26_16-33-06__bark__de_speaker_0.png.png' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "C:\SuperStableDiffusion2.0\TTS-4.0\tts-generation-webui\server.py", line 66, in with gr.Blocks( File "C:\SuperStableDiffusion2.0\TTS-4.0\installer_files\env\lib\site-packages\gradio\blocks.py", line 1411, in *exit* self.config = self.get_config_file() File "C:\SuperStableDiffusion2.0\TTS-4.0\installer_files\env\lib\site-packages\gradio\blocks.py", line 1378, in get_config_file props = block.get_config() if hasattr(block, "get_config") else {} File "C:\SuperStableDiffusion2.0\TTS-4.0\installer_files\env\lib\site-packages\gradio\components.py", line 4433, in get_config "value": self.value, AttributeError: 'Gallery' object has no attribute 'value' Done! Press any key to continue . . . +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ From what I've read elsewhere the Triton thing is not a problem, but I don't know how to tackle the errors in the Traceback. Should I simply delete the TTS-4.0 directory and re-install, or something else? — Reply to this email directly, view it on GitHub <#49 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABTRXI2ORNYTBYXQPD7IBWTXNJOYFANCNFSM6AAAAAAZS232DI> . You are receiving this because you were mentioned.Message ID: ***@***.***>

rsxdalv · 2023-06-27T06:42:24Z

Actually, to be safe - create an empty outputs folder after removing the current outputs. On Tue, Jun 27, 2023, 9:39 AM Roberts Slisans ***@***.***> wrote:

…

The outputs seem to have a bug, I will try and fix it. For now you can just delete/backup the "outputs" folder. On Tue, Jun 27, 2023, 7:31 AM Magenta-6 ***@***.***> wrote: > @rsxdalv <https://github.com/rsxdalv> had a great day yesterday looking > at the functional aspects of the various programs. > Everything seemed to work fine. > However today I cannot get it to run. - I am simply d'clicking the > start_windows.bat file > > Error as below: > ++++++++++++++++++++++++++++++++++++++++++++++++++ > A matching Triton is not available, some optimizations will not be > enabled. > Error caught was: No module named 'triton' > Loading extensions: > Loaded extension: callback_save_generation_ffmpeg > Loaded extension: callback_save_generation_musicgen_ffmpeg > Loaded extension: empty_extension > Loaded 2 callback_save_generation extensions. > Loaded 1 callback_save_generation_musicgen extensions. > Loading Bark models > - Text Generation: GPU: Yes, Small Model: Yes > - Coarse-to-Fine Inference: GPU: Yes, Small Model: Yes > - Fine-tuning: GPU: Yes, Small Model: No > - Codec: GPU: Yes > 2023-06-27 16:19:08 | INFO | fairseq.tasks.text_to_speech | Please > install tensorboardX: pip install tensorboardX > 2023-06-27 16:19:08 | WARNING | xformers | Triton is not available, some > optimizations will not be enabled. > This is just a warning: No module named 'triton' > Traceback (most recent call last): > File "C:\SuperStableDiffusion2.0\TTS-4.0\tts-generation-webui\server.py", > line 87, in > history_tab(register_use_as_history_button) > File > "C:\SuperStableDiffusion2.0\TTS-4.0\tts-generation-webui\src\history_tab\main.py", > line 56, in history_tab > return history_content( > File > "C:\SuperStableDiffusion2.0\TTS-4.0\tts-generation-webui\src\history_tab\main.py", > line 81, in history_content > history_list_as_gallery = gr.Gallery(value=get_wav_files_img(directory)) > File > "C:\SuperStableDiffusion2.0\TTS-4.0\installer_files\env\lib\site-packages\gradio\components.py", > line 4403, in *init* > IOComponent.*init*( > File > "C:\SuperStableDiffusion2.0\TTS-4.0\installer_files\env\lib\site-packages\gradio\components.py", > line 215, in *init* > else self.postprocess(initial_value) > File > "C:\SuperStableDiffusion2.0\TTS-4.0\installer_files\env\lib\site-packages\gradio\components.py", > line 4468, in postprocess > file_path = self.make_temp_copy_if_needed(img) > File > "C:\SuperStableDiffusion2.0\TTS-4.0\installer_files\env\lib\site-packages\gradio\components.py", > line 259, in make_temp_copy_if_needed > temp_dir = self.hash_file(file_path) > File > "C:\SuperStableDiffusion2.0\TTS-4.0\installer_files\env\lib\site-packages\gradio\components.py", > line 223, in hash_file > with open(file_path, "rb") as f: > FileNotFoundError: [Errno 2] No such file or directory: > 'outputs\2023-06-26_16-33-06__bark__de_speaker_0.png\2023-06-26_16-33-06__bark__de_speaker_0.png.png' > > During handling of the above exception, another exception occurred: > > Traceback (most recent call last): > File "C:\SuperStableDiffusion2.0\TTS-4.0\tts-generation-webui\server.py", > line 66, in > with gr.Blocks( > File > "C:\SuperStableDiffusion2.0\TTS-4.0\installer_files\env\lib\site-packages\gradio\blocks.py", > line 1411, in *exit* > self.config = self.get_config_file() > File > "C:\SuperStableDiffusion2.0\TTS-4.0\installer_files\env\lib\site-packages\gradio\blocks.py", > line 1378, in get_config_file > props = block.get_config() if hasattr(block, "get_config") else {} > File > "C:\SuperStableDiffusion2.0\TTS-4.0\installer_files\env\lib\site-packages\gradio\components.py", > line 4433, in get_config > "value": self.value, > AttributeError: 'Gallery' object has no attribute 'value' > > Done! > Press any key to continue . . . > +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > From what I've read elsewhere the Triton thing is not a problem, > but I don't know how to tackle the errors in the Traceback. > Should I simply delete the TTS-4.0 directory and re-install, or something > else? > > — > Reply to this email directly, view it on GitHub > <#49 (comment)>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/ABTRXI2ORNYTBYXQPD7IBWTXNJOYFANCNFSM6AAAAAAZS232DI> > . > You are receiving this because you were mentioned.Message ID: > ***@***.***> >

rsxdalv · 2023-06-27T10:59:02Z

The error happens because the output folder gets named "2023-06-26_16-33-06__bark__de_speaker_0.png", I tried a simple generation and I get the proper folder name "2023-06-27_13-25-59__bark__de_speaker_0".

Did anything unusual happen?
I even tried changing my installation directory and it still didn't have this issue.

For now I also isolated the issue so that it only affects the history tab, and you can still boot up the app. Until a future upgrade I cannot make a stable solution. To get the latest changes run the "update" script.

Magenta-6 · 2023-06-27T22:51:07Z

Hey thanks.
I replaced the [outputs] folder with a new empty folder and it worked a treat.

The cause of the error may be this:
After generating heaps of stuff in stable diffusion I usually manually go through the outputs using windows explorer to review and delete the rubbish and move the usable content into a folder with a date and a description. This helps me to theoretically find content later and it reduces the content on the hard drive.

Using the same idea I adopted this practice with TTS-4.0.
I started to extract the .wavs from each of the sub-folders and left them in the main "outputs" folder to aggregate them into a new folder called [2023-06-26_TestPrompt-01].
All the other files, the .png's, the .oggs, .npz .json and the sub-folder itself were deleted.

I'm guessing that in this process a folder must have accidentally got re-named
[2023-06-26_16-33-06__bark__de_speaker_0.png]
I cannot find this folder to confirm, but it seems the most likely cause of the glitch.

I appreciate the time you put into handling these queries and in getting back to me with appropriate advice. I've dropped a copy of the file generated in the hope it might give you a chuckle. I'm finding that some of the non-english voices generate really good accented english content. Bark voice: de_speaker_0 is one of my faves!

I did this before I realized that there is a handy tab within the api for reviewing and deleting content. There are also tools for creating favourites and collections, which I have yet to utilize.

With all of these Ai Gen Tools, (as with digital photography), a good file management workflow is essential for managing and curating digital assets. I can see that your "One Stop Shop" approach to TTS + Audio gen is similar to Adobe Lightroom in bringing workflow and content creation together. It should get a lot of attention.

As an aside there is a pretty powerful image management tool called "breadboard" which reads metadata within images and can sort by text tags.
Link: https://github.com/cocktailpeanut/breadboard

2023-06-26_16-33-06__bark__de_speaker_0.zip

rsxdalv · 2023-06-27T22:56:25Z

Thank you! I'll reply longer later but for now I'll just add this: theoretically the ogg files contain everything with the most space efficiency. There are a few considerations to make but generally just keeping ogg files for a "frozen" archive is good, I'll just need to add more UI to handle them specifically.

…

On Wed, Jun 28, 2023, 1:51 AM Magenta-6 ***@***.***> wrote: Hey thanks. I replaced the [outputs] folder with a new empty folder and it worked a treat. The cause of the error may be this: After generating heaps of stuff in stable diffusion I usually manually go through the outputs using windows explorer to review and delete the rubbish and move the usable content into a folder with a date and a description. This helps me to theoretically find content later and it reduces the content on the hard drive. Using the same idea I adopted this practice with TTS-4.0. I started to extract the .wavs from each of the sub-folders and left them in the main "outputs" folder to aggregate them into a new folder called [2023-06-26_TestPrompt-01]. All the other files, the .png's, the .oggs, .npz .json and the sub-folder itself were deleted. I'm guessing that in this process a folder must have accidentally got re-named [2023-06-26_16-33-06__bark__de_speaker_0.png] I cannot find this folder to confirm, but it seems the most likely cause of the glitch. I appreciate the time you put into handling these queries and in getting back to me with appropriate advice. I've dropped a copy of the file generated in the hope it might give you a chuckle. I'm finding that some of the non-english voices generate really good accented english content. Bark voice: de_speaker_0 is one of my faves! I did this before I realized that there is a handy tab within the api for reviewing and deleting content. There are also tools for creating favourites and collections, which I have yet to utilize. With all of these Ai Gen Tools, (as with digital photography), a good file management workflow is essential for managing and curating digital assets. I can see that your "One Stop Shop" approach to TTS + Audio gen is similar to Adobe Lightroom in bringing workflow and content creation together. It should get a lot of attention. As an aside there is a pretty powerful image management tool called "breadboard" which reads metadata within images and can sort by text tags. Link: https://github.com/cocktailpeanut/breadboard 2023-06-26_16-33-06__bark__de_speaker_0.zip <https://github.com/rsxdalv/tts-generation-webui/files/11887860/2023-06-26_16-33-06__bark__de_speaker_0.zip> — Reply to this email directly, view it on GitHub <#49 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABTRXI3YX52L2EI3BL4OCDDXNNPWNANCNFSM6AAAAAAZS232DI> . You are receiving this because you were mentioned.Message ID: ***@***.***>

rsxdalv · 2023-06-28T22:59:34Z

That's an interesting project! Yes, I have been using Stable Diffusion and I saw a similar issue of collections being a necessity and a bottleneck.
For Bark files, something like this can be ported to run on local files: https://rsxdalv.github.io/bark-speaker-directory/voice-drafts

Magenta-6 · 2023-06-29T21:25:15Z

@rsxdalv that card index for voices is an awesome add on.
I really like the way it is set up with a pic, voice sample and tags.
I assumed that ported means it can it be brought into TTS-4.0 as an extension in the same way the oobabooga works.

However using my trial and error has got me into trouble again.
I copy/pasted you web url: https://rsxdalv.github.io/bark-speaker-directory/ into the bottom field of the gradio settings page
I think it was called Directories
It crashed the api . . . and caused the following error:

++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Traceback (most recent call last):
File "C:\SuperStableDiffusion2.0\TTS-4.0\tts-generation-webui\server.py", line 126, in
demo.queue(
File "C:\SuperStableDiffusion2.0\TTS-4.0\installer_files\env\lib\site-packages\gradio\blocks.py", line 1757, in launch raise ValueError("allowed_paths must be a list of directories.")
ValueError: allowed_paths must be a list of directories.

Done!
Press any key to continue . . .
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
I ran the update_windows.bat hoping it might clear the field, but that was a bit optimistic.
I did see quite a few updates come in though so still worth doing.

If you could let me know whether I should replace a "particular_file.py" it would help me.
I have Visual Studio so I could even manually delete the url from the appropriate file, if I knew where to look.

PS: The card system is super useful too so any tips on doing it the right way would be appreciated.
Sorry to waste your time on this.

Cheers from NZ.

rsxdalv · 2023-06-29T22:00:06Z

If you create an unrecoverable issue in the settings, you can just delete or backup config.json and it will get recreated.

Magenta-6 · 2023-07-01T02:26:14Z

Thanks - Opened the config.json and removed the link.
Perfect again.
Still interested in your voice card idea, but unsure how to go about porting them.

Magenta-6 · 2023-07-01T02:28:11Z

Just found your readme.

rsxdalv · 2023-07-01T06:17:42Z

There are two types of voice cards - "profiles" with names and pictures and "tree of voices" where based on relations between them the voices are grouped together. To be honest I'm not yet sure which use case are you looking for. Do you want to automatically generate voice profiles from your user made voices? Or do you want to have a voice selector that utilizes voice profiles? Or do you want the outputs/history view to have the "tree of voices" style display?

…

On Sat, Jul 1, 2023, 5:28 AM Magenta-6 ***@***.***> wrote: Just found your readme. — Reply to this email directly, view it on GitHub <#49 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABTRXI66EIKHBNHHGG4C5ODXN6DMLANCNFSM6AAAAAAZS232DI> . You are receiving this because you were mentioned.Message ID: ***@***.***>

Magenta-6 · 2023-07-02T04:55:47Z

@rsxdalv - Thanks for asking.
I think the voice selector is what I like but I can see that a tree could be useful if the heirarchy can be figured out.

What would be great is if the Voice Tab in your webui could be linked to the Voice Cards as well as to the .npz files.

As an example of a similar situation, below is a screenshot of Textual Inversion (T.I.) cards in Automatic1111's Stable diffusion webui. [2023-07-04, image deleted and replaced with a screenshot of the folder contents showing pairs of embeddings.pt files and .png files]

Each of the images was generated using a particular Textual Inversion.
The images in .png format simply got dropped into the [embeddings] folder with the T.I. files.
If the file name of the T.I. is marsattacks3.pt, then an image called marsattacks3.png gets automatically pulled into the right slot.

(There is a way of embedding (steganizing) data into images, which might be able to be used but I don't know how to do that). - [2023-07-04, See Later post below]

As people start generating content with multiple voices and the number of voices starts to increase a Voice Selector with fields for #hashtags, etc will be an extremely useful way of setting up collections, families and characters.
Card system is also a great way of sharing voices with others.

[2023-07-04, image replaced]

rsxdalv · 2023-07-03T17:11:13Z

@rsxdalv - Thanks for asking. I think the voice selector is what I like but I can see that a tree could be useful if the heirarchy can be figured out.

What would be great is if the Voice Tab in your webui could be linked to the Voice Cards as well as to the .npz files.

As an example of a similar situation, below is a screenshot of Textual Inversion (T.I.) cards in Automatic1111's Stable diffusion webui.

Each of the images was generated using a particular Textual Inversion. The images in .png format simply got dropped into the [embeddings] folder with the T.I. files. If the file name of the T.I. is marsattacks3.pt, then an image called marsattacks3.png gets automatically pulled into the right slot.

(There is a way of embedding (steganizing) data into images, which might be able to be used but I don't know how to do that).

As people start generating content with multiple voices and the number of voices starts to increase a Voice Selector with fields for #hashtags, etc will be an extremely useful way of setting up collections, families and characters. Card system is also a great way of sharing voices with others.

Just to be safe let's censor the image

rsxdalv · 2023-07-03T17:12:40Z

As for that, yes, I need to see if I can somehow get image generation, and then I could write a plugin that saves voices as images. Currently my approach is that I want to keep the "core" simpler and then enhance it with plugins, which could eventually become part of the "core".

Magenta-6 · 2023-07-03T20:11:56Z

A plugin sounds good. A modular approach around a central core seems like the way to go, then others with special skillsets can create add-ons. I guess that's part of the charm of Github and open source.

PS. I've deleted the screenshot on the previous post.
Below is an example of a .png file with image-style data written on the sides as a QR type code.
At least that's what it looks like to me.

Inspecting the file info in photoshop revealed almost no metadata apart from image size and format.

rsxdalv · 2023-07-16T14:26:09Z

I added the initial basic version of this #78, where if you have the same filename you can see it in the UI.

Magenta-6 · 2023-07-17T03:00:47Z

Thanks - Love the way you coded it to automatically rename the image file. Works a treat.

rsxdalv · 2023-07-26T11:53:49Z

#98
ok, now it will rename both automatically, and also you can select voices from the gallery

Magenta-6 closed this as completed Jul 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Query about ideal drive:\location for install to reduce conflicts #49

Query about ideal drive:\location for install to reduce conflicts #49

Magenta-6 commented Jun 25, 2023

rsxdalv commented Jun 25, 2023

Magenta-6 commented Jun 26, 2023 •

edited

Loading

rsxdalv commented Jun 26, 2023 via email

Magenta-6 commented Jun 27, 2023

rsxdalv commented Jun 27, 2023 via email

rsxdalv commented Jun 27, 2023 via email

rsxdalv commented Jun 27, 2023 •

edited

Loading

Magenta-6 commented Jun 27, 2023

rsxdalv commented Jun 27, 2023 via email

rsxdalv commented Jun 28, 2023

Magenta-6 commented Jun 29, 2023

rsxdalv commented Jun 29, 2023

Magenta-6 commented Jul 1, 2023

Magenta-6 commented Jul 1, 2023

rsxdalv commented Jul 1, 2023 via email

Magenta-6 commented Jul 2, 2023 •

edited

Loading

rsxdalv commented Jul 3, 2023

rsxdalv commented Jul 3, 2023

Magenta-6 commented Jul 3, 2023

rsxdalv commented Jul 16, 2023

Magenta-6 commented Jul 17, 2023

rsxdalv commented Jul 26, 2023

Query about ideal drive:\location for install to reduce conflicts #49

Query about ideal drive:\location for install to reduce conflicts #49

Comments

Magenta-6 commented Jun 25, 2023

rsxdalv commented Jun 25, 2023

Magenta-6 commented Jun 26, 2023 • edited Loading

rsxdalv commented Jun 26, 2023 via email

Magenta-6 commented Jun 27, 2023

rsxdalv commented Jun 27, 2023 via email

rsxdalv commented Jun 27, 2023 via email

rsxdalv commented Jun 27, 2023 • edited Loading

Magenta-6 commented Jun 27, 2023

rsxdalv commented Jun 27, 2023 via email

rsxdalv commented Jun 28, 2023

Magenta-6 commented Jun 29, 2023

rsxdalv commented Jun 29, 2023

Magenta-6 commented Jul 1, 2023

Magenta-6 commented Jul 1, 2023

rsxdalv commented Jul 1, 2023 via email

Magenta-6 commented Jul 2, 2023 • edited Loading

rsxdalv commented Jul 3, 2023

rsxdalv commented Jul 3, 2023

Magenta-6 commented Jul 3, 2023

rsxdalv commented Jul 16, 2023

Magenta-6 commented Jul 17, 2023

rsxdalv commented Jul 26, 2023

Magenta-6 commented Jun 26, 2023 •

edited

Loading

rsxdalv commented Jun 27, 2023 •

edited

Loading

Magenta-6 commented Jul 2, 2023 •

edited

Loading