Releases: NaruseMioShirakana/DragonianVoice
Releases · NaruseMioShirakana/DragonianVoice
DragonianVoice - 1.0.1 - Native api
DragonianVoice - 1.0.1 - Native api
- Most of the places where specific system APIs are used have been separated.
- The Tensor library and ONNX library have been decoupled, allowing you to choose which one to compile.
- Smart pointers are now used instead of raw pointers.
- The Logger system has been rewritten. The current Logger allows you to specify the LoggerID and LoggerLevel. The log text format has been updated.
- The exception structure has also been rewritten. The current exception can track the call stack and the location where the exception is thrown to a certain extent.
- Encoder (Hubert), Vocoder (Hifigan), and PE (F0Extractor - RMVPE, FCPE) have been replaced by global references. This allows you to use the same model in different Svc models to avoid loading the same model multiple times or just changing the Svc model. The API for loading these models returns a smart pointer, and the API for unloading these models only reduces the global reference count of the specified model by one. Therefore, you don't need to worry about issues when calling the unloading API while your Svc model is in use, as these models are actually unloaded when all Svc models that use the specified model are unloaded after you call the unloading API.
- The default audio format has been changed to PCM-float32-le, but you can still use the API with the I16 suffix to use audio in PCM-int16-le format.
- Function parameter types have been replaced with some renamed empty structures to enable the type detection system of the C/C++ language and IDE.
- Now, shallow diffusion requires you to manually call the corresponding function, instead of directly setting the corresponding item to true in the inference parameters and adding the pointer to the corresponding model.
- Now, vocoder enhancement requires you to manually call the corresponding function, instead of directly setting the corresponding item to true in the inference parameters and adding the pointer to the corresponding model.
- This repository is currently only used for releases. For source code, see: Source code of DragonianLib
libsvc - 0.0.9 - Native api
修复了一些已知的BUG,增加了CUDAEP的支持。
该Release中预编译的onnxruntime.dll目前支持CPU、DML以及CUDA12+CUDNN9
源代码
libsvc - 0.0.8 - Native api
优化性能
同时将大部分依赖静态编译
libsvc - 0.0.7 - Native & .Net api
ShallowDiffusion Bug Fix
Infer Pcm Data
libsvc - 0.0.6 - Native & .Net api
Format Code
Bug Fix
libsvc - 0.0.5 - Native & .Net api
Vocoder BUG Fix
Support Write PCM Data
libsvc - 0.0.4 - Native & .Net api
C# example
using LibSvcApi;
LibSvc.LibSvcHparams Config = new();
Config.TensorExtractor = "DiffusionSvc";
Config.SamplingRate = 44100;
Config.HopSize = 512;
Config.HubertPath = "hubert\\vec-768-layer-12.onnx";
Config.SpeakerCount = 2;
Config.HiddenUnitKDims = 768;
Config.EnableCharaMix = 1;
Config.EnableVolume = 1;
Config.MelBins = 128;
Config.DiffusionSvc.After = "Models\\ShallowDiffusion\\ShallowDiffusion_after.onnx";
Config.DiffusionSvc.Alpha = "Models\\ShallowDiffusion\\ShallowDiffusion_alpha.onnx";
Config.DiffusionSvc.Encoder = "Models\\ShallowDiffusion\\ShallowDiffusion_encoder.onnx";
Config.DiffusionSvc.Denoise = "Models\\ShallowDiffusion\\ShallowDiffusion_denoise.onnx";
Config.DiffusionSvc.Naive = "Models\\ShallowDiffusion\\ShallowDiffusion_naive.onnx";
Config.DiffusionSvc.Pred = "Models\\ShallowDiffusion\\ShallowDiffusion_pred.onnx";
void PrintProgress(ulong arg1, ulong arg2)
{
Console.WriteLine(arg1 * 100.0 / 10);
}
LibSvc.CallbackProgress Callback = new LibSvc.CallbackProgress(PrintProgress);
UnionModel Model = LibSvc.Factory.LoadUnionSvcModel(
ref Config, ref Callback,
0, 0, 8
);
string AudioPath = "input.wav";
Int16Vector Audio = LibSvc.Factory.ReadAudio(ref AudioPath, 48000);
Console.WriteLine(Audio.Size());
LibSvc.SlicerSettings slicerSettings = new();
UInt64Vector SlicePos = LibSvc.Factory.SliceAudio(ref Audio, ref slicerSettings);
Console.WriteLine(SlicePos.Size());
Slices slices = LibSvc.Factory.Preprocess(ref Audio, ref SlicePos);
Console.WriteLine(slices.Size());
string VocoderPath = "hifigan\\nsf_hifigan.onnx";
VocoderModel Vocoder = LibSvc.Factory.LoadVocoderModel(ref VocoderPath);
LibSvc.Params _params = new();
_params.SetVocoder(ref Vocoder);
ulong Proc = 0;
Slice slice = slices[0];
Audio = Model.Inference(slice, ref _params, ref Proc);
Console.WriteLine((double)slice.SrcLength() * Config.SamplingRate / slicerSettings.SamplingRate);
Console.WriteLine(Audio.Size());
GC.KeepAlive(Callback);
MoeVoiceStudio - 0.1.3
Ver - 0.1.3
UI的变化
- 主要变动
- 增加了CrashHandler
核心的变化
- 主要变动
- 支持了ReflowSVC
MoeVoiceStudio - 0.1.2
MoeVoiceStudio - 0.1.2
修复了一些BUG
MoeVoiceStudio - TTS - 0.1.4
优化代码结构
修复了几个BUG