DragonianVoice - 1.0.1 - Native api

Most of the places where specific system APIs are used have been separated.
The Tensor library and ONNX library have been decoupled, allowing you to choose which one to compile.
Smart pointers are now used instead of raw pointers.
The Logger system has been rewritten. The current Logger allows you to specify the LoggerID and LoggerLevel. The log text format has been updated.
The exception structure has also been rewritten. The current exception can track the call stack and the location where the exception is thrown to a certain extent.
Encoder (Hubert), Vocoder (Hifigan), and PE (F0Extractor - RMVPE, FCPE) have been replaced by global references. This allows you to use the same model in different Svc models to avoid loading the same model multiple times or just changing the Svc model. The API for loading these models returns a smart pointer, and the API for unloading these models only reduces the global reference count of the specified model by one. Therefore, you don't need to worry about issues when calling the unloading API while your Svc model is in use, as these models are actually unloaded when all Svc models that use the specified model are unloaded after you call the unloading API.
The default audio format has been changed to PCM-float32-le, but you can still use the API with the I16 suffix to use audio in PCM-int16-le format.
Function parameter types have been replaced with some renamed empty structures to enable the type detection system of the C/C++ language and IDE.
Now, shallow diffusion requires you to manually call the corresponding function, instead of directly setting the corresponding item to true in the inference parameters and adding the pointer to the corresponding model.
Now, vocoder enhancement requires you to manually call the corresponding function, instead of directly setting the corresponding item to true in the inference parameters and adding the pointer to the corresponding model.
This repository is currently only used for releases. For source code, see: Source code of DragonianLib

Assets 3

10 Aug 05:02

NaruseMioShirakana

lib-0.0.9

d3efc14

libsvc - 0.0.9 - Native api

修复了一些已知的BUG，增加了CUDAEP的支持。
该Release中预编译的onnxruntime.dll目前支持CPU、DML以及CUDA12+CUDNN9
源代码

Assets 3

13 Jun 07:30

NaruseMioShirakana

lib-0.0.8

e9cfa5b

libsvc - 0.0.8 - Native api

优化性能
同时将大部分依赖静态编译

Assets 3

02 Jun 13:06

NaruseMioShirakana

lib-0.0.7

7597c82

libsvc - 0.0.7 - Native & .Net api

ShallowDiffusion Bug Fix
Infer Pcm Data

Assets 3

01 Jun 15:53

NaruseMioShirakana

lib-0.0.6

ce07b45

libsvc - 0.0.6 - Native & .Net api

Format Code
Bug Fix

Assets 3

23 May 12:56

NaruseMioShirakana

lib-0.0.5

f5ee2a6

libsvc - 0.0.5 - Native & .Net api

Vocoder BUG Fix
Support Write PCM Data

Assets 3

21 May 07:44

NaruseMioShirakana

lib-0.0.4

ec928cf

libsvc - 0.0.4 - Native & .Net api

C# example

using LibSvcApi;


LibSvc.LibSvcHparams Config = new();
Config.TensorExtractor = "DiffusionSvc";
Config.SamplingRate = 44100;
Config.HopSize = 512;
Config.HubertPath = "hubert\\vec-768-layer-12.onnx";
Config.SpeakerCount = 2;
Config.HiddenUnitKDims = 768;
Config.EnableCharaMix = 1;
Config.EnableVolume = 1;
Config.MelBins = 128;
Config.DiffusionSvc.After = "Models\\ShallowDiffusion\\ShallowDiffusion_after.onnx";
Config.DiffusionSvc.Alpha = "Models\\ShallowDiffusion\\ShallowDiffusion_alpha.onnx";
Config.DiffusionSvc.Encoder = "Models\\ShallowDiffusion\\ShallowDiffusion_encoder.onnx";
Config.DiffusionSvc.Denoise = "Models\\ShallowDiffusion\\ShallowDiffusion_denoise.onnx";
Config.DiffusionSvc.Naive = "Models\\ShallowDiffusion\\ShallowDiffusion_naive.onnx";
Config.DiffusionSvc.Pred = "Models\\ShallowDiffusion\\ShallowDiffusion_pred.onnx";
void PrintProgress(ulong arg1, ulong arg2)
{
    Console.WriteLine(arg1 * 100.0 / 10);
}

LibSvc.CallbackProgress Callback = new LibSvc.CallbackProgress(PrintProgress);

UnionModel Model = LibSvc.Factory.LoadUnionSvcModel(
    ref Config, ref Callback,
    0, 0, 8
);

string AudioPath = "input.wav";
Int16Vector Audio = LibSvc.Factory.ReadAudio(ref AudioPath, 48000);
Console.WriteLine(Audio.Size());

LibSvc.SlicerSettings slicerSettings = new();
UInt64Vector SlicePos = LibSvc.Factory.SliceAudio(ref Audio, ref slicerSettings);
Console.WriteLine(SlicePos.Size());

Slices slices = LibSvc.Factory.Preprocess(ref Audio, ref SlicePos);
Console.WriteLine(slices.Size());

string VocoderPath = "hifigan\\nsf_hifigan.onnx";
VocoderModel Vocoder = LibSvc.Factory.LoadVocoderModel(ref VocoderPath);

LibSvc.Params _params = new();
_params.SetVocoder(ref Vocoder);
ulong Proc = 0;
Slice slice = slices[0];
Audio = Model.Inference(slice, ref _params, ref Proc);
Console.WriteLine((double)slice.SrcLength() * Config.SamplingRate / slicerSettings.SamplingRate);
Console.WriteLine(Audio.Size());
GC.KeepAlive(Callback);