Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sample rate converter #9

Open
dromer opened this issue Aug 5, 2021 · 10 comments
Open

sample rate converter #9

dromer opened this issue Aug 5, 2021 · 10 comments

Comments

@dromer
Copy link

dromer commented Aug 5, 2021

At the moment the processing is fixed on 44.1, but many systems run on 48k (or even higher). see #2 (comment)

It would be great if we are able to run at other samplerates, does this require training with other datasets?

@GuitarML
Copy link
Owner

GuitarML commented Aug 5, 2021

Yes, that is something I'm interested in implementing. Right now I believe you would need to train using a 48k dataset (have not tested this), but I'm currently looking into using r8brain for internal samplerate conversion. The idea is if you change the samplerate in the plugin to anything other than 44.1k, it would downsample to 44.1k for processing the neural net model, then upsample back to the output samplerate. I need to verify the latency in doing that.

@dromer
Copy link
Author

dromer commented Aug 5, 2021

Hmmm, sounds to me that having a 48k trained model would certainly be the more performant option.
(considering many of us are looking at running this on embedded targets, there are more resource constraints)

@GuitarML
Copy link
Owner

GuitarML commented Aug 5, 2021

I agree, no samplerate conversion would be ideal. I'll do some testing for both options and share the results here.

@MaxPayne86
Copy link

@GuitarML what's the decrease in performance from 44.1 to 48 on Stateful LSTM? For example does it make sense to run NeuralPi at 24kHz and put downsample/upsample blocks? Guitar is pretty dead at 12kHz...

@GuitarML
Copy link
Owner

GuitarML commented Aug 9, 2021

@MaxPayne86 I wouldn't want to go below 44.1kHz, just to keep sound quality high. I'm not opposed to testing out 24kHz though, more information on how the models perform at different samplerates would be good. I haven't tested 48kHz models on the raspberry pi hardware yet, but I'll post the results here when I do. On the Rpi4, sushi is reporting 16% cpu usage running one neural net model at 44.1k.

@MaxPayne86
Copy link

@GuitarML I didn't mentioned aliasing, sorry. So we could low pass the input at 12kHz and we would have the network running at 48kHz for a 4x oversample does it sounds good to you? Seems lowering the sample rate to 24kHz is not a good idea...curious how commercial neural network plugins handle that...didn't read anything in available literature what do you think?

@mishushakov
Copy link

@MaxPayne86 Neural DSP uses something they call "anti-derivative trigonometric interpolation", they run neural model at a certain sample rate, but convert to the desired sample rate using the algorithm

source: https://neuraldsp.com/news/a-new-audio-engine-powering-neural-dsp-plugins

@MaxPayne86
Copy link

MaxPayne86 commented Aug 10, 2021

@mishushakov @GuitarML okay so the processing chain cold be something like

upsample -> (lowpass) -> neural @??? -> downsample

for the upsample/downsample blocks zita-resampler is a choice, don't know how it performs against JUCE's own implementation

NOTE: if input is 44.1, then first stage is doing 44.1 -> 48. If input is 96, then first stage is a downsample block and the second one an upsample

@smallbutfine
Copy link

smallbutfine commented Sep 27, 2022

Nyquist alone doesn't cut it. It is a common misunderstanding. For complex signals there is a real world advantage of running 88, 96 or even 192khz sample rates natively due to microdynamics, not just oversampling for anti-aliasing. Believe it or not, things sound a lot better. (Otherwise, we would still listen to non-HD media and there would be no advantage in high sample rates at all, considering no one older than 18 might hear anything above 22kHz. Still, 48kHz is considerably higher quality than 44.1, the difference is even more noticable than the next step from 48kHz to 88. Public German broadcasting archives analog media of all kind in 192kHz in the meantime, and believe me, they would not if it would not have a benefit, just think of the cost of a multiple of space and power requirements for insane amount of data. There is some serious academical knowledge behind that. In other domains of sampling, higher samplerates than "sufficient considering Nyquist freq" are very common also for reasons.
I also would have a blast using the plugin for more delicate tasks like mic pre models for vocals. Just my ideas regarding this topic.

@smallbutfine
Copy link

PS, maybe NeuralDsp uses something esoteric internally? I remember reading a paper by audio converter guru Frederic Forsell, where he argumented for 60kHz sample rate for sampling and processing audio as a theoretical reasonable optimum, when high sample rates meant much higher building cost of high quality converters...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants