First time here? Check out the FAQ!

How would *you* approach the implementation of a classic realtime phase vocoder?

0 votes

These two Max tutorials are pretty well known
part 1
part 2

I have been trying to do this in realtime for some time now (years?) - I always run into a problem with the integration stage. In Kyma , the add wrap (psuedo) integrator is a delay line at framerate/hopsize with the feedback set to 1. As an accumulator this is problematic, because it never clears again once started, and I think the integrator always needs to clear on each frame. The result of the unclearable-accumulator on a scope is like an animation software that doesn't clear the background - the image just keeps drawing on top of itself, and zero is not an 'erase' - the result is a lot of non-zero information. In the context o FFT phase information , this means it just always 'dirty', and the phase relationships are always smeared.

This article by Stefan Bernsee is also well respected and implemented in many ways in other environments, but I have neither been able to get a circuit like that  to work in Kyma

Has anyone else explored phase vocoders in Kyma?

Also, why is it that we are limited to the most basic FFT resolution, 2x overlap? It seems that everyone in DSP recommends minimum 4x and at least 8x for decent results.... can't there just be a switch in the FFT / iFFT that can get us some oversampling?

asked Jul 13, 2019 in Sound Design by cristian-vogel (Master) (8,370 points)
I don‘t understand why you would want to clear the accumulator - the whole point is creating a „running“ phase, isn‘t it?
You can set the accumulator to any value if you feed it the difference between the target value and its current value btw
thats because you haven't seen my context. I need to clear it, because some processing smears the integration forever, and I want to be able to reset it after some processing but not for others.   I currently use two integration circuits, and keep one 'clean' but this feels like a workaround, and I am looking for useful suggestions
ok, how about feeding it the difference of the target value to the current value then? that definitely works

1 Answer

0 votes

Also, why is it that we are limited to the most basic FFT resolution, 2x overlap?

When they talk about overlapping FFTs, all they mean is that there is more than one FFT and that each FFT is operating on a different time window.

To do a 4X overlap, you could create a second FFT module (which already does 2X overlap), and delay its input by 1/4 of the FFT length. Now you have the spectrum at 4 different points in time.

Next take two iFFT modules, delay the output of one relative to the other by 1/4 the FFT length and add their outputs to recover the time domain signal.

To do more than 4X overlap, just repeat the process, using 1/OverlapNbr as the delay.

answered Jul 13, 2019 by ssc (Savant) (120,590 points)
That‘s true, but as soon as you want to do something with the data in-between it gets very hairy because of the parallel signal flow. It‘s impossible to process frame after frame.
I agree with Kymaguy ... The issue is, that FFT processing needs to happen only 'in the wire' , and we cannot use Capytalk or any type of scripting. So as soon as I try to do any type of advanced processing  inside the FFT block,  it means doubling all the wiring of the graph to feed 4x - in stereo it gets really busy and difficult to debug.

8x overlap is very difficult to do on Pacarana because of this.
it‘s not only that. the real problem is you can‘t process frame after frame unless you increase the samplerate (or processing speed) between FFT and iFFT..