Zoom Audio Buffering Emulator / by Les Stuck

zoomba.jpg

In 2020 we spent a lot of time in Zoom meetings, and I was intrigued by the way audio degraded when the network connection was bad. It was obviously an FFT-based algorithm, but done well. I liked the way it would freeze frames, and when the connection improved, it would play all the delayed audio really fast to catch up.

zoomba is a phase vocoder delay, which records FFT frames at the regular FFT rate, but you get to decide how fast those frames get played back. Playback can be beat-synced or defined as frequency. With the initial value of 0, the device should pass reasonably clean audio, playing back frames in sync with the FFT. A slow LFO plays the frames back more slowly, creating a temporal backlog in memory. Clicking the “catch up” button plays the backlog back quickly. Furthermore, when the LFO is very slow, one can vary the maximum frame duration to choose between legato with a long maximum frame size or staccato with a short one.

Try it with speech, increasing the latency by decreasing the LFO rate, try catching up, and try small and large maximum frame durations. It's a simple effect, but it may help us remember a year in which much of most human speech we heard was passed through a nice phase vocoder delay.