Waveform Autoencoding at the Edge of Perceivable Latency

Franco Caspe; Andrew McPherson; Mark Sandler

Format: poster
Session: posters-1
Paper PDF link
Presence: in person
Duration: 5
Type: short

Abstract:

We introduce an audio plugin implementation of BRAVE, a waveform autoencoder presented recently, that affords Neural Audio Synthesis with low latency and jitter. As a redesign of the well-known RAVE model, BRAVE introduces a series of architectural modifications for supporting instrumental interaction with almost imperceptible latency (<10 ms) and jitter (~ 3 ms). By comparing both designs, we highlight key architectural differences between the models that impact their instrumental performance capability, arguing that no model fits all purposes, and calling for their careful selection for each interactive design. Finally, we discuss challenges and opportunities for leveraging low-latency waveform autoencoders to develop interactive systems, such as Digital Musical Instruments, that can foster control intimacy through enhanced responsiveness and space for nuance.