There is a workaround to prevent sound glitching. It requires dumping the frame numbers in a specific manner.
Each checkpoint is fixed by loading a state some frames back, as usual, but that state is loaded NOT at the frame it is saved at (NOT when the movie gets to that frame), but only after the capture already reaches the last sync frame. Then eternalSPU gets some extra frames to figure out how all the new data must be processed, and the frames where the sound is still glitched are cut away from the captured video.
Desync is at frame 415. We save the tasSPU sate at frame 400. We capture the movie up to frame 415 with eternalSPU. Then we LOAD the tasSPU state from frame 400. In the encode, the frame 430 will be equal to our movie frame 415, but right after the tasSPU state (from frame 400) was loaded (at frame 415), there is the sound glitch. So we give eternalSPU these extra frames.
But while doing this, we must dump another list - a list of avisynth trim commands with the corresponding frames. There's no command to cut out a bunch of frames in avisynth, only to pick and process some segment, so we need to append all the segments we pick with Trim. But what frames to trim at?
As I said, we already have the properly sounding frames 0-415 from eternalSPU. So the first Trim command would use this segment. Then we shift the frame number forward by the diff between 400 and 415 (extra frames offset) and have the second segment starting at frame 430. Let's assume it needs to end at frame 810 then. At frame 810 we load the state from frame 800 (this means we have the new offset of 10 frames) and the next segment will start at the video frame 820, etc.
So here is the code that must be dumped to a different text file, it then will be directly copypasted to our usual avisynth script and used for encoding.
Language: avisynth
Trim(0, 415) + \
Trim(430, 810) + \
Trim(820, 1000)