Post subject: VBA Rerecording Sound Synchronization for Gameboy
Former player
Joined: 9/3/2012
Posts: 40
Location: boston
I am curious how gameboy videos are currently encoded, due to my recent experience with my pokemon yellow TAS. When I created a demo encode for youtube, I used my own special purpose code to generate the final rendered file. This was a very simple implementation which simply output one png file for each frame of video, and wrote all sound to a target wav file. Internal emulation time was dilated to allow me to write these files regardless of the speed of my computer. The png files and wav file were then muxed and encoded to video using ffmpeg. My initial attempts were met with failure because the audio would gradually go out of sync with the video. Initially I thought that this was because the simulated gameboy did not use exactly 60 fps as its framerate, so I tried various other plausible framerates. Eventually I found to my surprise that there is no constant framerate which could work. So, I rewrote my AV rendering code to record the exact amount of time that had elapsed according to the recorded sound, and to adaptively drop and duplicate frames to keep the frames synced to the sound at a constant framerate of 60 fps, within an imperceptible drift tolerance. I found that there are different conditions during simulation which cause the frames to go out of sync with the video. During restarts and bootup sequence, there are many frames which must be added, and during normal operation, frames must be occasionally dropped at a rate of about 1 frame every three to four seconds. Source for this program is here: http://hg.bortreb.com/vba-clojure/file/aeb4b676ba8b/clojure/com/aurellem/run/final_cut.clj After making this modification to my A/V rendering code, I was able to achieve a perfect encode of my 12 minute video using my rather under-powered laptop over the course of a few hours. Listen to the encode of my TAS that was generated by my program, especially the beginning of the "My Little Pony" theme song at 12:20. Notice how there are no pops in any of the notes. http://www.youtube.com/watch?v=p5T81yHkHtI Now, listen to the official encode of my TAS, especially around 12:25, where the first note of the song plays. http://www.youtube.com/watch?v=aYQpl8Jj6Yg You will notice some pops in the audio when the "My Little Pony" theme song is played. You can also hear these pops during the "Pallet Town" song that plays at the start of each video. Looking at several other Gameboy encodes, I can notice similar pops a few times a minute. So, my questions are: How are Gameboy TAS encodes rendered? Why are there pops in the sound for many Gameboy TASs? Is there something wrong with vba-rerecording itself that is creating these pops? Would it be useful to add a command line option to vba-rerecording that would render a vbm file to a directory of images and a soundfile? something like vba-rerecording <rom> --rendermovie <vbm> \ --png-dir=<output> --audio-file=<audio> This would of course do automatic frame dropping and duplication to make everything stay in sync. As always, it's a pleasure to work with this great TAS community. --Robert
Post subject: Re: VBA Rerecording Sound Synchronization for Gameboy
Emulator Coder, Skilled player (1113)
Joined: 5/1/2010
Posts: 1217
bortreb wrote:
My initial attempts were met with failure because the audio would gradually go out of sync with the video. Initially I thought that this was because the simulated gameboy did not use exactly 60 fps as its framerate, so I tried various other plausible framerates. Eventually I found to my surprise that there is no constant framerate which could work.
IIRC, The problem is caused by Gameboy allowing to turn the screen off. VBA then changes the framerate in unpredictable ways. Then there was the issue that one run caused the emulator to hang when dumping. My guess of cause is that the game first turns the screen off and then proceeds to lock up, never turning the screen back on.
bortreb wrote:
How are Gameboy TAS encodes rendered?
From internal VBA dumper. Interesting about the pops in sound. I guess VBA isn't just trying to duplicate frames to maintain A/V sync...
bortreb wrote:
This would of course do automatic frame dropping and duplication to make everything stay in sync.
Better to dump at correct framerate, duplicating if needed (dropping should not be needed).
Joined: 10/3/2004
Posts: 138
I don't have an answer for the pops, but I do remember back in the FCEU days here, I was experimenting with making DVD encodes of runs for my own personal viewing. I had found, with Bisqwit's input, that the framerate was ever so slightly less than 60fps, yet due to the nature of AVI (or at least due to the emulator, I never figured out which was the case), the video was playing back at exactly 60fps, causing the video to lead the audio, IIRC. The solution I came up with was to determine the exact frame rate that the emulator was actually running at and calculate the proper sample rate to make the audio play back synched, use Avisynth to make the audio play at that sample rate, then resample to 48KHz. Perhaps something similar could be done with your PNG output technique to ensure the audio and video stay in sync without dropping/duping frames? Certainly a minimal pitch change from the forced sample rate and minimal quality drop from the 48KHz resample will be less jarring than even a single dropped/duped frame. Slightly off-topic: I've not watched a lot of long Gameboy runs, but I would assume if that had been an issue for standard video dumping, that there would already be a solution. The reason I ask this here is because I'm planning on doing a non-TAS versus run with my girlfriend soon using VBA-rr as my emulator of choice (and including our live commentary as we play), and I want to make sure that I won't be having any sync issues with a 30-35 minute segment length.