Drag the video into AvsP (an AviSynth script editor) and add a new line with the text "info.Crop(0, 0, 400, 300)" in it. Press F5; it should display the topleft corner of the video with some info on it. Check if the audio length differs greatly from the video length, or if the video has an unusual FPS.
To see audio issues the AviSynth plugin
AudioGraph may be useful; use "ConvertToYUY2.AudioGraph(1)", scroll to the end of the video and see if the waveforms match the actions in the video.
To fix issues use the "TimeStretch" function. Use the "rate" parameter
slightly above or below 100. Testing in realtime might be difficult though if your system can't handle playback...
N64 games should be fine at 640x480. Record to a lossless format for editing. For lossless RGB use the CamStudio codec (fast) or the ZMBV codec (small files, doesn't like 24-bit sources). For lossless YV12 use FFDShow's HuffYUV ("HFYU", "YV12", "Median", "adaptive tables" checked) or the original Huffyuv codec. (Emulators output in RGB.)
For downsizing try the GaussResize function (upscaling should be fine with BilinearResize). Test a few values for p.
For the final encode I'd use x264 with a constant ratefactor (not constant quantizer). A value of 16 should look almost like the original; larger values compress more. For audio you can use the lame codec in VirtualDubMod, or extract the audio into a WAV file + encode it on the command line + merge video and audio with MKVMerge GUI.