I strongly doubt that VirtualBox (or similar VM software) would be suitable for general-purpose TASing. In typical use, it's more like a hypervisor than a true emulator, so you're subject to timing issues of the host machine's CPU. I imagine it'd work in the same situations as Hourglass does.
VirtualBox does have software emulation support borrowed from QEMU, which would be a better starting point for a serious effort. QEMU is definitely not cycle-accurate, and from what I gather it's not always deterministic either. It looks like
MARSSx86 is an attempt at a cycle-accurate system, leveraging code from QEMU.
Bochs is another x86 emulator that lacks many of QEMU's optimizations. That relative simplicity might make it easier to turn into a TAS-capable program, at the expense of slow playback. I can't find good discussion about how accurate and deterministic it really is.