TLDR; By pressing buttons really fast, we can write and execute our own code and win the game. The code we can write is very limited, but we have more than enough tools at our disposal to make some magic happen.
You can use the comma and period keys to frame advance inside the youtube video, if you want to see it frame by frame.


  • Complete the game as fast as possible.
  • Load the credits without any visual bugs.

Additional Comments

This run is 28 frames faster than the previous run of 47 frames, and has been console verified by Bigbass.
The huge time save was achieved by setting up the various requirements without jumping to address $8FE3, which took up the majority of the previous TAS.
I created this run with the SubNesHawk core of Bizhawk 2.6.3, using the Trace Logger to know exactly where the program counter was at the end of each frame.

The DPCM Bug

The NES had a hardware issue called the DPCM Bug. You can read more about the bug here, but in summary, a hardware error existed that could drop a single bit when reading the button inputs.
To work around this bug, the game reads button inputs in a loop until two consecutive inputs are the same. Normally inputs are read twice in a frame, but should the DPCM Bug happen, the inputs would be read 3 times.
To exploit the workaround, we could press different buttons every time the inputs are polled. This would cause the input loop to happen as many times as we want! But “what good does that do us?” you might ask. Let's take a look at how my run exploits this, and the technical details behind it that allow for such a run to exist.

How the run works

(Heads up, there’s a decent amount of jargon in the rest of the author’s comments. This assumes you’re already familiar with a decent amount of 6502 Assembly)
The Non Maskable Interrupt (NMI) happens at the start of V-Blank, and the programmable interrupt (IRQ) is set to fire just a few scanlines above that. If we’re still executing code inside the NMI when the IRQ fires, the wrong PRG banks are currently loaded for the IRQ code. Address $101 is used as an “IRQ Mode” to determine which branches to take inside the IRQ code. In this case, the IRQ Mode is 0x20, which expects Bank 0x18 to be loaded in, but the NMI was using bank 0x1A to make some various graphical updates so the IRQ jumps us to the wrong place! This leads to an RTS instruction, except we jumped here using an interrupt, which should use RTI to properly return. Because of this, the wrong bytes are pulled off the stack as a return address, which leads the program counter to address $0001.
This part of RAM is filled with BRK instructions, which jumps to the programmable interrupt code. Every time this happens, since the IRQ Mode is still 0x20, the same problems occur. We enter the code as an interrupt, but exit the code with RTS, sending us back to $0001. This might sound like it’s an infinite loop, but luckily for us, the IRQ Mode is at address $101, which happens to be inside the stack! With each BRK leading to an RTS, the stack pointer is moving downwards 6 bytes. This will eventually reach the IRQ Mode, overwriting it with 0xB4. Now when the programmable interrupt is executed, it properly reaches an RTI instruction, so the stack pointer stops shifting around with every BRK, and we can continue to execute bytes after address $0001.
The NMI continues to happen every frame, and the buttons we press during the NMI are stored at addresses $17, $18, $F5, $F6, $F7, and $F8. Through careful planning, these bytes are all we need to win the game.

The limitations

Every frame we can choose what buttons to hold down on both controllers, and these fill in the bytes listed above in a specific manner. Let's look at addresses $17 and $18 first.
$17 is the total held buttons on controller 1
$18 is the newly pressed buttons on controller 1 for this frame.
For instance, suppose I wish to use these addresses to write A9 20. Let’s understand how button presses are used to write these hexadecimal numbers. Each button corresponds to a single binary bit of this number: (Button names have been shortened to a single character. Start is represented by a capital S; Select with a lowercase s)

A: 0x80
B: 0x40
s: 0x20
S: 0x10
U: 0x08
D: 0x04
L: 0x02
R: 0x01
So if I wish to hold down 0xA9, I need 0x80 + 0x20 + 0x08 + 0x01. (A + Select + Up + Right)
To write A9 20, I need to take two frames. For the first frame, I’ll push all the buttons except the ones required for address $18. That will allow me to press the new buttons on the next frame, and write out the full desired operation. So for the first frame, I would only press 0x80 + 0x08 + 0x01. That makes the following:
$17: 89 (Total held)
$18: 89 (New buttons)
And by continuing to hold down all of those previous buttons, and pressing 0x20 on the next frame, I can write:
$17: A9 (Total Held)
$18: 20 (New buttons)
You might see the problem already. Suppose I want to write A9 10. It’s not possible to hold down 0xA9 when the newly pressed buttons write 0x10, as that would make the total buttons 0x80 + 0x20 + 0x10 + 0x08 + 0x01 which adds up to 0xB9. The bits that compose address $18 are required to fit within the bits composing address $17.
Now let’s say I want to write BD BD 00. Since the third byte is 00 I don’t need to change that byte, so this could be written with just the two addresses, right? Well, let’s see. To write this, I need to press A, Select, Start, Up, Down, and Right. Super Mario Bros. 3 masks away Up + Down, as well as Left + Right, so the result would actually be 0xB1, as the Up and Down presses get removed. This limits what bytes we can create, as any bytes ending in 3, 7, B, C, D, E, or F are impossible to write with button presses.
Moving on to addresses $F5 through $F8
$F5: New buttons pressed for controller 1
$F6: New buttons pressed for controller 2
$F7: Total held buttons for controller 1
$F8: Total held buttons for controller 2
Just like the previous addresses, the bits that compose $F5 need to be contained within the bits composing $F7. The same applies for $F6 and $F8. This allows us to write much more than what can be written with addresses $17 and $18, though it too has its restrictions. Suppose I want to write 0A 20 5A B8. Like before, this takes two frames.
Frame 1:
$F5: 50 (New 1)
$F6: 98 (New 2)
$F7: 50 (Total 1)
$F8: 98 (Total 2)
Frame 2:
$F5: 0A (New 1)
$F6: 20 (New 2)
$F7: 5A (Total 1)
$F8: B8 (Total 2)
However, suppose I want to write A2 20 86 F9. This cannot be done, as A2 contains bits that 86 does not, when both need to be written with controller 1.

The win condition

The goal is to reach the princess cutscene and then the credits. Jumping to the princess cutscene is surprisingly easy, though the credits are where the issues come in. Like the IRQ, the NMI has a byte that determines which branches the code should take. If the NMI Mode isn’t set to 0x20, the credits fail to load properly, resulting in a black screen. This has been demonstrated by Masterjun, who was able to load the princess cutscene in 13 frames. That run doesn’t meet the win condition unfortunately, as the credits fail to load.
In order to win the game with the proper credits sequence, we need to meet 6 requirements.
  1. The $C000 bank needs to be 0x19
  2. The $A000 bank needs to be 0x18
  3. The PPU Control Register Copy (Address 0xFF) must be 0xA8
  4. The Stack Pointer must be greater than 0x30
  5. The NMI Mode (Address $100) must be 0x20
  6. Jump to $B85A
It’s worth mentioning that by the first frame we gain control, the first 3 requirements are met and will remain that way so long as we don’t change the PRG banks or affect the PPU Control Register. Let’s focus on the other 3 requirements. After the first frame with inputs, the stack pointer is at 0x03, and the NMI mode is 0xB4. As Masterjun and ais523 pointed out in their submission, there exists a location in ROM that sets requirements 5 and 6 for us! Address $8FE3, which is where the game prepares the princess cutscene under normal gameplay.
This leaves us with two options:
  1. Correct the stack pointer and jump to $8FE3
  2. Correct the stack pointer and set up requirements 5 and 6 manually.
While it may be trivial to jump to $B85A, jumping to $8FE3 is complicated. Let’s take a look at why we can’t simply jump to $8FE3.
Recall from the limitations, the game will mask away conflicting left + right, or up + down inputs. To write JSR $8FE3, we need to be holding down 0x8F on one controller, and 0xE3 on another. 0x8F requires holding down all direction buttons, and 0xE3 requires left + right. We could substitute 0xE3 for 0xE1, as $8FE1 would read the same code (only if the zero flag is not set), which fulfills requirement 5 and 6, but that still leaves us with the problem of 0x8F. On top of that, we need to consider the opcode, JSR, which if written using the new inputs would conflict with 0x8F, as holding down 0x20 would change the value to 0xAF. Comparing that with the much easier $B85A, with no conflicting inputs, it might take more time setting up a jump to $8FE3 than it would to meet the requirements ourselves.

The problems

  1. We need to be able to run through the button inputs in RAM more than once, so we’re going to need to write a jump to $0000. That would take up one of the bytes we could use from $F5 through $F8.
  2. Address $16 holds a value of 0x1E, which is a 3 byte opcode. Executing that address will skip right over $17 and $18, so our button presses won't be usable.
  3. Most of the space here is filled with 0x00, which writes BRK, a 2 byte long opcode. That makes us execute on either even or odd addresses. When executing around $16, we need to make sure we’re executing odd addresses to avoid skipping our button presses. This also assumes $15 is a 2 byte opcode. We don’t need to worry about that for our button presses at $F5, as the 0xFF opcode sitting at address $F0 will always align us with odd bytes.
  4. Addresses $10 and $15 are both used as timers, and change every frame. $10 counts down, while $15 counts up. Different values create opcodes of varying lengths, which could offset our code. On top of that, some opcodes crash the NES, so we need to avoid executing these addresses when they hold a problematic value.
  5. We’re unable to control the A register. Multiple bytes that execute between our stored button inputs can change A in unhelpful ways, or simply in a manner we can’t manipulate.
  6. We’re unable to control the Y register. Addresses $8D and $8E both hold 0xA0, which corresponds to LDY Immediate. This will guarantee that between the bytes we can control at address $18 and $F5, the Y register will get reset to either 0xA0 or 0x00.
  7. We still have to work with the limitations, which prevents us from writing any 3 byte long opcodes with our button presses.
  8. Any instructions that work on the zero page with offsets, such as STA $FF,X, are truncated to always remain within the zero page, so we can’t simply store our desired value into $100 by offsetting it past address $FF.

The solutions

We may not be able to use the A or Y registers, but we have full control over X. We could load 0x20 into X and store it to create a jump to $0000, so we no longer need to worry about the jump every time we reach $F5.
If we’re about to execute $15 when it’s going to crash the NES, we could simply stall inside the input loop for so long that we don’t execute address $15 until the following frame, where it will hold a different value.
Problem 2 wasn’t relevant due to the short length of the run, as address $15 only ever created 2 byte long instructions when I needed to use the button presses on $17 and $18.
I can utilize the stack to push 0x20 into $100 instead of writing there directly.

The execution

Instead of jumping to $8FE3 to fix the NMI mode, I simply push the number 0x20 on the stack when the stack pointer is in the right place. I can also adjust the stack pointer by executing JSR, or PHA, so I can easily manipulate the stack pointer to line this up right.
Bytes Instruction Description
A2 20 LDX #20 X now holds a value of 0x20, which is used to write JSR
(06 10) 86 F9 (ASL $10) STX $F9 This stores 0x20 right after the button presses at address $F9. When executed, this will bring the program counter back to $0000, while also pushing two bytes on the stack. The stack pointer is now 0x01.
(0A) 48 8A 48 (ASL A) PHA TXA PHA This pushes 2 bytes to the stack. First it pushes 0x00 at $101, transfers X (0x20) to A, and then pushes 0x20 at $100, which completes requirements 4 and 5, as the stack pointer is now 0xFF. The JSR back to $0000 happens again, pushing the stack pointer to 0xFD.
20 5A B8 JSR $B85A This jumps to the credits, completing requirement 6 and winning the game.
To reiterate, when I’m “stalling” inside the input loop, I’m switching between pressing the A button and no inputs between every instance where the controllers are polled.
The first frame of input is spent stalling inside the NMI, and can’t be used to set controller 1’s button presses.
A frame by frame explanation is as follows:
  • The game takes 10 frames to set up the title screen.
  1. Stall inside the NMI until the IRQ fires
  2. Stall to write the first half of LDX 0x20
  3. Stall to avoid a game crash at address $15
  4. LDX 0x20
  5. STX $F9
  6. Stall to write the first half of the next frame.
  8. Stall to write the first half of the next frame.
  9. JSR $B85A to win the game.

Special thanks to

  • Bigbass, who verified the run on console.
  • DwangoAC, who helped out with any TASVideos related questions I had.
  • Masterjun and ais523, who created the TAS in 00:00:00.78, which was an incredibly useful resource for learning about runs utilizing the SubNesHawk core.
  • My friend Tony, who listened to my ramblings in a discord call for several hours over multiple days as I pieced this run together.

Suggested Screenshot

Samsara: File replaced, fixing the VBlankCount to provide an accurate time. Oh, and judging.
feos: Fixed game hashsum in the movie.

Samsara: I'm stunned at the fact that you were able to improve the published run at all, let alone by this much from where it already was. This is the absolute pinnacle of technical TASing in my mind, the idea of having so much power over a single game that it's over literally before your hand even moves from the NES's power button. The current publication was more than halved. We've gone from "less than one second" to "less than a third of a second", now. According to a quick Google search, a human blink can last up to 400 milliseconds, meaning you have quite literally beaten Super Mario Bros. 3 in the blink of an eye. Someone can literally blink and miss all of the input in this TAS. There is a 5.263% chance that someone could guess my favorite frame in this run on their first try. This isn't just breaking the limit of what's possible with TASing, this is breaking breaking the limit... And you did this with your very first submission to the site, too (please please please let it be the first of many). I could come up with more ridiculous statistics, but I don't want to match the previous submission's judgement length. Excellent, excellent, excellent work!
On that note... I apologize for taking away from the submission by doing this, but I'd like to briefly address the previous judgement in order to clear something up. I'm admittedly not a very technical Judge - hell, whenever someone asks me about ASM, I give them a blank stare before responding with "31/f/cali" - but even if I was, I'd still think the previous submission's judgement was unnecessarily long and complicated. I'd like to think the audience reading these judgements is the same way, ideally down to also being 31/f/cali, ladies. It uses over 20000 characters to come to the conclusion that DPCM glitch TASes are only barely acceptable, and what I want to clear up is that barely is not the case. They are completely acceptable. I don't want the previous judgement to make TASers afraid of submitting DPCM glitch runs for fear of breaking some arcane rule etched in stone by a wizard. These runs are no different than any other "game end glitch". They'll be processed normally and fairly as they come in, and they're definitely something TASvideos wants to see more of.
With all that said, I'm happily accepting this technical masterpiece as an improvement to the published run!
Reminder to Publishers: This run has been console verified. Don't forget to flag it!
Spikestuff: Imagine if they go 25/any/vic and all I can say to that is Publishing.

Joined: 8/3/2004
Posts: 15177
Player (246)
Joined: 6/16/2020
Posts: 28
This might be an issue with the .bk2's metadata, but the run is supposed to be 19 frames with a time of 00:00.316. Is there any way to fix that? Edit: Thanks for fixing that Samsara!
Site Admin, Skilled player (1242)
Joined: 4/17/2010
Posts: 11373
Location: RU
Try replaying it until the last frame and saving the movie.
Warning: When making decisions, I try to collect as much data as possible before actually deciding. I try to abstract away and see the principles behind real world events and people's opinions. I try to generalize them and turn into something clear and reusable. I hate depending on unpredictable and having to make lottery guesses. Any problem can be solved by systems thinking and acting.
Senior Judge, Site Admin, Expert player (2137)
Joined: 11/13/2006
Posts: 2799
Location: Northern California
User movie #75515128403815194 Seems like you can also change the VBlankCount in Header.txt to correct the time? Not sure if this is functionally the same as what feos suggested, I've never worked with this core before now. EDIT: Yeah, it seems functionally the same. Saving on final frame just sets VBlankCount to 19. I'll go ahead and replace the file. EDIT 2: Done.
TASvideos Admin and acting Senior Judge 💙 Currently on indefinite hiatus, taking care of family. Very infrequently posting on Cohost
warmCabin wrote:
You shouldn't need a degree in computer science to get into this hobby.
Player (5)
Joined: 8/30/2020
Posts: 42
Location: United Kingdom
lol wow Voting yes.
Player (51)
Joined: 4/1/2016
Posts: 285
Location: Cornelia Castle
Nice improvement, easy yes vote!
DJ Incendration Believe in Michael Girard and every speedrunner and TASer!
Editor, Player (172)
Joined: 4/7/2015
Posts: 331
Location: Porto Alegre, RS, Brazil
Quite a surprise seeing this category being improved, hell yes!
Games are basically math with a visual representation of this math, that's why I make the scripts, to re-see games as math. My things: YouTube, GitHub, Pastebin, Twitter
Skilled player (1667)
Joined: 2/23/2016
Posts: 1045
Didn't expect to see an improvement for this category. Impressive work! Yes vote.
My homepage --Currently not much motived for TASing as before...-- But I'm still working.
Joined: 3/10/2010
Posts: 899
Location: Sweden
I blinked. Then the game was over.
Editor, Publisher, Expert player (2411)
Joined: 10/12/2011
Posts: 6390
Location: The land down under.
digging up their subtitle stuff in the background Look it's already hard enough to figure out which frame should be the screenshot when there's a lot of similar gameplay between the two. But you managed to save that time by grabbing more red clouds. Anyways, nice improvement I guess? Here's my yes vote.
WebNations/Sabih wrote:
+fsvgm777 never censoring anything.
Disables Comments and Ratings for the YouTube account. Something better for yourself and also others.
Joined: 10/2/2005
Posts: 4032
I didn't realize this run *was* improvable and I'm delighted. Excellent work.
My Chiptune music, made in Famitracker: My twitch. I stream mostly shmups & rhythm games My youtube, again shmups and rhythm games and misc stuff:
Joined: 2/2/2021
Posts: 168
Location: Midwest
Definitely a yes vote from me. It's not only extremely fast, but also technically impressive. And it was a real thrill to see it run on console.
TAS Verifications | Mastodon | Github | Discord: @bigbass
Joined: 9/15/2021
Posts: 6
Location: france
In terms of technique, that's great For entertainement, it's zero ... Voted Meh ,almost voted No because we clearly see nothing of the game ... There are many game ending glitches where you actually see a bit of gameplay , seeing something happening (I'll take Megaman as an example) to trigger it .... Here, nothing ...
Expert player (2466)
Joined: 12/23/2007
Posts: 822
Awesome TAS, just from a technical standpoint. Few people would expect to see game end in 0.32 seconds. Sorry that I don't know much about the technical details, but are there lots of NES games can be done this way just like SMB3? What about SMB1? Thanks.
Recent projects: SMB warpless TAS (2018), SMB warpless walkathon (2019), SMB something never done before (2019), Extra Mario Bros. (best ending) (2020).
Site Admin, Skilled player (1242)
Joined: 4/17/2010
Posts: 11373
Location: RU
HappyLee wrote:
Sorry that I don't know much about the technical details, but are there lots of NES games can be done this way just like SMB3? What about SMB1? Thanks.
Not many:
Masterjun wrote:
jlun2 wrote:
How many games out of the NES library have the DPCM bug?
Every game on a NES console has the DPCM bug. Not all games implement a DPCM bug workaround. And even less games have a DPCM bug workaround that can be exploited. And in most of the cases where you can exploit from the first frame of input, you're just going to end up with a crash you can't manipulate. The principle applied here is not a rare way to trigger the credits, but a rare way to trigger a crash.
Warning: When making decisions, I try to collect as much data as possible before actually deciding. I try to abstract away and see the principles behind real world events and people's opinions. I try to generalize them and turn into something clear and reusable. I hate depending on unpredictable and having to make lottery guesses. Any problem can be solved by systems thinking and acting.
Post subject: The Fastest TAS, Even Faster
Joined: 1/4/2019
Posts: 78
Location: Canada
Okay, seriously, this was already the shortest published category on TASVideos, did we really have to make it shorter?!
Former player
Joined: 6/30/2010
Posts: 1101
Location: Zurich, Switzerland
This is a very impressive technical achievement, but from the perspective of a viewer, I can only vote "no". There is no entertainment here because there is no visible gameplay whatsoever.
Current project: Gex 3 any% Paused: Gex 64 any% There are no N64 emulators. Just SM64 emulators with hacky support for all the other games.
Editor, Expert player (3716)
Joined: 11/30/2014
Posts: 2778
Location: US
Nice improvement, great work and yes vote. The previous run actually relied on a emulation issue that has since been fixed, so it's nice to have a consistent run for this category. (The previous run could still be console verified since the emulation error (MMC3 IRQ timing) could coincidentally be cancelled out by the variability of NES power up state.)
Skilled player (1720)
Joined: 9/17/2009
Posts: 4957
Location: ̶C̶a̶n̶a̶d̶a̶ "Kanatah"
I never knew the previous run was even improveable, nice work! You seem to have knowledge on the NES. Do you want to try the DPCM bug run for SMB2? There was a concept video but never went further.
Player (26)
Joined: 8/29/2011
Posts: 1206
Location: Amsterdam
Lol, I misread that as "32 seconds". And then before I noticed, the run was already done. Yes vote, this is funny stuff.
Editor, Player (44)
Joined: 7/11/2010
Posts: 1025
HappyLee wrote:
Awesome TAS, just from a technical standpoint. Few people would expect to see game end in 0.32 seconds. Sorry that I don't know much about the technical details, but are there lots of NES games can be done this way just like SMB3? What about SMB1? Thanks.
This sort of run exploits a bug in the game's controller read routines. Different games use different read routines – some of them are exploitable, some of them aren't. SMB1's controller read routine is very simple and doesn't have this sort of bug in. (The reason games might want to use a more complex read routine is to work around a bug in the NES in which the DPCM unit occasionally causes incorrect data to be read from the controller, but if I remember correctly, SMB1 doesn't use the DPCM unit and thus it has no need to try to "fix" the controller data – it'll always be read correctly first time. The DPCM unit has no effect on this glitch itself; the DPCM glitch is the reason games use complex and potentially buggy controller read routines, but not the cause of the bug exploited here. That bug being exploited here is a different bug which was accidentally introduced by the SMB3 programmers while trying to fix the DPCM bug, as opposed to being the DPCM bug itself.) Even when the controller read routines are buggy, the bug doesn't necessarily do anything useful. With SMB3, we're kind-of lucky that the effect of the bug is to start running zero page (which contains some important registers like the controller registers) as a program, but other games might just crash, and it isn't always possible to manipulate the crash.
Joined: 7/15/2017
Posts: 199
Location: Kentucky
This better not just get Vault/Standard like the last iteration did.
I like to comment on submissions and look around the site. You have probably seen me before (if you have been around for a while) either on the site, Discord, or any other social media. I recently took up making temporary encodes for new submissions. Also, I never forget to greet Tompa wherever I find him! "when resyncing stuff sucks it's called Resuccing" - EZGames69 “If an emulator stops being accepted to the site it should be called an emuLAMEr” - EZGames69 "oh no discord, everything I say will now be logged forever, sdfsdf, time to hide" - Masterjun "just had to give therapy to a taxi with daddy issues" - psx Current Projects: Mother 3 (75% complete)
Publisher, Reviewer, Expert player (4187)
Joined: 5/29/2017
Posts: 2730
Location: Michigan
xxezrabxxx wrote:
This better not just get Vault/Standard like the last iteration did.
The Class system isn't based on entertainment like Tiers were. Standard is given to all any% and 100% movies (with some middle ground categories but that discussion is still in progress). This movie falls under the any% category so if accepted, it will be in standard. This debate over entertainment value is exactly why the tier system was overhauled.
[14:15] <feos> WinDOES what DOSn't 12:33:44 PM <Mothrayas> "I got an oof with my game!" Mothrayas Today at 12:22: <Colin> thank you for supporting noble causes such as my feet MemoryTAS Today at 11:55 AM: you wouldn't know beauty if it slapped you in the face with a giant fish [Today at 4:51 PM] Mothrayas: although if you like your own tweets that's the online equivalent of sniffing your own farts and probably tells a lot about you as a person MemoryTAS Today at 7:01 PM: But I exert big staff energy honestly lol Samsara Today at 1:20 PM: wouldn't ACE in a real life TAS just stand for Actually Cease Existing
Joined: 6/8/2020
Posts: 19
Location: São Paulo, Brazil
Nice improvement. Voting "yes".
A more stupid human being than you think.
Judge, Skilled player (1303)
Joined: 9/12/2016
Posts: 1645
Location: Italy
Just like for the previous submission, I voted No. For me entertainment means to get my attention glued to the screen for at least some time, closely following gameplay that is visible on screen and that displays multiple different situations. I agree that it's also funny to see a game getting beaten before it starts, but for me it's not remotely enough to make me content. I'd be much more happy to work on making such a TAS, rather than watching it. Or maybe watching the reaction of other people watching it for the first time... after which they will never have the same reaction again.
my personal page - my YouTube channel - my GitHub - my Discord: thunderaxe31 <Masterjun> if you look at the "NES" in a weird angle, it actually clearly says "GBA"