Submission #6274: Lord_Tom, Maru & Tompa's NES Super Mario Bros. 3 "warps" in 10:24.34

(Link to video)
Nintendo Entertainment System
FCEUX 2.2.3
Submitted by Maru on 2/17/2019 6:19:31 PM
Submission Comments
Well, we missed submitting this by the 29th anniversary of the game for the U.S., but it is an improvement nonetheless! This TAS (10:24.34) is 36 frames faster than our published run of 10:24.94 and 76 frames faster than Lord Tom, Mitjitsu, and Tompa's run from 2010. All of the new improvements are from the World 8-Airship boss fight and on. This TAS has been console verified by dwangoAC. Go TASBot, go!
Since this run is essentially a continuation of our previous effort, and a lot of input is reused from that TAS, we have decided to include comparisons between both the published run and the 2010 run.
World 1:
World 1 is exactly the same as the published run. But to recap, we saved 14 frames over the 2010 run.
In 1-1, we had saved six frames by using GlitchMan's improved mushroom grab. The reason that the mushroom grab is faster is that it grabs the mushroom with enough speed ($bd) to clear the pit without slowing down. While it was possible to remove the lag frame by using a tweak on the old mushroom grab, it will not improve the fastest overall time.
It was possible to save a frame in 1-1 by grabbing the mushroom with a higher speed unit ($bd = 9 instead of $bd = 8), but that frame is eaten up by hammer brother movements after 1-2. Therefore, it is not implemented in this run.
2010 TAS: 6 frames ahead 2018 TAS: 0 frames ahead
In 1-2, we saved three frames over the 2010 TAS by not having to delay frames to manipulate the hammer brothers.
2010 TAS: 9 frames ahead (+3 frames) 2018 TAS: 0 frames ahead
In 1-3, we saved four frames over the 2010 TAS with two improvements: 1) clipping into the block to grab the leaf and 2) RAT926's improved jump to the white block. The leaf grab saves two frames, and it would be even faster if not for the need to preserve P-Speed to the white block. Additional frames were saved by using a duck jump to the white block, which does not cancel the tail flip and allows for the white block timer for ducking on the block ($570) to be started sooner.
Note that 1-3 can be improved by an additional four frames. The leaf grab can be improved by one frame through performing a corner clip through one of the wood blocks before clipping into the leaf powerup block. RAT926 also found that it was slightly faster to use two small jumps to the white block instead of one big jump to the white block. In addition, the published run delayed one frame in the toad house because of hammer brother manipulation, but finishing the level three frames faster gives a good hammer brother movement (four frames faster did not give a good hammer brother movement, so one frame had to be lost).
2010 TAS: 13 frames ahead (+4 frames) 2018 TAS: 0 frames ahead
In 1-Fortress, we had saved 1 frame over the 2010 TAS through better movement under the roto-disc. Note that there is a three frame speed/entertainment trade-off in this level. The stuttering movement up until the door leading to the warp whistle cost about three frames, but it is more entertaining than simply running towards the door.
2010 TAS: 14 frames ahead (+1 frame) 2018 TAS: 0 frames ahead
While it was possible to save 4 frames in World 1 (three frames faster in 1-3 and one frame faster in 1-Fortress due to better hammer brother luck) or get a good hammer brother movement from saving three frames in 1-F, these savings ultimately do not amount to anything. In 8-Tank, Mario has to fight a boomerang bro, and in order for the fight to be completed as fast as possible, the boomerang bro needs to move in a specific way. Getting the next optimal boomerang bro pattern would involve starting the boss fight in 8-T at least five frames faster. However, that is a little problematic because further improvements to World 1 give a series of bad hammer brother movements. In order to save time in 8-Tank, World 1 would need to be completed at least 13 frames faster than our run.
We did attempt removing the lag frame from 1-1 by using a tweak on the old mushroom grab and incorporating our 1-3 and 1-F improvements. However, this only saved 13 frames in World 1 compared to the 2010 TAS (1 lag frame + 12 non-lag frame), which was not enough to reach the optimal boomerang bro pattern.
Score Manipulation:
Score manipulation, like the published run, is used throughout this TAS to ditch lag. We have copied and pasted an explanation of how it works:
"We had noticed some variation in the number of lag frames required to enter the pipe at the end of the various autoscrollers: sometimes it'd be 9, sometimes 10. It didn't seem to be affected by enemy position, etc, but Tompa noticed on 8-Navy that by waiting atop the pipe, the amount of lag would change every 20ish frames: 9, 10, 9, 10, in an endless cycle. By dumping traces and comparing instruction counts, Lord Tom found the cyclical phenomenon seemed related to the sound code. This made sense, because the sound code executes during lag frames -- otherwise the music would lag and would sound awful.
This insight didn't prove useful, but led to one that did. Using the same technique to investigate an extra lag frame at the start of 8-Tank2, we found a quite unexpected cause: the score display!
The score is stored as 3 bytes at $715-7. But converting those bytes to a six-digit, base-10 number (plus a zero on the end) is actually quite a chore for the NES's little 6502 processor. Basically, a loop runs for each digit in the display -- the higher the digit, the more iterations of the loop. Those extra instructions can add up; for frames where it's a close call whether or not lag will occur, having a good (or bad) score can make the difference.
A good score is one where the sum of each digit is low, so 9999990 is the worst and 0 is the best, and 1110000 is much better than 9990. To check a given score, we'd just divide by 10 and convert to hex, e.g. for 224,030 we'd google '22,403 in hexadecimal', then use the hex editor to enter 00 57 83 starting at $715 and see if it lagged. It was surprisingly painful to aim for the scores we wanted, chained from level to level as the autoscrollers have hundreds of individual scoring events, the escalation of values as you stomp enemies make certain scores much easier to achieve than others, and the timer countdown at the end of each level affects the 10's place, which we can otherwise only change via coins (50) and block-breaking (10)."
World 8:
In 8-Tank, we saved three frames over the 2010 TAS. Two frames were saved by not having to delay time to get the optimal boomerang bro routine, and another frame was saved with score manipulation, entering the pipe with a nice round score of 100,000.
2010 TAS: 17 frames ahead (+3 frames) 2018 TAS: 0 frames ahead
One thing to note about score manipulation to reduce lag leading to Boom-Boom boss fights is that it only reduces lag frames. It actually costs one non-lag frame. If you enter the pipe with an "unoptimal" score, it is possible to manipulate the y-subpixel so Boom-Boom charges earlier, knocking off one frame from the boss fight. The 8-Airship boss fight is an exception to this because subpixel manipulation can save a frame on the Boom-Boom boss fight regardless of lag.
In 8-Navy, we did not save any time compared to the 2010 and 2018 TASes.
2010 TAS: 17 frames ahead 2018 TAS: 0 frames ahead
Unfortunately, this TAS and the published run had worse luck with the Hands compared to the 2010 TAS. A hand grabs you based on RNG from $781-9. We tested saving one non-lag frame during the Boom-Boom boss fight in 8-Navy and saving a second frame in the underground warp zone to the Hands, but we had to lose both frames in order to not be grabbed by the first hand. The luck with the hands was rather unpleasant.
2010 TAS: 12 frames ahead (-6 frames) 2018 TAS: 0 frames ahead
In 8-Airship, we saved eight frames over the 2010 TAS. Six frames were saved by digging into the right edge of the screen more by building up P-Speed and doing as large of a jump as possible from the last platform before the pipe. Two additional frames were saved through a combination of score manipulation and better subpixel management with the Boom-Boom boss fight (one frame at the start of the level and another frame entering the pipe).
2010 TAS: 20 frames ahead (+8 frames) 2018 TAS: 1 frame ahead (+1 frame)
In 8-1, we saved three frames over the 2010 TAS and one frame over the 2018 TAS. Two frames came from a type of corner boost that was not present in the 2010 TAS along with not having to manipulate the card because we had two flowers instead of two stars. The next frame was hard-won. Tompa was able to maneuver around the bullet bill cannon and blue wall faster with precise subpixel management. This was ultimately the key because we no longer needed to sacrifice pixels to perform the corner boost. This also allowed us to incorporate an additional corner clip through one of the bullet bill cannons. These things allowed us to save a third frame in 8-1.
2010 TAS: 23 frames ahead (+3 frames) 2018 TAS: 2 frames ahead (+1 frame)
In 8-2, we managed to save six frames over both the 2010 and 2018 TASes. The piranha plants during the second part of the level were the limiting factor to how fast we could complete 8-2, and attempts in the past at improving this level did not result in anything. What we did know was that if we could reach the top of the slope two frames faster, framerules would allow us to get speed 62 from the slope instead of speed 61, which would save another four frames (six frames in total). Maru noticed a couple of things that would make this improvement possible. First, it was possible to save pixels near the piranha plants by ducking for a frame, as ducking reduces Mario's hitbox size. This alone, however, did not do any good. Eventually, Maru realized that it was possible to despawn one of the piranha plants at the start of the level, which would allow Mario to run down one of the slopes longer. The frame that was saved from the slope boost combined with the improved maneuvering around the piranha plants (hitbox abuse), allowed Mario to reach the slope two frames faster, get speed 62 from the slope, and save six frames overall in this level.
2010 TAS: 29 frames ahead (+6 frames) 2018 TAS: 8 frames ahead (+6 frames)
In 8-Fortress, 22 frames were saved over the 2010 TAS using the magnificent star route that Lord Tom discovered. The following is copied and pasted from the 2018 TAS to recap the logistics of the route:
"This submission contains the first significant route change to 8-F since Morimoto's landmark TAS in 2003. Lord Tom proposed it early in our work on this run, but all of us worked heavily in optimizing it. There's a lot going on, so it's worth breaking the route down step by step.
1. 1st Star
The major challenge here is to grab the star as quickly as possible while still being able to clip through the blocks to the right. Performing the clip requires a duck-jump at a certain horizontal speed (33+). Unfortunately, while it is possible to use a duck jump to grab the star, it cannot be done with a high enough speed to perform the clip - it isn't even close. As a result, the star is grabbed from the block's left side. As well, the star grab and clip must be performed without touching ground for more than a frame at a time - else more P-meter power is lost performing the clip, along with several frames afterward.
2. 2nd Star
One might wonder why we grab the first star at all if there's another one just down the way. Well, the reason is that if you don't have star power active when you hit the 2nd star's block you'll get a coin rather than a star! This grab was perhaps the most laborious portion of the route to optimize. Again, to preserve P-speed we need to avoid spending more than one frame on the ground prior to grabbing the star. As with the first star, the game's collision detection makes it impossible to grab a power-up via a duck jump with appreciable horizontal velocity, so we stand up for a frame as we leave the block to take advantage of standing-Mario's bigger hit box. After grabbing the star, we need enough speed (49+) to clip through the blocks ahead, which is just possible.
3. P-Switch
Like the 1st star, hitting the P-switch might seem a needless detour, but the door to BoomBoom only exists when a P-switch is active, so it can't be skipped. A shout-out here to Tompa for finding a type of block clip we hadn't seen before - to go through the wall we duck-jump to fall into the block that's sticking out, then "land" as if to do a wall-jump against the wall beyond it. Now, since we've landed but are also ducking inside a block, we clip through the wall without having to stand up. With optimal conditions, there's just enough time to get through the wall, hit the P-switch and jump up to it before losing additional P-meter.
4. Spike Room
We tried soooo many strats in here; this was the fastest we found. Like the published run, we go through the door as far left as is useful to let accelerate, jumping coming out of the door to preserve P-speed. Since we have the star, though, we can duck-jump under the spikes! At least, we can until the star runs out, which we manage to delay long enough to slide out without losing P-speed. Unlike the published run, our P-switch is still active at this point, which means we don't get extra speed from the moving walkway, but it's still faster to run right before firing our fireballs.
5. Boom Boom
Once BoomBoom is active, the moving walkway starts again; we stay on the ground as long as possible to take advantage. BoomBoom is defeated 28 frames faster. Despite this, there's less time left on the level timer (378 vs 380) because we used fewer doors, during which the timer doesn't run. Unfortunately, the level timer counts down digit by digit, though. It takes 7 more frames to count down from 378 than it does from 380, so we lose those frames after the ? ball is grabbed and finish the level 21 frames faster."
Because of the precision required for this route to work, it is very difficult to find improvements to this level. The first room requires precise subpixels to maneuver around the roto-discs without losing time. Furthermore, the star grabs required precision down to subpixels. In fact, the wall clip after grabbing the first star is only possible with a single subpixel. Finally, acceleration framerules ($55d) needed to line up for the star grabs to work and also to quickly rebuild P-Speed after hitting the P-Switch.
So, Maru decided to look at the spike room again. He was able to save a frame in the spike room by manipulating the y-subpixel ($75f) upon entering the door to give one more frame of airtime. That one extra frame of airtime is one less frame spent ducking on the conveyor belt.
2010 TAS: 51 frames ahead (+22 frames) 2018 TAS: 9 frames ahead (+1 frame)
In 8-Tank2, we did not save any time compared to the 2010 or 2018 TASes. We did, however, have to manipulate the score to create lag in order to save one non-lag frame on the Boom-Boom fight. RNG does not change during lag frames, so this was necessary for this improvement to happen.
2010 TAS: 51 frames ahead 2018 TAS: 9 frames ahead
In 8-Bowser Castle, we did not have to delay time to manipulate the Bowser pattern. The Bowser pattern is the ultimate limiting factor for these TASes because he needs to hop six times in order to kill him with the Quick Drop before he makes his big jump. We tried to squeeze as much as we could out of Bowser's Castle, and our final result was 25 frames faster than the 2010 TAS and 27 frames faster than the 2018 TAS. At first, we had only managed to save 26 frames compared to the published run ( Maru was able to save a frame clipping into the thwomp room wall, but that frame was eaten up due to acceleration framerules ($55d). If just half of a pixel was saved, it would be possible to make that duck jump in the statue room and land on the checkered floor to gain P-Speed a frame earlier. Maru was able to save that half of a pixel by clipping into the wall with a lower subpixel value, which would allow Mario to gain a small boost after duck jumping off the wall. We did not lose any time at all to manipulate the Bowser pattern. One frame was lost because we had 376 on the timer instead of 375, but the next Bowser pattern was achieved nonetheless!
2010 TAS: 76 frames ahead (+25 frames) 2018 TAS: 36 frames ahead (+27 frames)
The next six-hop Bowser pattern after this would involve having to start the Bowser fight 25 frames ahead of this TAS. We hope that it will happen someday. Because we were able to reach the next Bowser pattern, there are no speed/entertainment tradeoffs with the 8-Fortress star route anymore. Hooray!
Lord Tom's Comments:
I was mainly a cheerleader for this second effort, but in a lot of ways this is the SMB3 Warps run I've been waiting for since I started with this game, one where the sum total of lots of great improvements come together -- and then DON'T get hammered by bad RNG at the end. Maru did such a great job conjuring frames, then Tompa finished it off finding that last frame in 8-1. Enjoy!
Maru's Comments:
It's good that we were able to take SMB3 one step closer to perfection again. I spent a lot of time searching for improvements and nearly gave up, but the perseverance paid off. I have a lot fun TASing this game, and I look forward to working on some more projects, especially SMB3Mix any%.
Tompa's Comments:
This improvement came a lot quicker than expected. Maru has done an excellent job at finding all these improvements. I just haven't had any ideas nor motivation to explore myself. Though when Maru said that only a single frame remained, it got my heart pumping so I gave a final effort. 8-1 was a level I always felt could be pushed even further. I gave a quick look, saw a potential timesaver and here we are.
Special thanks:
GlitchMan - for the 1-1 mushroom grab that was necessary to save 14 frames in World 1.
RAT926 - for the 1-3 improved jump to the white block and for finding a lot of other glitches in this game.
Southbird - for his disassembly that was very useful for understanding how score affected lag.

Nach: I found the submission notes here highly confusing, and wasn't sure what this run was supposed to be about. It refers to past runs, but does not link to any of them. It keeps comparing against a TAS from "2010" even though it's trying to obsolete a better TAS in the same branch from "2018". I'm a smart guy, but I had a hard time following what this submission was about from reading its notes.
I gave this run some time in the queue to collect some comments, but there really wasn't anything of note. This may be because SMB3 "Warps" has been published just recently, or after so many submissions, there's nothing really more to say. Or maybe comments often have some correlation with the submission notes, and the notes here aren't as clear as they should be. But enough about the notes, judgments are not about judging the notes but the submitted run itself.
Overall, there wasn't much feedback on this run, aside from a lot of positive voting. The movie looks like a decent improvement to the published run. Accepting to Stars because that's the tier of the previous run, and the run itself is really polished.
fsvgm777: Processing.
Last Edited by adelikat on 10/31/2023 1:47 AM
Page History Latest diff List referrers