Experienced Forum User, Published Author, Player
(246)
Joined: 6/16/2020
Posts: 28
With the way my code was set up, the first input in a frame would determine if I wasted a single CPU cycle to prevent the race condition. This was the one part of my input generating code that I wasn't able to take care of while generating the millions of inputs, so I write a lua script that automatically changed the first input in the most recent frame if a lag frame followed it while watching the TAS. (A lag frame in this context indicating the race condition happened)
Experienced Forum User, Published Author, Player
(246)
Joined: 6/16/2020
Posts: 28
Ooh, I honestly got so wrapped up with other ACE tomfoolery that I hadn't considered resubmitting that TAS. I had help optimizing that one, and I think it would be a pretty good idea to submit it again!
It would be very cool to make cart swapping a real feature of bizhawk, though my current implementation is buggy at times and definitely not ready for a pull request. Part of me wants to wait for cart swapping to be a real feature of the emulator before resubmitting that TAS, so I don't need to start from a savestate. That being said, I do not know if I'll actually be able to get cart-swapping to be a perfectly stable feature. I have some other cart swapping shenanigans that I want to explore too... One project at a time.
Experienced Forum User, Published Author, Player
(246)
Joined: 6/16/2020
Posts: 28
Oh that's lovely. It's even better considering the NMI routine doesn't disable future NMIs, which makes overflowing the stack trivial. You couldn't ask for a better exploit!
Great discovery, great TAS.
Experienced Forum User, Published Author, Player
(246)
Joined: 6/16/2020
Posts: 28
What's with the final time of 00:00.00? You say you're using the 9999FPS because that's the fastest libTAS will allow for ScummVM, but benefit does that have for this TAS? Is this using the 9999FPS for a similar "key queuing trick" as this tas of Mari0 to pack all the inputs in a fraction of a second (where I assume the game responds to each input one at a time at a more leisurely pace)? If so I find it less entertaining, as that would allow any TAS for an ADRIFT game to have an arbitrarily short time-of-final-input using the exact same method.
Experienced Forum User, Published Author, Player
(246)
Joined: 6/16/2020
Posts: 28
The bytes that form your inventory being read as code is always a fascinating optimization problem, and I love how in your author comments you talk about where you started and the ways you reduced the cost of items to achieve the same desired effect.
In your author comments, you mention where the item jumps the PC, but you don't mention where the inventory is stored in WRAM, which would really help clear things up for those of us who don't know much about the game but are trying to follow along with what the ACE is doing. You share a tracelog where $CABB is a jump instruction. I assume the end result makes this a jump to the inventory (presumably at $F00A). At the end of the comments, consider showing the full tracelog from using the item to the end of your payload. I think it would help me follow along.
Great discovery, and nice TAS!
Experienced Forum User, Published Author, Player
(246)
Joined: 6/16/2020
Posts: 28
I like the comparison to the subframe TASes. I'll admit, I'm unfamiliar with libTAS, but being able to arbitrarily change the "framerate" in order to queue up 24 inputs is a pretty unique exploit I've not seen elsewhere. The "subframe inputs" are certainly being used for different reasons than the subframe TAS you linked (Queuing up a series of actions to all be performed in one frame, versus manipulating the program counter to execute manipulated RAM) though it's an interesting use for a similar trick.
To build off of the comparison to subframe TASes, this TAS has a similar fate where the technical achievement of pulling this off is much more interesting than watching the end result. My initial thoughts as I watched it was a sarcastic "Seriously? Okay. sure, why not? The level was edited so he just teleports to the axe, I guess." I admit I was a bit skeptical when I saw the final time of 0:00.01 and skimmed the phrase 1,000,000,000 FPS, but as I read through the full author notes I found it to be more interesting than it initially appears. And while I think editing the levels with a built-in level editor is a pretty boring way to beat the game extra fast, I think combining it with the billion FPS, the buggy return to the title screen, and the continued exploitation to allow the game to begin in 8-4 make this run a perfect fit to be published in the playground.
Cool run, and great write-up in the author notes!
Experienced Forum User, Published Author, Player
(246)
Joined: 6/16/2020
Posts: 28
This is fantastic! I also really appreciate the visuals in the video. Seeing the RAM and the cursor's position helped a lot to understand what's happening.
I'm a big fan of both ACE exploits and Crash Bash. You absolutely have my yes vote.
Experienced Forum User, Published Author, Player
(246)
Joined: 6/16/2020
Posts: 28
The issue with the game crash has been solved, and confirmed to be a malfunction of Bizhawk's open bus behavior for the SRE instruction.
In the submitted TAS, open bus executes as follows:
We enter open bus through an indirect jump, leading to address $53AE. The Y register holds a value of $0A.
SRE ($53), Y.
This is an indirect instruction, so the first step is to find the address we're going to execute the instruction on. The operand was $53, so let's take a look at address $0053 and $0054. both of these addresses have a value of $00, which forms a pointer to address $0000. This particular instruction also adds an offset of the Y register, which has a value of $0A, so we're going to perform the SRE on address $0A.
In the submitted TAS, Address $0A holds a value of $40. The SRE instruction takes this value and leaves it on the bus. Therefore the next instruction to execute will use the value of $40 as the opcode.
40 is an RTI instruction, and that takes us to the payload.
However, that is incorrect behavior, as demonstrated by Alyosha's console verification. The correct behavior would take the value of $40, bit-shift it to the right, and leave the result on the open bus. To correct this, I made sure address $0A had a value of $80 for the next console verification. When performed on console, $80 is bit shifted to become $40, and so the following instruction is RTI. That leads to the payload that was written, and the run is verified!
Thanks for the help, everyone! Especially Alyosha and BigBass for help with console verification, and Mizumaririn and SeraphmIII for help optimizing the SMB1 part of the run.
Link to video
Experienced Forum User, Published Author, Player
(246)
Joined: 6/16/2020
Posts: 28
As a brief update, the cause of the crash is now known, but not for the reason I initially thought. To figure out if the crash was due to a BRK instruction, I created a ROM hack with a simple crash handler (when a BRK instruction executes, instead of an infinite loop, I made some code that shows on screen the address of the BRK instruction that was executed, as well as a frame counter to verify the frame the game crashes on is on the same frame as expected.) and I sent the TAS with the ROM hack over to BigBass to verify:
It would appear the cause was indeed a BRK instruction due to unexpected open bus behavior. Remember, the indirect jump is to address $53AE, which means the SRE instruction never executed, and instead a BRK was. This seemed rather fishy, as my knowledge of open bus behavior contradicts that outcome. I discussed it with some members of the NesDev discord server, and apparently everdrives have wildly inaccurate Open Bus behavior. My current theory is that the cause of the crash wasn't actually a BRK when Alyosha tested it, and the crash handler ROM I put together ended up as a red herring.
After some more conversation, it would appear the real culprit was the SRE instruction after all! Bizhawk leaves a value of 40 on the bus, though the SRE instruction should leave the bit shifted result on the bus instead, which in this case was 20. that creates JSR $2020, which leads to, you guessed it- open bus. An infinite loop of JSR $2020 occurs, thus the game crashes.
After I came to the conclusion of "The mystery BRK is the cause of the crash!" I decided to mark the submission as "Cancelled", though perhaps that was done a little too hastily. If anybody wants to verify the run, use this movie for the SMB1 portion, and it must be done without an everdrive: https://tasvideos.org/UserFiles/Info/638169302314508374
The SMB3 portion is still the movie that can be found here: https://tasvideos.org/UserFiles/Info/638160503431898737
Experienced Forum User, Published Author, Player
(246)
Joined: 6/16/2020
Posts: 28
Interesting. I probably should've went into more detail what the SMB3 portion is doing in my author notes, but the April first deadline was swiftly approaching, and I put too much effort into writing bad jokes.
In SMB3, the final instructions to execute are what sets up that variables to start SMB1 in world 'N', so it would seem everything is being executed, though it's certainly possible something is going wrong during the setup.
I write a loop that's essentially
LDY #PayloadSize
Loop:
; -- wait for next frame, snip --
LDA Controller 1
ORA Controller 2
STA (pointer) , Y
DEY
BNE Loop
; -- Set up SMB1 bytes for starting in world 'N', snip --
And by running that, I write the payload (in reverse order) at the pointer. (in this case, $0180). It's possible the payload is being written incorrectly. I don't know if this would affect anything, but while making the TAS I placed inputs in controller 2 during a read of controller 1, just for visual clarity of "Here's where stage 2 of my code is happening". (Line 1393 in TAStudio) Perhaps I could've used a marker. If that would throw off the bytes controller 2 is pressing during the replay device playback, then it would explain why you still start in world 'N' but the payload doesn't work. Then again, it could be something else, like the unofficial opcode. More specifically, how the result of that opcode affects open bus.
It's worth noting, if instead of the payload running, a BRK instruction were to execute (if the entire stack was cleared when the game booted? I don't believe any version of the game does that. Open Bus could have behaved different from the emulator?) the audio that plays is identical to Alyosha's recording, though unlike the recording the background moves away from the bridge:
Experienced Forum User, Published Author, Player
(246)
Joined: 6/16/2020
Posts: 28
It shouldn't be too difficult. The SMB3 TAS intentionally ends with a HLT instruction, so you could take the time for a human to swap out the cartridges.
I can separate the SMB3 TAS into it's own file and send over both halves. I'm currently working with some folks in the Retro Mario Speedrunning Discord to optimize it (My SMB1 gameplay was fairly suboptimal)
I'll go ahead and upload both halves of the TAS as user files:
The SMB3 TAS that runs first: https://tasvideos.org/UserFiles/Info/638160503431898737
The SMB1 TAS that follows: https://tasvideos.org/UserFiles/Info/638160458975160308
A more optimized TAS is in the works, but these files are from the TAS that was submitted.
Experienced Forum User, Published Author, Player
(246)
Joined: 6/16/2020
Posts: 28
I watched this run side by side with the previous run. 8-5 was quite spectacular!
Yes vote.
Edit: you should probably include Hartmann as an additional author, since everything up until 8-T appeared to be the exact same inputs.
Experienced Forum User, Published Author, Player
(246)
Joined: 6/16/2020
Posts: 28
I didn’t assume the rerecord count was inaccurate, especially since some form of brute forcing was involved. I just think it’s hilariously large for a game that looks as simple as possible. Especially with the knowledge that it plays itself if you let it. It definitely goes to show, “don’t judge a book by its cover.” Or in this case I guess, “Don’t judge a TAS by how trivial the game appears.”
Experienced Forum User, Published Author, Player
(246)
Joined: 6/16/2020
Posts: 28
I'm genuinely impressed that there was time to save, and the five digit rerecord count is some next level comedy. Additionally, I love how one of the methods for achieving a faster time was brute forcing.
Edit: I found myself re-reading through the submission text after all these years, and I've found a new appreciation for this TAS. It's wild how many optimizations there are that are invisible to the uninformed eye. The lag reduction by putting the game into autoplay mode is clever. Decelerating in the air to start walkin' with the maximum speed is unintuitive at first, but also really clever. I didn't give it the appreciation it deserved when it was initially being published. I'm a few years wiser now. I'd vote yes.
Experienced Forum User, Published Author, Player
(246)
Joined: 6/16/2020
Posts: 28
This might be an issue with the .bk2's metadata, but the run is supposed to be 19 frames with a time of 00:00.316. Is there any way to fix that?
Edit: Thanks for fixing that Samsara!
Experienced Forum User, Published Author, Player
(246)
Joined: 6/16/2020
Posts: 28
It can also be a bit misleading, as there are several different cheep cheep objects, and only one of them has the odd bump behavior.
Here's the table from my video (with some slight adjustments) highlighting the sprites that get bumped before/without detecting the tile beneath them.
Experienced Forum User, Published Author, Player
(246)
Joined: 6/16/2020
Posts: 28
I don't think you got anything wrong. The video I made on Remote Sprite Bumping has some outdated information though. To add the the list of enemies, Boomerang Bros can also be remote / instant bumped. The descriptions you wrote for each of them describe it perfectly!