Alyosha
He/Him
Editor, Expert player (3514)
Joined: 11/30/2014
Posts: 2713
Location: US
I want to start focusing more heavily on console verification efforts for the NES. I want to do this in a more structured way, so I'll be using this thread and this first post to keep things organized. If anyone has anything they want to add or suggest, feel free to post. Status: Power up timing verified to a high degree of confidence, many timing sensitive games now sync to console. DMC timing verified with new tests. DMC sensitive games now work on console. Runs with resets are the biggest remaining open cases, but no way to test (for me.) Testing: Nothing. Current Test Runs: None presently. No runs are currently known to desync due to emulation errors, but also few runs sync on current builds. Testing can continue as new runs are made.
Editor, Player (68)
Joined: 1/18/2008
Posts: 663
It will be important to note, based on experience and your testing list: * How many controllers should be connected, and/or if count is known to matter * What specific issue is being targeted in the run, for example DMC timing, so the bot can be set appropriately * If the robot should, on the console, react to every poll exactly as the emulator does (DMC timing test) or if we should use a frame window mode * If a run starts from reset
true on twitch - lsnes windows builds 20230425 - the date this site is buried
Alyosha
He/Him
Editor, Expert player (3514)
Joined: 11/30/2014
Posts: 2713
Location: US
^for now I will not be considering runs that use reset or turn on the DMC channel, so everything will be from power on and number of controllers won't matter (except obviously for 2 player runs) But yeah for clarity I will add that info for each run I plan on starting with easy runs like Streemerz as kind of general tests that don't do anything too crazy, then move on to others as I become more confident in the basics of the emulator.
Editor, Player (68)
Joined: 1/18/2008
Posts: 663
I will test Streemerz and any other games I have this weekend. If any other runs for games I do not have are posted but I have donor carts for them, I will test them if my EPROM eraser was not stolen. EDIT: Alyosha should have some info to help track down problems now. - Streemerz had somewhat deterministic desyncs - Battletoads acted strangely, with nondeterministic desync, though the most common desync point was at the stage 1 boss. I can't remember this game desyncing nondeterministically with other runs. - MM4 was extremely nondetermistic but had a finite amount of desync modes. If we can find the culprit then maybe we can take a guess as to what the "perfect" NES has. If it is console SRAM (doubtful), then I can clear the console on run attempts. - MM5 desynced once, but could have been the bot. Synced two other times just fine
true on twitch - lsnes windows builds 20230425 - the date this site is buried
Alyosha
He/Him
Editor, Expert player (3514)
Joined: 11/30/2014
Posts: 2713
Location: US
Thanks to True's testing I now have some real leads on things that need investigating. Mega Man 5 and 6 are the most promising to sync, I'll try to get longer runs of those maybe even beat the game. Mega Man 4 is pretty random but synced once, so maybe some potential there. Battletoads performed weirdly, I still think it should sync but will hold off on that one for now. Streemerz gave some very real data points to look at. Comparing the VODs of True's stream to BizHawk shows some visible differences that I hope will lead me to some new directions. This should be a much better approach then the random guessing I was doing with Kirby, so I'm optimistic.
Alyosha
He/Him
Editor, Expert player (3514)
Joined: 11/30/2014
Posts: 2713
Location: US
I'm making progress in understanding True's console tests. Several of the runs experienced 2 unique desync modes. As it turns out, I can recreate most of these modes by changing whether or not the Vblank flag is set at power on. This is known to be basically random. I also reverted BizHawk's start up state back to a previous one similar to FCEUX, this combined with the random VBLank flag and I got precisely the same desyncs as true got on MM4, and Streemerz superb Joe. I also got the same battletoads desyncs, but it required a few more tweeks then just that. The only desync I have never seen is the second one from Streemerz, Streemerz mode, so that is what I am currently investigating. The search now is basically for a valid startup state that matches all the desyncs when changes vblank flag.
Site Admin, Skilled player (1234)
Joined: 4/17/2010
Posts: 11251
Location: RU
Alyosha wrote:
I also reverted BizHawk's start up state back to a previous one similar to FCEUX, this combined with the random VBLank flag and I got precisely the same desyncs as true got on MM4, and Streemerz superb Joe. I also got the same battletoads desyncs, but it required a few more tweeks then just that.
That's... What!?
Warning: When making decisions, I try to collect as much data as possible before actually deciding. I try to abstract away and see the principles behind real world events and people's opinions. I try to generalize them and turn into something clear and reusable. I hate depending on unpredictable and having to make lottery guesses. Any problem can be solved by systems thinking and acting.
Alyosha
He/Him
Editor, Expert player (3514)
Joined: 11/30/2014
Posts: 2713
Location: US
^ Yeah sorry it's not a very good explanation. Here's the order of events that lead to this point: 1. Pass test ROM's. These make sure the emulator is working right but don't offer any insight into start up behaviour. 2. Make a 2player run of Battletoads sync. With everything else being correct, this game syncs entirely based on an accurate start up state. At the time, this state was borrowed from FCEUX, which has one extra dead frame in the PPU and slgihtly different initial conditions. 3. Match the run that syncs to the start up state here: https://wiki.nesdev.com/w/index.php/PPU_power_up_state Surprisingly, this worked out quite well. 4. Make other test runs, they failed to sync in what should be the accurate start up state, the one in the current NESHawk release. 5. I checked against the old start up state and the behaviour matched the console tests. So, something is wrong somewhere. I'm not sure if it's in the NESDev link, my implementation, or something else. True's tests are pretty definitive that if I change the Vlbank flag at power on I should get one of the 2 desyncs he got. The goal now is to make that happen with a consistent start up state.
Site Admin, Skilled player (1234)
Joined: 4/17/2010
Posts: 11251
Location: RU
I didn't mean that I didn't understand, I'm following everything you guys are doing, this was just completely unexpected: that fceux startup ram is more accurate in the end.
Warning: When making decisions, I try to collect as much data as possible before actually deciding. I try to abstract away and see the principles behind real world events and people's opinions. I try to generalize them and turn into something clear and reusable. I hate depending on unpredictable and having to make lottery guesses. Any problem can be solved by systems thinking and acting.
Alyosha
He/Him
Editor, Expert player (3514)
Joined: 11/30/2014
Posts: 2713
Location: US
It's not related to the RAM at all. It's related to where in the frame the CPU starts relative to the PPU and the exact cycle alignment between the two. I've tried different RAM states for these runs and it is irrelevent. And the only reason that even matters is because OAM DMA takes one cycle longer depending on whether it starts on an even or odd cycle, and because VBlank can be delayed by a bit depending on exactly when it is triggered.
Patashu
He/Him
Joined: 10/2/2005
Posts: 4000
If it's essentially random whether VBlank starts on or off, and other such things related to startup state, should a NESHawk TAS be able to specify the starting states of such things (similar to how you can specify the starting state of RAM)?
My Chiptune music, made in Famitracker: http://soundcloud.com/patashu My twitch. I stream mostly shmups & rhythm games http://twitch.tv/patashu My youtube, again shmups and rhythm games and misc stuff: http://youtube.com/user/patashu
Alyosha
He/Him
Editor, Expert player (3514)
Joined: 11/30/2014
Posts: 2713
Location: US
Patashu wrote:
If it's essentially random whether VBlank starts on or off, and other such things related to startup state, should a NESHawk TAS be able to specify the starting states of such things (similar to how you can specify the starting state of RAM)?
Probably not. All games should clear the first before testing anything. And actually this is what Streemerz and Battletoads are both doing. It turns out I had made an error in my original testing, so it's not the initial condition of the VBlank flag that is the problem here, it seems to be purely a start up timing issue. I cleaned up the code a bit but as far I can tell there are no further errors. Both games will desync if start up is different by a single PPU tick. It could be that something is just not quite deterministic here (besides the known case of PPU-CPU alignment.) I wonder if the NES10 chip is throwing off the start up timing slightly?
Editor, Player (68)
Joined: 1/18/2008
Posts: 663
I do have a toploader that I could use to verify that, but would need to set up a TV again. It's not convenient but will see what I can do.
true on twitch - lsnes windows builds 20230425 - the date this site is buried
Alyosha
He/Him
Editor, Expert player (3514)
Joined: 11/30/2014
Posts: 2713
Location: US
It's also possible that the probability of success for runs like these is just very low. MM4 only synced once. I'm able to get almost every desync observed in the console testing with only slight variations in start up timing. The only one I have still never seen is the second pie hit in streemerz mode. That one really is a mystery. Anyway I'll be putting some time into MM5 now and hopefully have a complete run soon if RNG is friendly. I am curious about the top loader tests though True, good luck! EDIT: @True: I also added a mega man 6 run, made from Shinryu's obsolete movie (the published run desyncs much earler.) Shinryu's run desynced in the same spot you mentioned in the verification thread. I resynced it and beat that stages boss so that should be a good test. The run seems to desync pretty badly thereafter, so I didn't go any further, but if it gets through the lag and RNG nightmare in that section a complete run would probably finish.
Experienced player, Moderator, Senior Ambassador (897)
Joined: 9/14/2008
Posts: 1007
For the most part I'm focusing on my family for the time being and I don't want to dedicate a lot of time to do tests myself but I *do* have a remote development environment for console verification I can set back up. I gave a presentation on how it works at NBLUG that's worth skipping through: Link to video It's admittedly a bit odd working remotely but I have all the pieces to allow Alyosha or anyone else who wants it to control console power, send input, physically see what buttons are being pressed, and see the console output. I'm also very fortunate to have a local collector who has almost 700 NES games, although I can't meet up with him all that often so we'd want to batch them up. For a first-pass it may make sense to get a flash cart which will admittedly be using fake mappers but will otherwise exercise the console properly and will allow testing any game loaded on it without physical interaction. This is just something I'm throwing out there. I have the spare console and I'm not currently working on NES projects so I can set this up if it sounds intriguing. Thoughts?
I was laid off in May 2023 and could use support via Patreon or onetime donations as I work on TASBot Re: and TASBot HD. I'm dwangoAC, part of the senior staff of TASVideos as the Senior Ambassador and BDFL of the TASBot community; I post TAS content on YouTube.com/dwangoAC based on livestreams from Twitch.tv/dwangoAC.
Site Admin, Skilled player (1234)
Joined: 4/17/2010
Posts: 11251
Location: RU
dwangoAC wrote:
I'm also very fortunate to have a local collector who has almost 700 NES games, although I can't meet up with him all that often so we'd want to batch them up.
That's INSANELY AWESOME!!!
dwangoAC wrote:
For a first-pass it may make sense to get a flash cart which will admittedly be using fake mappers but will otherwise exercise the console properly and will allow testing any game loaded on it without physical interaction.
I dunno if PowerPak can return to the same game after the reset, but Everdrive can, and it's also more accurate than PowerPak. Though, no flash cart is as precise and a real cart, so absolute levels of accuracy shouldn't be tested against them.
Warning: When making decisions, I try to collect as much data as possible before actually deciding. I try to abstract away and see the principles behind real world events and people's opinions. I try to generalize them and turn into something clear and reusable. I hate depending on unpredictable and having to make lottery guesses. Any problem can be solved by systems thinking and acting.
Alyosha
He/Him
Editor, Expert player (3514)
Joined: 11/30/2014
Posts: 2713
Location: US
dwangoAC wrote:
This is just something I'm throwing out there. I have the spare console and I'm not currently working on NES projects so I can set this up if it sounds intriguing. Thoughts?
This sounds amazing! I would love to have the opportunity to use this environment, thanks for putting it out there. :) I'm a bit short of time at the moment, so I can't put too much thought into yet, but I'm already excited with the possibilities!
Alyosha
He/Him
Editor, Expert player (3514)
Joined: 11/30/2014
Posts: 2713
Location: US
I've made some progress lately in analyzing why and how games sync. I improved NMI behaviour in the emulator and now have managed to see a behaviour that has never occured before in emulator but happened about 50% of the time in True's console testing. Unfortunately, in doing so I lost sync in Battletoads 2p warps, which did sync on console , so must in some way have represented a valid console state. The complicating factor here is that ppu/cpu alignment allows for some pretty diificult to capture changes in emulation (without emulating at another level lower accuracy.) So the current build must be emulating a different alignment then the release nuild. Also, NMI timings are not thoroughly tested by test roms. But if this streemerz run in my test build syncs it should mean its close.
Warepire
He/Him
Editor
Joined: 3/2/2010
Posts: 2174
Location: A little to the left of nowhere (Sweden)
This is crazy, I am deeply impressed by your progress here. Well done.
Editor, Player (68)
Joined: 1/18/2008
Posts: 663
Well I can do some tests today, just let me know what you want me to test. (Also, idle on IRC more - makes communications much easier.)
true on twitch - lsnes windows builds 20230425 - the date this site is buried
Alyosha
He/Him
Editor, Expert player (3514)
Joined: 11/30/2014
Posts: 2713
Location: US
Warepire wrote:
This is crazy, I am deeply impressed by your progress here. Well done.
Thanks! But so far pretty much all of this is stuff that is already known. Putting it all together into something complete and internally consistent is the challenge now. Thanks to True's console testing I've got a lot of new data to work with. Battletoads one player warpless made it furtehr then ever, but only once, so that needs investigating. A new sync record for Streemerz was also set, but it desycned at a room with tons of clowns running around, and in several different ways at that, so lot's to look at there. Perhaps the most important results were from the read2004 test rom. As suspected 4 seems to be a typical alignment at power on, so that is good to know, 7 is as well. Unexpectedly, values rangnig from 0-12 of initial 'FF' were observed that needs to be studied. Overall I think I need to pick an easier game then the hard core demanding Streemerz and Battletoads, so the hunt is on for that. Unpredicatability of NES power on makes sensitive games difficult to test and identify problems.
Editor, Player (68)
Joined: 1/18/2008
Posts: 663
Values seen in rough order of chance: 4 (when cold), 7 (reset or when hot), 8, 6, 11, 12, 0, 10, 1, 2 Need to test this against more consoles, test when cold, test when hot, and test with various cooldown times between power cycles or resets. Need to test with Famicom.
true on twitch - lsnes windows builds 20230425 - the date this site is buried
Site Admin, Skilled player (1234)
Joined: 4/17/2010
Posts: 11251
Location: RU
Where's the test ROM?
Warning: When making decisions, I try to collect as much data as possible before actually deciding. I try to abstract away and see the principles behind real world events and people's opinions. I try to generalize them and turn into something clear and reusable. I hate depending on unpredictable and having to make lottery guesses. Any problem can be solved by systems thinking and acting.
Alyosha
He/Him
Editor, Expert player (3514)
Joined: 11/30/2014
Posts: 2713
Location: US
feos wrote:
Where's the test ROM?
https://github.com/nwidger/nintengo/blob/master/samples/other/read2004.nes here it is. So far this is the best test I've seen in characterizing NES power up, so it's pretty neat. (However, if you happen to have a very old nes it will just give garbage since $2004 is not readable.)
Alyosha
He/Him
Editor, Expert player (3514)
Joined: 11/30/2014
Posts: 2713
Location: US
Brief update here. It seems like I'm running into a wall again. I tested Bionic Commando as True reports that desyncs on console, but it syncs just fine in BizHawk. Looking at the trace logger it seems like there are 2 possibilities. 1. Emulation is wrong. It desyncs at a lag frame where only one or two extra instructions could change the outcome, but so far I haven't found a way to change this result. 2. The bot isn't picking up an input poll. On this particular frame input is polled at the very end of the frame, only two dozen insturctions from vblank, maybe this is throwing something off in the replay?