Post subject: Mass glitch hunting through Regex searches?
Jigwally
He/Him
Active player (418)
Joined: 3/11/2012
Posts: 119
Hi, I'm still very new to this kind of thing so I apologize for how amateurish this entire post probably looks but I wanted to run a general concept I had by you guys: I wanted to have a way to search for shared code (such as the functions I worked out while disassembling Little Mermaid), so I converted the entire NES/SNES libraries into txt files containing their equivalent hex strings & started looking for games with the same structures by searching the hex opcodes. Unsurprisingly I got several hits for other Capcom games from the same time frame. But then I was thinking, if you had a general idea of a glitch & how it happens in code, could you not do a mass search for other games with similar code to discover more games with the same glitch? At first this seemed kind of impossible to me because of how many different ways code can be written and how much you have to know about the game code already to know if a glitch can be performed, like the theorizing about making a TAS by brute forcing every input. But then I learned about regular expression searches which allowed me to search for templates of code rather than exact code, & I actually managed to get some successful hits lately. The specific glitch I tried to find first were games where simultaneous L+R/U+D causes odd behavior. I know this is usually because the game attempts to index a value from outside an intended table, like how the U+D climbing trick in SMB2 is due to the game using an opcode as a velocity value. I tried to examine multiple ways this could be written into code. I see that when games retrieve input they vary in whether the directional values appear in the high or low nybble of memory so I tested for both. My first "successful" hit was actually for Super Mario Bros 3, for a function that indexes an incorrect value on simultaneous L+R/U+D presses. But on closer inspection it was part of a leftover debug thing disconnected from the rest of the game. But then I tried a combination I hadn't before (29 C0 2A 2A): AND #$C0, ROL, ROL [or 29 30 2A 2A] I haven't tested every game in the list to figure out what the hit was for exactly & if it was meaningful, but several of the games I tested out had obvious effects: Back to the Future - Pressing U+D causes Marty to trip on nothing, pressing L+R+U causes him to shoot backward (I guess this was already used in the TAS but I didn't know until I got a hit for it) Shadow of the Ninja - Causes moonwalk animation (probably already known) Captain America and The Avengers - pressing U+D on the pause screen causes a graphical glitch Spelunker 2 - When pressed during gameplay you reset back to the title Championship Pool - This one is the most interesting. L+R/U+D (on either controller) seems to completely crash the game after 8 frames. Specifically this glitch causes stack addresses to be overwritten. This glitch is too chaotic for me to understand yet but through random testing I was able to perform a skip from the title screen to a menu you don't normally go to, so I think there's some kind of glitch TAS potential here. I haven't investigated every game on the list but the fact that I got so many hits for this specific glitch in a single search makes me think I'm on to something and that I'll find other instances of them doing something interesting if I dig deeper & because I know the Kirby Super Star ending skip TAS is built on this premise I tried to do a similar search through the SNES library. I got a hit for shared code in KDL3, but because that game has no ladders I'm not sure where, if at all, that code is executed. I'm very new to SNES disassembly & I don't understand SA1 tracing so I don't know how to approach that yet.
greysondn
They/Them
Joined: 4/29/2018
Posts: 44
jlun2 asked about this in the discord - whether it was actually a workable/practical thing or not. I'm going to quote my response there as it may have value as you start to do something like this.
It's a coorelation/causation thing. That combo is very dangerous to assume is a code segment just because it's in hex in the game ROM. Such combinations can arise also from, for example, graphics and text data. If I read him right, "no, it's not impossible; however it should be applied with caution as it's not 100% foolproof. It will yield a place worth investigating if done right... that does not mean it will yield something worthwhile." For shared code, if you can find the specific blob and just search for exactly that, you would probably locate shared code segments. If we know some exploit for that code segment, that may be actionable. But be aware that "we know for a glitch for a game that has this code segment" and "we know a glitch for this code segment" are two very different things.
Beyond that, impressive consideration of a toolkit! I'm usually the one looking for patterns and even I don't recall thinking to ask "is there a pattern we could detect in these older games?"
Post subject: Re: Mass glitch hunting through Regex searches?
Joined: 7/28/2005
Posts: 339
Jigwally wrote:
I got a hit for shared code in KDL3, but because that game has no ladders I'm not sure where, if at all, that code is executed. I'm very new to SNES disassembly & I don't understand SA1 tracing so I don't know how to approach that yet.
If you can load the code into IDA or some other visual disassembler (I haven't used any other ones; sorry), you can get a list of code that links to it. I've never tried to disassemble SNES games using IDA, though, so I dunno how much of a pain in the ass it is to load. I think gocha wrote something to make it easier? https://github.com/gocha/ida-snes-ldr
MESHUGGAH
Other
Skilled player (1889)
Joined: 11/14/2009
Posts: 1349
Location: 𝔐𝔞𝔤𝑦𝔞𝔯
I see some potential in this, but also many problems (TL;DR: nearly same as greysondn's response). Currently what I see as problems: - False positives -> But on closer inspection it was part of a leftover debug thing disconnected from the rest of the game. - Doesn't works for unknown glitches at the first time, only after discovering and trying to defining how to search for it (NES DPCM) - Obfuscated codes Regarding L+R/U+D instant crash on NES is a common issue especially (edit) on not famous game ones. Sometimes you need 3 directions at once, because of differences of implementing it. edit: I would add that this post written based on my J2ME experience, as in porting back the .jad file to .jar and examining the converted code. Probably there is no code obfuscation in old platforms. Also I avoided timing based glitches for obvious reason as it can't be read out from source/compiled code.
PhD in TASing 🎓 speedrun enthusiast ❤🚷🔥 white hat hacker ▓ black box tester ░ censorships and rules...
Post subject: Re: Mass glitch hunting through Regex searches?
Jigwally
He/Him
Active player (418)
Joined: 3/11/2012
Posts: 119
Yeah, on closer inspection a lot of the hits in the list really were false positives. I still think this has potential but I have to be more intelligent about how I do it. For example I was previously attempting to use regular expressions that identified the memory address containing input based on its proximity to a load from $4016, put it into a capture group, & looked for another function that isolated directional data & transferred it to X or Y (for indexing from a table). My "holy grail" was to find a case where the game indexed a jump table with this data, & therefore I could get it to jump to a glitched address which could potentially be leveraged into an ending skip or even ACE. But I'm not sure such a glitch exists in any NES games.
Kles wrote:
If you can load the code into IDA or some other visual disassembler (I haven't used any other ones; sorry), you can get a list of code that links to it. I've never tried to disassemble SNES games using IDA, though, so I dunno how much of a pain in the ass it is to load. I think gocha wrote something to make it easier? https://github.com/gocha/ida-snes-ldr
Yeah the disassembly methods I've been doing are probably super inefficient. I don't use any kind of disassembler or even bother to assign symbols to anything in the emulator debugger. I probably need to learn how to rewrite everything as an annotated .asm file if I want it to be useful to other people.
Personman
Other
Joined: 4/20/2008
Posts: 465
This is a pretty brilliant idea, even if it mostly produces false positives in its current incarnation — even if just that single discovery in Championship Pool turns out to be real, it'll have proved its worth. It also made me wonder if anyone has tried fuzzing old games. It would take some work to define what an interesting outcome looks like, and you probably wouldn't get great results if you started from power-on — but what if you took an existing TAS as a source of valid states, and started from those?
A warb degombs the brangy. Your gitch zanks and leils the warb.