Introduction
Pokemon Crystal is known to have a trick to clone Pokemon. If the system loses power as the game saves, after it updates stored Pokemon but before it updates party Pokemon, some data could now be in both locations. Crystal accepts this despite a 16-bit checksum in the save because it does not apply to auxiliary data like stored Pokemon. The checksum is to keep corruption in the main data unlikely. “Unlikely.”
Categories
- Heavy luck manipulation
- Heavy glitch abuse
- Corrupts save data
Used emulator: BizHawk 2.3
Objectives
Save corruption
Crystal stores a 16-bit checksum of the main save data in the save to try to detect corruption and load the backup save if necessary. While it is unlikely that this value can accidently match, it is not impossible. When save data is cleared from the title screen, all of save memory is set to $00. Undetected save corruption can occur here if the values written to the main save data before power off still sum to $0000 and match the zeroed checksum. As the checksum is 16 bits, the theme becomes “push the checksum over $ffff to overflow to $0000.” This can be done and controlled through box names, which let the player enter up to 8 characters to label boxes of stored Pokemon. This movie names additional boxes past those required for the arbitrary code execution to increase the checksum and cause the collision. Thank you to luckytyphlosion who asked again if checksum collision could be set up.
This movie wants to corrupt the save to have a Cyndaquil with no string terminator in its nickname. All this requires is a reset after Cyndaquil's data saves and before its nickname's terminator saves, but BizHawk movies are only designed to reset on frame boundaries. The subframe precision required can still be found with the equal frame lengths setting. A Game Boy frame completes every 70224 cycles while the LCD is on but the time between is arbitrary while the LCD is off. The default setting syncs frame boundaries to frame completions, yet EFLs have a frame boundary every 70224 cycles regardless of when frames complete. Over the course of this movie, the Game Boy frame timing is shifted relative to the EFL frame boundary with the variable lengths of LCD off periods. This allows a reset between when the L and the terminator in Cyndaquil are written to the save data.
15 00 or <DAY>
To summarize the link, if the text engine reads the control character $15 and it is followed by a $00, the program counter will end up at $cd52, an address near manipulable memory. This movie has the text engine try to read Cyndaquil's unterminated nickname to display it on the STATS screen. The nickname is copied to a string buffer which the text engine reads past because of the lack of a terminator. An item quantity buffer is not far from this string buffer, and the Antidote quantity scrolled to in the Cherrygrove Mart menu provides a $15 there for the text engine to read. A max item quantity buffer is the next byte and is set to $63 at the Antidotes, but is then set to $00 when CANCEL is used to leave the mart menu.
Once the text engine reads the $15 $00, code execution begins at $cd52 and the address that follows the $15 $00 is on the stack. That address is the location of the last loaded Pokemon data. One step up after the mart window places ret nz
at $cd70, memory of pointers for the background map, which is reached without issue and sends execution to the Pokemon data. The manipulated trainer ID and DV bytes in Cyndaquil's data there are interpreted as the instructions below. This is enough to jump into the memory for box names where more values can be controlled. The box names require a couple lower bytes in addresses to be character values, and the number of entry points from here is already limited.
; af = $0100
; bc = $c569
; de = $00ff
; hl = $002b from ld hl, $002b in moves
ld h, $36
add hl, bc ; hl = $fb94, the 5th character of Box 4's name
jp hl
Arbitrary code execution
This movie uses a modified version of MrWint's box names that execute code input from the joypad. Aside from address changes between the games and setups, there are two small changes: or [hl]
is used to reset the carry flag and the jp nc
jumps before the first of the two boxes. I thought or [hl]
was faster to input before I learned the mechanics below, but it's the same, yet the higher value still contributes to the checksum. The jump before the first box is fast to input and does not cause any issues with a clever box name there. That box is covered after the two primary boxes. Note it is faster to scroll the list to boxes before they are named, and hold to scroll as opposed to press -> release a bunch is break even for Box 4 and faster for Box 5 and up.
- Box names
- One input every other frame. The first input cannot be A B START or SELECT. The same button cannot be pressed two inputs in a row (or it's a hold). Directionals can be pressed consecutively if a new button or directional is also pressed. Example: Up 2 can be done in two inputs with UP -> UP|LEFT. Priority for two directionals at once is UP > DOWN > LEFT > RIGHT. If A and a direction are input together the A is processed then the cursor moves.
Also thank you to MrWint's submission for how to format this section.
Bytes | Instruction | Comment |
---|
Box 5 | | |
aa | xor d | d stores last joypad input: find out differences to current input |
ea a1 fb | ld [$fba1], a | Write difference; will be executed as opcode later in the cycle |
aa | xor d | Restore current joypad input value |
f5 | push af | Copy current joypad input from a... |
d1 | pop de | ... to d (store it as last joypad input) |
f1 | pop af | Restore a and f from the previous cycle |
50 ($fba1) | (any) | Execute opcode written earlier this cycle |
Box 6 | | |
f5 | push af | Save a and f for next cycle |
b6 | or [hl] | Clears carry flag, needed for the jump |
fa a4 ff | ld a, [$ffa4] | Reads current joypad inputs into a |
d2 95 fb | jp nc, $fb95 | Loop back to Box 4; carry will never be set |
Box 4 has two more goals in addition to high values for the checksum. The 6th character is where Box 6 loops back to, because it was fast to input. However, the string terminator opcode is ld d, b
and the box name program relies on d to store the last joypad input. The 5th character is where the 15 00
setup used in this movie jumps to, because of the limited entry point options. The slide into Box 5 needs to avoid an unsafe opcode from xor d
. With the initial values of a=$02 and d=$c5 (from ld d, b
) the result would be rst $00
, an instruction that resets the game. Box 4 is named in such a way where both jump locations interpret different instructions to satisfy those goals.
Bytes | Instruction | Comment |
---|
Box 4 | | (from 5th character) |
f6 fe | or $fe | Sets a to $fe |
f6 fe | or $fe | |
50 | ld d, b | Sets d to $c5 |
Bytes | Instruction | Comment |
---|
Box 4 | | (from 6th character) |
fe f6 | cp $f6 | Preserves a |
fe 50 | cp $50 | Preserves d |
Below are the joypad inputs and commands executed to reach the usual end. Refer to MrWint's submission for the nuances of this process. It accomplishes many of the same things as that submission so the comments are mimicked for comparison. One difference here is the warp data pointer is set to the player coordinates and the map values are written in that area. Warp data entries start with coordinates that need to match to warp, and this way there is already a match. An idea from luckytyphlosion was to end input early through Crystal's auto input system, which uses a pointer to run-length encoded inputs. For this movie, there happens to be a (properly terminated!) series of bytes in memory that plays out the rest almost perfectly.
Joypad Command
97 69 (ld l, c) // setup cycle
fd 6a (ld l, d)
ae 53 (ld d, e) // setup cycle
dc 72 (ld [hl], d) // fbfd <- dc; Set warp pointer to our fake warp data
f1 2d (dec l)
c5 34 (inc [hl]) // setup cycle
b7 72 (ld [hl], d) // fbfc <- b7; ^
34 83 (add e) // setup cycle
5e 6a (ld l, d)
6b 35 (dec [hl]) // fb5e <- 43; Enable Red in Mt. Silver
d2 b9 (cp c) // setup cycle
b8 6a (ld l, d)
9c 24 (inc h)
b6 2a (ld a, [hli])
94 22 (ld [hli], a) // fcb9 <- 03; (Glitch) map entrance $03, close to Red
b6 22 (ld [hli], a) // fcba <- 03; Map group
3e 88 (adc b) // setup cycle
4c 72 (ld [hl], d) // fcbb <- 4c; Map index: In Mt. Silver
bd f1 (pop af) // setup cycle
d7 6a (ld l, d)
e2 35 (dec [hl]) // fcd7 <- 00; Having no Pokémon in your party wins all battles instantly
1a f8 (ld hl, sp-$0b)
b7 ad (xor l) // setup cycle
dd 6a (ld l, d)
24 f9 (ld sp, hl) // Fix the stack pointer*; leave directly to overworld
8e aa (xor d) // setup cycle
e4 6a (ld l, d)
b6 52 (ld d, d) // setup cycle
d4 62 (ld h, d)
0a de (sbc $f5) // setup cycle; *after this skips a push
78 72 (ld [hl], d) // d4e4 <- 78; Change tile the character stands on, needed for warping
a0 d8 (ret c) // setup cycle
c2 62 (ld h, d)
ad 6f (ld l, a) // setup cycle
c7 6a (ld l, d)
f2 35 (dec [hl]) // c2c7 <- ff; Enable auto input
de 2c (inc l)
9e 40 (ld b, b) // setup cycle
ec 72 (ld [hl], d) // c2c8 <- ec; Set auto input pointer
c0 2c (inc l)
62 a2 (and d)
40 22 (ld [hli], a) // c2c9 <- 60; ^
77 37 (scf) // setup cycle
05 72 (ld [hl], d) // c2ca <- 05; Set auto input bank
29 2c (inc l)
1d 34 (inc [hll]) // c2cb <- 01; Delay first auto input
d4 c9 (ret) // Return to normal game code
Route
Intro
- Save data is cleared because it's the totally moral thing to do before a save corruption TAS. Gambatte fills uninitialized save memory with $ff bytes so the clear sets the values to something objective. A setup with Gambatte's $ff bytes may exist.
- Options are not set as text can print at the fast speed when A or B is held anyway.
- The trainer ID is manipulated to be $2636 for the 15 00 setup. This is done with 23 expected frames of delay between power on and NEW GAME. Two other 16-bit values are rolled at the same time that affect the checksum, but are secondary in priority.
- The player is selected to be the girl. A few values in memory like a sprite index are higher with her. This helps avoid an extra box name for the checksum in combination with the player name and is worth the frames to move the cursor down.
- She is named five multiplication symbols. Each character has the value of $f1 and is in memory twice for the player name and Cyndaquil's original trainer name. This is an efficient way to increase the checksum despite the extra frames it takes to display.
New Bark Town
- Mom is talked to directly to avoid the exclamation point animation that plays if the player tries to walk past. It is only a couple frames faster to do this and some luck manipulations may take the exclamation point for the variance it provides.
- Cyndaquil is manipulated to have $09e9 DVs for the 15 00 setup. The actual delay used here is difficult to estimate because of “lag frame rules” when sprites are decompressed relative to the timer interrupt, but it is around 10-13 frames.
Cherrygrove City
- The PC is opened with a buffered A press from the left side. In general, buffered menus from the overworld will open before sprites update and possibly deload. An extra sprite stays in memory from this side and increases the checksum at no cost.
- Boxes 1 through 8 are named for the arbitrary code execution and checksum. The 9 character has the highest value at $ff, but the filler boxes input a 1 on the way to save an input in exchange for a slightly lower character value.
- The game is saved from the PC through the change box action. It is a couple seconds faster than a save from the menu in the overworld, though a save from where the PC is misses out on a few more sprites loaded in memory to increase the checksum.
- An almost-purchase of 21 Antidotes in Cherrygrove Mart completes the 15 00 setup.
- The auto input that takes over from the arbitrary code execution presses UP x4, RIGHT x9, A x12 to end the movie. Note each character in the textboxes consumes an input so at least 11 inputs are needed to reach the credits from next to Red.
ThunderAxe31: Not only this TAS introduces a new glitch that breaks the game with an unprecedented speed, but it's also highly optimized. A possible faster setup has been theorized, but after some discussion and research, it seemed to me that it would not be verifiable on real console due to the fact that it would rely on non-deterministic data. In any case, such hypothetical improvement would require to remake the run from scratch, and it wouldn't save much time compared to this submission, nor would it prove the setup used for this submission to be poorly executed. Clearing the data at beginning was the best choice.
For what concerns the branching, there are two separate aspects to take in account: similarity between games and similarity between movies. Since Pokémon Crystal is a Gen II Pokémon game, it can obsolete and be obsoleted by Pokémon Gold and Silver, even though the glitch abused in this movie is only available in Crystal. Since the
current fastest-completion movie features much unique and entertaining contents, it shouldn't be obsoleted by this submission; however I have to note that we have an
obsoleted movie with the "save glitch" label, and even though it's much longer and uses a different glitch, its movie goal is the same as the one for this submission, so the obsoletion chain will be changed in order for this submission to obsolete it.