TASVideos

Tool-assisted game movies
When human skills are just not enough

Submission #6108: gifvex's GBC Pokémon: Crystal Version in 04:12.68

Console: Game Boy Color
Game name: Pokémon: Crystal Version
Game version: USA/Europe v1.0
ROM filename: Pokemon - Crystal Version (UE) (V1.0) [C][!].gbc
Branch:
Emulator: BizHawk 2.3
Movie length: 04:12.68
FrameCount: 15092
Re-record count: (unknown)
Author's real name: AM
Author's nickname: gifvex
Submitter: gifvex
Submitted at: 2018-09-25 07:36:04
Text last edited at: 2018-09-29 05:57:24
Text last edited by: ThunderAxe31
Download: Download (4924 bytes)
Status: judging underway
Submission instructions
Discuss this submission (also rating / voting)
List all submissions by this submitter
List pages on this site that refer to this submission
View submission text history
Back to the submission list
Author's comments and explanations:

(Link to video)

Introduction

Pokemon Crystal is known to have a trick to clone Pokemon. If the system loses power as the game saves, after it updates stored Pokemon but before it updates party Pokemon, some data could now be in both locations. Crystal accepts this despite a 16-bit checksum in the save because it does not apply to auxiliary data like stored Pokemon. The checksum is to keep corruption in the main data unlikely. “Unlikely.”

Categories

  • Heavy luck manipulation
  • Heavy glitch abuse
  • Corrupts save data

Used emulator: BizHawk 2.3

Objectives

Save corruption

Crystal stores a 16-bit checksum of the main save data in the save to try to detect corruption and load the backup save if necessary. While it is unlikely that this value can accidently match, it is not impossible. When save data is cleared from the title screen, all of save memory is set to $00. Undetected save corruption can occur here if the values written to the main save data before power off still sum to $0000 and match the zeroed checksum. As the checksum is 16 bits, the theme becomes “push the checksum over $ffff to overflow to $0000.” This can be done and controlled through box names, which let the player enter up to 8 characters to label boxes of stored Pokemon. This movie names additional boxes past those required for the arbitrary code execution to increase the checksum and cause the collision. Thank you to luckytyphlosion who asked again if checksum collision could be set up.

This movie wants to corrupt the save to have a Cyndaquil with no string terminator in its nickname. All this requires is a reset after Cyndaquil's data saves and before its nickname's terminator saves, but BizHawk movies are only designed to reset on frame boundaries. The subframe precision required can still be found with the equal frame lengths setting. A Game Boy frame completes every 70224 cycles while the LCD is on but the time between is arbitrary while the LCD is off. The default setting syncs frame boundaries to frame completions, yet EFLs have a frame boundary every 70224 cycles regardless of when frames complete. Over the course of this movie, the Game Boy frame timing is shifted relative to the EFL frame boundary with the variable lengths of LCD off periods. This allows a reset between when the L and the terminator in Cyndaquil are written to the save data.

15 00 or <DAY>

see also: https://forums.glitchcity.info/index.php?topic=7706.msg203310#msg203310

To summarize the link, if the text engine reads the control character $15 and it is followed by a $00, the program counter will end up at $cd52, an address near manipulable memory. This movie has the text engine try to read Cyndaquil's unterminated nickname to display it on the STATS screen. The nickname is copied to a string buffer which the text engine reads past because of the lack of a terminator. An item quantity buffer is not far from this string buffer, and the Antidote quantity scrolled to in the Cherrygrove Mart menu provides a $15 there for the text engine to read. A max item quantity buffer is the next byte and is set to $63 at the Antidotes, but is then set to $00 when CANCEL is used to leave the mart menu.

Once the text engine reads the $15 $00, code execution begins at $cd52 and the address that follows the $15 $00 is on the stack. That address is the location of the last loaded Pokemon data. One step up after the mart window places ret nz at $cd70, memory of pointers for the background map, which is reached without issue and sends execution to the Pokemon data. The manipulated trainer ID and DV bytes in Cyndaquil's data there are interpreted as the instructions below. This is enough to jump into the memory for box names where more values can be controlled. The box names require a couple lower bytes in addresses to be character values, and the number of entry points from here is already limited.

  ; af = $0100
  ; bc = $c569
  ; de = $00ff
  ; hl = $002b from ld hl, $002b in moves

  ld h, $36

  add hl, bc ; hl = $fb94, the 5th character of Box 4's name
  jp hl

Arbitrary code execution

see also: #4233: MrWint's GBC Pokémon: Silver Version in 30:39.49

This movie uses a modified version of MrWint's box names that execute code input from the joypad. Aside from address changes between the games and setups, there are two small changes: or [hl] is used to reset the carry flag and the jp nc jumps before the first of the two boxes. I thought or [hl] was faster to input before I learned the mechanics below, but it's the same, yet the higher value still contributes to the checksum. The jump before the first box is fast to input and does not cause any issues with a clever box name there. That box is covered after the two primary boxes. Note it is faster to scroll the list to boxes before they are named, and hold to scroll as opposed to press -> release a bunch is break even for Box 4 and faster for Box 5 and up.

Box names
One input every other frame. The first input cannot be A B START or SELECT. The same button cannot be pressed two inputs in a row (or it's a hold). Directionals can be pressed consecutively if a new button or directional is also pressed. Example: Up 2 can be done in two inputs with UP -> UP|LEFT. Priority for two directionals at once is UP > DOWN > LEFT > RIGHT. If A and a direction are input together the A is processed then the cursor moves.

Also thank you to MrWint's submission for how to format this section.

Bytes Instruction Comment
Box 5
aa xor d d stores last joypad input: find out differences to current input
ea a1 fb ld [$fba1], a Write difference; will be executed as opcode later in the cycle
aa xor d Restore current joypad input value
f5 push af Copy current joypad input from a...
d1 pop de ... to d (store it as last joypad input)
f1 pop af Restore a and f from the previous cycle
50 ($fba1) (any) Execute opcode written earlier this cycle
Box 6
f5 push af Save a and f for next cycle
b6 or [hl] Clears carry flag, needed for the jump
fa a4 ff ld a, [$ffa4] Reads current joypad inputs into a
d2 95 fb jp nc, $fb95 Loop back to Box 4; carry will never be set

Box 4 has two more goals in addition to high values for the checksum. The 6th character is where Box 6 loops back to, because it was fast to input. However, the string terminator opcode is ld d, b and the box name program relies on d to store the last joypad input. The 5th character is where the 15 00 setup used in this movie jumps to, because of the limited entry point options. The slide into Box 5 needs to avoid an unsafe opcode from xor d. With the initial values of a=$02 and d=$c5 (from ld d, b) the result would be rst $00, an instruction that resets the game. Box 4 is named in such a way where both jump locations interpret different instructions to satisfy those goals.

Bytes Instruction Comment
Box 4 (from 5th character)
f6 fe or $fe Sets a to $fe
f6 fe or $fe
50 ld d, b Sets d to $c5

Bytes Instruction Comment
Box 4 (from 6th character)
fe f6 cp $f6 Preserves a
fe 50 cp $50 Preserves d

Below are the joypad inputs and commands executed to reach the usual end. Refer to MrWint's submission for the nuances of this process. It accomplishes many of the same things as that submission so the comments are mimicked for comparison. One difference here is the warp data pointer is set to the player coordinates and the map values are written in that area. Warp data entries start with coordinates that need to match to warp, and this way there is already a match. An idea from luckytyphlosion was to end input early through Crystal's auto input system, which uses a pointer to run-length encoded inputs. For this movie, there happens to be a (properly terminated!) series of bytes in memory that plays out the rest almost perfectly.

  Joypad     Command
  97         69 (ld l, c)      // setup cycle
  fd         6a (ld l, d)
  ae         53 (ld d, e)      // setup cycle
  dc         72 (ld [hl], d)   // fbfd <- dc; Set warp pointer to our fake warp data
  f1         2d (dec l)
  c5         34 (inc [hl])     // setup cycle
  b7         72 (ld [hl], d)   // fbfc <- b7; ^
  34         83 (add e)        // setup cycle
  5e         6a (ld l, d)
  6b         35 (dec [hl])     // fb5e <- 43; Enable Red in Mt. Silver
  d2         b9 (cp c)         // setup cycle
  b8         6a (ld l, d)
  9c         24 (inc h)
  b6         2a (ld a, [hli])
  94         22 (ld [hli], a)  // fcb9 <- 03; (Glitch) map entrance $03, close to Red
  b6         22 (ld [hli], a)  // fcba <- 03; Map group
  3e         88 (adc b)        // setup cycle
  4c         72 (ld [hl], d)   // fcbb <- 4c; Map index: In Mt. Silver
  bd         f1 (pop af)       // setup cycle
  d7         6a (ld l, d)
  e2         35 (dec [hl])     // fcd7 <- 00; Having no Pokémon in your party wins all battles instantly
  1a         f8 (ld hl, sp-$0b)
  b7         ad (xor l)        // setup cycle
  dd         6a (ld l, d)
  24         f9 (ld sp, hl)    // Fix the stack pointer*; leave directly to overworld
  8e         aa (xor d)        // setup cycle
  e4         6a (ld l, d)
  b6         52 (ld d, d)      // setup cycle
  d4         62 (ld h, d)
  0a         de (sbc $f5)      // setup cycle; *after this skips a push
  78         72 (ld [hl], d)   // d4e4 <- 78; Change tile the character stands on, needed for warping
  a0         d8 (ret c)        // setup cycle
  c2         62 (ld h, d)
  ad         6f (ld l, a)      // setup cycle
  c7         6a (ld l, d)
  f2         35 (dec [hl])     // c2c7 <- ff; Enable auto input
  de         2c (inc l)
  9e         40 (ld b, b)      // setup cycle
  ec         72 (ld [hl], d)   // c2c8 <- ec; Set auto input pointer
  c0         2c (inc l)
  62         a2 (and d)
  40         22 (ld [hli], a)  // c2c9 <- 60; ^
  77         37 (scf)          // setup cycle
  05         72 (ld [hl], d)   // c2ca <- 05; Set auto input bank
  29         2c (inc l)
  1d         34 (inc [hll])    // c2cb <- 01; Delay first auto input
  d4         c9 (ret)          // Return to normal game code

Route

Intro

  • Save data is cleared because it's the totally moral thing to do before a save corruption TAS. Gambatte fills uninitialized save memory with $ff bytes so the clear sets the values to something objective. A setup with Gambatte's $ff bytes may exist.
  • Options are not set as text can print at the fast speed when A or B is held anyway.
  • The trainer ID is manipulated to be $2636 for the 15 00 setup. This is done with 23 expected frames of delay between power on and NEW GAME. Two other 16-bit values are rolled at the same time that affect the checksum, but are secondary in priority.
  • The player is selected to be the girl. A few values in memory like a sprite index are higher with her. This helps avoid an extra box name for the checksum in combination with the player name and is worth the frames to move the cursor down.
  • She is named five multiplication symbols. Each character has the value of $f1 and is in memory twice for the player name and Cyndaquil's original trainer name. This is an efficient way to increase the checksum despite the extra frames it takes to display.

New Bark Town

  • Mom is talked to directly to avoid the exclamation point animation that plays if the player tries to walk past. It is only a couple frames faster to do this and some luck manipulations may take the exclamation point for the variance it provides.
  • Cyndaquil is manipulated to have $09e9 DVs for the 15 00 setup. The actual delay used here is difficult to estimate because of “lag frame rules” when sprites are decompressed relative to the timer interrupt, but it is around 10-13 frames.

Cherrygrove City

  • The PC is opened with a buffered A press from the left side. In general, buffered menus from the overworld will open before sprites update and possibly deload. An extra sprite stays in memory from this side and increases the checksum at no cost.
  • Boxes 1 through 8 are named for the arbitrary code execution and checksum. The 9 character has the highest value at $ff, but the filler boxes input a 1 on the way to save an input in exchange for a slightly lower character value.
  • The game is saved from the PC through the change box action. It is a couple seconds faster than a save from the menu in the overworld, though a save from where the PC is misses out on a few more sprites loaded in memory to increase the checksum.
  • An almost-purchase of 21 Antidotes in Cherrygrove Mart completes the 15 00 setup.
  • The auto input that takes over from the arbitrary code execution presses UP x4, RIGHT x9, A x12 to end the movie. Note each character in the textboxes consumes an input so at least 11 inputs are needed to reach the credits from next to Red.


ThunderAxe31: This is going to require some time. Judging.


Similar submissions (by title and categories where applicable):