1 2
12 13
Player (77)
Location: Cornelia Castle
Joined: 4/1/2016
Posts: 316
Location: Cornelia Castle
I know that the Stairs Glitch is where you climb 70 stairs and certain things happen. For a credits warp, you need to get 56 houses, 57 heals and 201 Pures (the way to get 201 is underflow the pure-count and drink 54 in battle.) Use down to 32 houses, save only with the last one and jump to your inventory (with a code) to get the credits rolling.
DJ Incendration Believe in Michael Girard and every speedrunner and TASer!
Active player (472)
🇨🇦 Canada
Joined: 9/27/2004
Posts: 650
Location: 🇨🇦 Canada
DJ Incendration wrote:
I know that the Stairs Glitch is where you climb 70 stairs and certain things happen. For a credits warp, you need to get 56 houses, 57 heals and 201 Pures (the way to get 201 is underflow the pure-count and drink 54 in battle.) Use down to 32 houses, save only with the last one and jump to your inventory (with a code) to get the credits rolling.
Sounds like you have the tools already with the pastebin and the notes and stuff. Here's a video of an RTA playthrough of the credits warp route so you can visualize the setups. https://www.youtube.com/watch?v=eo0CtjU_T_c
Location: Waterford, MI
Joined: 9/12/2014
Posts: 557
Location: Waterford, MI
FatRatKnight wrote:
Not sure how interesting this restriction will be, but it is a thought. "No run away." For sake of rules interpretation, selecting run is allowed, but successfully running is not. Save-reset trick will be a side-stepping of the proposed rule in the effort to avoid encounters without technically running away. This will only for the overworld, at least, so I'll leave it up to others to decide on whether to include "no reset" on top of "no run away." I'm curious what the implications are for never running away. Sure, you'll get more levels. You'll probably manipulate fewer enemies where possible for faster fights. How avoidable is damage with the limited RNG? Will it be worth it to have more than one standing character? Does this restriction give better variation for this game? At the moment, I find this thought interesting. I have no intention of following through with the thought, but I would share it in case someone else finds it interesting as well.
I'd like to see a run with those restrictions. Having those restrictions would also make money routing more interesting too. As for the length, I believe it would take less than 2h30m. Would also make any RPG run more action packed.
Skilled player (1851)
🇯🇵 Japan
Joined: 5/7/2008
Posts: 187
Location: 🇯🇵 Japan
従来より32階層でのメニューオープンでゲームがフリーズすることが知られていました。 32回の昇降後に単にメニューを開くだけだと"40:RTI"を踏んでしまうため、$0312に到達することが出来ません。
      00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
0300: 10 05 7A 10 06 00 07 0C 05 06 40 40 04 04 30 40
0303:10 06     BPL $030B
030B:40        RTI
しかし予めどこかでメニューを開いていると$0300-$030Fの上書き処理が完了され、$0312に到達できるようになります。
      00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
0300: 10 05 7A 10 06 00 01 13 02 00 80 7A 80 7A 80 00
0303:10 06     BPL $030B
030B:7A        NOP
030C:80 7A     NOP #$7A
030E:80 00     NOP #$00
0310:01 00     ORA ($00,X) @ $0000 = #$00
エンディングシーケンスに移行するにはbank:0Dに切り替える必要がありますが、幸いなことにそれはbank:0Fのカオス戦後のループ処理の中に存在します。
0F:C93A:20 2B E9  JSR $E92B
0F:C93D:A9 0D     LDA #$0D
0F:C93F:20 03 FE  JSR $FE03
0F:C942:20 03 B8  JSR $B803
0F:C945:4C 3A C9  JMP $C93A
このループのどこかに飛ぶことが出来ればカオスを呼び出すことなくゲームを終えることが可能ですが、0xC9や0x44以下の文字は登録できない上に7byteしか自由に使えないので非常に困難です。 そこで、スタックに既に積まれている0xC9を再利用し、0x38-0x44の値を"48:PHA"でプッシュすれば"60:RTS"でエンディングシーケンスへリターンすることが可能になります。 条件を満たすためには A = X = 0x38 or 0x3F に3byte、"9A:TXS(SP=X)"に1byte、"48:PHA"に1byte、0x5C-0x6Fが使用不能文字なので"60:RTS"を含んでいるアドレスにジャンプするために"4C:JMP $****"で3byte、合計8byteが必要になります。 残念なことに1byte足りないので無理やり捻出する事を考えます。 "48:PHA"からはじまり"60:RTS"で終わるサブルーチンがあれば1byteを削減できるので、探した結果ふたつの名前が候補に上がりました。 4人目の名前は「ちごあゆ」か「ちごにむ」のどちらかが使え、入力が楽なのは前者ですがムービーの最後に入力が余計に必要になるのでTASには向きません。 2人目の名前には "AF:LAX $****" という illegal opcode を使います、これにより37*2種類の候補の名前からカーソル移動の一番少ないものを選択できます。 「よるえき」という文字がカーソル移動10回で済むので、多分これが一番早いと思います。
      00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
0300: 10 05 7A 10 06 00 01 13 02 00 80 7A 80 7A 80 00 
0310: 01 00 AF B2 8D 90 15 0C 13 06 B0 40 12 04 A0 40 
0320: 02 00 4C 51 50 55 07 18 05 12 40 A0 04 10 30 A0
0330: 03 00 9A 4C 9F AA 15 18 13 12 B0 A0 12 10 A0 A0
$0310:01 00     ORA ($00,X) @ $0000 = #$00 A:05 X:05 Y:0E S:11 P:nvubdIzc
$0312:AF B2 8D  LAX $8DB2 = #$38           A:05 X:05 Y:0E S:11 P:nvubdIzc -- A = X = #$38 
$0315:90 15     BCC $032C                  A:38 X:38 Y:0E S:11 P:nvubdIzc 
$032C:04 A0     NOP $A0                    A:38 X:38 Y:0E S:11 P:nvubdIzc
$032E:30 A0     BMI $02D0                  A:38 X:38 Y:0E S:11 P:nvubdIzc
$0330:03        ASO ($00,X) @ $010B = #$00 A:38 X:38 Y:0E S:11 P:nvubdIzc -- * Be careful *
$0332:9A        TXS                        A:38 X:38 Y:0E S:11 P:nvubdIzc -- SP = X
$0333:4C 9F AA  JMP $AA9F                  A:38 X:38 Y:0E S:38 P:nvubdIzc
$AA9F:48        PHA                        A:38 X:38 Y:0E S:38 P:nvubdIzc
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
$FE2D:60        RTS                        A:00 X:EF Y:00 S:37 P:nvubdIZC
$C939:0E 20 2B  ASL $2B20 = #$08           A:00 X:EF Y:00 S:39 P:nvubdIZC -- Wrong address
$C93C:E9 A9     SBC #$A9                   A:00 X:EF Y:00 S:39 P:nvubdIZc
$C93E:0D 20 03  ORA $0320 = #$02           A:56 X:EF Y:00 S:39 P:nvubdIzc
$C941:FE 20 03  INC $0320,X @ $040F = #$00 A:56 X:EF Y:00 S:39 P:nvubdIzc
$C944:B8        CLV                        A:56 X:EF Y:00 S:39 P:nvubdIzc
$C945:4C 3A C9  JMP $C93A                  A:56 X:EF Y:00 S:39 P:nvubdIzc

$C93A:20 2B E9  JSR $E92B                  A:56 X:EF Y:00 S:39 P:nvubdIzc -- Correct address
$C93D:A9 0D     LDA #$0D                   A:FF X:00 Y:00 S:39 P:nvubdIZC 
$C93F:20 03 FE  JSR $FE03                  A:0D X:00 Y:00 S:39 P:nvubdIzC -- Bank Change
$C942:20 03 B8  JSR $B803                  A:00 X:00 Y:00 S:39 P:nvubdIZC -- Ending Sequence
ここで注意が必要なのは、4人目の職業が赤魔道士[03]の場合Aの値が$010Bの値とAの論理和になるが、$010Bはエミュレータの初期化パターンによって#$00になっている。 $010Bの初期値はコンソールによって個体差があるので白魔道士[04]でなければ成功しない事がある。
Stack 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
0100: FF 00 00 00 FF FF FF FF 00 00 00 00 FF FF 68 E1
0110: 02 03 0E BA 62 B8 EB AD 45 CA F6 C8 92 C9 01 18
0120: 00 0B 05 92 C9 01 08 00 0B 05 92 C9 01 18 00 0B
0130: 05 92 C9 01 08 00 0B 05 92 C9 01 18 00 0B 05 92
Character List
8A 8B 8C 8D 8E 48 49 4A 4B 4C
 あ い う え お   が ぎ ぐ げ ご
8F 90 91 92 93 4D 4E 4F 50 51
 か き く け こ   ざ じ ず ぜ ぞ
94 95 96 97 98 52 53 54 55 56
 さ し す せ そ   だ ぢ づ で ど
99 9A 9B 9C 9D 57 58 59 5A 5B
 た ち つ て と   ば び ぶ べ ぼ
9E 9F A0 A1 A2 70 71 72 73 74
 な に ぬ ね の   ぱ ぴ ぷ ぺ ぽ
A3 A4 A5 A6 A7 7C 7D 7E 7F B9
 は ひ ふ へ ほ   ゃ ゅ ょ っ 。
A8 A9 AA AB AC 80 81 82 83 84
 ま み む め も   0 1 2 3 4
AD AE AF B0 B1 85 86 87 88 89
 ら り る れ ろ   5 6 7 8 9
B2 B3 B4 B5 B6 C2 C4 C5 C3 FF
 や ゆ よ わ ん   - ! ? ‥  
Player (77)
Location: Cornelia Castle
Joined: 4/1/2016
Posts: 316
Location: Cornelia Castle
Inzult wrote:
DJ Incendration wrote:
I know that the Stairs Glitch is where you climb 70 stairs and certain things happen. For a credits warp, you need to get 56 houses, 57 heals and 201 Pures (the way to get 201 is underflow the pure-count and drink 54 in battle.) Use down to 32 houses, save only with the last one and jump to your inventory (with a code) to get the credits rolling.
Sounds like you have the tools already with the pastebin and the notes and stuff. Here's a video of an RTA playthrough of the credits warp route so you can visualize the setups. https://www.youtube.com/watch?v=eo0CtjU_T_c
So, the run has been taken down. Where's the pastebin again?
DJ Incendration Believe in Michael Girard and every speedrunner and TASer!
Post subject: Corrected JP character table
Sand
He/Him
Player (147)
Joined: 6/26/2018
Posts: 208
pirohiko wrote:
Character List
I discovered some errors in the character table. Here is a corrected table. It is found at 0xa013 in the JP ROM. The corresponding table is at 0xa011 in the NA ROM, and is called lut_NameInput in the disassembly.
8A 8B 8C 8D 8E 48 49 4A 4B 4C
あ い う え お が ぎ ぐ げ ご
8F 90 91 92 93 4D 4E 4F 50 51
か き く け こ ざ じ ず ぜ ぞ
94 95 96 97 98 52 53 54 55 56
さ し す せ そ だ ぢ づ で ど
99 9A 9B 9C 9D 57 58 59 5A 5B
た ち つ て と ば び ぶ べ ぼ
9E 9F A0 A1 A2 70 71 72 73 74
な に ぬ ね の ぱ ぴ ぷ ぺ ぽ
A3 A4 A5 A6 A7 7D 7E 7F 7C B9
は ひ ふ へ ほ ゃ ゅ ょ っ 。
A8 A9 AA AB AC 80 81 82 83 84
ま み む め も 0 1 2 3 4
B0 B1 B2 B3 B4 85 86 87 88 89
ら り る れ ろ 5 6 7 8 9
AD AE AF B5 B6 C2 C4 C5 C3 FF
や ゆ よ わ ん - ! ?  ‥  
The corrections are:
  • ゃゅょっ is 7D 7E 7F 7C, not 7C 7D 7E 7F.
  • らりるれろやゆよ is B0 B1 B2 B3 B4 AD AE AF, not AD AE AF B0 B1 B2 B3 B4.
Here it is as a Python table, for convenience:
{
    0x8a: "あ", 0x8b: "い", 0x8c: "う", 0x8d: "え", 0x8e: "お", 0x48: "が", 0x49: "ぎ", 0x4a: "ぐ", 0x4b: "げ", 0x4c: "ご",
    0x8f: "か", 0x90: "き", 0x91: "く", 0x92: "け", 0x93: "こ", 0x4d: "ざ", 0x4e: "じ", 0x4f: "ず", 0x50: "ぜ", 0x51: "ぞ",
    0x94: "さ", 0x95: "し", 0x96: "す", 0x97: "せ", 0x98: "そ", 0x52: "だ", 0x53: "ぢ", 0x54: "づ", 0x55: "で", 0x56: "ど",
    0x99: "た", 0x9a: "ち", 0x9b: "つ", 0x9c: "て", 0x9d: "と", 0x57: "ば", 0x58: "び", 0x59: "ぶ", 0x5a: "べ", 0x5b: "ぼ",
    0x9e: "な", 0x9f: "に", 0xa0: "ぬ", 0xa1: "ね", 0xa2: "の", 0x70: "ぱ", 0x71: "ぴ", 0x72: "ぷ", 0x73: "ぺ", 0x74: "ぽ",
    0xa3: "は", 0xa4: "ひ", 0xa5: "ふ", 0xa6: "へ", 0xa7: "ほ", 0x7d: "ゃ", 0x7e: "ゅ", 0x7f: "ょ", 0x7c: "っ", 0xb9: "。",
    0xa8: "ま", 0xa9: "み", 0xaa: "む", 0xab: "め", 0xac: "も", 0x80: "0", 0x81: "1", 0x82: "2", 0x83: "3", 0x84: "4",
    0xb0: "ら", 0xb1: "り", 0xb2: "る", 0xb3: "れ", 0xb4: "ろ", 0x85: "5", 0x86: "6", 0x87: "7", 0x88: "8", 0x89: "9",
    0xad: "や", 0xae: "ゆ", 0xaf: "よ", 0xb5: "わ", 0xb6: "ん", 0xc2: "-", 0xc4: "!", 0xc5: "?", 0xc3:  "‥", 0xff: " ",
}
Post subject: Name entry cursor movement
Sand
He/Him
Player (147)
Joined: 6/26/2018
Posts: 208
I haven't seen documentation of optimized cursor movement on the name entry menu, even though some published runs take advantage of it. I've posted new notes at Wiki: GameResources/NES/FinalFantasy1?revision=3#NameEntry. In short, the name entry of [2079] NES Final Fantasy by TheAxeMan in 1:09:57.70 could be done up to 5 frames faster; the name entry in [4468] NES Final Fantasy "game end glitch" by AmaizumiUni, Spikestuff & DJ Incendration in 01:36.47 is already optimized. Naively, to move the cursor from one point to another on the name entry screen, you would alternate between 1 frame of pressing a directional button and 1 frame of no input. When you need to do a diagonal movement, you save time by alternating the two directional buttons you need to press. For example, moving 2 spaces down and 3 spaces right (2D3R), a total distance of 5 spaces, can be done in 5 frames by alternating Right and Down inputs: ...R .D.. ...R .D.. ...R. But you can do even better than that. The name entry menu reacts to simultaneous directional inputs in unintuitive ways. For example, if you were pressing ...R on the previous frame, and press U..R on the current frame, the cursor moves 1 space down, despite the Down button never being pressed. Only U... and ..L. can move the cursor up or left, but there are many ways to move the cursor down or right, which means that you can often move at the maximum speed of 1 space per frame, even when moving straight horizontally or vertically.
A B C D E F G H I J
K L M N O P Q R S T
U V W X Y Z ’ , .  
0 1 2 3 4 5 6 7 8 9
a b c d e f g h i j
k l m n o p q r s t
u v w x y z - ‥ ! ?
[2079] NES Final Fantasy by TheAxeMan in 1:09:57.70 (and in turn namegenerator/inputgeneration.py from lightwarriorcode-v1.0.zip) applies the diagonal-movement optimization but not the simultaneous-buttons optimization. For example, the diagonal 2D2R movement from 'A' to 'W' in the name "Wedg" is already as fast as it can be:
|..|...R....| 'B'
|..|.D......| 'L'
|..|...R....| 'M'
|..|.D......| 'W'
|..|.......A|
But the 3R horizontal movement from 'd' to 'g' takes 5 frames when it could take only 3 frames:
|..|...R....| 'e'
|..|........|
|..|...R....| 'f'
|..|........|
|..|...R....| 'g'
|..|.......A|
You can move from 'd' to 'g' in only 3 frames by alternating ...R and ..LR:
|..|...R....| 'e'
|..|..LR....| 'f'
|..|...R....| 'g'
|..|.......A|
User movie #638766622812817201 is a demonstration of entering the "Wedg", "Bigs", "Axe ", "Viks" names from [2079] NES Final Fantasy by TheAxeMan in 1:09:57.70 5 frames faster. The inputs were computed using a shortest-path Lua script (a further development of the savestate manipulation technique from Post #532299) using the movement table from Wiki: GameResources/NES/FinalFantasy1#NameEntry. User movie #638766593655640386 is a partial resync of the original .fm2 to BizHawk 2.10, for easier comparison.
あいうえおがぎぐげご
かきくけこざじずぜぞ
さしすせそだぢづでど
たちつてとばびぶべぼ
なにぬねのぱぴぷぺぽ
はひふへほゃゅょっ。
まみむめも01234
らりるれろ56789
やゆよわん-!?‥ 
[4468] NES Final Fantasy "game end glitch" by AmaizumiUni, Spikestuff & DJ Incendration in 01:36.47 already applies both the diagonal-movement optimization and the simultaneous-buttons optimization. The only place the latter optimization is needed is in the name of the 4th party member, "ちごにむ". I believe the optimization was applied by Spikestuff in Post #505707 and User movie #71345684705421713. (The original submission in #7120: AmaizumiUni, Spikestuff & DJ Incendration's NES Final Fantasy "game end glitch" in 01:36.47 was replaced so I can't be sure.) To move 3D1R from 'あ' to 'ち', the movie uses a frame of U.L. to move the cursor down immediately after another down movement:
|..|.D......|........| 'か'
|..|...R....|........| 'き'
|..|.D......|........| 'し'
|..|U.L.....|........| 'ち'
|..|.......A|........|
The same trick is used to move 4D2R from 'ご' to 'に'. The 2D1R movement from 'に' to 'む' uses .D.. ...R U.L., but in this case the simultaneous buttons are unnecessary; .D.. ...R .D.. would also have worked.
Post subject: An unsuccessful search for faster name payloads for game end glitch
Sand
He/Him
Player (147)
Joined: 6/26/2018
Posts: 208
I experimented with many different options for party member names in an attempt to find a faster credits warp than [4468] NES Final Fantasy "game end glitch" by AmaizumiUni, Spikestuff & DJ Incendration in 01:36.47. Long story short, I didn't find any. The names よるえき [0xaf 0xb2 0x8d 0x90] and ちごにむ [0x9a 0x4c 0x9f 0xaa] for party members 2 and 4 are the fastest of many alternatives I tried. Let's first walk through the exploit. Though 4468M is for the Japanese release, I'll use labels from the USA disassembly. The NES CPU's stack occupies 256 bytes from 0x0100 to 0x01ff. The ultimate cause of the game end glitch is the fact that the variable tmp_hi is located inside the stack area, at address 0x0110. tmp_hi is used during string formatting to store a pointer into a text buffer. Repeatedly climbing the stairs causes the stack to grow until it overlaps tmp_hi. When you press Start after filling the stack, the text formatting for the pause menu ends up storing the value 0x0302 in tmp_hi, overwriting the return address of an active stack frame. In this case, 0x0302 is a pointer into str_buf (which happens to overlap with the data structure ptygen that stores information about the party during party selection). When the stack frame returns, because of the overwritten return address, the CPU starts executing code at 0x0303. This address would be part of party member 1's name, except that the data for party member 1 data in ptygen has been overwritten, having been used as scratch space for string formatting. That scratch string data executes as harmless code and eventually reaches the part of ptygen reserved for party member 2 0x0310, which is where the first part of the exploit payload is stored. ptygen is not where party information is permanently stored. Once the game starts, that memory may be reused for other purposes, including string formatting, as we have seen. At the time of the exploit, the data for party member 1 has been overwritten, but the data for party members 2–4 is still intact. Some of the memory is controllable (party member classes and names), and some are not (e.g. screen coordinates). Here's what ptygen looks like at the time:
        00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f
0300             10 06 .. .. .. .. .. .. 7a 80 7a 80 00
0310    L2 00 N2 N2 N2 N2 15 .. .. .. .. .. .. .. .. ..
0320    .. .. .. .. .. .. .. .. .. .. .. .. 04 10 80 a0
0330    L4 00 N4 N4 N4 N4 15 18
  • .. bytes are omitted because they always get jumped over.
  • Ln and Nn bytes are the bytes we have control over:
    • Ln is a party member's class, 0x00 to 0x05,
    • Nn are the 4 bytes of a party member's name.
  • All other bytes are fixed.
The first few instructions starting at 0x0303 are some harmless branches and nops:
0303    BPL 0x030b
030B    NOP
030C    NOP #0x7a
030E    NOP #0x00
The first byte we have control over is L2 at 0x0310. The second byte is always 0x00 and cannot be changed, but then we get to execute the 4 bytes of party member 2's name:
0310    L2
0311    0x00
0312    N2
0313    N2
0314    N2
0315    N2
Starting at 0x0316, there are are some non-modifiable instructions that halt the CPU. No matter what alignment we start with, we will hit a 0x12 at address 0x031c which encodes a JAM instruction. Therefore, we are obligated to use the final N2 byte for a branch instruction (either BCC or BVC) to take the next byte 0x15 as an operand and jump ahead 21 bytes, to arrive at 0x032c. At 0x32c, there is a nop and a branch that doesn't get taken, and then we get to execute the class and name of player character 4:
032C    NOP 0x10
032E    BMI 0x02d0
0330    L4
0331    0x00
0332    N4
0333    N4
0334    N4
0335    N4
That's the extent of our control, the 10 bytes representing the class and name of party members 2 and 4. The L2 and L4 bytes are limited because they can only represent one of the six classes using byte values 0x00 to 0x05. The N2 and N4 bytes are more flexible, but even then we are limited to a palette of 90 bytes that represent name characters. Because the final byte of party member 2's name must be a branch instruction, we really have just about 7 bytes to work with.
The goal of the game end glitch is to jump to EnterEndingScene in memory bank 0x0d. For that, we want to jump to VictoryLoop in bank 0x0f. VictoryLoop loads bank 0x0d and jumps to EnterEndingScene.
C935    JSR EnterBattle_L   ; start the battle!
C938    BCC :+              ;  see if this battle was the end game battle

      @VictoryLoop:
C93A    JSR LoadEpilogueSceneGFX
C93D    LDA #BANK_ENDINGSCENE
C93F    JSR SwapPRG_L
C942    JSR EnterEndingScene
C945    JMP @VictoryLoop

      :
C948    JSR ReenterStandardMap  ; if this was just a normal battle, reenter the map
Because VictoryLoop is a loop, our jump target doesn't have to be exact. We can land at almost any address inside the loop and it will work. Even if we're misaligned with the intended instructions, it happens to still work out: the JMP at 0xc945 eventually runs and restarts the loop with proper alignment. And because the CPU's carry flag happens to be set at this point, we can even land on the BCC instruction before the loop. Therefore, we are looking to jump to any address between 0xc938 and 0xc945 inclusive.
We'd be happy with a jump target anywhere between 0xc938 and 0xc945. Unfortunately, we cannot just write a JMP instruction, because none of the bytes 0xc9 and 0x38–0x45 are available to us. The table below shows the naming bytes we have to work with, along with what instructions they encode:
0x8a あ0x8b い0x8c う0x8d え0x8e お0x48 が0x49 ぎ0x4a ぐ0x4b げ0x4c ご
TXA implANE #STY absSTA absSTX absPHA implEOR #LSR AALR #JMP abs
0x8f か0x90 き0x91 く0x92 け0x93 こ0x4d ざ0x4e じ0x4f ず0x50 ぜ0x51 ぞ
SAX absBCC relSTA ind,YJAMSHA ind,YEOR absLSR absSRE absBVC relEOR ind,Y
0x94 さ0x95 し0x96 す0x97 せ0x98 そ0x52 だ0x53 ぢ0x54 づ0x55 で0x56 ど
STY zpg,XSTA zpg,XSTX zpg,YSAX zpg,YTYA implJAMSRE ind,YNOP zpg,XEOR zpg,XLSR zpg,X
0x99 た0x9a ち0x9b つ0x9c て0x9d と0x57 ば0x58 び0x59 ぶ0x5a べ0x5b ぼ
STA abs,YTXS implTAS abs,YSHY abs,XSTA abs,XSRE zpg,XCLI implEOR abs,YNOP implSRE abs,Y
0x9e な0x9f に0xa0 ぬ0xa1 ね0xa2 の0x70 ぱ0x71 ぴ0x72 ぷ0x73 ぺ0x74 ぽ
SHX abs,YSHA abs,YLDY #LDA X,indLDX #BVS relADC ind,YJAMRRA ind,YNOP zpg,X
0xa3 は0xa4 ひ0xa5 ふ0xa6 へ0xa7 ほ0x7d ゃ0x7e ゅ0x7f ょ0x7c っ0xb9 。
LAX X,indLDY zpgLDA zpgLDX zpgLAX zpgADC abs,XROR abs,XRRA abs,XNOP abs,XLDA abs,Y
0xa8 ま0xa9 み0xaa む0xab め0xac も0x80 00x81 10x82 20x83 30x84 4
TAY implLDA #TAX implLXA #LDY absNOP #STA X,indNOP #SAX X,indSTY zpg
0xb0 ら0xb1 り0xb2 る0xb3 れ0xb4 ろ0x85 50x86 60x87 70x88 80x89 9
BCS relLDA ind,YJAMLAX ind,YLDY zpg,XSTA zpgSTX zpgSAX zpgDEY implNOP #
0xad や0xae ゆ0xaf よ0xb5 わ0xb6 ん0xc2 -0xc4 !0xc5 ?0xc3 ‥0xff  
LDA absLDX absLAX absLDA zpg,XLDX zpg,YNOP #CPY zpgCMP zpgDCP X,indISC abs,X
We do also have limited control over the L2 and L4 bytes, which represent party member classes and may take on any value from 0x00 to 0x05. The fastest option is to take the defaults, which are L2 = Thief = 0x01 = ORA X,ind and L4 = Red Mage = 0x03 = SLO X,ind. The Red Mage SLO can cause us problems in some circumstances, because it may change the value of the A register. We can work around the problem by setting and L4 = White Mage = 0x04 = NOP zpg, but that costs a few frames. The 0x00 BRK impl and 0x02 JAM instructions are not useful to us.
0x00 Fighter0x01 Thief0x02 Black Belt0x03 Red Mage0x04 White Mage0x05 Black Mage
BRK implORA implJAMSLO X,indNOP zpgORA zpg

The exploit payload of 4468M is clever and subtle. Because we cannot directly enter any of the jump destination address bytes we need, we must cobble together an address using bytes already in memory, on the stack and elsewhere. The general strategy is this:
  • Store a value in the A register that works as the low-order byte of a destination address in the range 0xc938 to 0xc945. Because we will eventually use an RTS instruction to do the jump, and RTS adds 1 to the address, A must actually be in the range 0x37 to 0x44.
  • Store a value in the X register that points to one byte before any of the multiple appearances of 0xc9 on the stack.
  • Transfer the value of the X register into the SP (stack pointer) register.
  • Push the value of the A register onto the stack using PHA. The stack pointer now points to an address 0xc9?? that works as a destination jump address.
  • Jump to the address on the stack with an RTS instruction.
It's a lot to accomplish in 7 bytes. Let's walk through it.
0310    01 00       ORA (0x00,X)    ; character class 0x01, (0x00,X) points to zero so there is no effect
0312    AF B2 8D    LAX 0x8db2      ; set A = X = 0x38, the value stored at 0x8db2
0313    90 15       BCC 0x032c      ; use the preexisting 0x15 operand, jump ahead to character 4 data
The 0x01 byte (representing the Thief class) encodes a two-byte ORA instruction, which has no effect given the current state of the registers. The exploit then spends 3 bytes on an instance of the undocumented/illegal instruction LAX, which assigns to the A register and the X register simultaneously. The final byte of party member 2's name must be used for a branch instruction (here BCC) to jump to party member 4's data—we have no choice. Assigning to A and X simultaneously is a good use of space, but it means we need a single value that works for both registers. A must be in the range 0x37 to 0x44, and X must point to one byte before a 0xc9 on the stack. Here's the relevant part of the stack:
        00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f
0130                            92 c9 01 18 00 0b 05 92
0140    c9 01 08 00 0b 05
There are only two values that satisfy both constraints: 0x38 and 0x3f. We can take either of these values from anywhere in memory. The payload takes it from address 0x8db2, which contains 0x38. (0x38, as a value for A and X, has a non-obvious advantage over 0x3f. The SLO (0x00,X) instruction that comes from from executing character class 0x03 (Red Mage) happens to leave the A register and processor flags unchanged when X = 0x38, but not when X = 0x3f. 0x3f can still work, but it requires changing party member 4's class to White Mage, which encodes a NOP instruction. This is the part that pirohiko marked "be careful" in Post #505719.)
0330    03 00       SLO (0x00,X)    ; character class 0x03, doesn't change A when X = 0x38
0332    9A          TXS             ; set SP = X, SP points to one byte before 0xc9
0333    4C 9F AA    JMP 0xaa9f      ; jump to code that will run PHA and RTS for us
The first byte of party member 4's name is spent on a TXS instruction to set the stack pointer. We have just 3 bytes of payload left, and we still need to do a PHA to push A on the stack and an RTS to do the jump. We cannot directly encode an RTS instruction (0x60 is not one of the bytes we have access to), but we can encode a JMP (0x4c). So what we can do is execute an absolute JMP to an RTS somewhere else in the code. But a JMP takes 3 bytes, which leaves no room for the PHA we also need. Therefore we need to jump to a place in memory that executes both a PHA and (eventually) an RTS. There's only one range of addresses that works, 0xaa93 to 0xaa9f. The payload uses 0xaa9f.
It's hard to imagine triggering the exploit any faster: start a new game, walk directly north to the stairs, walk up and down and open the menu a few times. Any improvement can only come from entering the party information faster. Is there a payload (a configuration of party member classes and names) that accomplishes the same exploit, but is faster to enter? That is what I tried and failed to find. My strategy was to keep the basic exploit technique, but generate a number of variations on the payload (that use different opcodes and addresses but have the same effect), then time each of those payloads using an automatic name entry script. These are some of the variations I tried:
  • LAX opcodes other than 0xaf, including 2-byte encodings.
  • Source addresses for LAX other than 0x8db2.
  • Destination addresses for JMP other than 0xaa9f.
  • BVC as an alternative to BCC.
  • Moving the PHA instruction into the payload, which gives more address freedom in the JMP to RTS.
  • Opcodes and placement of NOP instructions, in payloads that are short enough to leave room for a NOP.
4468M used a 3-byte form of the LAX instruction, opcode 0xaf, with an absolute addressing mode. The 2-byte operand is the address of a memory location that contains 0x38. Our repertoire of name characters gives us access to addressing modes of LAX whose encoding is only 2 bytes:
OpcodeInstructionBytes
0xa3LAX X,ind2 bytes
0xa7LAX zpg2 bytes
0xafLAX abs3 bytes
0xb3LAX ind,Y2 bytes
A search of the possible second operand bytes, using the values of the registers at the time of the exploit (X = 0x05, Y = 0x0e), finds just two possibilities for a 2-byte LAX instruction that places a 0x38 or 0x3f into both A and X. The first is LAX (0x59),Y; that is, the indirect, Y-indexed address mode with an operand of 0x59, whose encoding is 0xb3 0x59. The second possibility is only possible when you hold certain controller inputs at the time of the exploit, in order to stage a 0x38 value in the joy variable at address 0x0020. With this preparation you can used LAX (0x50,X), whose encoding is 0xa3 0x50. A 2-byte LAX gives us a lot of flexibility. Saving 1 byte means we can put the necessary PHA instruction directly in the payload, which means our JMP only has to hit an RTS, not PHA then RTS. Only needing to hit an RTS gives us a lot more possibilities for jump targets. Alternatively, we could continue jumping to PHA+RTS, and fill in the extra byte in the payload with a NOP instruction (or any 1-byte instruction that is effectively a nop). Or, as a special case in the final position, we can just repeat the previous name byte, because the JMP will happen before it gets executed. If we must jump to an address that executes PHA and then RTS, the only possibilities are 0xaa93–0xaa9f. I found this using a script—see gameend-search-jmp.fnl in the source code below. Of these possibilities, four are equally fast to enter, following a 0x4c byte: 0xaa95, 0xaa96, 0xaa9a, and 0xaa9f. 0xaa9f is the cleanest, as it points directly to the PHA instruction. Besides the variations listed above, I also tried things such as substituting 2-byte LDA followed by TAX for 3-byte LAX. See the file gameend-payloads.py in the source code for the full expression of variations:
Language: python

# 4468M payload, with all possibilities for branch instruction and PHA+RTS code address. # Also consider 2-byte LDA, ADC, EOR, SRE, or RRA followed by TAX as an alternative for 3-byte LAX. # Also consider 2-byte LDX followed by TXA as an alternative for 3-byte LAX. (LAX3 | (LDA2 | ADC2 | EOR2) + TAX | LDX2 + TXA) + BRANCH + TXS + JMP_PHA_RTS, # SRE and RRA can also serve to set A, though they may also change the value # of processor flags, so try more options for the branch (BVS and BCS in # addition to BVC and BCC). (SRE2 | RRA2) + TAX + ALL_BRANCH + TXS + JMP_PHA_RTS, # When we do 2-byte LDX first, we also have the option of doing TXS before TXA. (LDX2 + TXS) + BRANCH + TXA + JMP_PHA_RTS, # 2-byte LAX with jump to PHA+RTS, padding with 1-byte NOPs (or what are # effectively NOPs) in every possible position. What works as a NOP differs # in each position. # Anything that assigns to A or X is a NOP if it happens before LAX. # (As long as it doesn't interfere with the address mode of LAX.) # PHA and TXS work too, as we're not using the stack yet. Also include # TAY, even though it may interfere with the address mode of the LAX. (NOP | TXA | TYA | TAX | TAY | LSRA | PHA | TXS) + LAX2 + BRANCH + TXS + JMP_PHA_RTS, # After LAX, A and X are equal, so TXA and TAX are NOPs. We may now # clobber Y with TAY or DEY. PHA and TXS are still available to us too. LAX2 + (NOP | TXA | TAX | TAY | DEY | PHA | TXS) + BRANCH + TXS + JMP_PHA_RTS, # After TXS, we may still interchange A and X, or clobber Y. We can also # use a second TXS as a NOP. But we cannot use PHA anymore, because the # stack is now set up. LAX2 + TXS + BRANCH + (NOP | TXA | TAX | TAY | DEY | TXS) + JMP_PHA_RTS, # In the final position, don't use a NOP, just repeat the previous byte. LAX2 + TXS + BRANCH + JMP_PHA_RTS + REP, # 2-byte LAX with PHA and jump to RTS. LAX2 + TXS + BRANCH + PHA + JMP_RTS,
Sadly I could not find a use for the TAS instruction :(
In total I generated and timed 8168 candidate payloads. The list, sorted from fastest to slowest, is ff1-gameend-payloads-time.log. There are a few payloads that are equally fast (52 frames) as the one used in 4468M. These are basically the same, [LAX 0x8db2; BCC] [TXS; JMP ADDR], just using equivalent JMP addresses:
[0xaf 0xb2 0x8d 0x90]	22	[0x9a 0x4c 0x95 0xaa]	30	52
[0xaf 0xb2 0x8d 0x90]	22	[0x9a 0x4c 0x96 0xaa]	30	52
[0xaf 0xb2 0x8d 0x90]	22	[0x9a 0x4c 0x9a 0xaa]	30	52
[0xaf 0xb2 0x8d 0x90]	22	[0x9a 0x4c 0x9f 0xaa]	30	52
The LAX 0x8db2 instruction, despite needing 3 bytes to encode, uses characters that are all close to each other on the name entry screen. After that, 1 frame slower, there is a similar batch that uses address 0xc3b5 in place of 0x8db2 in the LAX instruction, and uses BVC in place of BCC: [LAX 0xc3b5; BCC] [TXS; JMP ADDR].
[0xaf 0xb5 0xc3 0x50]	23	[0x9a 0x4c 0x95 0xaa]	30	53
[0xaf 0xb5 0xc3 0x50]	23	[0x9a 0x4c 0x96 0xaa]	30	53
[0xaf 0xb5 0xc3 0x50]	23	[0x9a 0x4c 0x9a 0xaa]	30	53
[0xaf 0xb5 0xc3 0x50]	23	[0x9a 0x4c 0x9f 0xaa]	30	53
The first payloads that are notably different are 3 frames slower overall. I'll highlight just a few of them. One uses a 2-byte LAX and an explicit PHA followed by a JMP to RTS: [LAX (0x59),Y; TXS; BCC] [PHA; JMP 0xc34b].
[0xb3 0x59 0x9a 0x90]	32	[0x48 0x4c 0x4b 0xc3]	23	55
Another uses the space gained by a 2-byte LAX to store a TXA (which is effectively a nop) at the beginning of party member 4's name: [LAX (0x59),Y; TXS; BCC] [TXA; JMP 0xaa9f]. The TXA opcode 0x8a is in the upper left of the name entry palette and requires no cursor movement to enter.
[0xb3 0x59 0x9a 0x90]	32	[0x8a 0x4c 0x9f 0xaa]	23	55
This one uses an alternative 2-byte LAX, no explicit PHA, and a JMP to PHA+RTS. The final name character, which is not executed, can be a repeat of the previous character, which requires no cursor movement: [LAX (0x50,X); TXS; BCC] [JMP 0xaa9f].
[0xa3 0x50 0x9a 0x90]	32	[0x4c 0x9f 0xaa 0xaa]	23	55

In summary, I am impressed by the high degree of optimization achieved by [4468] NES Final Fantasy "game end glitch" by AmaizumiUni, Spikestuff & DJ Incendration in 01:36.47. Despite my best efforts, I was not able to improve it. My source code, data files, and development notes are available from git clone https://www.bamsoftware.com/git/ff1.git. The commit as of this writing is d5cd9d40f891729a441eb950da98e371b43167d2. A brief howto is: run gameend-payloads.py to generate a list of candidate payloads in gameend-payloads.fnl. Run gameend-payloads-time.fnl (in BizHawk) to time how the names take to enter. Run gameend-payloads-test.fnl to check whether they actually result in a game end glitch.
1 2
12 13