This is the big one.
By dismounting from the stairs in BLK-4-01 and walking through the glitched worlds, it is possible to wrong warp to the final stage, saving a huge amount of time.
During the course of the making of this run, I tried to figure out the wrong warp to understand what is going on.
While I didn't discover any of this, here is a write up detailing its workings.
The wrong warp is effectively an errant write, writing an unexpected value to an unexpected location in memory.
Get ready for some dry, dry, theory.
This is the simpler part.
The culprit code is located at
$28132 (loaded in at
$8132 during runtime) and is executed whenever a room is loaded or the camera crosses a 64 pixel boundary.
The code looks like this:
LDY #$00
LDA ($00),Y
STA $07C2,X
BEQ $8178
INY
LDA ($00),Y
CLC
ADC $09
STA $07DA,X
LDA $0A
ADC #$00
AND #$01
STA $07E0,X
INY
LDA ($00),Y
STA $07D4,X
INY
LDA ($00),Y
STA $07E6,X
INY
LDA ($00),Y
STA $07CE,X
...
Effectively, a pointer to a 5-byte struct is dereferenced and the component bytes are written to
$7C2,X,
$7DA,X,
$7D4,X,
$7E6,X and
$7CE,X, respectively.
Either
$0 or
$1 is also written to
$7E0,X for good measure.
Ordinarily, the data is somehow related to the enemies, and
X is supposed to be between 0 and 5, writing to a number of 6 byte tables in RAM.
The value for
X is loaded from a table located at
$2840C (
$840C at runtime), containing 32 entries going
0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 5, 0 and so on, i.e. a
mod 6 table.
This table is indexed by the current camera position divided by 64.
However, if the camera position is way out of bounds (above 2048), we read outside of the table and find a value from regular program code.
If the camera position is between
$3540 and
$357F, we read from
$840C,$D5, which happens to be
$70.
Added to
$7C2, this happens to be
$7C2 + $70 = $832, which happens to be the same address as
$32, the level index.
In other words, with a specific out-of-bounds camera position, we can corrupt the level index.
This part is substantially more complicated, but it also comes down to a combination of camera position and the current room.
As mentioned, we write data from a 5-byte struct into RAM.
These structs are accessed indirectly through a pointer table located at
$2A03F.
We must find a struct where its first byte is
$0E, the final level index.
Fortunately, several such structs exist, at indices
$0D,
$11 and several others.
The next part is then how to access this struct.
It is accessed like this, immediately before we write the data (see above):
LDA $76
ASL A
TAY
...
LDA ($98),Y
ASL A
BCS $811A
TAY
LDA $A03F,Y
STA $00
LDA $A040,Y
STA $01
...
$76 is the camera position divided by 64 as before, which is then multiplied by 2.
The index for the struct is fetched indirectly from a pointer at
($98),Y.
The value of this pointer is determined by the current screen index.
Every screen in the game has a unique pointer.
The pointers for the screens are arranged like this in ROM:
$293F1 Block 0 (Beginning)
$29401 Block 1 (Clock Tower)
$29425 Block 2 (Mad Forest)
$29439 Block 3 (Ship)
$29451 Block 4 (Death Tower)
$29463 Block 5 (Bridge)
$2946F Block 6 (Swamp)
$29478 Block 7 (Caves)
$2948F Block 8 (Sunken City)
<Large Gap...>
...
$29BF1 Block 9 (Crypt)
$29BFB Block A (Cliffs)
$29C17 Block B (Rafters)
$29C25 Block C (Entry Hall)
$29C2F Block D (Riddle)
$29C45 Block E (Final Approach)
As you can see, the pointers for the screens are mostly adjacent to eachother in ROM, save for one large gap in the middle.
The combination of screen and camera position is used to index the struct pointer table at
$2A03F.
However, we need to have a specific camera position (see above).
This means that we read from the screen pointer itself out of bounds, but it also limits the number of possible rooms we can use for this.
Unfortunately,
no valid room in the game is practical to use with the camera position that we need.
However, the gap in-between those screens is our saviour.
Most of the bytes in this gap are
$00, which means that
$98 may be loaded with zero, which allows us to read an index from zero-page!
We read from
$D5 << 1 = $1AA, so in practical terms, we read a value from
$AA to use as an index for the table of structs.
The task then becomes finding a way to get the correct index (
$0D,
$11, etc.) into
$AA to read it.
Unfortunately, again, under normal circumstances
$AA doesn't contain a suitable value.
We therefore need to repeat this whole process to get
$AA to contain
$11, which finally enables the wrong warp to work.
This is in itself not trivial, but at this point I decided it was simpler to just copy the old route and be done with it.
The reason why the previous run and this run both go to the swamp is because, as you can see, the screens for the swamp are physically closer in ROM to the invalid gap: We skip several blocks worth of screens that do not contain usable wrong warps.
At this point, we simply need to venture enough screens out of bounds to hit a screen that points to
$00.
Another potential approach could be to go out of bounds in the clock tower and use yet more memory corruption to get a large enough screen index to reach the gap early.
I wrote
some tools to find possible rooms and camera positions to find memory corruptions in this game.
Messy as the code is, it helped immensely in figuring out the possibilities.
It is also worth noting that unfortunately,
there is no credits warp using this method.
A credits warp would work by setting the gamestate variable
$1E to
$0C.
Unfortunately,
$0C only ever occurs in the first byte of the 5-byte structs, but there is no suitable offset to add to
$7C2 to get to
$81E.
Knowing this would have saved me a lot of frustration and hours wasted many years ago.
In theory, the wrong warp is simple and repeatable:
Just go to the right screen with the right camera position, and make sure any variables in zero page are set correctly if required.
In practice, every time the screen changes, other variables are overwritten as well, causing all manner of strange effects and instability.
In addition, actually getting the correct camera positions is also easier said than done.
I still haven't fully deciphered how the camera position gets set between screen changes all the time.
The current wrong warp as it is is an utter miracle, because the specific sequence of room changes it executes happen to be exactly right to prevent the game from freezing or otherwise glitching out with undesired effects.
I do not know how much research the original discoverer of the glitch did, but it is nothing short of stunning that it is as optimal as it is.
Not only does the warp work, but it is also repeatable enough that real time runners successfully do it.
It still is not trivial however.
While the wrong warp itself is repeatable, the screens you come across in the glitched worlds are somewhat based on RNG, so it is very possible to get dismounted from the stairs and die or get stranded in the glitched world.
As for the J version of the game, the principles of the wrong warp apply just the same, however, some of the exact values you get are different.
The J version is also substantially more unstable, causing freezes and general havoc much more easily.
Phew.