1 2
19 20
Alyosha
He/Him
Editor, Expert player (3532)
Joined: 11/30/2014
Posts: 2728
Location: US
I'll be using this opening post to keep track of current progress on console verifications and other testing. Currently testing: - HDMA edge cases in single speed mode _____________________________________ Dev Build: https://ci.appveyor.com/project/zeromus/bizhawk-udexo/build/artifacts Repo for input timestamp files: https://github.com/alyosha-tas/GBI_timestamps Dump script for 2.6.1 and older: http://tasvideos.org/userfiles/info/68761073876269603 Dump script for 2.6.2 and up: http://tasvideos.org/userfiles/info/71650785644185599 NOTE: use the setting 'TotalExecutedCycles Return Value' in GB->Settings in BizHawk and set it to GBI for double speed mode games _____________________________________ Several games use uninitialized RAM in a way that causes desyncs on real hardware, below is a list of some examples I found: Race Days (Select Dirty Racing) : This game checks a memory address to see if a block needs to be initialized. It does this many times in different locations. The same movie will desync in different ways each playthrough on console. Ex:
02EB:  EA 24 C2  LD   (#C224h),A         A:00 F:90 B:A8 C:FC D:02 E:00 H:A0 L:00 SP:DFFA Cy:61216123 LY:128 ZnhCie
Super R.C. Pro Am : This game checks 4 locations in memory before initializing them, these could be related to linking but I'm not really sure. There seems to be a high probability of these cases passing on a real console in my testing:
FFCD > 0x02 (FFCD unwritten)

FFCC != 0xDE (checks 3 times, FFCC unwritten)

FFCC != 0xDE (checks 3 times, FFCC unwritten)

FFC9 != FFC8 (FFC9 unwritten, FFC8 written 1)
Pokemon Crystal: When loading up the credits sequence it checks player state in the wrong bank, this doesn't seem to have any real impact, but would possibly cause a desync if you wanted to continue a movie past the credits:
4384:  FA 5D D9  LD   A,(#D95Dh)         A:00 F:80 B:C2 C:91 D:0A E:00 H:C2 L:91 SP:C0B9 Cy:323183331 LY:149 ZnhciE
4387:  FE 01     CP   #01h               A:00 F:80 B:C2 C:91 D:0A E:00 H:C2 L:91 SP:C0B9 Cy:323183347 LY:149 ZnhciE
Daiku no Gen-san - Ghost Building Company (Japan): uses uninitialized RAM at the start of level 2, this causes a desync towards the end of the level. It looks like all it does is use these values for starting xscroll possibly for the clouds. Battletoads: Reads from HRAM at the start of level one to set some values in WRAM. Doesn't effect the TAS but might effect casual play, not sure
09F9:  F0 C2     LDH  A,(#FFC2h)         A:24 F:00 B:00 C:20 D:C9 E:18 H:9A L:C3 SP:CAFC Cy:105661311 LY:0 znhcie
Editor, Emulator Coder
Joined: 8/7/2008
Posts: 1156
My only request is that you make an effort to prepare it for a higher level of debugging quality, even though it can't be used yet, chiefly by giving it the ability to step and exit from emulation mid-frame. We're not likely to address the problem of bizhawk-cant-handle-debugging very soon, but we're more likely to do it sooner if we have a nice in-house c# core to practice with. However, this goal will come into conflict with performance. My philosophy is, if you want it to be _fast_, you shouldn't do it in c# in the first place. A fluffy c# core should be for features, instead.
MarbleousDave
He/Him
Player (12)
Joined: 9/12/2009
Posts: 1555
I would also ask for one request. The one I'm dying to see happen is emulating four Game Boys at once and linking them together for games like F-1 Race and Yoshi's Cookie.
Skilled player (1706)
Joined: 9/17/2009
Posts: 4952
Location: ̶C̶a̶n̶a̶d̶a̶ "Kanatah"
Well, apparently MUGG found some glitches in gb games where it behaves differently depending on the emulator: 1. Daffy Duck The Marvin Missions 2. The Smurfs 3. Wario Land II (There's a gameboy (non-color) version of this too) So maybe in the long term after the core is matured enough, these can be looked into?
Editor, Expert player (2313)
Joined: 5/15/2007
Posts: 3855
Location: Germany
Here is also an old post about emulation bugs in GB games, if you have use for it. http://tasvideos.org/forum/viewtopic.php?p=260154#260154
Noxxa
They/Them
Moderator, Expert player (4139)
Joined: 8/14/2009
Posts: 4083
Location: The Netherlands
While we're at it with the requests, I have one - bootleg mappers. In particular, no TAS-capable emulator can currently run Terrifying 911 - the only emulator that runs it currently is hhugboy.
http://www.youtube.com/Noxxa <dwangoAC> This is a TAS (...). Not suitable for all audiences. May cause undesirable side-effects. May contain emulator abuse. Emulator may be abusive. This product contains glitches known to the state of California to cause egg defects. <Masterjun> I'm just a guy arranging bits in a sequence which could potentially amuse other people looking at these bits <adelikat> In Oregon Trail, I sacrificed my own family to save time. In Star trek, I killed helpless comrades in escape pods to save time. Here, I kill my allies to save time. I think I need help.
Editor, Emulator Coder
Joined: 8/7/2008
Posts: 1156
mothrayas, the most expedient way for you to solve that that is report to gambatte upstream
Alyosha
He/Him
Editor, Expert player (3532)
Joined: 11/30/2014
Posts: 2728
Location: US
I'm making decent progress here. So far I'm doing everything in single clock steps, so breaking out into a debugger should be simple. Trace logging works for the new CPU core, and I'm able to compare logs to gambatte. So far it makes it through the bios corectly (even though there is no rendering yet so there is nothing to see.) There is a couple of cycles of variation in timing the scanline counter, so it doesn't exactly match Gambatte, but all that stuff is still very preliminary (and the deviation doesn't show up until about 13000000 cycles in, so I guess it's pretty small.) Savestates are already functioning as well, so all the basic foundations of the core are coming together. Next steps are to get rendering going. I have sprite evaluation and the timer done, just need to do all the actual rendering steps and start firing interrupts and such. Once that's in place I can start varifying the CPU and working through all the various test ROMs. I think rendering shouldn't be too bad, certainly the biggest hurdle will be the dreaded APU, but that's still a ways off. Also I looked at Terrifying 911. At first I thought that was going to be some kind of murder mystery that starts off with a phone call to 911, but after googling it ... it's something else entirely 0_0
Experienced player (632)
Joined: 11/23/2013
Posts: 2208
Location: Guatemala
Mothrayas wrote:
While we're at it with the requests, I have one - bootleg mappers. In particular, no TAS-capable emulator can currently run Terrifying 911 - the only emulator that runs it currently is hhugboy.
Rockman 8 by YongYong and Beast Fighter by Sachen also only work on hhugboy.
Here, my YouTube channel: http://www.youtube.com/user/dekutony
Editor, Emulator Coder, Site Developer
Joined: 5/11/2011
Posts: 1108
Location: Murka
Alyosha wrote:
Also this won't be a gameboy color core, which I would rather do seperately (even if it means copying a majority of the code.)
A GameBoy color with 0xff4c not written to is 99.998% identical to a regular GameBoy, so there's not much of a point to that.
Alyosha
He/Him
Editor, Expert player (3532)
Joined: 11/30/2014
Posts: 2728
Location: US
So far so good. It's a good thing the Gambatte core already exists in BizHawk, it probably would have taken me 10x as long to debug some of these tests without it. instruction timing tests pass as well, so the basics of the CPU are functioning correctly. Next steps will be Memory access timing, Timer tests, and interrupt timing. Once those are done the core should be on a pretty firm footing to start on the more difficult stuff.
Post subject: Gambatte has a graphic glitch with Emo Dao
Judge, Skilled player (1289)
Joined: 9/12/2016
Posts: 1645
Location: Italy
Gambatte has a graphic glitch with Emo Dao, please see if there something you can do about it. VBA does not feature these broken sprites.
my personal page - my YouTube channel - my GitHub - my Discord: thunderaxe31 <Masterjun> if you look at the "NES" in a weird angle, it actually clearly says "GBA"
Post subject: Re: Gambatte has a graphic glitch with Emo Dao
Warepire
He/Him
Editor
Joined: 3/2/2010
Posts: 2174
Location: A little to the left of nowhere (Sweden)
ThunderAxe31 wrote:
Gambatte has a graphic glitch with Emo Dao, please see if there something you can do about it. VBA does not feature these broken sprites. --image--
Do you know if it's supposed to be glitched or not? There have been cases with the SNES library where a glitched behavior is the correct one.
Post subject: Re: Gambatte has a graphic glitch with Emo Dao
Judge, Skilled player (1289)
Joined: 9/12/2016
Posts: 1645
Location: Italy
Legit question. In fact, I was about to delete my post. This screenshot comes from a recording done on real hardware: https://www.youtube.com/watch?v=93ySxbVIY_Y
my personal page - my YouTube channel - my GitHub - my Discord: thunderaxe31 <Masterjun> if you look at the "NES" in a weird angle, it actually clearly says "GBA"
Alyosha
He/Him
Editor, Expert player (3532)
Joined: 11/30/2014
Posts: 2728
Location: US
@ThunderAxe31: I'm not following. Is the glitch supposed to be there or not? Making good progress so far. Right now all of the CPU, DMA, and timer tests from Gekkio's mooneye gb page pass, so I'm pretty confident the core is internally consistent and accurate up to this point. Background and sprites are working just need to do windowing and graphics will be initially done (though not in a cycle accurate way yet, that will be the next step.) I have a couple bugs in some register bits and interrupt timing to work out, but those should be easy enough to work out so games should be fully playable (except for sound) pretty soon.
Judge, Skilled player (1289)
Joined: 9/12/2016
Posts: 1645
Location: Italy
Alyosha wrote:
@ThunderAxe31: I'm not following. Is the glitch supposed to be there or not?
Is supposed to be there. If you look carefully in the upper left in my screenshot from the YouTube video, you can see some whitish lines. These should be the glitched sprites just before disappearing, because the player is walking slowly throught the level. Also note that the glitch happens on BGB as well.
my personal page - my YouTube channel - my GitHub - my Discord: thunderaxe31 <Masterjun> if you look at the "NES" in a weird angle, it actually clearly says "GBA"
Alyosha
He/Him
Editor, Expert player (3532)
Joined: 11/30/2014
Posts: 2728
Location: US
Since I had some time today I decided to see if I could get the last graphics bit, windowing, working. As it turns out, I really should only try to do one thing at a time, because a large chunk of the time I spent on this was just trying to remember what the heck I was doing. In the end though it worked just fine. This scene from Mega Man 5 is actually pretty challenging to get right graphically, so it's a good sign that things are working well: A few more small timing things to clean up and it will be time to start audio. I also did some performance analysis, since this core runs much faster the NESHawk while not really having that much less to do, to see if I could learn anything from it. One thing I did learn is that statements like this:
state = divider_reg.Bit(9);
are really very expensive. Maybe knowing that can help me speed things up in other places.
creaothceann
He/Him
Editor
Joined: 4/7/2005
Posts: 1874
Location: Germany
Is that line using bit masking and shifting? You could also store each emulated bit as its own boolean.
Alyosha
He/Him
Editor, Expert player (3532)
Joined: 11/30/2014
Posts: 2728
Location: US
creaothceann wrote:
Is that line using bit masking and shifting? You could also store each emulated bit as its own boolean.
Yeah it does this:
public static bool Bit(this byte b, int index)
		{
			return (b & (1 << index)) != 0;
		}
It also seems like Bools are just slow altogether, not sure.
Joined: 5/24/2004
Posts: 262
Alyosha wrote:
creaothceann wrote:
Is that line using bit masking and shifting? You could also store each emulated bit as its own boolean.
Yeah it does this:
public static bool Bit(this byte b, int index)
		{
			return (b & (1 << index)) != 0;
		}
It also seems like Bools are just slow altogether, not sure.
I would posit that it's the sheer number of function calls to Bit() that eat away at performance. Can you run a disassembler to determine if the JIT compiler is inlining those calls? If it isn't, you could try the AggressiveInlining attribute that was mentioned in the NesHawk thread. I haven't done a huge amount of performance optimization in C#, but it's pretty fascinating.
Alyosha
He/Him
Editor, Expert player (3532)
Joined: 11/30/2014
Posts: 2728
Location: US
Andypro wrote:
I would posit that it's the sheer number of function calls to Bit() that eat away at performance. Can you run a disassembler to determine if the JIT compiler is inlining those calls? If it isn't, you could try the AggressiveInlining attribute that was mentioned in the NesHawk thread. I haven't done a huge amount of performance optimization in C#, but it's pretty fascinating.
Tried aggreisve inlining, no effect, probably the compiler already does it. Tried removing the Bit call altogether and just using explicit statements like (value & 0x200) > 0, no effect. Tried changing to integers and rewriting the logic. Small but positive effect, not sure it's worth the reduced clarity. But the most effective change was putting the condition most likely to be false first in the 'if' statement later on in that function. >___> I guess I need optimization 101. EDIT: Here's the complete thing if anyone sees anything obvious. According to performance profiler, this tiny block takes about 1/4 as long to run as the entire CPU function 0_0. So something about it must be super slow:
		public void tick_2()
		{
			divider_reg+=1;

			// pick a bit to test based on the current value of timer control
			switch (timer_control & 3)
			{
				case 0:
					state = divider_reg & 0x200;
					break;
				case 1:
					state = divider_reg & 0x8;
					break;
				case 2:
					state = divider_reg & 0x20;
					break;
				case 3:
					state = divider_reg & 0x80;
					break;
				default:
					break;
			}

			// And it with the state of the timer on/off bit
			state_c = timer_control & 0x4;

			// this procedure allows several glitchy timer ticks, since it only measures falling edge of the state
			// so things like turning the timer off and resetting the divider will tick the timer
			if ((state == 0 || state_c == 0) && (old_state > 0 && old_state_c > 0))
			{
				timer_old = timer;
				timer+=1;

				// if overflow, set the interrupt flag and reload the timer (4 clocks later)
				if (timer < timer_old)
				{
					pending_reload = 4;
					reload_block = false;
				}
			}

			old_state = state;
			old_state_c = state_c;

		}
Joined: 6/29/2016
Posts: 53
Just a few random ideas: The "switch (timer_control & 3)" part might be faster as a lookup table? e.g:
static int[] mask = new int[4] { 0x200, 0x8, 0x20, 0x80 };
state = divider_reg & mask[timer_control & 3];
Might just end up being pretty similar, though.
if (timer < timer_old)
Couldn't this just be if(timer == 0) or if(timer == Int32.MinValue) (or whatever the data type used here) - should technically be faster than comparing to another variable. I imagine that's probably not really causing much of a performance issue, either, though. If you don't actually need the values of old_state & old_state_c elsewhere, maybe try something like this?
if(state == 0 || state_c == 0) {
   if(oldStatesNotZero) {
      ...
   }
   oldStatesNotZero = false;
} else if(state > 0 && state_c > 0)  {
   oldStatesNotZero = true;
}
Alyosha
He/Him
Editor, Expert player (3532)
Joined: 11/30/2014
Posts: 2728
Location: US
I had actually tested the look up table earlier but it is slower by a LOT. I guess I could try with unsafe too but it's not worth it just for that. I also tried stringing together '?' operators but that had only marginall effect. I also tried just replacing the switch with ordinary logic elements but that wasn't much faster either. Overall it's currently ~30% faster then when I started, and I guess that's about as fast as it's going to get. It doesn't matter much for this case but I thought I could learn something more from it since it's so simple and obviously slow. Oh well though, too much other stuff to do to get hung up on it. EDIT: I tried your (Sour) other suggestions but they didn't result in improvements. Thanks for coming up with things to try though.
creaothceann
He/Him
Editor
Joined: 4/7/2005
Posts: 1874
Location: Germany
How about this?
Language: c

public void tick_2() { divider_reg += 1; // pick a bit to test based on the current value of timer control // switch (timer_control & 3) { // case 1: state = divider_reg & 0b0000001000; break; // case 2: state = divider_reg & 0b0000100000; break; // case 3: state = divider_reg & 0b0010000000; break; // case 0: state = divider_reg & 0b1000000000; break; // } int i = timer_control & 3; i = (i - 1) & 3; i = 0b0000001000 << (i * 2); state = divider_reg & i; // AND it with the state of the timer on/off bit state_c = timer_control & 0x4; // this procedure allows several glitchy timer ticks, since it only measures falling edge of the state // so things like turning the timer off and resetting the divider will tick the timer // if ((state == 0 || state_c == 0) && (old_state > 0 && old_state_c > 0)) { if (((state & state_c) == 0) && ((old_state | old_state_c) != 0)) { timer_old = timer; timer += 1; // if overflow, set the interrupt flag and reload the timer (4 clocks later) if (timer < timer_old) { pending_reload = 4; reload_block = false; } } old_state = state; old_state_c = state_c; }
Alyosha
He/Him
Editor, Expert player (3532)
Joined: 11/30/2014
Posts: 2728
Location: US
^ About the same as the current one. (But I believe your new condition statement is incorrect, as state and state_c never have the same bits set.) So to summarize so far: - putting failing condition first is the biggest improvement - bool to int is also a noticable but smaller improvement - look up table has a negative effect - everything else has more or less no effect
1 2
19 20