Post subject: Deadlock in JPC-RR when restoring a savestate from Lua
Sand
He/Him
Player (143)
Joined: 6/26/2018
Posts: 176
I tried to using a Lua script to optimize a short segment of a game (find the best frame to jump within a window of 15 frames). The script I wrote reliably deadlocks JPC-RR: it sits at 0% CPU and the interface is unresponsive. A small patch to JPC-RR avoids the problem, but hinders interactive use. I'm using version 11.6-WIP, built from source based on commit bf3a341ad233439e21a32f33dffc1e730668121d. This sample script should be sufficient to reproduce the deadlock. It doesn't depend on any specific game. Just load any savestate and run it. You may have to press S to get it started. I'm not sure the wait_for_event function is the best way to do frame advance: that was some trial-and-error coding on my part.
function jump()
	-- Press a key, or do whatever is necessary to jump.
end

function get_metric()
	-- Read the character's y position.
	return jpcrr.read_word(0x1000)
end

-- Call jpcrr.wait_event until we get an event of the right type, taking care
-- of "lock" events meanwhile.
function wait_for_event(target_type)
	while true do
		local etype, emsg = jpcrr.wait_event()
		if etype == "lock" then
			jpcrr.release_vga()
		end
		if etype == target_type then
			return etype, emsg
		end
	end
end

-- Restore savestate, then run for n frames, initiating a jump at frame j.
function trial(savestate, j, n)
	-- Stop emulator, if running.
	if jpcrr.status().running then
		jpcrr.pc_stop()
		wait_for_event("stop")
	end

	-- Restore initial state.
	jpcrr.load_state_normal(savestate)
	wait_for_event("attach")

	jpcrr.pc_start()
	for i=0,n do
		if i == j then
			jump()
		end
		-- Advance frame.
		wait_for_event("lock")
	end
	print(j, get_metric())
end

local savestate = jpcrr.save_state("init.jrsr")
for j=0,14 do
	trial(savestate, j, 15)
end
jpcrr.pc_stop()
I did a little debugging of this. The deadlock is caused by several threads waiting on a synchronization monitor. I suspect there is a race condition because it is not completely deterministic: it usually happens on the first loop iteration, but may occasionally happen later. A thread dump (kill -QUIT $(pidof java)) shows many backtraces including these three that are waiting on a org.jpc.plugins.PCControl:
"VGA output cycle thread" #42 prio=6 os_prio=0 tid=0x00007f84a40b3800 nid=0x4b97 in Object.wait() [0x00007f84add9b000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x0000000085bcb4c0> (a org.jpc.output.OutputStatic)
        at java.lang.Object.wait(Object.java:502)
        at org.jpc.output.OutputStatic.waitReaders(OutputStatic.java:221)
        - locked <0x0000000085bcb4c0> (a org.jpc.output.OutputStatic)
        at org.jpc.output.OutputStatic.addFrame(OutputStatic.java:127)
        at org.jpc.output.Output.addFrame(Output.java:76)
        at org.jpc.output.OutputChannel.addFrame(OutputChannel.java:78)
        at org.jpc.output.OutputChannelVideo.addFrameVideo(OutputChannelVideo.java:47)
        at org.jpc.emulator.VGADigitalOut.holdOutput(VGADigitalOut.java:48)
        at org.jpc.plugins.PCControl.doCycleDedicatedThread(PCControl.java:1477)
        - locked <0x0000000085a9dd70> (a org.jpc.plugins.PCControl)
        at org.jpc.plugins.PCControl.access$1300(PCControl.java:80)
        at org.jpc.plugins.PCControl$9.run(PCControl.java:1486)
        at java.lang.Thread.run(Thread.java:748)

"Plugin thread for org.jpc.plugins.PCControl" #25 prio=5 os_prio=0 tid=0x00007f8510470800 nid=0x4b8b waiting for monitor entry [0x00007f84adc9a000]                                                                                                                                                                            
   java.lang.Thread.State: BLOCKED (on object monitor)
        at org.jpc.plugins.PCControl.main(PCControl.java:419)
        - waiting to lock <0x0000000085a9dd70> (a org.jpc.plugins.PCControl)
        at org.jpc.pluginsbase.Plugins$PluginThread.run(Plugins.java:485)

"AWT-EventQueue-1" #15 prio=6 os_prio=0 tid=0x00007f8510408800 nid=0x4b82 waiting for monitor entry [0x00007f84ad387000]
   java.lang.Thread.State: BLOCKED (on object monitor)
        at org.jpc.plugins.PCControl.start(PCControl.java:1261)
        - waiting to lock <0x0000000085a9dd70> (a org.jpc.plugins.PCControl)
        at org.jpc.plugins.PCControl$5.run(PCControl.java:484)
        at java.awt.event.InvocationEvent.dispatch(InvocationEvent.java:301)
        at java.awt.EventQueue.dispatchEventImpl(EventQueue.java:758)
        at java.awt.EventQueue.access$500(EventQueue.java:97)
        at java.awt.EventQueue$3.run(EventQueue.java:709)
        at java.awt.EventQueue$3.run(EventQueue.java:703)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.security.ProtectionDomain$JavaSecurityAccessImpl.doIntersectionPrivilege(ProtectionDomain.java:80)
        at java.awt.EventQueue.dispatchEvent(EventQueue.java:728)
        at org.GNOME.Accessibility.AtkWrapper$5.dispatchEvent(AtkWrapper.java:700)
        at java.awt.EventDispatchThread.pumpOneEventForFilters(EventDispatchThread.java:205)
        at java.awt.EventDispatchThread.pumpEventsForFilter(EventDispatchThread.java:116)
        at java.awt.EventDispatchThread.pumpEventsForHierarchy(EventDispatchThread.java:105)
        at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:101)
        at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:93)
        at java.awt.EventDispatchThread.run(EventDispatchThread.java:82)
I tried this simple patch to JPC-RR. It works, in that it allows the script to run without deadlocking, but it has the side effect of preventing the screen from being updated immediately after loading a savestate, which is awkward when working interactively.
--- a/org/jpc/plugins/PCControl.java
+++ b/org/jpc/plugins/PCControl.java
@@ -1483,7 +1483,7 @@ e.printStackTrace();
     {   
         final PC _xpc = _pc;
         cycleDone = false;
-        (new Thread(new Runnable() { public void run() { doCycleDedicatedThread(_xpc); }}, "VGA output cycle thread")).start();
+        // (new Thread(new Runnable() { public void run() { doCycleDedicatedThread(_xpc); }}, "VGA output cycle thread")).start();
         while(cycleDone)
             try {
                 synchronized(this) {
Site Admin, Skilled player (1257)
Joined: 4/17/2010
Posts: 11539
Location: Lake Char­gogg­a­gogg­man­chaugg­a­gogg­chau­bun­a­gung­a­maugg
Since Ilari doesn't visit the forum, you better ask him directly at freenode #lsnes (since I don't think there's a dedicated room for jpcrr).
Warning: When making decisions, I try to collect as much data as possible before actually deciding. I try to abstract away and see the principles behind real world events and people's opinions. I try to generalize them and turn into something clear and reusable. I hate depending on unpredictable and having to make lottery guesses. Any problem can be solved by systems thinking and acting.
Post subject: Re: Deadlock in JPC-RR when restoring a savestate from Lua
Sand
He/Him
Player (143)
Joined: 6/26/2018
Posts: 176
Sand wrote:
I'm using version 11.6-WIP, built from source based on commit bf3a341ad233439e21a32f33dffc1e730668121d.
A small update on this. I'm working on a new DOS TAS and I decided to try the more recent JPC-RR r11.8-rc2, which includes a commit with the promising log message "Fix utterly broken repaint code" that touches the same code as the patch in my previous post. Unfortunately, this version seems to have trouble with repeated reloading of savestates from Lua (see Post #500149). I tried reverting the commit "Only mark PC as truly stopped after output cycle completes" as well as the slightly more recent r11-maint branch with its commit "Try to fix race condition in restart from stop via Lua", but I couldn't make my script work. So I switched back to r11.6-WIP with the anti-deadlock patch. Luckily, movie files seem to be portable between the two versions. You can see me trying to make r11.8-rc2 work in this video between 0:48:00 and 2:17:00.