Makes me curious how many stages can it complete, and can any boss stages be done. lol
Same with that stage with the dinos that you're supposed to beat under a certain time to get a secret exit or something.
Any level that requires going to the left is impossible. If the route is complicated, needs some item or strategy, the odds of the bot suceeding are neglitible, even after billions of years playing.
I have updated the restart counter that saves data to the pool file.
http://pastebin.com/k0Ww4jJF
There are minor changes that have been made to the code so it is saved in the pool array much rather than saving it in a basic var and reading to the basic var. This also fixes problems with the restart counter not keeping the number of restarts when starting mari/o back up after toggling it off and on. Restart count is also shown in the Fitness option menu.
You can thank me later or when ever you get the chance, or not at all. Just remember that this code is made by sethbling and was modified by me. Please don't claim this as your own.
Complicated routes would be possible with a more advanced fitness criteria. The way I see it is that it tries to measure the progression towards the goal. For most levels, that simplifies to how far to the right MarI/O got.
Anyone know of a way to view the run with the max fitness? Maybe a way to automatically record it as a movie?
Im sure if you found what triggers the finish line song/animation etc, you could add that to the lua so when thats triggered to save that completed state and start recording from beginning and after it finishes the movie, stop recording,game, and script.
I have been running this for ~3 days. Max score achieved after 2 days, 4502. Now MArio still scores around the 4000-4200s with occasional bad runs. He is at generation 41, species 30 genome 5. He usually goes to Yoshi's island 3. He now also avoids the green chimney(which drops to another level). Interestingly, he has evolved into a kangaroo. Glitches in the script include the dialog boxes that pop up. To dismiss these requires a press of the 'z' key but the script times out and considers the dialog a failure. Increasing the the timeout does not solve this bug. I could upload the DPI.state.pool file (~700k) if anyone is interested but this site does not allow uploads.
what would happen if you set the survivors of each genome to a higher standard? so instead of taking the best 50%, set it to take the best 75% or something. There would be less survivors but maybe also less fuck up ones.
I think that the script should timeout when a dialog box is opened, I don't consider it fit when such time wasting is done.
But the correct solution if you want to accept it is to introduce another input to let the system know when a dialog box is open.
Add a RAM check if a dialog box is open, if so spam the A button and skip the main part of the script.
While you're at it, you might also want to spam the B button for a bit when you didn't move for like 2 seconds.
And also why don't you also press the B button when there is an enemy right in front of you. It will make it more efficient!
See, here are the addresses:
Address Length Type Description
$7E:0094 2 bytes 3 Player X position (16-bit) within the level
$7E:00E4 12 bytes 2 Sprite X position, low byte.
$7E:14E0 12 bytes 2 Sprite X position, high byte.
$7H:L0L0 42 bytes yes Read this address to get the winning value!
But seriously, creaothceann, do you understand the concept of a self-learning AI?
It has to learn to either avoid the message box or get used to it (the message box is a sprite, so it can detect that).
Warning: Might glitch to creditsI will finish this ACE soon as possible
(or will I?)
do you understand the concept of a self-learning AI?
It has to learn to either avoid the message box or get used to it (the message box is a sprite, so it can detect that).
Sure, if you want the AI to learn playing the whole game, or keep the script game-unspecific, you can't add that optimization. I'm more interested in an AI's raw gameplay though, and adding that optimization would keep more genome variants alive.
Except that in the SMW case, the AI might falsely interpret staying alive longer (keeping the text window open) as an optimal survival strategy, and pass that information to future generations.
Masterjun's statement and henke37's comment lead to better generations for SMW.
I am still the wizard that did it.
"On my business card, I am a corporate president. In my mind, I am a game developer. But in my heart, I am a gamer." -- Satoru Iwata
<scrimpy> at least I now know where every map, energy and save room in this game is
thanks for the idea (and the code) for the ram check. But since I am a noob, maybe you all can tell me what an 'smw case' is. Are you referring to 'SuperMarioWorld case'? That kind of reference belies a whole universe outside this problem and suggests an awareness of a solution that has nothing to do with SMW. Could you link me so I can understand what you are alluding to?
I agree, that the game should eventually try to press the A button (z key) when a dialog appears, but it never does. It just times out. It should be noted that the event of MArio hitting the bottom of the correct box to generate a dialog box is a rare event.
The same thing happens in other circumstances as well. When Mario would run up against a wall/step early in the simulation, he would just stop and it would time out. He would never try jumping. The only time he would get over a jump is if was already in 'jumping mode'. The transition to jumping mode seems to be a random event an not triggered by waiting. Same thing with the veritcal pipes (stacks?) If his jump makes him crash into a pipe and he slides to its bottom, he will then jump over it since he is in jumping mode. But if he runs into a pipe, he never gets the idea to jump over it. I guess the fitness is trying to minimize time so any waiting is considered a negative and that genome is considered a failure. It's just like a corporate exec who seeks to maximize earnings per share in the short term. Long term wins at the expense of short term losses are never considered.
When I get a better understanding of this script, I need to try and have it increase the mutation rate when Mario's travel stops. This kind of thing also lends itself to parallel processing. One thread is 'driving Mario' and another thread is monitoring his velocity so that when velocity goes to zero, it changes the params. However, I dont know how to write asynchronous threads in LUA. I have a hard enough time keeping my head straight with node.js and its use of closures to achieve asynchronicity.
I got tricked into thinking about this. I'll share my thoughts so I can then delete them.
I haven't studied mari/o or any AI topics deeply, but it seems to me it's essentially a lab rat which can perceive its maze and perceive the smell of cheese at the end. This approach will be ultimately limited to what you'd expect from a well-trained rat.
Higher organisms and humans perceive other things. Most importantly I think, they perceive their levels of frustration and whimsy, and they are able to perceive other attempts at runs through levels--from others, or more useful for us, the entire history of their attempted runs (reflecting on past experiences). Most importantly, they perceive their satisfaction with what they're doing, not just whether they succeeded.
So what I would do next if I was working on this would be to put the operations of the mari/o script under the control of a higher order process. It can perceive the structure and outcome of previous runs, and it can control parameters for the mari/o process to perceive, essentially as guidance from the higher order: how creative it's feeling (fitness function governed by exploration of possibilities, not progress towards goal); how many trials to run, etc.
Whether this higher order process is a neural net or not is hard to say. Evolving it as well would turn into an exponentially harder problem since you would be trying to evolve a mind and not just a route. It might be better just to use a traditional AI approach instead of brute force trials. The output of the this executive module is parameters for mari/o (or for the fitness evaluator) so that it can guide its evolution. That kind of brute force _practicing_ is something the sole human mind each of us has is good at.
This kind of division of responsibilities should be easier than trying to jam everything into what is effectively just one bigger, better-trained largely-instinctual rat-brain (the one mari/o neural net). And I think it models a human being better, who is kind of an executive controlling a mammal brain.
Hey guys, so I was reading about MarI/O, and got it to run... But for some reason it's not loading the pool states when I click "load" -
I'm going to take a peek at the code in a bit, but I was wondering- has anyone been able to get the program to load saved pool files?
Edit: I discovered what it is: you have to either specify the path in the save/load box or keep your state in the root bizhawk folder.
I got tricked into thinking about this. I'll share my thoughts so I can then delete them.
...
And I think it models a human being better, who is kind of an executive controlling a mammal brain.
I love your opinion on this subject. A mammals' brain over this partial rat brain seems to be a much better (and probably much quicker) solution to an AI completing a level. I'm sure with the right parameters, it would be much more versatile than the current script.
How easy is it to give MarI/O some additional sense information? Specifically, game rules such as entering blocks at a particular speed, or double-jumping?
I am still the wizard that did it.
"On my business card, I am a corporate president. In my mind, I am a game developer. But in my heart, I am a gamer." -- Satoru Iwata
<scrimpy> at least I now know where every map, energy and save room in this game is
How easy is it to give MarI/O some additional sense information? Specifically, game rules such as entering blocks at a particular speed, or double-jumping?
First, we need to add better fitness params. right now, it's just based on mario's x value. I'm thinking of adding a change in distance like ((verticalDistance + horizontal) / n) + (( score + coins) / n) - deaths, and instead starting the script on a rotation of levels where you die if you don't move in 2-3 seconds. Where n is the time it took to cover that distance, for the fitness function. Maybe with that, it will actually learn something other than right movement.
Next, we need to be able to tell the script what enemy types re on the screen, what the different floor and block types are, so the script can at least differentiate between them.
After that, i'm sure it will be much easier for the script to learn much more complex move sets.
I'd also recommend creating training levels and adding them to the map.