Optimal Elma player
Moderator: Moporators
-
- Kuski
- Posts: 30
- Joined: 8 May 2005, 00:12
- Location: Adelaide, Australia
- Contact:
Optimal Elma player
The other day I had an idea: what if we create the optimal Elma player? This is how it works. An Elma player can be modeled as a Reinforcement Learning agent with 5 available actions: throttle, break, left, right and turn. If the agent has collected all the apples he enters the terminal state and receives a reward of 1/(time taken), otherwise in all other states he receives a reward of -1. This way, the agent will aim to minimize the time. We can now do standard TD(lambda) to teach the agent to drive well and hopefully he will learn the optimal policy.
There are several extensions to this. We can change the reward function, such that it grows with each collected apple, ie the first apple gives +1, the second +10, the third +100 and so on - this will be some form of shaping. We can provide "mini-goals" for the agent - these could be checkpoints along a "good" path. We can initialize the agent's driving policy by a good drive through the level (provided by a user) and let him try to make it better. We can also do learning by example, just like its done here: https://research.microsoft.com/en-us/pr ... fault.aspx. The possibilities are endless and if we put our minds together we can achieve great things!
The only problem is the learning agent must live in the Elma environment. How can we access the code of the game?
Here are some more papers related to this:
Learning to fight: https://research.microsoft.com/apps/pub ... x?id=65639
Learning to bike ride: http://www.cs.mcgill.ca/~mgendr12/COMP5 ... arning.pdf
bipedal-walking: http://citeseerx.ist.psu.edu/viewdoc/su ... 1.107.3195
Robot Auto Racing: http://citeseerx.ist.psu.edu/viewdoc/su ... .1.86.9280
Apprenticeship learning: http://citeseerx.ist.psu.edu/viewdoc/su ... 0.1.1.2.92
There are several extensions to this. We can change the reward function, such that it grows with each collected apple, ie the first apple gives +1, the second +10, the third +100 and so on - this will be some form of shaping. We can provide "mini-goals" for the agent - these could be checkpoints along a "good" path. We can initialize the agent's driving policy by a good drive through the level (provided by a user) and let him try to make it better. We can also do learning by example, just like its done here: https://research.microsoft.com/en-us/pr ... fault.aspx. The possibilities are endless and if we put our minds together we can achieve great things!
The only problem is the learning agent must live in the Elma environment. How can we access the code of the game?
Here are some more papers related to this:
Learning to fight: https://research.microsoft.com/apps/pub ... x?id=65639
Learning to bike ride: http://www.cs.mcgill.ca/~mgendr12/COMP5 ... arning.pdf
bipedal-walking: http://citeseerx.ist.psu.edu/viewdoc/su ... 1.107.3195
Robot Auto Racing: http://citeseerx.ist.psu.edu/viewdoc/su ... .1.86.9280
Apprenticeship learning: http://citeseerx.ist.psu.edu/viewdoc/su ... 0.1.1.2.92
Re: Optimal Elma player
why so stupid replies to such an intelligent post?
I know it should be theoretically possible to make an optimal elma player but really really hard... Those scientific papers you showed... such hard formulas even for such an easy things as driving a bicycle. So for a very complex game like elma, those formulas would be incomprehensible.
I don't see why we should give points for apples. Apples are not 'extra points', they are things that have to be collected.
You can't really get elma code anywhere, only balazs has it. So I think that's the biggest problem, is it?
I know it should be theoretically possible to make an optimal elma player but really really hard... Those scientific papers you showed... such hard formulas even for such an easy things as driving a bicycle. So for a very complex game like elma, those formulas would be incomprehensible.
I don't see why we should give points for apples. Apples are not 'extra points', they are things that have to be collected.
You can't really get elma code anywhere, only balazs has it. So I think that's the biggest problem, is it?
A winner of 4 GAA's (mc2 included), winner of mkup206, and a proud member of team TAP.
Play uni levels: http://koti.mbnet.fi/zebra/uni.html
Homepage: http://koti.mbnet.fi/zebra/elma.html
Play uni levels: http://koti.mbnet.fi/zebra/uni.html
Homepage: http://koti.mbnet.fi/zebra/elma.html
-
- Kuski
- Posts: 30
- Joined: 8 May 2005, 00:12
- Location: Adelaide, Australia
- Contact:
Re: Optimal Elma player
Yes I was expecting more intelligent replies too...why so stupid replies to such an intelligent post?
Don't worry about the formulas, they are not as hard as they look. I did my honours project in Reinforcement Learning and my current PhD supervisor is an expert in Reinforcement Learning, so we should be able to deal with the formulas.I know it should be theoretically possible to make an optimal elma player but really really hard... Those scientific papers you showed... such hard formulas even for such an easy things as driving a bicycle. So for a very complex game like elma, those formulas would be incomprehensible.
Think of a really long level that takes about 3 minutes to complete. If the agent receives a reward only at the very end, then there is nothing to guide him in between. Instead, if you give the agent some small intermediate rewards (eg. points for collecting apples) then it will help in the task. This way, the agent will first learn how to get the first apple quickly, then he will learn how to get the second apple quickly and so on until he learns how to get them all quickly. Remember, one has to learn how to crawl before he can walk.I don't see why we should give points for apples. Apples are not 'extra points', they are things that have to be collected.
For me this is the most difficult part. I thought someone might have hacked into the code, or did something clever like that. In the worst case, we can email Balazs and hope that he is kind enough to give us the code. Does he usually reply to emails about Elma?You can't really get elma code anywhere, only balazs has it. So I think that's the biggest problem, is it?
- Grace
- 38mins club
- Posts: 4851
- Joined: 19 Nov 2005, 10:45
- Location: Deep in your Imagination, Twirling your Dreams and Weaving your thoughts.
Re: Optimal Elma player
he generally ignores you or gives you a cryptic auto email about the release date of elma 2.
oh and the only problem with setting "rewards" at each apple is that sometimes the fastest way of getting to apples 1 will mean that overall it takes longer to get to apple 2 than if you'd taken a different route. for instance it's quicker to run straight into the very top apple in what the heck, but if you do then it will take you longer to get the apples right underneath it.
oh and the only problem with setting "rewards" at each apple is that sometimes the fastest way of getting to apples 1 will mean that overall it takes longer to get to apple 2 than if you'd taken a different route. for instance it's quicker to run straight into the very top apple in what the heck, but if you do then it will take you longer to get the apples right underneath it.
Cyberscore!
___________________________________________________
Targets: 6 Legendary, 23 WC, 20 Pro, 5 Good | AvgTT: 39:59:96
___________________________________________________
Targets: 6 Legendary, 23 WC, 20 Pro, 5 Good | AvgTT: 39:59:96
-
- Kuski
- Posts: 30
- Joined: 8 May 2005, 00:12
- Location: Adelaide, Australia
- Contact:
Re: Optimal Elma player
Actually we don't need the source. It will be sufficient to simulate keystrokes inside Elma and have Elma provide us with our current state. The state comprises of: current time, number of apples eaten, position of the bike etc.he generally ignores you or gives you a cryptic auto email about the release date of elma 2.
Yep I understand what you are saying. This is why I suggested giving rewards in an incremental order, so the later apples have more influence than the earlier ones. After all, we only care about getting the last apple quickly, but we would also like to get to the earlier apples quickly if possible. The reward for collecting the i-th apple at time t could be something like e^(i+c/t), where c is some constant. So later apples give you much more reward; and apples collected quickly give you more reward.oh and the only problem with setting "rewards" at each apple is that sometimes the fastest way of getting to apples 1 will mean that overall it takes longer to get to apple 2 than if you'd taken a different route. for instance it's quicker to run straight into the very top apple in what the heck, but if you do then it will take you longer to get the apples right underneath it.
- Grace
- 38mins club
- Posts: 4851
- Joined: 19 Nov 2005, 10:45
- Location: Deep in your Imagination, Twirling your Dreams and Weaving your thoughts.
Re: Optimal Elma player
Bicycle has less buttons = less complex, also no sexy tricks, ever seen a bicycle do a brutal on uphill battle? how about ramp frenzy brutal or animal farm head compression? nat.
Cyberscore!
___________________________________________________
Targets: 6 Legendary, 23 WC, 20 Pro, 5 Good | AvgTT: 39:59:96
___________________________________________________
Targets: 6 Legendary, 23 WC, 20 Pro, 5 Good | AvgTT: 39:59:96
Re: Optimal Elma player
dimkadimon, thanks for starting a most interesting topic.
elma may well be a computer game, with simulated physics, but the possibilities is really what makes the game complex. it reminds me of when doom was first released: you could then not only have enemies coming in front or back, but in depth of space too! elma for me is like that. it can be said to be more complex than most other 3d games.
about the concerns in emailing balazs, to me he has in the past been friendly, and cooperative to a certain level (i.e. past the level of ignoring my messages). if any part of your project needs example of well-driven, almost-perfect replays (such as John's 29), just reply here or send me a pm and i'll take a look into it. good luck!
elma may well be a computer game, with simulated physics, but the possibilities is really what makes the game complex. it reminds me of when doom was first released: you could then not only have enemies coming in front or back, but in depth of space too! elma for me is like that. it can be said to be more complex than most other 3d games.
about the concerns in emailing balazs, to me he has in the past been friendly, and cooperative to a certain level (i.e. past the level of ignoring my messages). if any part of your project needs example of well-driven, almost-perfect replays (such as John's 29), just reply here or send me a pm and i'll take a look into it. good luck!
-
- Kuski
- Posts: 30
- Joined: 8 May 2005, 00:12
- Location: Adelaide, Australia
- Contact:
Re: Optimal Elma player
I don't agree. A real bicycle has an infinite number of continuous actions: you can lean x degrees left, right, forward, back etc. In all cases, x is a real value. While an elma bicycle has only 5 discreet actions and those actions have only 2 states: on and off. Either you have full throttle or none at all, you can't do half throttle for example.Bicycle has less buttons = less complex, also no sexy tricks, ever seen a bicycle do a brutal on uphill battle? how about ramp frenzy brutal or animal farm head compression? nat.
- Grace
- 38mins club
- Posts: 4851
- Joined: 19 Nov 2005, 10:45
- Location: Deep in your Imagination, Twirling your Dreams and Weaving your thoughts.
Re: Optimal Elma player
sure, the bicycle has many different actions, however as far as i know there are only 2 buttons on an average bicycle, 6 on a mountain bike. Elma has 8.
Cyberscore!
___________________________________________________
Targets: 6 Legendary, 23 WC, 20 Pro, 5 Good | AvgTT: 39:59:96
___________________________________________________
Targets: 6 Legendary, 23 WC, 20 Pro, 5 Good | AvgTT: 39:59:96
Re: Optimal Elma player
You should ask the code from milagros. He has made quite many additions to the game and probably already knows the source code by heart. But I think he has to code it in assembly because balazs hasn't given the original source code.
And btw, there has been a similar kind of topic earlier. It was an april fool though:
http://mopolauta.moposite.com/viewtopic.php?t=3132
And btw, there has been a similar kind of topic earlier. It was an april fool though:
http://mopolauta.moposite.com/viewtopic.php?t=3132
A winner of 4 GAA's (mc2 included), winner of mkup206, and a proud member of team TAP.
Play uni levels: http://koti.mbnet.fi/zebra/uni.html
Homepage: http://koti.mbnet.fi/zebra/elma.html
Play uni levels: http://koti.mbnet.fi/zebra/uni.html
Homepage: http://koti.mbnet.fi/zebra/elma.html
Re: Optimal Elma player
so driving a 2d motorbike is more complex than riding a real bike in 3d with 2 brakes, gears and variable applied force to the pedals?SmaXa wrote:sure, the bicycle has many different actions, however as far as i know there are only 2 buttons on an average bicycle, 6 on a mountain bike. Elma has 8.
Re: Optimal Elma player
John, ez add more different grips, grounds, wind and if the kuski gets tired?
Re: Optimal Elma player
optimal player would obviously be modified Zweq.exe, with inbuilt winballe.exe, driveWR.exe and pwnJohn.exe. Hard to make even for teh_mila
Team TR
Multi WR in Labyrinth with GRob
Best Internal Total Times, Pipe stats & Pipe archive
World kuski map, World Cup stats
Re: Optimal Elma player
AFAIK the collision scripts use type double. So bike is going to act differently each time you execute the same moves sequence. In that case any kind of macro/neural network or howsoever it's called is not going to work. I don't know if you are already aware of this issue.
Lousku wrote:could you mayke shorter sig please mega annoying and also against rules )
-
- Kuski
- Posts: 30
- Joined: 8 May 2005, 00:12
- Location: Adelaide, Australia
- Contact:
Re: Optimal Elma player
So milagros reimplemented elma in assembly?!? That sounds hard. How can I contact him?You should ask the code from milagros. He has made quite many additions to the game and probably already knows the source code by heart. But I think he has to code it in assembly because balazs hasn't given the original source code.
-
- Kuski
- Posts: 30
- Joined: 8 May 2005, 00:12
- Location: Adelaide, Australia
- Contact:
Re: Optimal Elma player
What collision scripts are you referring to? How can doubles affect the move sequence? - I think you wanted to say that the script is non-deterministic resulting in different move sequence each time you execute it?alias wrote:AFAIK the collision scripts use type double. So bike is going to act differently each time you execute the same moves sequence. In that case any kind of macro/neural network or howsoever it's called is not going to work. I don't know if you are already aware of this issue.
-
- Kuski
- Posts: 30
- Joined: 8 May 2005, 00:12
- Location: Adelaide, Australia
- Contact:
Re: Optimal Elma player
Where have I double posted and what is wrong with my quoting?Jappe wrote:apparently dimkadimon isnt an optimal forum user since he doesnt know how to quote properly and makes double posts
Re: Optimal Elma player
The finite accuracy of the variables used by the physic engine and your frame rate cause the fact that your ride is non-deterministic even if in so-called 'auto-levels' where you don't have to press any buttons: the bike still acts differently in each ride.dimkadimon wrote:What collision scripts are you referring to? How can doubles affect the move sequence? - I think you wanted to say that the script is non-deterministic resulting in different move sequence each time you execute it?alias wrote:AFAIK the collision scripts use type double. So bike is going to act differently each time you execute the same moves sequence. In that case any kind of macro/neural network or howsoever it's called is not going to work. I don't know if you are already aware of this issue.
A winner of 4 GAA's (mc2 included), winner of mkup206, and a proud member of team TAP.
Play uni levels: http://koti.mbnet.fi/zebra/uni.html
Homepage: http://koti.mbnet.fi/zebra/elma.html
Play uni levels: http://koti.mbnet.fi/zebra/uni.html
Homepage: http://koti.mbnet.fi/zebra/elma.html
Re: Optimal Elma player
Just ignore that idiot.dimkadimon wrote:Where have I double posted and what is wrong with my quoting?Jappe wrote:apparently dimkadimon isnt an optimal forum user since he doesnt know how to quote properly and makes double posts
A winner of 4 GAA's (mc2 included), winner of mkup206, and a proud member of team TAP.
Play uni levels: http://koti.mbnet.fi/zebra/uni.html
Homepage: http://koti.mbnet.fi/zebra/elma.html
Play uni levels: http://koti.mbnet.fi/zebra/uni.html
Homepage: http://koti.mbnet.fi/zebra/elma.html
Re: Optimal Elma player
No he didn't reimplement elma But somehow converted the elma.exe into assembly where he can then add his own stuff.dimkadimon wrote:So milagros reimplemented elma in assembly?!? That sounds hard. How can I contact him?You should ask the code from milagros. He has made quite many additions to the game and probably already knows the source code by heart. But I think he has to code it in assembly because balazs hasn't given the original source code.
You can contact him in IRCnet channel #across nick teh_mila.
A winner of 4 GAA's (mc2 included), winner of mkup206, and a proud member of team TAP.
Play uni levels: http://koti.mbnet.fi/zebra/uni.html
Homepage: http://koti.mbnet.fi/zebra/elma.html
Play uni levels: http://koti.mbnet.fi/zebra/uni.html
Homepage: http://koti.mbnet.fi/zebra/elma.html
-
- Kuski
- Posts: 30
- Joined: 8 May 2005, 00:12
- Location: Adelaide, Australia
- Contact:
Re: Optimal Elma player
Actually if the variations are small (from run to run) then this might not be such a bad thing. In some sense it will encourage the agent's exploration. This is one of the reasons why TD-Gammon worked so well: http://gammonline.com/articles/td-gammon.htmlThe finite accuracy of the variables used by the physic engine and your frame rate cause the fact that your ride is non-deterministic even if in so-called 'auto-levels' where you don't have to press any buttons: the bike still acts differently in each ride.
Does he have an email? - because that would be easier for me.You can contact him in IRCnet channel #across nick teh_mila.
- The_BoneLESS
- 38mins club
- Posts: 4604
- Joined: 7 Sep 2003, 00:30
- Team: HHIT
- Location: Dangerously close to the St-Lawrence River
- Contact:
Re: Optimal Elma player
Another thing to take in consideration is that the graphic card configuration has an impact on certain physical properties of the game (such as grip, bounces, etc.). I don't know if your agent can take that in consideration, haven't read the article yet.
Concerning that "collision script", or the "physical engine", or whatever you want to call it, it doesn't seem to react the same everytime with the same combination of button presses so, your agent couldn't rely on a simple button combination, it would have to adapt to the progress of its run. But i guess that's the whole point of the agent for him to be optimal. I'll definitly read this article later today, seems interesting.
For your information, milagros disassembled the elma.exe file if i'm not mistaken.
Concerning that "collision script", or the "physical engine", or whatever you want to call it, it doesn't seem to react the same everytime with the same combination of button presses so, your agent couldn't rely on a simple button combination, it would have to adapt to the progress of its run. But i guess that's the whole point of the agent for him to be optimal. I'll definitly read this article later today, seems interesting.
For your information, milagros disassembled the elma.exe file if i'm not mistaken.
Last edited by The_BoneLESS on 11 Feb 2009, 17:54, edited 1 time in total.
Website || TT:38:05:33 || WC5:15th || HHIT for life || 9th world wide ... BAP is next
Re: Optimal Elma player
Damn I wish I knew more about this kind of stuff as it interests me greatly. I think there is no hurt in trying to make the optimal play dimkadimon even if it doesn't suceed it is still a step in the right direction and a learning experience. If there is anything I can do to help you let me know.
Oh ya and don't listen to all the negative comments from noobs like Jappe. they are just pesamistic people.
Oh ya and don't listen to all the negative comments from noobs like Jappe. they are just pesamistic people.
Religious Man
|| TT: 37:47:70 || EX New Wave WR + Animal Farm WR || 24 Canadian Records || TEC Bronze Medal || HHIT in the 37min club || http://elastomaster.tripod.com/ ||
|| TT: 37:47:70 || EX New Wave WR + Animal Farm WR || 24 Canadian Records || TEC Bronze Medal || HHIT in the 37min club || http://elastomaster.tripod.com/ ||
Re: Optimal Elma player
here are some scene fantasies:
(1) april fools 2005:
(2) interesting stuff starts from linked post: http://mopolauta.moposite.com/viewtopic ... 92#p164592
among other things, has been mentioned the need of an expert on the subject.
(1) april fools 2005:
Code: Select all
A russian Elma player has programmed a software which calculates the theoretical minimum times of the levels. It uses some kind of Euler algorithms which are very efficient in this kind of processing. There is also set a limit value (0.84) for the bug bounces.
among other things, has been mentioned the need of an expert on the subject.
Last edited by hex on 12 Feb 2009, 02:09, edited 1 time in total.
Re: Optimal Elma player
Only fps matters. If you are thinking of vsync and stuff, vsync just limits the fps to your screen refresh rate.The_BoneLESS wrote:Another thing to take in consideration is that the graphic card configuration has an impact on certain physical properties of the game (such as grip, bounces, etc.). I don't know if your agent can take that in consideration, haven't read the article yet.
A winner of 4 GAA's (mc2 included), winner of mkup206, and a proud member of team TAP.
Play uni levels: http://koti.mbnet.fi/zebra/uni.html
Homepage: http://koti.mbnet.fi/zebra/elma.html
Play uni levels: http://koti.mbnet.fi/zebra/uni.html
Homepage: http://koti.mbnet.fi/zebra/elma.html
-
- Kuski
- Posts: 30
- Joined: 8 May 2005, 00:12
- Location: Adelaide, Australia
- Contact:
Re: Optimal Elma player
Has anyone played X-Moto? http://xmoto.tuxfamily.org/
It seems similar to Elasto, but slightly easier. All the code is available, so perhaps it will be easier to hack into that...
It seems similar to Elasto, but slightly easier. All the code is available, so perhaps it will be easier to hack into that...
- The_BoneLESS
- 38mins club
- Posts: 4604
- Joined: 7 Sep 2003, 00:30
- Team: HHIT
- Location: Dangerously close to the St-Lawrence River
- Contact:
Re: Optimal Elma player
thing is, you won't find any xmoto fan here because, the game has horrible physics and is unplayable. But yeah, it is indeed an elma look-alike (clone).dimkadimon wrote:Has anyone played X-Moto? http://xmoto.tuxfamily.org/
It seems similar to Elasto, but slightly easier. All the code is available, so perhaps it will be easier to hack into that...
Website || TT:38:05:33 || WC5:15th || HHIT for life || 9th world wide ... BAP is next
Re: Optimal Elma player
THIS X MOTO GAIM IS SO REDECULOUSLY GOOD ITS HELAROUSE. Yeah xmoto is a swearword around here.
nice topic anyway, although those links you provided seem a bit too far fetched
nice topic anyway, although those links you provided seem a bit too far fetched
- Grace
- 38mins club
- Posts: 4851
- Joined: 19 Nov 2005, 10:45
- Location: Deep in your Imagination, Twirling your Dreams and Weaving your thoughts.
Re: Optimal Elma player
X-Moto is pretty much an attempted copy of Elma.
Cyberscore!
___________________________________________________
Targets: 6 Legendary, 23 WC, 20 Pro, 5 Good | AvgTT: 39:59:96
___________________________________________________
Targets: 6 Legendary, 23 WC, 20 Pro, 5 Good | AvgTT: 39:59:96
Re: Optimal Elma player
It is a copy of elma, and bad one... Really sux physics etc.
Re: Optimal Elma player
i believe milagros could be of help to you. if he did implement real-time multiplayer, he should be capable of implementing the stuff you need. but first, show him your research funds...
but, wouldn't that be countered by an impeccable, constant fps (say 60)? or would just the physics engine and the frame rate somehow need be synched one another? because if the bike gets one degree more inclined say at time zero, at time even four the difference could be a most significant one. i'm just saying that might not be really good for the agent. thankszebra wrote:The finite accuracy of the variables used by the physic engine and your frame rate cause the fact that your ride is non-deterministic even if in so-called 'auto-levels' where you don't have to press any buttons: the bike still acts differently in each ride.
Re: Optimal Elma player
Reinforcement learning or any other gradient approach (neural nets, ..) is completely unsuitable for this problem. It's usually used if you don't know mathematical model exactly (robots in real environment) or if there is some uncertainity (opponents, ..).
Finding the optimal path for given level is very non-convex problems with loads of local minimas (called optimal runs with certain 'style'). You will never get to the optimum using gradient method, at most you can hope for nab style good run. Other problem is that there aren't really any local goals so your player knows what he is supposed to do at the beginning when no weights are trained yet. On the other hand there isn't any problem like this in checkers (get as many points in the next move) or car driving (get infront of the opponent).
This problem is classic optimization problem - given framerate (or set of timeevents), press keys so you touch the flower in minimal time under contraint that you have to take all apples. For this problem you know the mathematical model (set of equations in elma) and there is no uncertainity at all (no need to explain those autolevs now..). Final time (or failed to finish) can be evaluated in P for given set of keys. You don't have to run it through elma, you jsut have to copy the equations and it's going to be faster that way. At the end keys can be sent to elma to drive your final 'solution'.
Some years ago I tried to write some nice code to do that. At that time I was quite nab in asm, so quite many things were not really optimal. Eventhough the algorithm was able to drive better than WR times those times and with some more trying its easy to get some 33tt atleast.
I tried these apporaches:
You set manually local goal (get as fast as you can here, have as high speed as possible or some combination of speed/time) and then you try brute force where keys can change only at low framerate (lets say its running 1000fps and you can check key only 20fps). With that it's possible to count 'optimally' a few secs (with some other branch&bound hacks). From this 'optimal' solution you can improve by doing simple gradient decent by changing times of keypresses with 1ms accuracy. To get better solution you can always remember best X runs for each local goal and then try next local goal from each one of them (similar to some genetic algorithms).
This can be improved a lot using more clever apporach. I can drive some example run and then computer will try to find the optimal way under contraint that some distance meassure between my saveload run and the computer try will be close enough. With this several seconds can be calculated 'optimally' efficiently. Some clever branch&bound algorithm may be useful there.
Using this approach its possible to do perfect hoyla runs (like warmup, ..), impsy tricks where i just show how do i plan to do it and computer optimizes (hard bounce can be done at high fps on some 0.3s brute force), however, it will not find styles for you, you have to tell it what to try.
I haven't really finished coding it and I can't be bothered to do so atm. Some years ago I wanted to do some simple framework so you can code the algorithm in c++ yourself and it would work, however, I found out that most of the ppl here are programming nabs and noone would be interested. The only thing would be that it would help ppl cheat.
Edit:
btw. doing decent real time battle bot is completely different thing. A good one would prolly fail in internals.(some 45tt or smth) I didn't try that one but would be prolly even cooler competition if there were many skilled programmers around here.
Finding the optimal path for given level is very non-convex problems with loads of local minimas (called optimal runs with certain 'style'). You will never get to the optimum using gradient method, at most you can hope for nab style good run. Other problem is that there aren't really any local goals so your player knows what he is supposed to do at the beginning when no weights are trained yet. On the other hand there isn't any problem like this in checkers (get as many points in the next move) or car driving (get infront of the opponent).
This problem is classic optimization problem - given framerate (or set of timeevents), press keys so you touch the flower in minimal time under contraint that you have to take all apples. For this problem you know the mathematical model (set of equations in elma) and there is no uncertainity at all (no need to explain those autolevs now..). Final time (or failed to finish) can be evaluated in P for given set of keys. You don't have to run it through elma, you jsut have to copy the equations and it's going to be faster that way. At the end keys can be sent to elma to drive your final 'solution'.
Some years ago I tried to write some nice code to do that. At that time I was quite nab in asm, so quite many things were not really optimal. Eventhough the algorithm was able to drive better than WR times those times and with some more trying its easy to get some 33tt atleast.
I tried these apporaches:
You set manually local goal (get as fast as you can here, have as high speed as possible or some combination of speed/time) and then you try brute force where keys can change only at low framerate (lets say its running 1000fps and you can check key only 20fps). With that it's possible to count 'optimally' a few secs (with some other branch&bound hacks). From this 'optimal' solution you can improve by doing simple gradient decent by changing times of keypresses with 1ms accuracy. To get better solution you can always remember best X runs for each local goal and then try next local goal from each one of them (similar to some genetic algorithms).
This can be improved a lot using more clever apporach. I can drive some example run and then computer will try to find the optimal way under contraint that some distance meassure between my saveload run and the computer try will be close enough. With this several seconds can be calculated 'optimally' efficiently. Some clever branch&bound algorithm may be useful there.
Using this approach its possible to do perfect hoyla runs (like warmup, ..), impsy tricks where i just show how do i plan to do it and computer optimizes (hard bounce can be done at high fps on some 0.3s brute force), however, it will not find styles for you, you have to tell it what to try.
I haven't really finished coding it and I can't be bothered to do so atm. Some years ago I wanted to do some simple framework so you can code the algorithm in c++ yourself and it would work, however, I found out that most of the ppl here are programming nabs and noone would be interested. The only thing would be that it would help ppl cheat.
Edit:
btw. doing decent real time battle bot is completely different thing. A good one would prolly fail in internals.(some 45tt or smth) I didn't try that one but would be prolly even cooler competition if there were many skilled programmers around here.
[carebox]
Re: Optimal Elma player
That's some really interesting stuff mila! While we had different points of view in a little conversation last year, i couldn't agree more with your post. I've been thinking about working on this for quite a while, but I didn't know you already investigated those equations and implemented some optimization algorithms. I'm studying computer sience and mathematics, and consider myself to be a quite skilled programmer. I would be VERY interested in this C++ framework, or even these raw equations, that tell me the time for given keypresses at given points of time for a given level. Keep it up!
-
- Kuski
- Posts: 30
- Joined: 8 May 2005, 00:12
- Location: Adelaide, Australia
- Contact:
Re: Optimal Elma player
Thank you for that insightful post Milagros! I now see that the problem is easier than I thought originally. You are right - there is no uncertainty and we know the exact model. If for example, we didn't know what level the agent is in then it would be more suitable for RL. This raises the question: is it possible to create a strong RL agent that is reasonably good at any level you give it, without seeding it with any starting styles? I suspect that its only possible if the levels are small (best ride < 20 seconds). However as the level becomes more complex, the goals become further away and without any intermediate rewards the problem probably becomes exponentially harder.
Nevertheless, the problem is still interesting even when you know the exact physics and exact environment. It just becomes an optimization problem, but its still a difficult and interesting optimization problem. I am sure this community would appreciate seeing how close (or maybe even better!) we can get to the current WR times. Perhaps we can learn something from the computer player. Who knows...
I have a background in maths and CS, so I am willing to spend some time on this problem. I am also urging others to join in, the more the merrier Milagros it would be awesome if you could provide us with those equations for the physical model or any code that you have - just to get us started!
Nevertheless, the problem is still interesting even when you know the exact physics and exact environment. It just becomes an optimization problem, but its still a difficult and interesting optimization problem. I am sure this community would appreciate seeing how close (or maybe even better!) we can get to the current WR times. Perhaps we can learn something from the computer player. Who knows...
I have a background in maths and CS, so I am willing to spend some time on this problem. I am also urging others to join in, the more the merrier Milagros it would be awesome if you could provide us with those equations for the physical model or any code that you have - just to get us started!
Re: Optimal Elma player
unfortunately dimkadimon, like milagros stated, most of the people here are programming nabs. the general population only has the ability to encourage you, i think its a great idea though. perhaps flat track would be the ideal level to begin with, seeing as there are no apples, and no unapparent styles that we havent thought of yet. it might be the easiest level to reach the optimal time?
God Bless America
- The_BoneLESS
- 38mins club
- Posts: 4604
- Joined: 7 Sep 2003, 00:30
- Team: HHIT
- Location: Dangerously close to the St-Lawrence River
- Contact:
Re: Optimal Elma player
I would surely like to help you but, i mostly (and pretty much only) have theoretical knowledge about optimization algorithms at this point.
Well, we could at least start with some algorithm for simplier problems like Warm Up or any level without apples (though i'm sure mila has already done that on his free time).
Well, we could at least start with some algorithm for simplier problems like Warm Up or any level without apples (though i'm sure mila has already done that on his free time).
Website || TT:38:05:33 || WC5:15th || HHIT for life || 9th world wide ... BAP is next
Re: Optimal Elma player
whatever happens you have got the interest of the whole scene.
Re: Optimal Elma player
lika mila said, this will only help ppl cheat. only have i no interest at all in this, I would accually dislike it. if someone could cheat runs better than current wrs and spread would spoil the whole scene and ppl would quit elma.
Fuck that, go play elma for real damnit. or go hack some other game.
Fuck that, go play elma for real damnit. or go hack some other game.
Elasto Mania - ez better
Re: Optimal Elma player
8km = this driving all day.
- Kopaka
- 39mins club
- Posts: 6611
- Joined: 23 May 2002, 13:59
- Team: LAME
- Location: In a northern danish city beating YOUR record.
- Contact:
Re: Optimal Elma player
I agree with ramone, actually surprised no one said that before, if you make some program that can make perfect runs it would ruin the whole point of the game.
Talk for yourself please.hexeditor wrote:whatever happens you have got the interest of the whole scene.
Re: Optimal Elma player
i thought most hardcore players with time would be more interested in the possibilities of the game than in hoyling the same levels over and over. that being said, i don't mind a computer playing for me and freeing me the task of losing my life at it.
Re: Optimal Elma player
but unfortunately some probably would like to have better times done by... computer On the other hand i think that the main reason is just to see what may be done in elma. what are the fastest possible runs in internals ect.
Team TR
Multi WR in Labyrinth with GRob
Best Internal Total Times, Pipe stats & Pipe archive
World kuski map, World Cup stats
Re: Optimal Elma player
having coded a Yam IA last week end, i think programming a game ruins the fun of playing it self, it's verified
On the other hand, mila's post is very interesting to read
On the other hand, mila's post is very interesting to read
Re: Optimal Elma player
I think it would be a very interesting endeavour, notwithstanding the possibility of cheat. If you guys are really interested in putting time in this, you should co-op with eachother, but only in a closed circle ofcourse. I'm not really impressed by the conservative opinions that find this a threat towards real people's wr's. It's not gonna be a threat.
Re: Optimal Elma player
here is last computer that tried to take elma wr's by itself
Re: Optimal Elma player
my keyboard was more worn out before it died. i guess that comp just didnt hoyl enough to get wrs
Re: Optimal Elma player
i wrote to zweq for his opinion, as he's one of the best players around. here it is, folks:
Zweq wrote:I'm very interested in seeking for limits in elma as long as it is not exploited nor will damage legit tables (styles arent published)
Last edited by hex on 23 Feb 2009, 05:10, edited 1 time in total.
Re: Optimal Elma player
why would zweqs opinion matter more than any elses? I am not at all interested in seeing a list of theoretical times. How could anyone guarantee drives/styles could be kept secret.
This is not intereresting! It sucks.
This is not intereresting! It sucks.
Elasto Mania - ez better