Optimal Elma player

General discussion about the games and the scene.

Moderator: Moporators

dimkadimon
Kuski
Posts: 30
Joined: 8 May 2005, 00:12
Location: Adelaide, Australia
Contact:

Optimal Elma player

Post by dimkadimon »

The other day I had an idea: what if we create the optimal Elma player? This is how it works. An Elma player can be modeled as a Reinforcement Learning agent with 5 available actions: throttle, break, left, right and turn. If the agent has collected all the apples he enters the terminal state and receives a reward of 1/(time taken), otherwise in all other states he receives a reward of -1. This way, the agent will aim to minimize the time. We can now do standard TD(lambda) to teach the agent to drive well and hopefully he will learn the optimal policy.

There are several extensions to this. We can change the reward function, such that it grows with each collected apple, ie the first apple gives +1, the second +10, the third +100 and so on - this will be some form of shaping. We can provide "mini-goals" for the agent - these could be checkpoints along a "good" path. We can initialize the agent's driving policy by a good drive through the level (provided by a user) and let him try to make it better. We can also do learning by example, just like its done here: https://research.microsoft.com/en-us/pr ... fault.aspx. The possibilities are endless and if we put our minds together we can achieve great things!

The only problem is the learning agent must live in the Elma environment. How can we access the code of the game?

Here are some more papers related to this:

Learning to fight: https://research.microsoft.com/apps/pub ... x?id=65639
Learning to bike ride: http://www.cs.mcgill.ca/~mgendr12/COMP5 ... arning.pdf
bipedal-walking: http://citeseerx.ist.psu.edu/viewdoc/su ... 1.107.3195
Robot Auto Racing: http://citeseerx.ist.psu.edu/viewdoc/su ... .1.86.9280
Apprenticeship learning: http://citeseerx.ist.psu.edu/viewdoc/su ... 0.1.1.2.92
hex
Kuski
Posts: 125
Joined: 21 Jan 2009, 03:47

Re: Optimal Elma player

Post by hex »

Last edited by hex on 24 Feb 2009, 21:18, edited 2 times in total.
User avatar
Sla
36mins club
Posts: 1425
Joined: 5 Apr 2006, 03:34
Team: SpEF
Location: Argentina
Contact:

Re: Optimal Elma player

Post by Sla »

nooo fuck you. jappe to the max!
User avatar
zebra
Kuski
Posts: 1010
Joined: 23 Sep 2003, 15:35
Team: TAP
Location: Finland
Contact:

Re: Optimal Elma player

Post by zebra »

why so stupid replies to such an intelligent post?

I know it should be theoretically possible to make an optimal elma player but really really hard... Those scientific papers you showed... such hard formulas even for such an easy things as driving a bicycle. So for a very complex game like elma, those formulas would be incomprehensible.

I don't see why we should give points for apples. Apples are not 'extra points', they are things that have to be collected.

You can't really get elma code anywhere, only balazs has it. So I think that's the biggest problem, is it?
A winner of 4 GAA's (mc2 included), winner of mkup206, and a proud member of team TAP.
Play uni levels: http://koti.mbnet.fi/zebra/uni.html
Homepage: http://koti.mbnet.fi/zebra/elma.html
dimkadimon
Kuski
Posts: 30
Joined: 8 May 2005, 00:12
Location: Adelaide, Australia
Contact:

Re: Optimal Elma player

Post by dimkadimon »

why so stupid replies to such an intelligent post?
Yes I was expecting more intelligent replies too...
I know it should be theoretically possible to make an optimal elma player but really really hard... Those scientific papers you showed... such hard formulas even for such an easy things as driving a bicycle. So for a very complex game like elma, those formulas would be incomprehensible.
Don't worry about the formulas, they are not as hard as they look. I did my honours project in Reinforcement Learning and my current PhD supervisor is an expert in Reinforcement Learning, so we should be able to deal with the formulas.
I don't see why we should give points for apples. Apples are not 'extra points', they are things that have to be collected.
Think of a really long level that takes about 3 minutes to complete. If the agent receives a reward only at the very end, then there is nothing to guide him in between. Instead, if you give the agent some small intermediate rewards (eg. points for collecting apples) then it will help in the task. This way, the agent will first learn how to get the first apple quickly, then he will learn how to get the second apple quickly and so on until he learns how to get them all quickly. Remember, one has to learn how to crawl before he can walk.
You can't really get elma code anywhere, only balazs has it. So I think that's the biggest problem, is it?
For me this is the most difficult part. I thought someone might have hacked into the code, or did something clever like that. In the worst case, we can email Balazs and hope that he is kind enough to give us the code. Does he usually reply to emails about Elma?
User avatar
Grace
38mins club
Posts: 4847
Joined: 19 Nov 2005, 10:45
Location: Deep in your Imagination, Twirling your Dreams and Weaving your thoughts.

Re: Optimal Elma player

Post by Grace »

he generally ignores you or gives you a cryptic auto email about the release date of elma 2.

oh and the only problem with setting "rewards" at each apple is that sometimes the fastest way of getting to apples 1 will mean that overall it takes longer to get to apple 2 than if you'd taken a different route. for instance it's quicker to run straight into the very top apple in what the heck, but if you do then it will take you longer to get the apples right underneath it.
Image Cyberscore! Image
___________________________________________________
Image
Targets: 6 Legendary, 23 WC, 20 Pro, 5 Good | AvgTT: 39:59:96
dimkadimon
Kuski
Posts: 30
Joined: 8 May 2005, 00:12
Location: Adelaide, Australia
Contact:

Re: Optimal Elma player

Post by dimkadimon »

he generally ignores you or gives you a cryptic auto email about the release date of elma 2.
Actually we don't need the source. It will be sufficient to simulate keystrokes inside Elma and have Elma provide us with our current state. The state comprises of: current time, number of apples eaten, position of the bike etc.
oh and the only problem with setting "rewards" at each apple is that sometimes the fastest way of getting to apples 1 will mean that overall it takes longer to get to apple 2 than if you'd taken a different route. for instance it's quicker to run straight into the very top apple in what the heck, but if you do then it will take you longer to get the apples right underneath it.
Yep I understand what you are saying. This is why I suggested giving rewards in an incremental order, so the later apples have more influence than the earlier ones. After all, we only care about getting the last apple quickly, but we would also like to get to the earlier apples quickly if possible. The reward for collecting the i-th apple at time t could be something like e^(i+c/t), where c is some constant. So later apples give you much more reward; and apples collected quickly give you more reward.
User avatar
Grace
38mins club
Posts: 4847
Joined: 19 Nov 2005, 10:45
Location: Deep in your Imagination, Twirling your Dreams and Weaving your thoughts.

Re: Optimal Elma player

Post by Grace »

Bicycle has less buttons = less complex, also no sexy tricks, ever seen a bicycle do a brutal on uphill battle? how about ramp frenzy brutal or animal farm head compression? nat.
Image Cyberscore! Image
___________________________________________________
Image
Targets: 6 Legendary, 23 WC, 20 Pro, 5 Good | AvgTT: 39:59:96
hex
Kuski
Posts: 125
Joined: 21 Jan 2009, 03:47

Re: Optimal Elma player

Post by hex »

dimkadimon, thanks for starting a most interesting topic.

elma may well be a computer game, with simulated physics, but the possibilities is really what makes the game complex. it reminds me of when doom was first released: you could then not only have enemies coming in front or back, but in depth of space too! elma for me is like that. it can be said to be more complex than most other 3d games.

about the concerns in emailing balazs, to me he has in the past been friendly, and cooperative to a certain level (i.e. past the level of ignoring my messages). if any part of your project needs example of well-driven, almost-perfect replays (such as John's 29), just reply here or send me a pm and i'll take a look into it. good luck!
dimkadimon
Kuski
Posts: 30
Joined: 8 May 2005, 00:12
Location: Adelaide, Australia
Contact:

Re: Optimal Elma player

Post by dimkadimon »

Bicycle has less buttons = less complex, also no sexy tricks, ever seen a bicycle do a brutal on uphill battle? how about ramp frenzy brutal or animal farm head compression? nat.
I don't agree. A real bicycle has an infinite number of continuous actions: you can lean x degrees left, right, forward, back etc. In all cases, x is a real value. While an elma bicycle has only 5 discreet actions and those actions have only 2 states: on and off. Either you have full throttle or none at all, you can't do half throttle for example.
User avatar
Grace
38mins club
Posts: 4847
Joined: 19 Nov 2005, 10:45
Location: Deep in your Imagination, Twirling your Dreams and Weaving your thoughts.

Re: Optimal Elma player

Post by Grace »

sure, the bicycle has many different actions, however as far as i know there are only 2 buttons on an average bicycle, 6 on a mountain bike. Elma has 8.
Image Cyberscore! Image
___________________________________________________
Image
Targets: 6 Legendary, 23 WC, 20 Pro, 5 Good | AvgTT: 39:59:96
User avatar
zebra
Kuski
Posts: 1010
Joined: 23 Sep 2003, 15:35
Team: TAP
Location: Finland
Contact:

Re: Optimal Elma player

Post by zebra »

You should ask the code from milagros. He has made quite many additions to the game and probably already knows the source code by heart. But I think he has to code it in assembly because balazs hasn't given the original source code.

And btw, there has been a similar kind of topic earlier. It was an april fool though:

http://mopolauta.moposite.com/viewtopic.php?t=3132
A winner of 4 GAA's (mc2 included), winner of mkup206, and a proud member of team TAP.
Play uni levels: http://koti.mbnet.fi/zebra/uni.html
Homepage: http://koti.mbnet.fi/zebra/elma.html
User avatar
John
first 35tt
Posts: 4738
Joined: 28 Sep 2002, 19:42
Team: WNO
Location: Luleå, Sweden

Re: Optimal Elma player

Post by John »

SmaXa wrote:sure, the bicycle has many different actions, however as far as i know there are only 2 buttons on an average bicycle, 6 on a mountain bike. Elma has 8.
so driving a 2d motorbike is more complex than riding a real bike in 3d with 2 brakes, gears and variable applied force to the pedals?
Image
User avatar
FinMan
36mins club
Posts: 2038
Joined: 13 Feb 2007, 11:14
Team: dat
Location: Jyväskylä, Finland

Re: Optimal Elma player

Post by FinMan »

John, ez add more different grips, grounds, wind and if the kuski gets tired? :P
Image
User avatar
pawq
38mins club
Posts: 6547
Joined: 24 Aug 2008, 19:56
Team: TR
Location: Southampton, UK

Re: Optimal Elma player

Post by pawq »

optimal player would obviously be modified Zweq.exe, with inbuilt winballe.exe, driveWR.exe and pwnJohn.exe. Hard to make even for teh_mila :cry:
User avatar
Chris
Kuski
Posts: 1111
Joined: 5 Dec 2008, 16:19
Team: Ferrari
Location: flat track

Re: Optimal Elma player

Post by Chris »

AFAIK the collision scripts use type double. So bike is going to act differently each time you execute the same moves sequence. In that case any kind of macro/neural network or howsoever it's called is not going to work. I don't know if you are already aware of this issue.
Lousku wrote:could you mayke shorter sig please :( mega annoying and also against rules :()
dimkadimon
Kuski
Posts: 30
Joined: 8 May 2005, 00:12
Location: Adelaide, Australia
Contact:

Re: Optimal Elma player

Post by dimkadimon »

You should ask the code from milagros. He has made quite many additions to the game and probably already knows the source code by heart. But I think he has to code it in assembly because balazs hasn't given the original source code.
So milagros reimplemented elma in assembly?!? That sounds hard. How can I contact him?
dimkadimon
Kuski
Posts: 30
Joined: 8 May 2005, 00:12
Location: Adelaide, Australia
Contact:

Re: Optimal Elma player

Post by dimkadimon »

alias wrote:AFAIK the collision scripts use type double. So bike is going to act differently each time you execute the same moves sequence. In that case any kind of macro/neural network or howsoever it's called is not going to work. I don't know if you are already aware of this issue.
What collision scripts are you referring to? How can doubles affect the move sequence? - I think you wanted to say that the script is non-deterministic resulting in different move sequence each time you execute it?
dimkadimon
Kuski
Posts: 30
Joined: 8 May 2005, 00:12
Location: Adelaide, Australia
Contact:

Re: Optimal Elma player

Post by dimkadimon »

Jappe wrote:apparently dimkadimon isnt an optimal forum user since he doesnt know how to quote properly and makes double posts
Where have I double posted and what is wrong with my quoting?
User avatar
zebra
Kuski
Posts: 1010
Joined: 23 Sep 2003, 15:35
Team: TAP
Location: Finland
Contact:

Re: Optimal Elma player

Post by zebra »

dimkadimon wrote:
alias wrote:AFAIK the collision scripts use type double. So bike is going to act differently each time you execute the same moves sequence. In that case any kind of macro/neural network or howsoever it's called is not going to work. I don't know if you are already aware of this issue.
What collision scripts are you referring to? How can doubles affect the move sequence? - I think you wanted to say that the script is non-deterministic resulting in different move sequence each time you execute it?
The finite accuracy of the variables used by the physic engine and your frame rate cause the fact that your ride is non-deterministic even if in so-called 'auto-levels' where you don't have to press any buttons: the bike still acts differently in each ride.
A winner of 4 GAA's (mc2 included), winner of mkup206, and a proud member of team TAP.
Play uni levels: http://koti.mbnet.fi/zebra/uni.html
Homepage: http://koti.mbnet.fi/zebra/elma.html
User avatar
zebra
Kuski
Posts: 1010
Joined: 23 Sep 2003, 15:35
Team: TAP
Location: Finland
Contact:

Re: Optimal Elma player

Post by zebra »

dimkadimon wrote:
Jappe wrote:apparently dimkadimon isnt an optimal forum user since he doesnt know how to quote properly and makes double posts
Where have I double posted and what is wrong with my quoting?
Just ignore that idiot.
A winner of 4 GAA's (mc2 included), winner of mkup206, and a proud member of team TAP.
Play uni levels: http://koti.mbnet.fi/zebra/uni.html
Homepage: http://koti.mbnet.fi/zebra/elma.html
User avatar
zebra
Kuski
Posts: 1010
Joined: 23 Sep 2003, 15:35
Team: TAP
Location: Finland
Contact:

Re: Optimal Elma player

Post by zebra »

dimkadimon wrote:
You should ask the code from milagros. He has made quite many additions to the game and probably already knows the source code by heart. But I think he has to code it in assembly because balazs hasn't given the original source code.
So milagros reimplemented elma in assembly?!? That sounds hard. How can I contact him?
No he didn't reimplement elma :) But somehow converted the elma.exe into assembly where he can then add his own stuff.

You can contact him in IRCnet channel #across nick teh_mila.
A winner of 4 GAA's (mc2 included), winner of mkup206, and a proud member of team TAP.
Play uni levels: http://koti.mbnet.fi/zebra/uni.html
Homepage: http://koti.mbnet.fi/zebra/elma.html
dimkadimon
Kuski
Posts: 30
Joined: 8 May 2005, 00:12
Location: Adelaide, Australia
Contact:

Re: Optimal Elma player

Post by dimkadimon »

The finite accuracy of the variables used by the physic engine and your frame rate cause the fact that your ride is non-deterministic even if in so-called 'auto-levels' where you don't have to press any buttons: the bike still acts differently in each ride.
Actually if the variations are small (from run to run) then this might not be such a bad thing. In some sense it will encourage the agent's exploration. This is one of the reasons why TD-Gammon worked so well: http://gammonline.com/articles/td-gammon.html
You can contact him in IRCnet channel #across nick teh_mila.
Does he have an email? - because that would be easier for me.
User avatar
The_BoneLESS
38mins club
Posts: 4604
Joined: 7 Sep 2003, 00:30
Team: HHIT
Location: Dangerously close to the St-Lawrence River
Contact:

Re: Optimal Elma player

Post by The_BoneLESS »

Another thing to take in consideration is that the graphic card configuration has an impact on certain physical properties of the game (such as grip, bounces, etc.). I don't know if your agent can take that in consideration, haven't read the article yet.

Concerning that "collision script", or the "physical engine", or whatever you want to call it, it doesn't seem to react the same everytime with the same combination of button presses so, your agent couldn't rely on a simple button combination, it would have to adapt to the progress of its run. But i guess that's the whole point of the agent for him to be optimal. I'll definitly read this article later today, seems interesting.

For your information, milagros disassembled the elma.exe file if i'm not mistaken.
Last edited by The_BoneLESS on 11 Feb 2009, 17:54, edited 1 time in total.
Website || TT:38:05:33 || WC5:15th || HHIT for life || 9th world wide ... BAP is next
User avatar
Memphis
37mins club
Posts: 1907
Joined: 28 Feb 2003, 13:35
Location: B.C. Canada
Contact:

Re: Optimal Elma player

Post by Memphis »

Damn I wish I knew more about this kind of stuff as it interests me greatly. I think there is no hurt in trying to make the optimal play dimkadimon even if it doesn't suceed it is still a step in the right direction and a learning experience. If there is anything I can do to help you let me know.

Oh ya and don't listen to all the negative comments from noobs like Jappe. they are just pesamistic people.
Religious Man
|| TT: 37:47:70 || EX New Wave WR + Animal Farm WR || 24 Canadian Records || TEC Bronze Medal || HHIT in the 37min club || http://elastomaster.tripod.com/
||
Image
hex
Kuski
Posts: 125
Joined: 21 Jan 2009, 03:47

Re: Optimal Elma player

Post by hex »

here are some scene fantasies:

(1) april fools 2005:

Code: Select all

A russian Elma player has programmed a software which calculates the theoretical minimum times of the levels. It uses some kind of Euler algorithms which are very efficient in this kind of processing. There is also set a limit value (0.84) for the bug bounces.
(2) interesting stuff starts from linked post: http://mopolauta.moposite.com/viewtopic ... 92#p164592
among other things, has been mentioned the need of an expert on the subject.
Last edited by hex on 12 Feb 2009, 02:09, edited 1 time in total.
User avatar
zebra
Kuski
Posts: 1010
Joined: 23 Sep 2003, 15:35
Team: TAP
Location: Finland
Contact:

Re: Optimal Elma player

Post by zebra »

The_BoneLESS wrote:Another thing to take in consideration is that the graphic card configuration has an impact on certain physical properties of the game (such as grip, bounces, etc.). I don't know if your agent can take that in consideration, haven't read the article yet.
Only fps matters. If you are thinking of vsync and stuff, vsync just limits the fps to your screen refresh rate.
A winner of 4 GAA's (mc2 included), winner of mkup206, and a proud member of team TAP.
Play uni levels: http://koti.mbnet.fi/zebra/uni.html
Homepage: http://koti.mbnet.fi/zebra/elma.html
dimkadimon
Kuski
Posts: 30
Joined: 8 May 2005, 00:12
Location: Adelaide, Australia
Contact:

Re: Optimal Elma player

Post by dimkadimon »

Has anyone played X-Moto? http://xmoto.tuxfamily.org/

It seems similar to Elasto, but slightly easier. All the code is available, so perhaps it will be easier to hack into that...
User avatar
The_BoneLESS
38mins club
Posts: 4604
Joined: 7 Sep 2003, 00:30
Team: HHIT
Location: Dangerously close to the St-Lawrence River
Contact:

Re: Optimal Elma player

Post by The_BoneLESS »

dimkadimon wrote:Has anyone played X-Moto? http://xmoto.tuxfamily.org/
It seems similar to Elasto, but slightly easier. All the code is available, so perhaps it will be easier to hack into that...
thing is, you won't find any xmoto fan here because, the game has horrible physics and is unplayable. But yeah, it is indeed an elma look-alike (clone).
Website || TT:38:05:33 || WC5:15th || HHIT for life || 9th world wide ... BAP is next
User avatar
Zweq
34mins club
Posts: 4055
Joined: 28 Nov 2002, 15:54
Location: suo mesta

Re: Optimal Elma player

Post by Zweq »

THIS X MOTO GAIM IS SO REDECULOUSLY GOOD ITS HELAROUSE. Yeah xmoto is a swearword around here.

nice topic anyway, although those links you provided seem a bit too far fetched
Image
User avatar
Grace
38mins club
Posts: 4847
Joined: 19 Nov 2005, 10:45
Location: Deep in your Imagination, Twirling your Dreams and Weaving your thoughts.

Re: Optimal Elma player

Post by Grace »

X-Moto is pretty much an attempted copy of Elma.
Image Cyberscore! Image
___________________________________________________
Image
Targets: 6 Legendary, 23 WC, 20 Pro, 5 Good | AvgTT: 39:59:96
User avatar
FinMan
36mins club
Posts: 2038
Joined: 13 Feb 2007, 11:14
Team: dat
Location: Jyväskylä, Finland

Re: Optimal Elma player

Post by FinMan »

It is a copy of elma, and bad one... Really sux physics etc.
Image
hex
Kuski
Posts: 125
Joined: 21 Jan 2009, 03:47

Re: Optimal Elma player

Post by hex »

i believe milagros could be of help to you. if he did implement real-time multiplayer, he should be capable of implementing the stuff you need. but first, show him your research funds...
zebra wrote:The finite accuracy of the variables used by the physic engine and your frame rate cause the fact that your ride is non-deterministic even if in so-called 'auto-levels' where you don't have to press any buttons: the bike still acts differently in each ride.
but, wouldn't that be countered by an impeccable, constant fps (say 60)? or would just the physics engine and the frame rate somehow need be synched one another? because if the bike gets one degree more inclined say at time zero, at time even four the difference could be a most significant one. i'm just saying that might not be really good for the agent. thanks
User avatar
milagros
Cheatless
Posts: 4560
Joined: 19 May 2002, 17:05

Re: Optimal Elma player

Post by milagros »

Reinforcement learning or any other gradient approach (neural nets, ..) is completely unsuitable for this problem. It's usually used if you don't know mathematical model exactly (robots in real environment) or if there is some uncertainity (opponents, ..).

Finding the optimal path for given level is very non-convex problems with loads of local minimas (called optimal runs with certain 'style'). You will never get to the optimum using gradient method, at most you can hope for nab style good run. Other problem is that there aren't really any local goals so your player knows what he is supposed to do at the beginning when no weights are trained yet. On the other hand there isn't any problem like this in checkers (get as many points in the next move) or car driving (get infront of the opponent).

This problem is classic optimization problem - given framerate (or set of timeevents), press keys so you touch the flower in minimal time under contraint that you have to take all apples. For this problem you know the mathematical model (set of equations in elma) and there is no uncertainity at all (no need to explain those autolevs now..). Final time (or failed to finish) can be evaluated in P for given set of keys. You don't have to run it through elma, you jsut have to copy the equations and it's going to be faster that way. At the end keys can be sent to elma to drive your final 'solution'.

Some years ago I tried to write some nice code to do that. At that time I was quite nab in asm, so quite many things were not really optimal. Eventhough the algorithm was able to drive better than WR times those times and with some more trying its easy to get some 33tt atleast.

I tried these apporaches:
You set manually local goal (get as fast as you can here, have as high speed as possible or some combination of speed/time) and then you try brute force where keys can change only at low framerate (lets say its running 1000fps and you can check key only 20fps). With that it's possible to count 'optimally' a few secs (with some other branch&bound hacks). From this 'optimal' solution you can improve by doing simple gradient decent by changing times of keypresses with 1ms accuracy. To get better solution you can always remember best X runs for each local goal and then try next local goal from each one of them (similar to some genetic algorithms).
This can be improved a lot using more clever apporach. I can drive some example run and then computer will try to find the optimal way under contraint that some distance meassure between my saveload run and the computer try will be close enough. With this several seconds can be calculated 'optimally' efficiently. Some clever branch&bound algorithm may be useful there.

Using this approach its possible to do perfect hoyla runs (like warmup, ..), impsy tricks where i just show how do i plan to do it and computer optimizes (hard bounce can be done at high fps on some 0.3s brute force), however, it will not find styles for you, you have to tell it what to try.

I haven't really finished coding it and I can't be bothered to do so atm. Some years ago I wanted to do some simple framework so you can code the algorithm in c++ yourself and it would work, however, I found out that most of the ppl here are programming nabs and noone would be interested. The only thing would be that it would help ppl cheat.

Edit:
btw. doing decent real time battle bot is completely different thing. A good one would prolly fail in internals.(some 45tt or smth) I didn't try that one but would be prolly even cooler competition if there were many skilled programmers around here.
[carebox]
welle
Kuski
Posts: 251
Joined: 15 Aug 2005, 20:52
Location: Germany

Re: Optimal Elma player

Post by welle »

That's some really interesting stuff mila! While we had different points of view in a little conversation last year, i couldn't agree more with your post. I've been thinking about working on this for quite a while, but I didn't know you already investigated those equations and implemented some optimization algorithms. I'm studying computer sience and mathematics, and consider myself to be a quite skilled programmer. I would be VERY interested in this C++ framework, or even these raw equations, that tell me the time for given keypresses at given points of time for a given level. Keep it up!
dimkadimon
Kuski
Posts: 30
Joined: 8 May 2005, 00:12
Location: Adelaide, Australia
Contact:

Re: Optimal Elma player

Post by dimkadimon »

Thank you for that insightful post Milagros! I now see that the problem is easier than I thought originally. You are right - there is no uncertainty and we know the exact model. If for example, we didn't know what level the agent is in then it would be more suitable for RL. This raises the question: is it possible to create a strong RL agent that is reasonably good at any level you give it, without seeding it with any starting styles? I suspect that its only possible if the levels are small (best ride < 20 seconds). However as the level becomes more complex, the goals become further away and without any intermediate rewards the problem probably becomes exponentially harder.

Nevertheless, the problem is still interesting even when you know the exact physics and exact environment. It just becomes an optimization problem, but its still a difficult and interesting optimization problem. I am sure this community would appreciate seeing how close (or maybe even better!) we can get to the current WR times. Perhaps we can learn something from the computer player. Who knows...

I have a background in maths and CS, so I am willing to spend some time on this problem. I am also urging others to join in, the more the merrier :) Milagros it would be awesome if you could provide us with those equations for the physical model or any code that you have - just to get us started!
User avatar
gimp
Kuski
Posts: 1141
Joined: 28 May 2007, 08:47

Re: Optimal Elma player

Post by gimp »

unfortunately dimkadimon, like milagros stated, most of the people here are programming nabs. the general population only has the ability to encourage you, i think its a great idea though. perhaps flat track would be the ideal level to begin with, seeing as there are no apples, and no unapparent styles that we havent thought of yet. it might be the easiest level to reach the optimal time?
God Bless America
User avatar
The_BoneLESS
38mins club
Posts: 4604
Joined: 7 Sep 2003, 00:30
Team: HHIT
Location: Dangerously close to the St-Lawrence River
Contact:

Re: Optimal Elma player

Post by The_BoneLESS »

I would surely like to help you but, i mostly (and pretty much only) have theoretical knowledge about optimization algorithms at this point.

Well, we could at least start with some algorithm for simplier problems like Warm Up or any level without apples (though i'm sure mila has already done that on his free time).
Website || TT:38:05:33 || WC5:15th || HHIT for life || 9th world wide ... BAP is next
hex
Kuski
Posts: 125
Joined: 21 Jan 2009, 03:47

Re: Optimal Elma player

Post by hex »

whatever happens you have got the interest of the whole scene.
User avatar
Ramone
50mins club
Posts: 1971
Joined: 20 May 2002, 15:42

Re: Optimal Elma player

Post by Ramone »

lika mila said, this will only help ppl cheat. only have i no interest at all in this, I would accually dislike it. if someone could cheat runs better than current wrs and spread would spoil the whole scene and ppl would quit elma.

Fuck that, go play elma for real damnit. or go hack some other game.
Elasto Mania - ez better
User avatar
FinMan
36mins club
Posts: 2038
Joined: 13 Feb 2007, 11:14
Team: dat
Location: Jyväskylä, Finland

Re: Optimal Elma player

Post by FinMan »

8km = this driving all day. :D
Image
User avatar
Kopaka
39mins club
Posts: 6611
Joined: 23 May 2002, 13:59
Team: LAME
Location: In a northern danish city beating YOUR record.
Contact:

Re: Optimal Elma player

Post by Kopaka »

I agree with ramone, actually surprised no one said that before, if you make some program that can make perfect runs it would ruin the whole point of the game.
hexeditor wrote:whatever happens you have got the interest of the whole scene.
Talk for yourself please.
hex
Kuski
Posts: 125
Joined: 21 Jan 2009, 03:47

Re: Optimal Elma player

Post by hex »

i thought most hardcore players with time would be more interested in the possibilities of the game than in hoyling the same levels over and over. that being said, i don't mind a computer playing for me and freeing me the task of losing my life at it.
User avatar
pawq
38mins club
Posts: 6547
Joined: 24 Aug 2008, 19:56
Team: TR
Location: Southampton, UK

Re: Optimal Elma player

Post by pawq »

but unfortunately some probably would like to have better times done by... computer :( On the other hand i think that the main reason is just to see what may be done in elma. what are the fastest possible runs in internals ect.
User avatar
totem
39mins club
Posts: 328
Joined: 26 Dec 2007, 14:04

Re: Optimal Elma player

Post by totem »

having coded a Yam IA last week end, i think programming a game ruins the fun of playing it self, it's verified

On the other hand, mila's post is very interesting to read
teajay
Donator duck
Posts: 10043
Joined: 3 Apr 2003, 17:53

Re: Optimal Elma player

Post by teajay »

I think it would be a very interesting endeavour, notwithstanding the possibility of cheat. If you guys are really interested in putting time in this, you should co-op with eachother, but only in a closed circle ofcourse. I'm not really impressed by the conservative opinions that find this a threat towards real people's wr's. It's not gonna be a threat.
User avatar
Mats
39mins club
Posts: 1183
Joined: 5 Apr 2007, 12:30
Location: Norway, Sandnes

Re: Optimal Elma player

Post by Mats »

here is last computer that tried to take elma wr's by itself
Image
TT:39.59.86|| AvgTT:41.49.24 || Multi TT:27:43:82 || Team [TR]
User avatar
Zweq
34mins club
Posts: 4055
Joined: 28 Nov 2002, 15:54
Location: suo mesta

Re: Optimal Elma player

Post by Zweq »

my keyboard was more worn out before it died. i guess that comp just didnt hoyl enough to get wrs
Image
hex
Kuski
Posts: 125
Joined: 21 Jan 2009, 03:47

Re: Optimal Elma player

Post by hex »

i wrote to zweq for his opinion, as he's one of the best players around. here it is, folks:
Zweq wrote:I'm very interested in seeking for limits in elma as long as it is not exploited nor will damage legit tables (styles arent published)
Last edited by hex on 23 Feb 2009, 05:10, edited 1 time in total.
User avatar
Ramone
50mins club
Posts: 1971
Joined: 20 May 2002, 15:42

Re: Optimal Elma player

Post by Ramone »

why would zweqs opinion matter more than any elses? I am not at all interested in seeing a list of theoretical times. How could anyone guarantee drives/styles could be kept secret.

This is not intereresting! It sucks.
Elasto Mania - ez better
Post Reply