Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tuning evaluation function #89

Open
Andyloris opened this issue Dec 14, 2021 · 4 comments
Open

Tuning evaluation function #89

Andyloris opened this issue Dec 14, 2021 · 4 comments

Comments

@Andyloris
Copy link
Contributor

Andyloris commented Dec 14, 2021

Hi, i recently coded a small chess engine which uses minimax and an evaluation funtion,
and then i realised that i needed to tune the values of the evaluation function to win some elo, and that i could do this with the bot.
Have you ever tried this ?

@pmariglia
Copy link
Owner

Great question.

I've explored evaluation function tuning a bit on a private branch, and these are my thoughts:

  1. Tuning via self-play is extremely slow. A game without randomness like chess can need up to 100,000 self-play games of tuning for an improvement to be noted. When you account for the randomness of a game like Pokemon it's almost certainly going to be much more than that. Playing this many games, even super fast games with the bot against itself, is just not going to happen in any reasonable amount of time.
  2. (This one isn't backed by any data but is more of an intuition I have and I very well may be wrong.) A static evaluation function just isn't enough to properly play competitive Pokemon. The value of something like a Pokemon's HP is dynamic and can change depending on the team matchup. In my opinion an evaluation function would need to have much more additional logic than the one in this project in order to properly capture what is important in a Pokemon battle

That being said I haven't spent too much time looking into this and I would certainly like to be wrong. If you have any ideas on how this can be approached I'd love to hear it.

What I've experimented with:

Texel's method might be feasible for this project.

@Andyloris
Copy link
Contributor Author

Andyloris commented Dec 15, 2021

If you're using texel's tuning you would neeed to somehow store in a text file all details about a "position" so you would need to store the pokemons of the two sides, their attacks, pps, items, evs, ivs, stats, the information about the terrain, which could be feasible, but would take a lot of space if you wanted to generate a LOT of positions, and it would take a lot of time so i don't think it's feasible.

I've tried stockfish's tuning method but so far it's a complete failure.

I think using something like CLOP could work beacause it doesn't need to generate a lot of positions and actually works in other games where there is randomness (backgammon) . Did you try CLOP ?

@pmariglia
Copy link
Owner

If you're using texel's tuning you would neeed to somehow store in a text file all details about a "position" so you would need to store the pokemons of the two sides, their attacks, pps, items, evs, ivs, stats, the information about the terrain, which could be feasible, but would take a lot of space if you wanted to generate a LOT of positions, and it would take a lot of time so i don't think it's feasible.

Yup, that is something I was going to try. Running all of those battles would probably take several weeks just based on some rough math about how long battles can take. Then minimizing the error would probably take even longer.

I've tried stockfish's tuning method but so far it's a complete failure.

My experience was identical :) Too slow, and the variables just took a random walk.

I think using something like CLOP could work beacause it doesn't need to generate a lot of positions and actually works in other games where there is randomness (backgammon) . Did you try CLOP ?

Haven't tried CLOP, mostly just because I am not too familiar with the algorithm itself. I'd be open to exploring the idea with you (just send me a message on Discord - my handle is in the README), but my gut tells me that any parameter optimization technique for Pokemon just isn't going to have good results, especially ones that were developed for Chess. Again very happy to be wrong.

@Andyloris
Copy link
Contributor Author

Andyloris commented Dec 15, 2021

I'm not too familliar with CLOP but it seems really easy to use. I think i'm gonna try to use it with chess before using in pokemon.
I am using this program to tune my engine.

Edit: I'm way to impatient to start with chess, i'm gonna try to start with pokemon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants