Skip to content

Latest commit

 

History

History
31 lines (19 loc) · 1.01 KB

README.md

File metadata and controls

31 lines (19 loc) · 1.01 KB

REINFORCEMENT LEARNING FOR SNAKE

Snake

We've all played it. But did you make a computer play it?

Feel like snaking?

python3 snake.py

Feel like learning the snaking?

python3 qlearn.py Q.npy

Feel like watching the learned snaking?

python3 autoplay.py Q.npy

Snake it don't break it.

Model

There are approximately as many states in a five-by-five game of snake as there are ants on Earth -- about 100 000 000 000 000 000. [1] Problematic. How to deal? Well, we reduce the state space to the eight cells surrounding the ant. I mean snake. Then we use plain Q learning. Probably will try some other function estimator in the future. That will be on-policy though (i.e. uses the same policy to estimate value as it does to choose actions) as opposed to the Q learning which simply takes the next state's best action as the value.

[1] https://www.quora.com/How-many-ants-are-there-in-the-world