Geometry Dash is a popular rhythm game where you play as a cube character who slides along the level, avoiding obstacles strategically placed to the rhythm of the song in the level.
For my machine learning final project in university, I chose to develop a model that is capable of traversing the environments presented in the game using only the pixel data that is repeatedly captured from the screen.
We wanted the project to be a simple supervised learning model with convolutional and fully connected layers, so data collection and processing was necessary (whereas it otherwise would not have been had we chosen to use reinforcement learning,).
So, firstly I wrote code to record the current key being pressed at any given point (or lack thereof) as well as the screen data. After capturing the data, I perform post processing in the form of converting the image to grayscale, and then resizing to 100×100. Finally, it is appended to an array of captured images, and the key press information is appended to a separate array.
My next step was to write the code for the model, which was simple enough using the PyTorch machine learning library. I wrote the model as described above and had it cycle through training on data from different levels throughout the training process. I did this so as to not overfit the model on any one level. To further reduce the possibility of overfitting on one level, I provided the model with an equal number of recordings for each level.
I trained the model such that given some input image, it must output the action associated with it. After training the model over several epochs on a variety of data obtained from different levels in the game, it seemed to be able to generalize the input information well enough that it could play a level on its own.
Here is a video of the model playing a custom made level, hosted by my friend and project partner Joshua Bales, who is also responsible for playing the game and using my recording code to gather data for the model to train on.