Generating Super Mario levels using LSTM

For more generated levels, please scroll to the bottom!

Description

This repo intends to replicate Super Mario as a String: Platformer Level Generation Via LSTMs. However, there are some differences between this implementation and what the paper describes. In this implementation:

Snaking and level depth are not used.
- Although the paper claims that the number of characters between the left and right sides of pipes are decreased by snaking down and up, it is also obvious that this number is increased by snaking up and down. They are two sides of the same coin.
- The paper claims that one depth character is added to the fifth column, two depth characters are added to the tenth column and etc. This means that an average Mario level (~200 columns) would have 40+ depth characters appended to later columns. To me, this seems very inefficient since seq_length is only 200.
Validation is done after each epoch instead of every 200 training examples (I guessed this number; the paper was not clear on this).
The LSTM used here similar to Andrej Karpathy's min-char-rnn.py, which supports seeds of arbitrary length during generation. It does not simply use n characters to predict the next. It isn't clear what the paper used.
- Please read through Andrej's code line-by-line if you are confused by what I wrote above; it gave me a crystal clear understanding of the difference between a LSTM and a standard feed-forward neural network (that simply uses the last n to characters to predict the next).
- To see some empirical evidence that LSTM memorizes longer than seq_length used for training, check out the first paragraph of section 4.2 of Visualizing and Understanding Recurrent Networks by Andrej.
Perfect pipe generation is not achieved here.

Please shoot me a pull request or message if you have any suggestions.

To understand how everything works, please go through the following notebooks in sequence:

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
data_preprocessed		data_preprocessed
data_raw		data_raw
generated_levels_png		generated_levels_png
generated_levels_txt		generated_levels_txt
tiles		tiles
trained_models		trained_models
.gitignore		.gitignore
01_preprocess_data.ipynb		01_preprocess_data.ipynb
02_train_model.ipynb		02_train_model.ipynb
03_generate_txt_from_model.ipynb		03_generate_txt_from_model.ipynb
04_convert_txt_to_png.ipynb		04_convert_txt_to_png.ipynb
LICENSE		LICENSE
README.md		README.md
parse_preprocessed_data.py		parse_preprocessed_data.py