Create a regular expression language and run the LSTM.
This project is made in order to test LSTM capabilities.
- Generate train and dev sets.
gen_examples.py
is an example on how to create a language.
The scriptgen_examples.py
will create regular expression samples for the language:
[0-9]+a+[0-9]+b+[0-9]+c+[0-9]+d+[0-9]+
Parameters:
samples_file
– the file which the samples will be writen to.
num_samples
– number of samples to generate.
seq_max
– maximum length of each contiguous sequence, for example a+ will be a sequence of random number ofa
in range [1,seq_max
].
example:python gen_examples.py data/train 1000 50
- Train the LSTM model on the language.
RunbasicLSTM.py
on the data you created and check the results.
Parameters:
train_samples
– train samples file.
dev_samples
– dev samples file.
example:python basicLSTM.py data/train data/dev
- PyTorch – the deep learning platform used
Bar Katz – bar-katz on github – [email protected]
- Fork it (https://github.com/bar-katz/Basic_RNN/fork)
- Create your feature branch (
git checkout -b feature
) - Commit your changes (
git commit -am 'add feature'
) - Push to the branch (
git push origin feature
) - Create a new Pull Request
Share your results!
Found a language LSTM can not learn try to understand why and add your language description to languages directory.