GitHub - arkilpatel/Compositional-Generalization-Seq2Seq: ACL 2022: Revisiting the Compositional Generalization Abilities of Neural Sequence Models

Compositional Generalization in Seq2Seq Models

Revisiting the Compositional Generalization Abilities of Neural Sequence Models

Recently, there has been an increased interest in evaluating whether neural models are able to generalize compositionally. Previous work had shown that seq2seq models such as LSTMs lack the inductive biases required for compositional generalization. We show that by modifying the training data distributions, neural sequence models such as LSTMs and Transformers achieve near perfect accuracies on compositional generalization benchmarks such as SCAN and Colors.

Dependencies

compatible with python 3.6
dependencies can be installed using Compositional-Generalization-Seq2Seq/code/requirements.txt

Setup

Install VirtualEnv using the following (optional):

$ [sudo] pip install virtualenv

Create and activate your virtual environment (optional):

$ virtualenv -p python3 venv
$ source venv/bin/activate

Install all the required packages:

at Compositional-Generalization-Seq2Seq/code:

$ pip install -r requirements.txt

To create the relevant directories, run the following command in the corresponding directory of that model:

for eg, at Compositional-Generalization-Seq2Seq/code/transformer:

$ sh setup.sh

Then transfer all the data folders to the data subdirectory of that model, which in case is Compositional-Generalization-Seq2Seq/code/transformer/data/

Models

The current repository includes implementations of 2 Models:

Transformer in Compositional-Generalization-Seq2Seq/code/transformer
- Sequence-to-Sequence Transformer Model
LSTM in Compositional-Generalization-Seq2Seq/code/lstm
- Sequence-to-Sequence LSTM Model

Datasets

We work with the following datasets:

SCAN add jump split
- Train Data Size: 13,204 examples
- Test Data Size: 7,706 examples
- Available online at: https://github.com/brendenlake/SCAN
Colors
- Train Data Size: 14 examples
- Test Data Size: 8 examples
COGS
- Train Data Size: 24,155 examples
- Test Data Size: 21,000 examples
- Available online at: https://github.com/najoungkim/COGS

Usage:

The set of command line arguments available can be seen in the respective args.py file. Here, we illustrate running a Transformer on the SCAN add_jump dataset in which the train set was modified to include 100 extra primitives. Follow the same methodology for running any experiment over any model.

Training Transformer model on SCAN add_jump_100_prims_controlled train set

at Compositional-Generalization-Seq2Seq/code/transformer:

$	python -m src.main -mode train -project_name test_runs -model_selector_set val -pretrained_model_name none -finetune_data_voc none -dev_set -no-test_set -no-gen_set -dataset add_jump_100_prims_controlled -dev_always -no-test_always -no-gen_always -epochs 150 -save_model -no-show_train_acc -embedding random -no-freeze_emb -no-freeze_emb2 -no-freeze_transformer_encoder -no-freeze_transformer_decoder -no-freeze_fc -d_model 64 -d_ff 512 -decoder_layers 3 -encoder_layers 3 -heads 2 -batch_size 64 -lr 0.0005 -emb_lr 0.0005 -dropout 0.1 -run_name RUN-train_try -gpu 1

Testing the trained Transformer model on SCAN add_jump_100_prims_controlled test set

at Compositional-Generalization-Seq2Seq/code/transformer:

$	python -m src.main -mode test -project_name test_runs -pretrained_model_name RUN-train_try -finetune_data_voc none -no-dev_set -no-test_set -gen_set -dataset add_jump_100_prims_controlled_10_prims_test -batch_size 1024 -run_name RUN-test_try -gpu 1

Citation

If you use our data or code, please cite our work:

@misc{https://doi.org/10.48550/arxiv.2203.07402,
  doi = {10.48550/ARXIV.2203.07402},
  url = {https://arxiv.org/abs/2203.07402},
  author = {Patel, Arkil and Bhattamishra, Satwik and Blunsom, Phil and Goyal, Navin},
  keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
  title = {Revisiting the Compositional Generalization Abilities of Neural Sequence Models},
  publisher = {arXiv},
  year = {2022}, 
  copyright = {arXiv.org perpetual, non-exclusive license}
}

For any clarification, comments, or suggestions please contact Arkil or Satwik.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
code		code
data		data
images		images
LICENSE		LICENSE
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Compositional Generalization in Seq2Seq Models

Revisiting the Compositional Generalization Abilities of Neural Sequence Models

Dependencies

Setup

Models

Datasets

Usage:

Training Transformer model on SCAN add_jump_100_prims_controlled train set

Testing the trained Transformer model on SCAN add_jump_100_prims_controlled test set

Citation

About

Releases

Packages

Languages

License

arkilpatel/Compositional-Generalization-Seq2Seq

Folders and files

Latest commit

History

Repository files navigation

Compositional Generalization in Seq2Seq Models

Revisiting the Compositional Generalization Abilities of Neural Sequence Models

Dependencies

Setup

Models

Datasets

Usage:

Training Transformer model on SCAN add_jump_100_prims_controlled train set

Testing the trained Transformer model on SCAN add_jump_100_prims_controlled test set

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages