Skip to content
This repository has been archived by the owner on May 6, 2024. It is now read-only.

Update the documentation for NLD data to point to new hosting location. #364

Merged
merged 1 commit into from
Nov 17, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
120 changes: 85 additions & 35 deletions DATASET.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,65 +9,115 @@ This data is licensed under the NetHack General Public License - based on the GP

## Accessing the Dataset

The dataset is currently hosted on WeTransfer with open access for all. It will eventually move to its own dedicated hosting site, which is in the process of being set up. For the time being, `NLD-AA` is one file, while `NLD-NAO` is in 5 parts (4 ttyrec zips + the xlogfiles).
The dataset is currently hosted on FAIR's AWS with open access for all. Given its large size it has been split into smaller chunks for ease of downloading.


### Download Links

#### TasterPacks

We provide a small "taster pack" dataset, that contain a random subsample of the full datasets, to allow fast iteration for those looking to play around with `nld`.
- [`nld-aa-taster.zip`](https://we.tl/t-YjbQy4yPQJ) (1.6GB)
- [`nld-aa-taster.zip`](https://dl.fbaipublicfiles.com/nld/nld-aa-taster/nld-aa-taster.zip
) (1.6GB)


#### Full Downloads
You can download these by visiting the links or using curl:

`NLD-AA` (1 file)
- [`nld-aa.zip`](https://we.tl/t-wwN4lD7Hqn) (90 GB)


`NLD_NAO` (5 files)
- [`nld-nao_part1.zip`](https://we.tl/t-XQe15aXAes) (54GB)
- [`nld-nao_part2.zip`](https://we.tl/t-YRHHAb9gTe) (63GB)
- [`nld-nao_part3.zip`](https://we.tl/t-XB0iundCAU) (54GB)
- [`nld-nao_part4.zip`](https://we.tl/t-pkWlT0yTFK) (42GB)
- [`nld-nao_xlogfiles.zip`](https://we.tl/t-vy7IAGohCu) (124MB)


#### Downloading from Command Line

WeTransfer obscures the final download link, appending authentication keys to the link. To obtain a final working url that can function with, for instance, `wget` or `curl`:
`NLD-AA` (16 file)
```
# Download NLD-AA
mkdir -p nld-aa
curl -o nld-aa/nld-aa-dir-aa.zip https://dl.fbaipublicfiles.com/nld/nld-aa/nld-aa-dir-aa.zip
curl -o nld-aa/nld-aa-dir-ab.zip https://dl.fbaipublicfiles.com/nld/nld-aa/nld-aa-dir-ab.zip
curl -o nld-aa/nld-aa-dir-ac.zip https://dl.fbaipublicfiles.com/nld/nld-aa/nld-aa-dir-ac.zip
curl -o nld-aa/nld-aa-dir-ad.zip https://dl.fbaipublicfiles.com/nld/nld-aa/nld-aa-dir-ad.zip
curl -o nld-aa/nld-aa-dir-ae.zip https://dl.fbaipublicfiles.com/nld/nld-aa/nld-aa-dir-ae.zip
curl -o nld-aa/nld-aa-dir-af.zip https://dl.fbaipublicfiles.com/nld/nld-aa/nld-aa-dir-af.zip
curl -o nld-aa/nld-aa-dir-ag.zip https://dl.fbaipublicfiles.com/nld/nld-aa/nld-aa-dir-ag.zip
curl -o nld-aa/nld-aa-dir-ah.zip https://dl.fbaipublicfiles.com/nld/nld-aa/nld-aa-dir-ah.zip
curl -o nld-aa/nld-aa-dir-ai.zip https://dl.fbaipublicfiles.com/nld/nld-aa/nld-aa-dir-ai.zip
curl -o nld-aa/nld-aa-dir-aj.zip https://dl.fbaipublicfiles.com/nld/nld-aa/nld-aa-dir-aj.zip
curl -o nld-aa/nld-aa-dir-ak.zip https://dl.fbaipublicfiles.com/nld/nld-aa/nld-aa-dir-ak.zip
curl -o nld-aa/nld-aa-dir-al.zip https://dl.fbaipublicfiles.com/nld/nld-aa/nld-aa-dir-al.zip
curl -o nld-aa/nld-aa-dir-am.zip https://dl.fbaipublicfiles.com/nld/nld-aa/nld-aa-dir-am.zip
curl -o nld-aa/nld-aa-dir-an.zip https://dl.fbaipublicfiles.com/nld/nld-aa/nld-aa-dir-an.zip
curl -o nld-aa/nld-aa-dir-ao.zip https://dl.fbaipublicfiles.com/nld/nld-aa/nld-aa-dir-ao.zip
curl -o nld-aa/nld-aa-dir-ap.zip https://dl.fbaipublicfiles.com/nld/nld-aa/nld-aa-dir-ap.zip
```

**Firefox**

1. Start a download as usual, then cancel it.
2. Open Downloads (⌘J)
3. Right-click on the Download and click "Copy Download Link"

**Chrome**
1. Start a download as usual, then cancel it.
2. Open Downloads (⇧⌘J)
3. Right-click on the Link and click "Copy Link Address"
`NLD_NAO` (41 files)
```
# Download NLD-NAO
curl -o nld-nao/nld-nao-dir-aa.zip https://dl.fbaipublicfiles.com/nld/nld-nao/nld-nao-dir-aa.zip
curl -o nld-nao/nld-nao-dir-ab.zip https://dl.fbaipublicfiles.com/nld/nld-nao/nld-nao-dir-ab.zip
curl -o nld-nao/nld-nao-dir-ac.zip https://dl.fbaipublicfiles.com/nld/nld-nao/nld-nao-dir-ac.zip
curl -o nld-nao/nld-nao-dir-ad.zip https://dl.fbaipublicfiles.com/nld/nld-nao/nld-nao-dir-ad.zip
curl -o nld-nao/nld-nao-dir-ae.zip https://dl.fbaipublicfiles.com/nld/nld-nao/nld-nao-dir-ae.zip
curl -o nld-nao/nld-nao-dir-af.zip https://dl.fbaipublicfiles.com/nld/nld-nao/nld-nao-dir-af.zip
curl -o nld-nao/nld-nao-dir-ag.zip https://dl.fbaipublicfiles.com/nld/nld-nao/nld-nao-dir-ag.zip
curl -o nld-nao/nld-nao-dir-ah.zip https://dl.fbaipublicfiles.com/nld/nld-nao/nld-nao-dir-ah.zip
curl -o nld-nao/nld-nao-dir-ai.zip https://dl.fbaipublicfiles.com/nld/nld-nao/nld-nao-dir-ai.zip
curl -o nld-nao/nld-nao-dir-aj.zip https://dl.fbaipublicfiles.com/nld/nld-nao/nld-nao-dir-aj.zip
curl -o nld-nao/nld-nao-dir-ak.zip https://dl.fbaipublicfiles.com/nld/nld-nao/nld-nao-dir-ak.zip
curl -o nld-nao/nld-nao-dir-al.zip https://dl.fbaipublicfiles.com/nld/nld-nao/nld-nao-dir-al.zip
curl -o nld-nao/nld-nao-dir-am.zip https://dl.fbaipublicfiles.com/nld/nld-nao/nld-nao-dir-am.zip
curl -o nld-nao/nld-nao-dir-an.zip https://dl.fbaipublicfiles.com/nld/nld-nao/nld-nao-dir-an.zip
curl -o nld-nao/nld-nao-dir-ao.zip https://dl.fbaipublicfiles.com/nld/nld-nao/nld-nao-dir-ao.zip
curl -o nld-nao/nld-nao-dir-ap.zip https://dl.fbaipublicfiles.com/nld/nld-nao/nld-nao-dir-ap.zip
curl -o nld-nao/nld-nao-dir-aq.zip https://dl.fbaipublicfiles.com/nld/nld-nao/nld-nao-dir-aq.zip
curl -o nld-nao/nld-nao-dir-ar.zip https://dl.fbaipublicfiles.com/nld/nld-nao/nld-nao-dir-ar.zip
curl -o nld-nao/nld-nao-dir-as.zip https://dl.fbaipublicfiles.com/nld/nld-nao/nld-nao-dir-as.zip
curl -o nld-nao/nld-nao-dir-at.zip https://dl.fbaipublicfiles.com/nld/nld-nao/nld-nao-dir-at.zip
curl -o nld-nao/nld-nao-dir-au.zip https://dl.fbaipublicfiles.com/nld/nld-nao/nld-nao-dir-au.zip
curl -o nld-nao/nld-nao-dir-av.zip https://dl.fbaipublicfiles.com/nld/nld-nao/nld-nao-dir-av.zip
curl -o nld-nao/nld-nao-dir-aw.zip https://dl.fbaipublicfiles.com/nld/nld-nao/nld-nao-dir-aw.zip
curl -o nld-nao/nld-nao-dir-ax.zip https://dl.fbaipublicfiles.com/nld/nld-nao/nld-nao-dir-ax.zip
curl -o nld-nao/nld-nao-dir-ay.zip https://dl.fbaipublicfiles.com/nld/nld-nao/nld-nao-dir-ay.zip
curl -o nld-nao/nld-nao-dir-az.zip https://dl.fbaipublicfiles.com/nld/nld-nao/nld-nao-dir-az.zip
curl -o nld-nao/nld-nao-dir-ba.zip https://dl.fbaipublicfiles.com/nld/nld-nao/nld-nao-dir-ba.zip
curl -o nld-nao/nld-nao-dir-bb.zip https://dl.fbaipublicfiles.com/nld/nld-nao/nld-nao-dir-bb.zip
curl -o nld-nao/nld-nao-dir-bc.zip https://dl.fbaipublicfiles.com/nld/nld-nao/nld-nao-dir-bc.zip
curl -o nld-nao/nld-nao-dir-bd.zip https://dl.fbaipublicfiles.com/nld/nld-nao/nld-nao-dir-bd.zip
curl -o nld-nao/nld-nao-dir-be.zip https://dl.fbaipublicfiles.com/nld/nld-nao/nld-nao-dir-be.zip
curl -o nld-nao/nld-nao-dir-bf.zip https://dl.fbaipublicfiles.com/nld/nld-nao/nld-nao-dir-bf.zip
curl -o nld-nao/nld-nao-dir-bg.zip https://dl.fbaipublicfiles.com/nld/nld-nao/nld-nao-dir-bg.zip
curl -o nld-nao/nld-nao-dir-bh.zip https://dl.fbaipublicfiles.com/nld/nld-nao/nld-nao-dir-bh.zip
curl -o nld-nao/nld-nao-dir-bi.zip https://dl.fbaipublicfiles.com/nld/nld-nao/nld-nao-dir-bi.zip
curl -o nld-nao/nld-nao-dir-bj.zip https://dl.fbaipublicfiles.com/nld/nld-nao/nld-nao-dir-bj.zip
curl -o nld-nao/nld-nao-dir-bk.zip https://dl.fbaipublicfiles.com/nld/nld-nao/nld-nao-dir-bk.zip
curl -o nld-nao/nld-nao-dir-bl.zip https://dl.fbaipublicfiles.com/nld/nld-nao/nld-nao-dir-bl.zip
curl -o nld-nao/nld-nao-dir-bm.zip https://dl.fbaipublicfiles.com/nld/nld-nao/nld-nao-dir-bm.zip
curl -o nld-nao/nld-nao-dir-bn.zip https://dl.fbaipublicfiles.com/nld/nld-nao/nld-nao-dir-bn.zip
curl -o nld-nao/nld-nao_xlogfiles.zip https://dl.fbaipublicfiles.com/nld/nld-nao/nld-nao_xlogfiles.zip
```

### Reconstructing the Dataset

Unzip the files in the standard way, with separate directories for `NLD-AA`, and `NLD-NAO`.


```bash
$ unzip /path/to/nld-aa.zip
# for NLD-AA
# will give you an nle_data directory at /path/to/dir/nld-aa-dir/nld-aa/nle_data/
$ unzip /path/to/nld-aa/nld-aa-dir-aa.zip -d /path/to/dir
$ unzip /path/to/nld-aa/nld-aa-dir-ab.zip -d /path/to/dir
$ unzip /path/to/nld-aa/nld-aa-dir-ac.zip -d /path/to/dir
...


# for NLD-NAO - don'f forget xlogfiles
# will give you an directory with ending with nld-nao-unzipped/
$ unzip /path/to/nld-xlogfiles.zip -d /path/to/nld-nao
$ unzip /path/to/nld-nao_part1.zip -d /path/to/nld-nao
$ unzip /path/to/nld-nao_part2.zip -d /path/to/nld-nao
$ unzip /path/to/nld-nao_part3.zip -d /path/to/nld-nao
$ unzip /path/to/nld-nao_part4.zip -d /path/to/nld-nao
$ unzip /path/to/nld-nao-dir-aa.zip -d /path/to/nld-nao
$ unzip /path/to/nld-nao-dir-ab.zip -d /path/to/nld-nao
$ unzip /path/to/nld-nao-dir-ac.zip -d /path/to/nld-nao
...
```


- NB: `NLD-AA` is already a single directory, so will unzip to one directory already,
where as all the `NLD-NAO` files should be zipped to one directory.

## Using the Dataset ([Colab Demo](https://colab.research.google.com/drive/1GRP15SbOEDjbyhJGMDDb2rXAptRQztUD?usp=sharing))

The code needed to use the dataset will be distributed in `NLE v0.9.0`. For now it can be found on the `main` branch of [NLE](https://github.com/facebookresearch/nle). You can follow the instructions to install [there](https://github.com/facebookresearch/nle), or try the below.
Expand All @@ -89,8 +139,8 @@ import nle.dataset as nld
if not nld.db.exists():
nld.db.create()
# NB: Different methods are used for data based on NLE and data from NAO.
nld.add_nledata_directory("/path/to/nld-aa", "nld-aa-v0")
nld.add_altorg_directory("/path/to/nld-nao", "nld-nao-v0")
nld.add_nledata_directory("/path/to/nld-aa/nle_data", "nld-aa-v0")
nld.add_altorg_directory("/path/to/nld-nao-unzipped", "nld-nao-v0")

dataset = nld.TtyrecDataset("nld-aa-v0", batch_size=128, ...)
for i, mb in enumerate(dataset):
Expand Down
19 changes: 7 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -269,17 +269,12 @@ If you use NLE in any of your work, please cite:
If you use NLD or the datasets in any of your work, please cite:

```
@inproceedings{hambro2022dungeonsanddata,
author = {Eric Hambro and
Roberta Raileanu and
Danielle Rothermel and
Vegard Mella and
Tim Rockt{\"{a}}schel and
Heinrich K{\"{u}}ttler and
Naila Murray},
title = {{Dungeons and Data: A Large-Scale NetHack Dataset}},
booktitle = {Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track},
year = {2022},
url = {https://openreview.net/forum?id=zHNNSzo10xN}
@article{hambro2022dungeons,
title={Dungeons and Data: A Large-Scale NetHack Dataset},
author={Hambro, Eric and Raileanu, Roberta and Rothermel, Danielle and Mella, Vegard and Rockt{\"a}schel, Tim and K{\"u}ttler, Heinrich and Murray, Naila},
journal={Advances in Neural Information Processing Systems},
volume={35},
pages={24864--24878},
year={2022}
}
```
Loading