- Whisper: Small fixes + CI fix (@iejmac)
- Whisper: Allow input_format=table (@iejmac)
- YtDownloader: don't print warnings (@iejmac)
- SLURM improvments, audio fixes (@m-bain)
- Support non-youtube video platforms (@vinyesm)
- ClippingSubsampler: fix d-type change of "clips" (@SCZwangxiao)
- Resolution subsampler passes through metadata (@MattUnderscoreZhang)
- Added width and height options for ResolutionSubsampler (@MattUnderscoreZhang)
- Fix for audio sampler never called (@vinyesm)
- Add support for subtitles from multiple languages (@sramshetty)
- Clipping subsampler refactor (@MattUnderscoreZhang)
- Subset worker refactor (@MattUnderscoreZhang)
Official Release of video2dataset v1 See blog post - https://laion.ai/blog/video2dataset/
- Implemented many subsamplers
- Tested on 3 datasets
- Good DataLoader
- Distributing with spark and multiprocessing
- Multistage setup works
- it works a bit