Skip to content

Releases: allenai/OLMo-core

v0.1.0

11 Jun 21:48
Compare
Choose a tag to compare

What's new

Added 🎉

  • Initial release.

Commits

910a0d7 (chore) prepare for release v0.1.0
8ef9b8b Add release workflow (#24)
30a13c1 Don't import DeviceMesh in distributed.utils (#23)
93ec91a Avoid using __torch_dispatch__ prior to v2.3.0 (#22)
22db20c Refactor ShardedFlatTensor class (#21)
b2c3c09 Fix a regression with checkpointing optimizer state (#20)
7a2299a Fix a CUDA synchronization issue with FSDP (#19)
fb20119 Validate loading model and optim state (#18)
c726658 downgrade priority
f8137cc Add checkpointing support for DTensors (#17)
eb56a9f Add weka as a new S3-like scheme (#16)
95bcc38 Add SHARD_GRAD_OP FSDP sharding strategy
b919c9b allow autowrap with function
ae42293 Generalize sharding spec for ShardedFlatTensor (#15)
70ee999 Increment torch version (#14)
1847b8e More improvements to FSDP, benchmark against DDP (#13)
a4e0ccf FSDP memory usage improvements (#12)
8c75d23 Add a post-backward stream for more throughput improvement (#11)
2cfd3a7 More FSDP optimizations (#10)
f235164 More FSDP optimizations (#9)
963f7dd FSDP fixes (#8)
15be9f2 Pass process group to mark_as_sharded (#7)
64b5bf7 Distributed updates (#6)
32afdaa fix normalizing dir
b97bb4a add py.typed, simplify pyproject.toml
0f410e9 FSDP optimizations, documentation additions (#5)
767a803 Document FSDP, add tests with act checkpointing
1ea4d0b improve docs
98ca9bf add github actions (#4)
759151d add docs
97e36b8 fix
de866d3 return files created by the current rank
ff37b6d don't try to get bytes for empty tensors
a958f42 improve error message
645dbf5 fixes
9aed170 fix
104659c fix
14bff8a don't override 'param_names'
0b35d65 clean up
be40ab2 Get checkpointing working for PyTorch FSDP (#3)
8e1fbd7 more checkpointing improvements, add ShardedFlatTensor class (#2)
ef996d9 Checkpointing improvements (#1)
4b3f7f7 Add FSDP.clip_grad_norm_()
5a17360 API updates
c77885b Add FSDP.apply()
23a22a1 Add FSDP.auto_wrap() class method
c49d5bb move FSDPState to its own module
6ba98cd skip weird combinations
4078dfd add to stream test
c90c2bc fix
c4717b8 fix
f51f2b7 Add test for cuda streams
a577957 fix test
9169fa6 another fix
20e2ec2 fix
7a27d12 fix other test
2520139 fix test
fc8fe2c skip weird combination
bbe0d3c more test fixes
39267e4 fix test
2b78dc9 ensure tensor on right device
d955b17 update tests
33e5bed Prefetching
c7eb5e1 update IO utils for HTTP
7dd6760 Implement streams
f4b6215 FSDP improvements
9bebbed add readme
5ced198 Add support for gradient accumulation
fd622ee Get tests working
06ce9e0 Get basic FSDP working
e3db911 add test with sharded model
0c60f86 Add functions for saving/loading model+optimizer state
118b6c1 Add Checkpointer.unshard() method
f2300c3 Support remote checkpoints
93754e1 Initial commit