Please have these installed before proceeding
- Clang
- Protocol Buffers
- gRPC
- Ninja
- Meson >= 0.55
- TensorFlow < 2
- CUDA, NCCL
-
Configure the build:
CXX=clang++ meson build
This command configures the build in a directory named
build
.It is recommended that you use
lld
as the linker. To do so, use the following command instead:CXX=clang++ CXX_LD=lld meson build
-
Build eLF:
ninja -C build
-
Do some tests
meson test -C build -v
- Start elf.Controller
- Modify TensorFlow code with elf.Worker and elf.tensorflow.ElasticOptimizer
- Start worker code to scale out
- Call
Controller.leave(worker_id)
on the controller orWorker.leave()
on the worker program to scale in