DFL

What is DFL?

DFL is a federated machine learning framework which uses blokchain as a proof of contributions to ML models, rather than a distributed ledger to records the aggregated models from clients.

Features

No centralized node, no centralized ML models.
High performance blockchain system. Nearly zero overhead.

DFL1 vs DFL2

DFL2 supports hunter to automatically download and configure the dependencies, which means you should be able to compile DFL2 on more platforms.
DFL2 integrates large-scale DFL network infrastructures.

Dependency

Install this package from package manager (such as apt, yum).

libunwind (https://github.com/libunwind/libunwind)

It is optional to install these following dependencies with your package manager. If these dependencies are not found, hunter will download and compile them. Hunter will only download necessary dependencies such as OpenBLAS, gflags etc. LMDB, CUDA support are not included within Hunter.

We recommend installing the following dependency from source code for better performance.

openblas (https://github.com/xianyi/OpenBLAS)

Getting started

For deployment

Install CMake and GCC with C++17 support.
Compile DFL executable(the source code is in DFL.cpp, you can find everything you need in CMake), which will start a node in the DFL network. There are several tools that we recommend to build, they are listed below:
- Keys generator: to generate private keys and public keys. These keys will be used in the configuration file.
Compile your own "reputation algorithm", which will define the way of updating ML models and updating the other nodes' reputation. This implementation is critical for different dataset distribution, malicious ratio situations. We provide four sample "reputation algorithm" here.
Run DFL executable, it should provide a sample configuration file for you.
Modify the configuration file as you wish, for example, peers, node address, private key, public key, etc. Notice that the batch_size and test_batch_size must be identical to the Caffe solver's configuration. Here is an explaination file for the configuration.
DFL receive ML dataset by network, there is an executable file called data_injector for MNIST dataset, use it to inject dataset to DFL. Current version of data_injector only supports I.I.D. dataset injection.
DFL will train the model once it receives enough dataset for training, and send it as a transaction to other nodes. The node will generate a block when generating enough transactions and perform FedAvg when receiving enough models from other nodes.

For simulation

Perform step 1 in deployment.
Compile DFL_Simulator_mt for multi-threading optimization. Or DFL_Simulator_opti for less memory consumption but without "reputation algorithm" support.

Some other tools:
- Dirichlet_distribution_generator_for_Non_IID dataset, used to generate Dirichlet distribution. You can execute without any arguments it to get its usage.
- large_scale_simulation_generator, it can automatically generate a configuration file for many many nodes (the configuration file is over 3000+ lines, so you'd better use this tool if you want to simulate for over 20 nodes).
Run the simulator, it should generate a sample configuration file and execute simulation immediately. You can use Ctrl+C to exit.
Modify the configuration file with this explanation file.
The simulator will automatically crate an output folder, whose name is the current time, in the executable path. The configuration file and reputation dll will also be copied to the output folder for easily reproduce the output.

Reputation algorithm SDK API:

Please refer to this link for sample reputation algorithm. The SDK API is not written yet.

For more details

https://dl.acm.org/doi/10.1145/3600225

Name		Name	Last commit message	Last commit date
Latest commit History 649 Commits
.github/workflows		.github/workflows
3rd		3rd
bin		bin
cmake		cmake
dataset		dataset
lib		lib
py_tool		py_tool
readme		readme
shell		shell
tests		tests
.gitignore		.gitignore
.gitmodules		.gitmodules
CMakeLists.txt		CMakeLists.txt
LICENSE.txt		LICENSE.txt
README.md		README.md
note.txt		note.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DFL

What is DFL?

Features

DFL1 vs DFL2

Dependency

Getting started

For deployment

For simulation

Reputation algorithm SDK API:

For more details

About

Releases

Packages

Languages

License

twoentartian/DFL2

Folders and files

Latest commit

History

Repository files navigation

DFL

What is DFL?

Features

DFL1 vs DFL2

Dependency

Getting started

For deployment

For simulation

Reputation algorithm SDK API:

For more details

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages