NVFlare 2.2.1 Release
Feature Highlights
FL Simulator -- A lightweight simulator of a running NVFLARE FL deployment. It allows researchers to test and debug their application without provisioning a real project. The FL jobs run on a server and multiple clients in the same process but in a similar way to how it would run in a real deployment. Researchers can quickly build out new components and jobs that can then be directly used in a real production deployment.
FLARE Dashboard NVFLARE's web UI. In its initial incarnation, the Flare Dashboard is used to help project setup, user registration, startup kits distribution and dynamic provisions. Dashboard setup and apis can be found here
Site-policy management -- Prior to NVFLARE 2.2, all policies (resource management, authorization and privacy protection, logging configurations) can only be defined by the Project Admin during provision time; and authorization policies are centrally enforced by the FL Server. NVFLARE 2.2 makes it possible for each site to define its own policies in the following areas:
- Resource Management: the configuration of system resources that are solely the decisions of local IT.
- Authorization Policy: local authorization policy that determines what a user can or cannot do on the local site. see related Federated Authorization
- Privacy Policy: local policy that specifies what types of studies are allowed and how to add privacy protection to the learning results produced by the FL client on the local site.
- Logging Configuration: each site can now define its own logging configuration for system generated log messages.
Federated XGBoost -- We developed federated XGBoost for data scientists to perform machine learning on tabular data with popular tree-based method. In this release, we provide several approaches for the horizontal federated XGBoost algorithms.
- Histogram-based Collaboration -- leverages recently released (XGBoost 1.7.0) federated versions of open-source XGBoost histogram-based distributed training algorithms, achieving identical results as centralized training (trees trained on global data information).
- Tree-based Collaboration -- individual trees are independently trained on each client's local data without aggregating the global sample gradient histogram information. Trained trees are collected and passed to the server / other clients for aggregation and further boosting rounds.
Federated Statistics -- built-in federated statistics operators that can generate global statistics based on local client side statistics. The results, for all features of all datasets at all sites as well as global aggregates, can be visualized via the visualization utility in the notebook.
MONAI Integration In 2.2 release, we provided two implementations by leveraging MONAI Bundle.
MONAI ClientAlgo Integration -- enable running MONAI bundles directly in a federated setting using NVFLARE
MONAI ClientAlgoStats Integration -- through NVFLARE Federated Statistics we can generate, compare and visualize all clients' data statistics generated from MONAI summary statistics
Tools and Production Support
Improved POC command
Dynamic Provision
Docker Compose
Preflight Check
Migrations tips
To migrate from releases prior to 2.2.1, here are few notes that might help migrate to 2.2.1.
What's Changed
- Pick up main changes by @YuanTingHsieh in #692
- Change np model persistor log level by @YuanTingHsieh in #693
- Remove env command in fl admin api by @YuanTingHsieh in #696
- Change print to logger.info in client executor by @YuanTingHsieh in #699
- Add list_files and download_job for single file by @nvkevlu in #698
- Add provisioned setting to integration tests and re-factor by @YuanTingHsieh in #697
- Add dns entries into ci scripts by @YuanTingHsieh in #700
- Add citation file by @YuanTingHsieh in #702
- Add doi to citation file by @YuanTingHsieh in #703
- Fix blossom CI/CD tests by @YuanTingHsieh in #704
- Remove self-hosted runners and use only blossom ci by @YuanTingHsieh in #707
- Fix setup.py by @YuanTingHsieh in #709
- Clean up workflow implementations, use self._engine by @YuanTingHsieh in #712
- Fix typos in tf related codes in examples and tests by @YuanTingHsieh in #713
- Docker compose by @IsaacYangSLA in #714
- Fix the shared volume settings among servers in compose.yaml by @IsaacYangSLA in #716
- Add cov.xml to unit test script by @YuanTingHsieh in #715
- Wait for cleanup process before start next test case by @YuanTingHsieh in #726
- Update example structure from app to job by @YuanTingHsieh in #724
- 625 Add monai bundle example by @yiheng-wang-nv in #721
- Fix overseer code style and issues by @YuanTingHsieh in #729
- Add research and integration folder by @YuanTingHsieh in #732
- Clean up ClientRunner and Communicator by @YuanTingHsieh in #720
- Add two command options to add user/client by @IsaacYangSLA in #723
- Helm chart by @IsaacYangSLA in #725
- Add nvflare poc command by @chesterxgchen in #739
- Clean up overseer by @YuanTingHsieh in #741
- Add org to cert by @IsaacYangSLA in #743
- Add two functions to retrieve roles and org from cert by @IsaacYangSLA in #744
- rename README.MD -> README.md by @holgerroth in #748
- Add Collective communication and hello-mpi example by @YuanTingHsieh in #719
- nvflare poc syntax update and add NVFLARE_POC_WORKSPACE env variable by @chesterxgchen in #751
-
- add fed_sm and auto_fed_rl research directories with index.html re… by @chesterxgchen in #754
- Remove doc build in github by @YuanTingHsieh in #758
- add client GPU assignment for POC command by @chesterxgchen in #759
- Add preflight check scripts by @YuanTingHsieh in #742
- Fix package checker utils test by @YuanTingHsieh in #762
- Move GPU utils out of poc_commands for later reuse by @YuanTingHsieh in #765
- Update integration test validator log level by @YuanTingHsieh in #768
- handle no gpu or no nvidia-smi installed cases by @chesterxgchen in #773
- consolidate NVFLARE CLI by @chesterxgchen in #774
- Federated Statistics Operator with Data Frame Statistics example. by @chesterxgchen in #750
- Add report resources command and fix resource manage issues by @YuanTingHsieh in #766
- Fix License check issue by @chesterxgchen in #779
- Added options to disable old TLS versions by @nvidianz in #775
- XGBoost with cyclic and bagging training by @ZiyueXu77 in #740
- Only restore the FLcomponent from the snapshot. by @yhwen in #796
- Add dashboard cli by @IsaacYangSLA in #801
- Clean up fuel utils and client executor by @YuanTingHsieh in #761
- Include org info in sub_start.sh, compose.yaml, helm chart and other … by @IsaacYangSLA in #808
- Local folder from provisioning by @IsaacYangSLA in #800
- link to temporary Auto-FedRL repo by @holgerroth in #811
- add 3rd party licenses by @chesterxgchen in #812
- Update cert reading (get_identity_info) and component (resource manager) by @IsaacYangSLA in #807
- Simulator by @yhwen in #757
- add apt_opt module by @chesterxgchen in #809
- XGBoost Tree-based Example Refactor and Refine by @ZiyueXu77 in #799
- Merged FOBS from 2.1 to dev by @nvidianz in #818
- Added FOBS readme link to the doc by @nvidianz in #827
- Add hello-xgboost example by @YuanTingHsieh in #833
- Remove collective communication codes by @YuanTingHsieh in #835
- Update issue templates by @YuanTingHsieh in #834
- Fix client aux runner by @YuanTingHsieh in #836
- Added MsgPack to the setup.py by @nvidianz in #829
- SimulatorRunner config for dev branch. by @yhwen in #817
- Clean up xgboost example by @YuanTingHsieh in #842
- Schema of user/site/org/resource by @IsaacYangSLA in #708
- Fix get_connected_clients in fl_admin_api by @YuanTingHsieh in #843
- add monai integration by @holgerroth in #828
- Fix delete operation by @IsaacYangSLA in #859
- update readme simulator arguments by @holgerroth in #861
- add CodeQL action by @YanxuanLiu in #718
- added holgerroth to trigger list by @YanxuanLiu in #863
- Re-factor hello-xgboost codes by @YuanTingHsieh in #862
- Fixed several issues in simulator by @yhwen in #850
- Remove CORS because setting baseUrl to "" works by @IsaacYangSLA in #867
- cherry-pick 2.2 changes into dev by @yhwen in #868
- Increase download_count for user and client whenever blob is downloaded by @IsaacYangSLA in #869
- Fix wrong id during client blob download by @IsaacYangSLA in #871
- Fix editing site name by @IsaacYangSLA in #873
- Redirect / to /index.html by @IsaacYangSLA in #874
- Dev authz integrate by @yhwen in #856
- Allow project admin to launch dashboard with passphrase to encrypt ro… by @IsaacYangSLA in #875
- sync 2.2 branch changes back to dev for CLI commands [skip ci] by @chesterxgchen in #837
- Fix POC scripts and json files by @YuanTingHsieh in #877
- dev_branch: fix POC command bug [skip ci] by @chesterxgchen in #881
- Make task_request_interval and task_check_period configurable by @YuanTingHsieh in #878
- cherry-pick doc update (#787) by @nvkevlu in #882
- Create an empty local folder in admin during POC by @IsaacYangSLA in #886
- sync 2.2 Fed Stats changes back to Dev branch [ skip ci] by @chesterxgchen in #887
- Fix gunicorn worker issue on role by @IsaacYangSLA in #889
- Fed stats new logic [skip ci] by @chesterxgchen in #890
- Add GPU resource consumer by @YuanTingHsieh in #763
- Fix hello-xgboost in secure mode by @YuanTingHsieh in #891
- Minor improvements for dashboard by @IsaacYangSLA in #898
- Integrate dashboard cli into nvflare cli by @IsaacYangSLA in #899
- Fix app validator by @YuanTingHsieh in #892
- Fed stats privacy2 [skip ci] by @chesterxgchen in #897
- Fix ci/cd by @YuanTingHsieh in #900
- Update admin api by @YuanTingHsieh in #902
- Add user guide on dashboard api by @IsaacYangSLA in #903
- Update hello tf2 headers by @YuanTingHsieh in #901
- add docs for front end of dashboard by @nvkevlu in #908
- Make task check period configurable in workflows by @YuanTingHsieh in #888
- Add base resource manager by @YuanTingHsieh in #905
- monai_nvflare update dependency versions by @holgerroth in #907
- Federated Analytics Conversion to Fed Stats + Visualization Bug fix + Fed Stats Spec update [skip ci] by @chesterxgchen in #909
- update README.md [skip ci] by @chesterxgchen in #911
- Update model and cli for dashboard by @IsaacYangSLA in #916
- update README.md [slip ci] by @chesterxgchen in #917
- Add pip install optional extra [HE] by @IsaacYangSLA in #920
- wrap authz_preview into CLI [skip ci] by @chesterxgchen in #912
- Use GPU 0 GPUResourceManager as default [gpu] by @YuanTingHsieh in #910
- Fed Stats Bug Fix [skip ci] by @chesterxgchen in #922
- check None component for disabled component [skip ci] by @chesterxgchen in #925
- Update dashboard cli to remove dashboard_image option [skip ci] by @IsaacYangSLA in #921
- Add docker compose user guide [skip ci] by @IsaacYangSLA in #926
- Fixes several problems found during authz testing [skip ci] by @nvidianz in #914
- Fix mgpu executor by @yhwen in #924
- Add helm chart user guide [skip ci] by @IsaacYangSLA in #927
- Add secure logging by @YuanTingHsieh in #919
- accelerations for tree based federated xgboost by @eordentlich in #885
- v2.2 doc restructure, take 1 [skip ci] by @kkersten in #928
- Fix code format [skip ci] by @YuanTingHsieh in #930
- convert to simulator with latest performance updates [skip ci] by @ZiyueXu77 in #933
- Clean up utils and unify job constants usage by @YuanTingHsieh in #934
- add sections for the docs changes needed for 2.2 [skip ci] by @nvkevlu in #931
- remove hello-monai and update hello-monai-bundle readme [skip ci] by @holgerroth in #938
- Bump protobuf from 3.20.1 to 3.20.2 [skip ci] by @dependabot in #929
- Removed unused server_meta [skip ci] by @YuanTingHsieh in #935
- Update hello-* example readme to use simulator [skip ci] by @YuanTingHsieh in #939
- Switch to 2.2.1.devYYMMDD format for nightly build [skip ci] by @IsaacYangSLA in #940
- Update information in dynamic provisioning document [skip ci] by @IsaacYangSLA in #943
- Remove unused vars in datakind by @YuanTingHsieh in #937
- Unit Tests for examples and README update [skip ci] by @chesterxgchen in #932
- consolidate xgboost example directories and add top level readme [skip ci] by @eordentlich in #942
- README.rst for app_opt/xgboost module [skip ci] by @nvidianz in #923
- Add option to create non-ha project.yml template [skip ci] by @IsaacYangSLA in #947
- Added filter to audit.log to exclude internal commands by @nvidianz in #948
- Pre run [skip ci] by @chesterxgchen in #944
- Changed decomposer unit test to run through whole FOBS [skip ci] by @nvidianz in #949
- Remove the old provisioning UI and mentions of it from the docs [skip ci] by @nvkevlu in #951
- Add init.py to dashboard folder by @IsaacYangSLA in #952
- Fix broken mprocess executor [skip ci] by @yhwen in #941
- Clean up client engine specs by @YuanTingHsieh in #906
- Unify tree-based and histogram-based examples [skip ci] by @YuanTingHsieh in #945
- add missing init.py file [skip ci] by @chesterxgchen in #954
- add missing init.py file [skip ci] by @chesterxgchen in #955
- Quick fix on import module [skip ci] by @IsaacYangSLA in #956
- rename metric, metrics to static, statics for codes and doc [skip ci] by @chesterxgchen in #957
- Add version arg to nvflare cli [skip ci] by @YuanTingHsieh in #958
- add missing step in sequence diagram [skip ci] by @chesterxgchen in #960
- Fix simulator package issue [skip ci] by @YuanTingHsieh in #962
- update sequence diagram [skip ci] by @chesterxgchen in #961
- Fixed the class_util for class import. by @yhwen in #959
- Increase token timeout to 30 min by @IsaacYangSLA in #965
- Fix real world fl docs reference [skip ci] by @YuanTingHsieh in #968
- Change back server binding to "0.0.0.0" by @IsaacYangSLA in #969
- Fix xgboost cyclic and move codes into app_opt [skip ci] by @YuanTingHsieh in #964
- Add monai multi-gpu training example [skip ci] by @holgerroth in #963
- Remove unused parts in log.config [gpu] by @YuanTingHsieh in #966
- Add ListResourceConsumer to support ListResourceManager [skip ci] by @YuanTingHsieh in #970
- Fix format issues [skip ci] by @YuanTingHsieh in #967
- Simulator max clients by @yhwen in #971
- POC Update [skip ci] by @chesterxgchen in #974
- Few minor changes in Fed Stats [skip ci] by @chesterxgchen in #973
- Change dashboard folder option to relative to current working directory [skip ci] by @IsaacYangSLA in #978
- Added unit_test for Dashboard APIs by @nvidianz in #975
- Return 409 for conflict during creating / patching clients by @IsaacYangSLA in #984
- Fixed the simulator relative workspace issue. by @yhwen in #985
- fix a few bugs found by Yan by @chesterxgchen in #981
- make fixes in docs for links, formatting, cleaning up [skip ci] by @nvkevlu in #976
- Run the Simulator in a separate process. by @yhwen in #982
- Redirect all 10 partial to .html files by @IsaacYangSLA in #983
- Remove rounding when computing percent histogram [skip ci] by @holgerroth in #986
- improve Fed stats filter [skip ci] by @chesterxgchen in #988
- Fix server engine abort_app_on_server by @YuanTingHsieh in #989
- more additions and updates to the docs [skip ci] by @nvkevlu in #992
- Moved the custom folder into local folder. by @yhwen in #987
- Fixed simulator log overwrite. by @yhwen in #990
- Allow user to show different plot_types in visualization [skip ci] by @chesterxgchen in #993
- handle cases where server_runner is not ready with client's request -- Yan's code by @chesterxgchen in #997
- Fix CTR-C to kill all child processes. by @yhwen in #994
- Switch to XGBoost Communicator API by @rongou in #996
- Clean up authz by @YuanTingHsieh in #999
- Clean up lighter project.yml and workspace docstring [skip ci] by @YuanTingHsieh in #1000
- Look for not authorized instead of Authorization Error when parsing by @YuanTingHsieh in #998
- Removed the workspace object from snapshot. by @yhwen in #1003
- Fix xgboost histogram based issue [skip ci] by @YuanTingHsieh in #1004
- [CICD] add new members to trigger list of blossom-ci [skip ci] by @YanxuanLiu in #1005
- Add unit test for hello-pt custom code by @YuanTingHsieh in #1006
- add note in docs about running on CPU if invalid GPU IDs [skip ci] by @nvkevlu in #1007
- Use randomly generated string as SECRET_KEY if users do not set it by @IsaacYangSLA in #1014
- Fixed the ha_admin_cmds. by @yhwen in #1011
- Fix info collect commands by @YuanTingHsieh in #1001
- Make heartbeat_timeout optional by @YuanTingHsieh in #1009
- Fixed overseer can not be shutdown_system issue. by @yhwen in #1015
- Changed the default max_jobs to 4. by @yhwen in #1012
- Add format checking on input of user email when dashboard first starts by @IsaacYangSLA in #1017
- Fix hello pt test issue [skip ci] by @YuanTingHsieh in #1018
- Check GPU number and memory only when specified number is not 0 by @YuanTingHsieh in #1010
- use spleen bundle version with multi-gpu eval by @holgerroth in #1013
- update APIStatus to return for delete running job by @nvkevlu in #1016
- Replace 11 redirect responses with send_static_file by @IsaacYangSLA in #1022
- Consolidate the federated statistics documentation [skip ci] by @chesterxgchen in #1020
- Add fed authorization into integration tests [gpu] by @YuanTingHsieh in #977
- Add warning when docker is not available for dashboard [skip ci] by @IsaacYangSLA in #1026
- fix typing, fix issue of abort_task error by @nvkevlu in #1024
- update example README.md & notebook [skip ci] by @chesterxgchen in #1027
- Provide hints when running nvflare.dashboard.cli or nvflare dashboard by @IsaacYangSLA in #1025
- Monai fed stats example by @holgerroth in #972
- Enhance job run status by @yhwen in #1023
- set the peer_ctx to private. by @yhwen in #1021
- rename metrics to statistics [skip ci] by @chesterxgchen in #1030
- Convert prostate example from poc to simulator by @ZiyueXu77 in #1002
- Added federeated-policies example by @nvidianz in #1028
- fix bug when statistics are remove by statistics filter [skip ci] by @chesterxgchen in #1031
- Monai Fed Stats: Return and save pre-run results by @holgerroth in #1034
- Fix shutdown command when authz failed by @YuanTingHsieh in #1008
- Default should be False and token is None [gpu] by @YuanTingHsieh in #1035
- Update XGBoost API and examples by @YuanTingHsieh in #1019
- Docs updates [skip ci] by @nvkevlu in #1033
- prelight_check update [skip ci] by @chesterxgchen in #1036
- Fix shutdown command by @YuanTingHsieh in #1037
- fix error type for download_job authorization by @nvkevlu in #1041
- update preflight_check.rst [skip ci] by @chesterxgchen in #1039
- Check nvidia-smi exists before using it by @chesterxgchen in #1038
- update provision message [skip ci] by @chesterxgchen in #1042
- update example README.md to include xgboost [skip ci] by @chesterxgchen in #1043
- Display proper not client authorized message [gpu] by @yhwen in #1040
- fix status of response by @nvkevlu in #1045
- Examples doc rst [skip ci] by @chesterxgchen in #1044
- Update Filter Document (RST) based on Yan's changes [skip ci] by @chesterxgchen in #1048
- Added defaults for simulator. by @yhwen in #1046
- Remove DistributionBuilder as provisioners can zip folders with their… by @IsaacYangSLA in #1052
- Update/remove portions of provisioning_system.rst by @IsaacYangSLA in #1053
- log the send_task_result exception message. by @yhwen in #1051
- Fixed report_resources command by @nvidianz in #1056
- more docs updates [skip ci] by @nvkevlu in #1055
- Fix sqlalchemy version to avoid flask-sqlalchemy all issue. by @IsaacYangSLA in #1057
- Update Brats example for simulator and poc mode by @Can-Zhao in #1050
- prostate - modify nvflare version for env preparation by @ZiyueXu77 in #1059
- Fix bug due to the following if condition by @chesterxgchen in #1060
- Updated getting started guide to use FL Simulator [skip ci] by @kkersten in #1058
- update monai_nvflare version [skip ci] by @chesterxgchen in #1061
- Clean up and add more details to docs [skip ci] by @nvkevlu in #1065
- Update xgboost README [skip ci] by @YuanTingHsieh in #1062
- FIX BUG JIRA FLARE-774 by @chesterxgchen in #1066
- Remove snapshot_persistor [skip ci] by @IsaacYangSLA in #1067
- add new members to blossim-ci by @holgerroth in #1064
- Upgrade cifar10 example [skip ci] by @holgerroth in #1047
- Fixed the simulator default settings. by @yhwen in #1063
- fix single channel default feature name [skip ci] by @holgerroth in #1068
- Check config for monai fed stats [skip ci] by @holgerroth in #1069
- Update cifar10 resource requirements [skip ci] by @holgerroth in #1072
- Allow GPUResourceManager/Consumer to handle float GPU memory by @YuanTingHsieh in #1073
- Fixed the simulator gpu option default message warning. by @yhwen in #1074
- Fixed thee CLI simulator CTR-C issue. by @yhwen in #1075
- Fix missing num_* in the response of calling PATCH project by @IsaacYangSLA in #1076
- NVFLARE README Update etc. [skip ci] by @chesterxgchen in #1070
- README UPDATE [slip ci] by @chesterxgchen in #1077
- update dashboard UI link [skip ci] by @chesterxgchen in #1078
- update by @chesterxgchen in #1079
- Update XGBoost pip version [skip ci] by @YuanTingHsieh in #1082
- Fix typo [skip ci] by @YuanTingHsieh in #1081
- Update monai spleen example and monai_nvflare [skip ci] by @holgerroth in #1083
- Dependency and README UPDATE [slip ci] by @chesterxgchen in #1087
New Contributors
- @eordentlich made their first contribution in #885
- @dependabot made their first contribution in #929
- @rongou made their first contribution in #996
Full Changelog: 2.1.1...2.2.1