Releases: hitsz-ids/synthetic-data-generator
Releases · hitsz-ids/synthetic-data-generator
0.2.2
What's Changed
- Feature: Add progressbar for CTGAN when fitting and sampling by @cyantangerine in #228
- Enhance: Check the type of foreign key by @Z712023 in #229
- BugFix: Parallel Data Processing by @cyantangerine in #227
- Enhanee: Improved CONTRIBUTING Docs with 4+1 view and Overview Diagram by @jalr4ever in #226
- BugFix: Regulate positive-negative values in the generated data by @jalr4ever in #232
- Enhance: Tenfold performance boost for reduce the memory usage of Gaussian Copula training. by @jalr4ever in #233
New Contributors
- @cyantangerine made their first contribution in #228
Full Changelog: 0.2.1...0.2.2
0.2.1
What's Changed
- Add CHN address inspector by @MooooCat in #158
- Update inspector part in Doc(API Reference) by @MooooCat in #159
- Add dotenv in single-table gpt model by @MooooCat in #161
- Speed up regex inspector, Add chn/eng name inspectors by @MooooCat in #162
- Add single table metadata example by @MooooCat in #166
- bugfix: SingleTableGPTModel._sample_with_data "has no attribute 'result'" by @aaronrmm in #174
- Change Metadata.column_list from Set to List by @MooooCat in #176
- Remove unnecessary dependency torchvision by @Guo-Yunzhe in #177
- Update pyproject.toml (joblib version) by @MooooCat in #175
- Bugfix: fix gussian copula segmentfault error by @MooooCat in #180
- Bugfix: fix division by zero error in numeric inspector, add comments by @MooooCat in #181
- Intro data processor in sdgx by @MooooCat in #171
- Intro data processor in Readme by @MooooCat in #182
- Fix View GFI Link in Readme by @MooooCat in #183
- Fix precision problem in metric's testcases by @MooooCat in #185
- Use GLM-4 by @TracyWang95 in #188
- Pin numpy<2 by @Wh1isper in #190
- Feature: Add Email Generator (a new type of sdgx.data_processor) by @MooooCat in #184
- Add ChnPiiGenerator and Enhance Models by @MooooCat in #191
- Update documentation and docstrings for DataProcessors by @MooooCat in #186
- Add live QR code by @MooooCat in #198
- Enhance Data Handling with Empty Column Inspector and Transformer by @MooooCat in #197
- Update NonValueTransformer's Default Setting and Handle Custom Fill Values by @MooooCat in #199
- Enhance Chinese Name Inspector by @MooooCat in #200
- Add Chinese Company Name Support and Inspector by @MooooCat in #201
- Update Live QR Code Image by @MooooCat in #203
- BugFix:
base_url
not included when request to gpt in SingleTableGPTModel by @jalr4ever in #205 - Enhance: Fix Data Quality with Outlier Handling and Improved Missing Value Treatment by @MooooCat in #207
- Typo Fix: Unified Logger Usage by @MooooCat in #209
- Update Live QR Code Image 0730 by @MooooCat in #210
- Bugfix: Update Fit Methods in Data Processors by @MooooCat in #211
- Add ConstInspector and ConstValueTransformer for Handling Constant Columns by @MooooCat in #202
- Enhance: Add NonValueTransformer Reverse Conversion with NAN_VALUE Replacement by @MooooCat in #212
- Maintenance: Update CTGAN Example to Use Latest SDG by @MooooCat in #213
- Fix Minor Typo by @MooooCat in #216
- Enhance Numeric Data Inspection and Introduce Positive/Negative Filtering by @MooooCat in #217
- Fix Division by Zero Error in Numeric Column Inspection by @MooooCat in #220
New Contributors
- @aaronrmm made their first contribution in #174
- @Guo-Yunzhe made their first contribution in #177
- @TracyWang95 made their first contribution in #188
- @jalr4ever made their first contribution in #205
Full Changelog: 0.2.0...0.2.1
0.2.0
What's Changed
LLM-Based SingleTable Model
A single-table data synthesis model based on LLM is included, view colab example:
Commits:
- Introduce LLM-based single-table model. by @MooooCat in #129
- Bugfix: fix model type typo by @MooooCat in #144
- Bugfix: fix return datatype in _sample_with_metadata by @MooooCat in #145
- Bugfix: fix LLM result typo by @MooooCat in #146
Improvements on Inspectors
- Add Regex Inspector and Email Inspector example. by @MooooCat in #115
- Implement datetime_formats in DatetimeInspector by @Femi-lawal in #125
- Distinguish int/float in NumericInspector by @MooooCat in #133
Metadata
- Bugfix: fix KeyError when metadata raising an MetadataInvalidError. by @MooooCat in #134
- Add dict support on metadata, optimize datetime format judgment rules, add eq for combiner by @MooooCat in #135
Python 3.12 Support
Readme and Docs
- Update README.md by @iokk3732 in #123
- docs: add iokk3732 as a contributor for code by @allcontributors in #127
- docs: add Femi-lawal as a contributor for code by @allcontributors in #128
- Add language switch on Readme.md by @MooooCat in #130
- Minor modifications on readme by @MooooCat in #131
- Update SDG Readme by @MooooCat in #139
- Update doc readme by @MooooCat in #140
- Add Colab Examples, Update Readme by @MooooCat in #147
- Update readme.md by @MooooCat in #150
- Add ctgan description on Readme.md by @MooooCat in #151
Others
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #124
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #138
New Contributors
- @iokk3732 made their first contribution in #123
- @Femi-lawal made their first contribution in #125
Full Changelog
Please view: 0.1.5...0.2.0
0.1.5
What's Changed
- docs: add Z712023 as a contributor for code by @allcontributors in #112
- Bugfix metric mutual information by @Z712023 in #118
- [Bugfix] Temporarily modify single table demo data link by @MooooCat in #121
- Introduce inspect_level in inspector and metadata by @MooooCat in #113
- Add start history chart in README by @Wh1isper in #122
New Contributors
Full Changelog: 0.1.4...0.1.5
0.1.4
What's Changed
- [Bugfix] Add future annotations by @MooooCat in #106
- Add testing for JSD metrics by @sjh120 in #100
- Add base model for multi-table statistic model, change single-table base class location by @MooooCat in #102
- Add mutual information metric by @Z712023 in #101
Full Changelog: 0.1.3...0.1.4
0.1.3
What's Changed
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #87
- [0.2.0] Metadata Implementation by @MooooCat in #81
- Patch on multi table combiner and test case by @Wh1isper in #89
- Fix typo _dumo_json by @Wh1isper in #90
- Intro dummy table for speedup models case by @Wh1isper in #92
- Intro torchrun in CLI by @Wh1isper in #88
- Implement MetadataCombiner, partitial refactoring on Metadata by @Wh1isper in #96
- Add mock data and testing for multi tables' related imp by @Wh1isper in #97
- Intro SubsetRelationshipInspector by @Wh1isper in #99
- Add demo data for multi-table scenario by @MooooCat in #98
Full Changelog: 0.1.2...0.1.3
0.1.2
0.1.1
0.1.0
What's Changed
- Update SDG's New Data Processor by @MooooCat in #48
- Rewrite and notice copyright for CTGAN by @Wh1isper in #50
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #49
- docs: add Wh1isper as a contributor for code by @allcontributors in #53
- docs: add MooooCat as a contributor for code by @allcontributors in #54
- docs: add joeyscave as a contributor for code by @allcontributors in #55
- Breaking changes: Refactoring of new architecture by @Wh1isper in #56
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #63
- 0.1.0: Intro DataConnector and DataLoader by @Wh1isper in #64
- [0.1.0] Metadata and Inspector by @Wh1isper in #67
- [0.1.0]Breaking changes: Reactoring models Part 1 by @Wh1isper in #68
- [0.1.0] Update a part of docs by @Wh1isper in #70
- Update Base Class of Metric by @MooooCat in #60
- docs: add sjh120 as a contributor for code by @allcontributors in #73
- [0.1.0] Refactoring CTGAN for DataLoader by @Wh1isper in #72
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #76
- [0.1.0] Intro NDArryLoader by @Wh1isper in #75
- Add subdir for NDArrayLoader to prevent collision of cache files by @Wh1isper in #78
- Init benchmark base code by @Wh1isper in #80
- Switch to cloudpickle and fix load bugs by @Wh1isper in #83
- Update docstring and user guides by @Wh1isper in #84
New Contributors
- @allcontributors made their first contribution in #53
- @sjh120 made their first contribution in #60
Full Changelog: 0.0.1...0.1.0