- [Short description of non-trivial change.]
- Conception
- Development
- Added bucketization() function.
- Added update_cwd_to_root() function.
- Added pyspark ingestion and testing framework.
- Added project setup function for notebook and console.
- Added feature Store for eks anomaly detection models.
- Added new features to version control feature store and nested JSONs.
- Renamed the repo to dish-devex-sdk from msspackages.
- Renamed the package to devex-sdk from msspackages.
- Added spark session creation module.
- Added module that converts CSV, JSON or Parquet to dataframe.
- Updates to data ingestion framework to increase re-usability of code through "Connectors".
- Created class "Spark_Utils" to allow for creation of spark objects with single function call.
- Created parent class "Spark_Data_Connector" that can read any s3 location using spark backend.
- Modified child classes: "EKS_Connector" and "Nested_Json_Connector" to inherit attributes and functions from parent class.
- Modified Parent class Pandas_Data_Connector to read s3 locations using pandas backend.
- Added GzConnector class to ingest logs of gzip format into a dataframe.
- Added GitHub Releases automation through GitHub Actions
- Added PyPi automatic versioning through GitHub Actions
- Updated CONTRIBUTING documentation to ensure accurate addition of dependencies.
- Modified main README structure to include a directory of all packages within the DevEx SDK.
- PyPi release to resolve versioning conflict.
- Added ability to version .whl files manually when compiling local builds.
- Added error handling, and minor changes to GzConnector and its tests.
- Reworked GzConnector to reflect changes in logging infrastructure and format.
- Updated spark config to enable read/write access for s3 buckets with dots.