Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Minimal implementation (tests green) * Remove sum method and rely on np.sum * Force DenseMatrix to always be 2-dimensional * Add __repr__ and __str__ methods * Fix as_mx * Fix ufunc return value * Wrap SparseMatrix, too * Demo of how the ufunc interface can be implemented * Do not subclass csc_matrix * Demonstrate binary ufuncs for sparse * Add tocsc method * Fix type checks * Minor improvements * ufunc support for categoricals * Remove __array_ufunc__ interface * Remove numpy operator mixin * Add hstack function * Add method for unpacking underlying array * Add __matmul__ methods to SparseMatrix * Stricter and more consistent indexing * Be consistent when instantiating from 1d arrays * Add column name metadata to `tabmat` matrices (#278) * Add column name getters * Matrix names are also combined * Add names to constructors * Add indexing support for column names * Remove unnecessary code * Better default column names * Reduce code duplication * Saner defaults * Add convenient getters and setters * Fix indexing * Smarter setter for categorical matrices * Add tests * Fix subsetting with np.newaxis * Remove the walrus :( * Fix test * Fix indexing with np.ix_ * Propagate column names where it makes sense * Fix merge mistake * Add changelog entry * Matrices from formulas (#267) * Add an experimental tabmat materializer class * Nicer way of handling interactions * Have proper column names [skip ci] * Make dummy ordering consistent with pandas [skip ci] * Fix mistake in categorical interactions [skip ci] * Add formulaic to environment files Have not added to the conda recipe yet. Should probably be optional. * Add from_formula constructor * Add some tests * Add more tests * Major refactoring - simplify categorical interactions - NaNs in categoricals should be handled correctly - parity with formulaic in categorical names * Make name formatting custommizable - interaction_separator - categorical_format - intercept_name * Add formulaic to conda recipe * Implement `C()` function to convert to categoricals * Auto-convert strings to categories * Fix C() not working from materializer interface * Add the pandasmaterializer tests from formulaic * Add formulaic to setup.py deps * Implement suggestions from code review * Clean up code - Add docstrings - Add type hints - Rename some classes * Pin formulaic minimum version * Add support for architectures not supported by xsimd (#262) * Release 3.1.9 (#263) * Pre-commit autoupdate (#264) Co-authored-by: quant-ranger[bot] <132915763+quant-ranger[bot]@users.noreply.github.com> * Add params for density and cardinality thresholds * Skip python 3.6 build * Refactor to avoid circular imports * Interaction of dropped and NA is dropped * Add type hint for context * Add unit tests for interactable vectors * Add more checks * Change argument name * Make C() stateful (remember levels) * Add test for categorizer state * More correct handling of encoding categoricals * Make adding an intercept implicitly parametrizable Default is False * Add na_action parameter to constrictor * Add test for sparse numerical columns * Add option to not add the constant column * Pre-commit autoupdate (#274) * Pre-commit autoupdate (#276) Co-authored-by: quant-ranger[bot] <132915763+quant-ranger[bot]@users.noreply.github.com> * Bump pypa/gh-action-pypi-publish from 1.8.6 to 1.8.7 (#277) Bumps [pypa/gh-action-pypi-publish](https://github.com/pypa/gh-action-pypi-publish) from 1.8.6 to 1.8.7. - [Release notes](https://github.com/pypa/gh-action-pypi-publish/releases) - [Commits](pypa/gh-action-pypi-publish@v1.8.6...v1.8.7) --- updated-dependencies: - dependency-name: pypa/gh-action-pypi-publish dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump pypa/gh-action-pypi-publish from 1.8.7 to 1.8.8 (#279) Bumps [pypa/gh-action-pypi-publish](https://github.com/pypa/gh-action-pypi-publish) from 1.8.7 to 1.8.8. - [Release notes](https://github.com/pypa/gh-action-pypi-publish/releases) - [Commits](pypa/gh-action-pypi-publish@v1.8.7...v1.8.8) --- updated-dependencies: - dependency-name: pypa/gh-action-pypi-publish dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump pypa/cibuildwheel from 2.13.1 to 2.14.1 (#280) Bumps [pypa/cibuildwheel](https://github.com/pypa/cibuildwheel) from 2.13.1 to 2.14.1. - [Release notes](https://github.com/pypa/cibuildwheel/releases) - [Changelog](https://github.com/pypa/cibuildwheel/blob/main/docs/changelog.md) - [Commits](pypa/cibuildwheel@v2.13.1...v2.14.1) --- updated-dependencies: - dependency-name: pypa/cibuildwheel dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Minimal implementation (tests green) * Remove sum method and rely on np.sum * Force DenseMatrix to always be 2-dimensional * Add __repr__ and __str__ methods * Fix as_mx * Fix ufunc return value * Wrap SparseMatrix, too * Demo of how the ufunc interface can be implemented * Do not subclass csc_matrix * Improve the performance of `from_pandas` in the case of low-cardinality categoricals (#275) * Improve the performance of `from_pandas` * Update changelog according to review * Add benchmark data to .gitignore (#282) * Demonstrate binary ufuncs for sparse * Add tocsc method * Fix type checks * Minor improvements * ufunc support for categoricals * Remove __array_ufunc__ interface * Remove numpy operator mixin * Add hstack function * Add method for unpacking underlying array * Add __matmul__ methods to SparseMatrix * Stricter and more consistent indexing * Be consistent when instantiating from 1d arrays * Adjust tests to work with v4 * Fix type hints * Add changelog entry * term and column names for formula-based matrices * Fix handling of formula-based names * Add tests for formula-based names --------- Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: Martin Stancsics <[email protected]> Co-authored-by: Uwe L. Korn <[email protected]> Co-authored-by: quant-ranger[bot] <132915763+quant-ranger[bot]@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Apply Matthias' suggestions Co-authored-by: Matthias Schmidtblaicher <[email protected]> * Allow missing values in `CategoricalMatrix` (#281) * Add missing support to categoricals * Rename functions * Parametrize missing behavior in constructors * Return a maskedarray from recover_orig * Propagate missing_method when indexing * Add tests * Template all the things! * Privatize has_missing attribute * Add changelog entry * Add option to treat missing values as a category * Update changelog * Raise if the missing category already exists * Add tests for missing name and raise on existing * Don't skip tests (they are fast) * Apply suggestions from review * Fix indxing * Fix intercept name in formulas * Add missing cateegorical functinoality to formulas * Much cooler handlong of missing categoricals * Add changelog entry * Correctly create missing category from model_spec (#297) * pyupgrade 3.9 * make ruff and mypy happy * bump minimum formulaic version (stateful transforms) * add test case with custom cat format * pin formulaic minimum version to 0.6 (#340) * cosmetics * Raise for unseen categories when materializing from an existing `ModelSpec` (#341) * Raise error on unseen levels when materializing * Fix test for unseen categories * Add test for raising on unseen categories * Properly handle missings when checking for unseen * Expand test for unseen missings * Improve attribute name * Add comment about dropping missings in tests for new levels * consistent tense * typo * slightly improve wording * Describe breaking change * improve wording * review comments * add change from #356 * fix * set default context to None * add scope to other test, too * tiny docstring cosmetics * remove duplicate . [skip-ci] * more docstring formatting --------- Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: Matthias Schmidtblaicher <[email protected]> Co-authored-by: Uwe L. Korn <[email protected]> Co-authored-by: quant-ranger[bot] <132915763+quant-ranger[bot]@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Marc-Antoine Schmidt <[email protected]> Co-authored-by: Matthias Schmidtblaicher <[email protected]> Co-authored-by: Martin Stancsics <[email protected]>
- Loading branch information