Version 0.28.0
pandas 1.0 support
We added pandas 1.0 support (#1197, #1299), and Koalas now can work with pandas 1.0.
map_in_pandas
We implemented DataFrame.map_in_pandas
API (#1276) so Koalas can allow any arbitrary function with pandas DataFrame against Koalas DataFrame. See the example below:
>>> import databricks.koalas as ks
>>> df = ks.DataFrame({'A': range(2000), 'B': range(2000)})
>>> def query_func(pdf):
... num = 1995
... return pdf.query('A > @num')
...
>>> df.map_in_pandas(query_func)
A B
1996 1996 1996
1997 1997 1997
1998 1998 1998
1999 1999 1999
Standardize code style using Black
As a development only change, we added Black integration (#1301). Now, all code style is standardized automatically via running ./dev/reformat
, and the style is checked as a part of ./dev/lint-python
.
Other new features and improvements
We added the following new feature:
DataFrame:
Other improvements
- Fix
DataFrame.describe()
to support multi-index columns. (#1279) - Add util function validate_bool_kwarg (#1281)
- Rename data columns prior to filter to make sure the column names are as expected. (#1283)
- Add an faq about Structured Streaming. (#1298)
- Let extra options have higher priority to allow workarounds (#1296)
- Implement 'keep' parameter for
drop_duplicates
(#1303) - Add a note when type hint is provided to DataFrame.apply (#1310)
- Add a util method to verify temporary column names. (#1262)