Version 0.27.0
head
ordering
Since Koalas doesn't guarantee the row ordering, head
could return some rows from distributed partition and the result is not deterministic, which might confuse users.
We added a configuration compute.ordered_head
(#1231), and if it is set to True
, Koalas performs natural ordering beforehand and the result will be the same as pandas'.
The default value is False
because the ordering will cause a performance overhead.
>>> kdf = ks.DataFrame({'a': range(10)})
>>> pdf = kdf.to_pandas()
>>> pdf.head(3)
a
0 0
1 1
2 2
>>> kdf.head(3)
a
5 5
6 6
7 7
>>> kdf.head(3)
a
0 0
1 1
2 2
>>> ks.options.compute.ordered_head = True
>>> kdf.head(3)
a
0 0
1 1
2 2
>>> kdf.head(3)
a
0 0
1 1
2 2
GitHub Actions
We started trying to use GitHub Actions for CI. (#1254, #1265, #1264, #1267, #1269)
Other new features and improvements
We added the following new feature:
DataFrame:
- apply (#1259)
Other improvements
- Fix identical and equals for the comparison between the same object. (#1220)
- Select the series correctly in SeriesGroupBy APIs (#1224)
- Fixes
DataFrame/Series.clip
function to preserve its index. (#1232) - Throw a better exception in
DataFrame.sort_values
when multi-index column is used (#1238) - Fix
fillna
not to change index values. (#1241) - Fix
DataFrame.__setitem__
with tuple-named Series. (#1245) - Fix
corr
to support multi-index columns. (#1246) - Fix output of
print()
matches with pandas of Series (#1250) - Fix fillna to support partial column index for multi-index columns. (#1244)
- Add as_index check logic to groupby parameter (#1253)
- Raising NotImplementedError for elements that actually are not implemented. (#1256)
- Fix where to support multi-index columns. (#1249)