Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Makes copy of input ddf to work around dropped column names (#3776)
When creating multiple graphs with the same dask_cudf dataframe, there is a metadata mismatch occurring when one or more partitions are empty. In fact, during the second graph creation with the dask_cudf dataframe that was used/modified earlier, the metadata are not conserved for partitions with empty empty dataframes. This is due to the fact a _reference_ to the input dataframe partly destroyed (modfied) during the first graph creation is reused in the second graph creation. This PR makes a copy of the input dataframe right after the repartition call to avoid that alteration. Authors: - jnke2016 ([email protected]) Approvers: - Vibhu Jawa (https://github.com/VibhuJawa) - Alex Barghi (https://github.com/alexbarghi-nv) - Rick Ratzel (https://github.com/rlratzel)
- Loading branch information