You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For many algorithms, whether the input data is C or Fortran contiguous determines whether an expensive memory copy needs to be made. While this seems innocuous, it can have significant UX implications because it's not well understood by most users and, when it rears its head, it's not obvious based on errors.
We should document this.
The text was updated successfully, but these errors were encountered:
beckernick
changed the title
[FEA] Document which algorithms expect Fortran vs. C contiguous data
[DOC] Document which algorithms expect Fortran vs. C contiguous data
Jun 13, 2024
Opened a PR that should inform users when a possibly useless copy is performed. As stated here, data on host (Numpy arrays and Pandas dataframes) will be copied over to device anyways, cuDF dataframes are deepcopied too and cuDF series are 1D and thus not affected by the issue. Then only cuda array interface compliant arrays (and numba arrays) can be copied only because of data order/contiguousness change. This change should allow the user to be informed.
If the user is informed through logging, is it necessary to also document it? If so, should we add the expected data order/contiguousness on the documentation of each function parameter providing data everywhere in the entire library? What should we do when function parameters are left undocumented (many occurrences)?
For many algorithms, whether the input data is C or Fortran contiguous determines whether an expensive memory copy needs to be made. While this seems innocuous, it can have significant UX implications because it's not well understood by most users and, when it rears its head, it's not obvious based on errors.
We should document this.
The text was updated successfully, but these errors were encountered: