Fix row numbers in the left column + add tests #26

severo · 2025-01-14T08:46:12Z

Add __index__: number as a mandatory field of the rows. It's a breaking change, but when data is sorted, we need to be sure (and the types should reflect it) that the row index is provided to:

show the row number in the left column
be able to show the cell content when double clicking

Two alternatives:

force __index__ in the rows only when orderBy is not undefined, because otherwise we already have the index. It's what is implemented in the helper function sortableDataFrame, and we could simply add type with function overloads. It would be a breaking change anyway since we don't force anything at the moment, but possibly less impactful.
don't break things, and simply remove features (no row number displayed in the left column, and no double click/mouse down callbacks) if data is sorted and no index is provided

Also in this PR:

I added some aria attributes as testing-library primarily relies on roles and accessible information (by opposition to the specific organization of the DOM), and as adopting its way of thinking helps build a more accessible component
I added tests on column sort and mouse double-click, to be sure the correct row index is sent.
[Breaking (the CSS)] I changed the left cells (row numbers) from <td> to <th> as it's semantically better, I think. But we can revert if it's better to avoid breaking the styles.

…umn sort

platypii · 2025-01-14T19:18:03Z

src/HighTable.tsx

@@ -49,15 +49,15 @@ type State = {
  columnWidths: Array<number | undefined> // width of each column
  invalidate: boolean // true if the data must be fetched again
  hasCompleteRow: boolean // true if at least one row is fully resolved (all of its cells)
-  rows: AsyncRow[] // slice of the virtual table rows (sorted rows) to render as HTML
+  rows: (Row | undefined)[] // slice of the virtual table rows (sorted rows) to render as HTML - undefined means pending


Why did this change from AsyncRow to Row?

[if I understand well...]

In master, when we set rows with SET_ROWS, it's an array of Row objects, not of AsyncRow objects, see

hightable/src/HighTable.tsx

Line 186 in da40bff

const resolved: Row[] = []

hightable/src/HighTable.tsx

Line 200 in da40bff

resolved.push(resolvedRow)

I think something is slightly confusing in master: the local rows variable in useEffect is effectively AsyncRow[] while the state field rows is Row[]. That's why I renamed

hightable/src/HighTable.tsx

Line 183 in da40bff

const rows = asyncRows(unwrapped, end - start, data.header)

to

hightable/src/HighTable.tsx

Line 183 in 727168c

const rowsChunk = asyncRows(unwrapped, end - start, data.header)

Also: when we use the state rows in the JSX code, resolved Row objects are expected:

hightable/src/HighTable.tsx

Lines 390 to 401 in da40bff

return <tr key={tableIndex} title={rowError(row, dataIndex)} className={isSelected({ selection, index: tableIndex }) ? 'selected' : ''}>

<td style={cornerStyle} onClick={event => onRowNumberClick({ useAnchor: event.shiftKey, tableIndex })}>

<span>{

/// TODO(SL): we might want to show two columns: one for the tableIndex (for selection) and one for the dataIndex (to refer to the original data ids)

rowLabel(dataIndex)

}</span>

<input type='checkbox' checked={isSelected({ selection, index: tableIndex })} />

</td>

{data.header.map((col, colIndex) =>

Cell(row[col], colIndex, dataIndex)

)}

</tr>

src/HighTable.tsx

platypii

Still thinking about this.

First of all the behavior now is basically: use the index if present, but otherwise just give a normal 1,2,3 row ordering, even if there is an orderBy. But if the __index__ does exists, then use it. Absolutely requiring the index feels like it might be impossible in some situations (can imagine a table provider that does SQL queries against a backend, or other variations). Could be hacked around by annotating with 1,2,3 indexes but that felt like it was adding more work to the simple-case, and it would be more flexible to let the table component handle indexed-or-not. The example in the README would need to change to add the __index__ column being returned if this change is merged.

This gave me an idea: If we do add the __index__ to AsyncRow (as done in this PR), and change DataFrame.rows() to always return type AsyncRow[] and not | Promise<Row[]>. This would simplify many code branches. It would make it slightly harder to write a DataFrame implementation, but we could fix that by providing helpers (for example see the asyncRows helper function) that would also annotate with indexes if needed? So the simple case would be like:

const df = {
  header, numRows,
  rows(start, end, orderBy): AsyncRow[] {
    const rows = [{}, {}, {}, ...]
    return asyncRows(rows, start)
  },
}

But for advanced use cases like parquet, we can return our own AsyncRow[] with full control over when cells are loaded (and the __index__).

Will continue thinking about best approach.

severo · 2025-01-15T08:43:02Z

Yes, it seems like a good simplification.

Thinking out loud: looking at the code in master (and in hyperparam-cli), we always resolve all the columns when we process a given row. I understand that AsyncRow gives the ability to update only some cells in a row, but do we want to provide this level of flexibility? If not, and iterating on your idea above, we could have

const df = {
  header, numRows,
  rows(start, end, orderBy): AsyncRow[] {
    const rows = [{}, {}, {}, ...]
    return asyncRows(rows, start)
  },
}

with

AsyncRow = WrappedPromise<Row>

severo · 2025-01-15T15:24:34Z

Absolutely requiring the index feels like it might be impossible in some situations (can imagine a table provider that does SQL queries against a backend, or other variations).

what should we do for the double click on a cell in that case? Disable it?

I think that for this PR, I'll apply the alternative #2:

don't break things, and simply remove features (no row number displayed in the left column, and no double click/mouse down callbacks) if data is sorted and no index is provided

so that we can still ship the tests, and a11y improvements, and then work on the index in another PR

if __index__ is not present in sorted rows, we: - don't show the number in the left column - cancel the callbacks on mouse down click and double click on a cell

This reverts commit 9769e68.

severo · 2025-01-15T17:05:30Z

src/dataframe.ts

+      if (!('__index__' in wrapped) && '__index__' in row && typeof row.__index__ === 'number') {
+        wrapped[i].__index__ = resolvablePromise<number>()
+        wrapped[i].__index__.resolve(row.__index__)
+      }


this part fixes a bug: __index__ was ignored if using asyncRows()

severo · 2025-01-15T17:06:26Z

Requesting a review again @platypii :) It breaks less things and is acceptable too, I think

platypii · 2025-01-15T17:23:31Z

I understand that AsyncRow gives the ability to update only some cells in a row, but do we want to provide this level of flexibility?

Ability to fill in individual cells is required behavior. The whole point is to allow cells to fill in when they are ready. For parquet, since it is column-oriented, small columns might fill in before large text columns and I don't want to have to wait for the entire row to resolve (not implemented in cli yet, but that is planned).

Also some columns might be derived, and take time to compute, like with the llm transform demo. This would look really bad if every row was blank until every column was ready:

column.loading.mp4

Which is why, if anything, I think we should make the DataFrame always return AsyncRow[].

platypii

These changes look good now, while we debate other refactors :-)

severo · 2025-01-15T17:26:06Z

Excellent, thanks for the details. Indeed we want cell level resolution.

platypii · 2025-01-15T17:27:16Z

what should we do for the double click on a cell in that case? Disable it?

Yea this is a good point which I don't have a great answer to. I could be convinced on requiring the __index__. But for now just disabling the clicks when sorted+noindex makes sense to me.

add aria attributes, change left cells from td to th, add test on col…

3da9a90

…umn sort

severo changed the title ~~add aria attributes, change left cells from td to th, add test on col…~~ add aria attributes, change left cells from td to th, add test on column sort Jan 14, 2025

severo marked this pull request as draft January 14, 2025 08:49

severo added 2 commits January 14, 2025 11:35

set readonly to the checkbox for now to remove warning in tests

4163b7b

mandatory __index__ in rows

727168c

platypii reviewed Jan 14, 2025

View reviewed changes

severo added 4 commits January 14, 2025 23:05

remove wrong comments + hide unknown row numbers when sorted

9cc1581

expand text

abb60f9

no need for a callback

7d138f6

improve comment

00c80d1

severo requested a review from platypii January 14, 2025 22:38

severo changed the title ~~add aria attributes, change left cells from td to th, add test on column sort~~ Fix row numbers in the left column + add tests Jan 14, 2025

severo marked this pull request as ready for review January 14, 2025 22:39

platypii reviewed Jan 15, 2025

View reviewed changes

severo added 8 commits January 15, 2025 17:07

remove breaking change on Row / AsyncRow definitions

3bf259d

if __index__ is not present in sorted rows, we: - don't show the number in the left column - cancel the callbacks on mouse down click and double click on a cell

centralize code that checks if __index__ is present and needed

8f45a17

update comment

ed23ba0

update docstring

fd8fd37

restore two changes

9769e68

Revert "restore two changes"

4f6c883

This reverts commit 9769e68.

no need to export

b6ffbdf

only add __index__ if required

20e30fd

severo commented Jan 15, 2025

View reviewed changes

severo requested a review from platypii January 15, 2025 17:05

platypii approved these changes Jan 15, 2025

View reviewed changes

severo merged commit 9d2f932 into master Jan 15, 2025
8 checks passed

severo deleted the add-tests branch January 15, 2025 17:26

severo mentioned this pull request Jan 15, 2025

upgrade hightable to 0.9.0 (fix row index) hyparam/hyperparam-cli#141

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix row numbers in the left column + add tests #26

Fix row numbers in the left column + add tests #26

severo commented Jan 14, 2025 •

edited

Loading

platypii Jan 14, 2025

severo Jan 14, 2025 •

edited

Loading

platypii left a comment •

edited

Loading

severo commented Jan 15, 2025

severo commented Jan 15, 2025

severo Jan 15, 2025 •

edited

Loading

severo commented Jan 15, 2025

platypii commented Jan 15, 2025

platypii left a comment

severo commented Jan 15, 2025

platypii commented Jan 15, 2025

	return <tr key={tableIndex} title={rowError(row, dataIndex)} className={isSelected({ selection, index: tableIndex }) ? 'selected' : ''}>
	<td style={cornerStyle} onClick={event => onRowNumberClick({ useAnchor: event.shiftKey, tableIndex })}>
	<span>{
	/// TODO(SL): we might want to show two columns: one for the tableIndex (for selection) and one for the dataIndex (to refer to the original data ids)
	rowLabel(dataIndex)
	}</span>
	<input type='checkbox' checked={isSelected({ selection, index: tableIndex })} />
	</td>
	{data.header.map((col, colIndex) =>
	Cell(row[col], colIndex, dataIndex)
	)}
	</tr>

Fix row numbers in the left column + add tests #26

Fix row numbers in the left column + add tests #26

Conversation

severo commented Jan 14, 2025 • edited Loading

platypii Jan 14, 2025

Choose a reason for hiding this comment

severo Jan 14, 2025 • edited Loading

Choose a reason for hiding this comment

platypii left a comment • edited Loading

Choose a reason for hiding this comment

severo commented Jan 15, 2025

severo commented Jan 15, 2025

severo Jan 15, 2025 • edited Loading

Choose a reason for hiding this comment

severo commented Jan 15, 2025

platypii commented Jan 15, 2025

platypii left a comment

Choose a reason for hiding this comment

severo commented Jan 15, 2025

platypii commented Jan 15, 2025

severo commented Jan 14, 2025 •

edited

Loading

severo Jan 14, 2025 •

edited

Loading

platypii left a comment •

edited

Loading

severo Jan 15, 2025 •

edited

Loading