Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to select multiple rows #103

Open
platypii opened this issue Dec 3, 2024 · 12 comments
Open

Ability to select multiple rows #103

platypii opened this issue Dec 3, 2024 · 12 comments
Assignees

Comments

@platypii
Copy link
Contributor

platypii commented Dec 3, 2024

Need to be able to select a range of rows, and apply an action to them (annotation, removal, etc).

I could imagine a couple ways this could be done. Please use your best judgement and let's try things out to see how they feel. Some ideas:

Excel-style

Click on row number to select row, and start selection. Shift-click on another to select the range.

Right click menu

Right-click on row number and have dropdown menu with option "select above" rows. (long click for mobile support?)

@severo
Copy link
Contributor

severo commented Dec 11, 2024

The selection UI will be implemented in hightable, not in hyperparam-cli.

Which data structure should we use to represent the selected rows?

  1. an array (index) of all the rows, with their state selected/not selected (see Select Multiple Rows by Holding Shift TanStack/table#3068 for example)
    Note that it can be implemented efficiently (using an array of 32bits integers - the array length is N/32) as a BitSet: https://github.com/uwdata/arquero/blob/main/src/table/BitSet.js
  2. a set (or array) of the selected row indexes (https://github.com/observablehq/inputs/blob/main/src/table.js) <- simple, but could be big if we select all the rows, or the first 1M
  3. an array of ranges ([row_min, row max]) of selected rows <- an issue that it can be corrupted if the ranges overlap for example, or if row max < row min

@platypii
Copy link
Contributor Author

Probably 2.

I like the idea of ranges, and maybe return a struct that include ranges, but ultimately probably need to return the list of rows. The main reason being that the table might be sorted, and so the selection needs to be sorting-aware.

@severo
Copy link
Contributor

severo commented Dec 12, 2024

Note that our implementation of the sort operation already creates an array whose length is the number of rows:

So 1. might also be feasible if sort is feasible.

But for datasets like https://huggingface.co/datasets/HuggingFaceFW/fineweb, not sure it makes sense (22B rows)

@severo severo self-assigned this Dec 18, 2024
@severo
Copy link
Contributor

severo commented Jan 2, 2025

Would it make sense to express the selected rows as a query, as defined in hyparam/hyparquet#56?

Pros:

  • directly usable by hyparquet
  • we can express the rows as we prefer given the case:
    • small list: id: {$in: [3, 4, 5, 6, 7]}
    • big list: $and: [{id: {$gte: 10000}}, {id: {$lt: 50000}}]
  • serializable

Note that we don't have such a column right now (id). We might decide to use the same convention as MongoDB with the _id field (https://www.mongodb.com/docs/manual/core/document/#the-_id-field) and use _id to refer to the row index (0 for the first row, etc) in the original data (ie: unordered, unsampled, etc).

Alternative: _id refers to the row index at that step (ie: after a sort, or a sampling). Not sure for now which is better.

@severo
Copy link
Contributor

severo commented Jan 3, 2025

I did some research on how other products (google sheets, LibreOffice, airtable) handle the ranges (click, shift+click, ctrl+click..). Every one has its own logic and behavior, no clear standard appears. Let's try to do something simple.

My idea is to handle the selection in the hightable component, while hyperparam handles the actions (filter, for example)

@severo
Copy link
Contributor

severo commented Jan 6, 2025

I'm working on selecting the rows with a pointer. But should we allow using the keyboard too? In this case, it can be a big change, because the table cells cannot be focused at all for now.

@platypii
Copy link
Contributor Author

platypii commented Jan 6, 2025

Let's start with just by clicking, we can evaluate keyboard options later.

At one point I had a branch that allowed clicking on a row number to set a selectedRow, and add it to the url hash. This is useful for sharing a deep link to a specific row. It was also intended to fix the problem where a user would double-click to go into a fullscreen cell view, but then hitting the back button would lose the table positioning. This was before the SidePanel and so it's less of an issue now, but I do think it could be a useful feature in general. And it might be a useful mechanism for keyboard navigation.

+  const [selectedRow, setSelectedRow] = useState<number>()
   const scrollRef = useRef<HTMLDivElement>(null)
   const tableRef = useRef<HTMLTableElement>(null)
+  const selectedRowRef = useRef<HTMLTableRowElement>(null)
   const latestRequestRef = useRef(0)
   const pendingRequest = useRef<Promise<void>>()
   const pendingUpdate = useRef(false) // true if user scrolled while fetching
 
   // total scrollable height
   const scrollHeight = (data.numRows + 1) * rowHeight
@@ -125,6 +127,16 @@ export default function HyTable({ data, cacheKey, onDoubleClickCell, setError }:
     }
   }, [data, firstLoad, offsetTop, rows.length, scrollHeight, startIndex, setError])
 
+  // handle deep links from url hash
+  // TODO: update on hash change
+  useEffect(() => {
+    const row = parseInt(window.location.hash.slice(1))
+    if (!isNaN(row)) {
+      // scroll to row after rendering
+      setTimeout(() => linkToRow(row), 0)
+    }
+  }, [])
+
   /**
    * Validate row length
    */
@@ -134,6 +146,17 @@ export default function HyTable({ data, cacheKey, onDoubleClickCell, setError }:
     }
   }
 
+  function linkToRow(row: number) {
+    if (selectedRow === row - 1) {
+      setSelectedRow(undefined)
+      window.history.pushState(undefined, '', window.location.pathname)
+    } else {
+      setSelectedRow(row - 1)
+      window.history.pushState(undefined, '', '#' + row)
+      scrollRef.current?.scrollTo({ top: (row - 1) * rowHeight })
+    }
+  }
@@ -195,8 +219,13 @@ export default function HyTable({ data, cacheKey, onDoubleClickCell, setError }:
               </tr>
             ))}
             {rows.map((row, rowIndex) => (
-              <tr key={startIndex + rowIndex} title={rowError(row, rowIndex)}>
-                <td style={cornerStyle}>
+              <tr
+                className={startIndex + rowIndex === selectedRow ? styles.selectedRow : undefined}
+                id={startIndex + rowIndex === selectedRow ? 'selectedRow' : undefined}
+                key={startIndex + rowIndex}
+                ref={startIndex + rowIndex === selectedRow ? selectedRowRef : undefined}
+                title={rowError(row, rowIndex)}>
+                <td onClick={() => linkToRow(startIndex + rowIndex + 1)} style={cornerStyle}>
                   {(startIndex + rowIndex + 1).toLocaleString()}
                 </td>
+++ b/styles/Table.module.css
@@ -128,6 +127,11 @@
   z-index: 15;
   box-shadow: inset 0 0 4px rgba(0, 0, 0, 0.2);
 }
+/* row numbers */
+.table tbody tr td:first-child {
+  cursor: pointer;
+}
+
 /* mock row numbers */
 .mockRowLabel {
   content: "";
@@ -138,3 +142,10 @@
   background: #eaeaeb;
   z-index: -10;
 }
+
+.selectedRow {
+  background-color: #fbf7bf;
+}
+.selectedRow td:first-child {
+  background-color: #f1edbb;
+}

@severo
Copy link
Contributor

severo commented Jan 6, 2025

OK, cool. I reused the colors and cursor in hyparam/hightable#18. This PR only allows to select rows and dispatch the selection to the subscribers. Updating the URL would be the task of the hightable client, ie. in another PR, depending on the UX we want in hyperparam.

@severo
Copy link
Contributor

severo commented Jan 7, 2025

In hyparam/hightable#19 I add the ability to toggle the selection for all the rows.

@severo
Copy link
Contributor

severo commented Jan 7, 2025

In hyparam/hightable#20, I added un/selecting a range, with shift+click.

@severo
Copy link
Contributor

severo commented Jan 7, 2025

Next steps:

  • ensure it works as expected with ordered data - see Control the selection and sort hightable#22
  • consume the result in an app, eg hyperparam-cli: delete (hide -> filter out) the selected rows
  • control the selection from an app, eg hyperparam-cli: right click, modal: select rows below

@severo
Copy link
Contributor

severo commented Jan 8, 2025

The current implementation shows the following warning:

Warning: You provided a checked prop to a form field without an onChange handler. This will render a read-only field. If the field should be mutable use defaultChecked. Otherwise, set either onChange or readOnly.

  • I'll try to fix it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants