You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
but then just continue waiting forever (even after the query is finished in Databricks).
If I hit push the "Stop (interrupt) execution" button twice the exception will be raised and I get the following trace:
Traceback (most recent call last):
File "/Users/kyrre.wahl.kongsgaard/projects/cdc-dashboards/.venv/lib/python3.12/site-packages/pyspark/sql/connect/client/core.py", line 648, in __iter__
for response in self._call:
^^^^^^^^^^
File "/Users/kyrre.wahl.kongsgaard/projects/cdc-dashboards/.venv/lib/python3.12/site-packages/grpc/_channel.py", line 543, in __next__
return self._next()
^^^^^^^^^^^^
File "/Users/kyrre.wahl.kongsgaard/projects/cdc-dashboards/.venv/lib/python3.12/site-packages/grpc/_channel.py", line 960, in _next
_common.wait(self._state.condition.wait, _response_ready)
File "/Users/kyrre.wahl.kongsgaard/projects/cdc-dashboards/.venv/lib/python3.12/site-packages/grpc/_common.py", line 156, in wait
_wait_once(wait_fn, MAXIMUM_WAIT_TIMEOUT, spin_cb)
File "/Users/kyrre.wahl.kongsgaard/projects/cdc-dashboards/.venv/lib/python3.12/site-packages/grpc/_common.py", line 116, in _wait_once
wait_fn(timeout=timeout)
File "/Users/kyrre.wahl.kongsgaard/.local/share/uv/python/cpython-3.12.7-macos-aarch64-none/lib/python3.12/threading.py", line 359, in wait
gotit = waiter.acquire(True, timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/kyrre.wahl.kongsgaard/projects/cdc-dashboards/.venv/lib/python3.12/site-packages/marimo/_runtime/handlers.py", line 31, in interrupt_handler
raise MarimoInterrupt
marimo._runtime.control_flow.MarimoInterrupt
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/kyrre.wahl.kongsgaard/projects/cdc-dashboards/.venv/lib/python3.12/site-packages/marimo/_runtime/executor.py", line 142, in execute_cell
return eval(cell.last_expr, glbls)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
Cell marimo:///Users/kyrre.wahl.kongsgaard/projects/cdc-dashboards/src/data/azure-applications/[service_principal_name].parquet.py#cell=cell-12, line 1, in <module>
azure_application_events.execute()
File "/Users/kyrre.wahl.kongsgaard/projects/cdc-dashboards/.venv/lib/python3.12/site-packages/ibis/expr/types/core.py", line 396, in execute
return self._find_backend(use_default=True).execute(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/kyrre.wahl.kongsgaard/projects/cdc-dashboards/.venv/lib/python3.12/site-packages/ibis/backends/pyspark/__init__.py", line 451, in execute
df = query.toPandas() # blocks until finished
^^^^^^^^^^^^^^^^
File "/Users/kyrre.wahl.kongsgaard/projects/cdc-dashboards/.venv/lib/python3.12/site-packages/pyspark/sql/connect/dataframe.py", line 2000, in toPandas
pdf, ei = self._session.client.to_pandas(query, self._plan.observations)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/kyrre.wahl.kongsgaard/projects/cdc-dashboards/.venv/lib/python3.12/site-packages/pyspark/sql/connect/client/core.py", line 1244, in to_pandas
table, schema, metrics, observed_metrics, _ = self._execute_and_fetch(
^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/kyrre.wahl.kongsgaard/projects/cdc-dashboards/.venv/lib/python3.12/site-packages/pyspark/sql/connect/client/core.py", line 1919, in _execute_and_fetch
for response in self._execute_and_fetch_as_iterator(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/kyrre.wahl.kongsgaard/projects/cdc-dashboards/.venv/lib/python3.12/site-packages/pyspark/sql/connect/client/core.py", line 1881, in _execute_and_fetch_as_iterator
for b in generator:
^^^^^^^^^
File "<frozen _collections_abc>", line 356, in __next__
File "/Users/kyrre.wahl.kongsgaard/projects/cdc-dashboards/.venv/lib/python3.12/site-packages/pyspark/sql/connect/client/reattach.py", line 139, in send
if not self._has_next():
^^^^^^^^^^^^^^^^
File "/Users/kyrre.wahl.kongsgaard/projects/cdc-dashboards/.venv/lib/python3.12/site-packages/pyspark/sql/connect/client/reattach.py", line 172, in _has_next
self._current = self._call_iter(
^^^^^^^^^^^^^^^^
File "/Users/kyrre.wahl.kongsgaard/projects/cdc-dashboards/.venv/lib/python3.12/site-packages/pyspark/sql/connect/client/reattach.py", line 277, in _call_iter
return iter_fun()
^^^^^^^^^^
File "/Users/kyrre.wahl.kongsgaard/projects/cdc-dashboards/.venv/lib/python3.12/site-packages/pyspark/sql/connect/client/reattach.py", line 173, in <lambda>
lambda: next(self._iterator) # type: ignore[arg-type]
^^^^^^^^^^^^^^^^^^^^
File "/Users/kyrre.wahl.kongsgaard/projects/cdc-dashboards/.venv/lib/python3.12/site-packages/pyspark/sql/connect/client/core.py", line 654, in __iter__
trailers = self._call.trailing_metadata()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/kyrre.wahl.kongsgaard/projects/cdc-dashboards/.venv/lib/python3.12/site-packages/grpc/_channel.py", line 819, in trailing_metadata
_common.wait(self._state.condition.wait, _done)
File "/Users/kyrre.wahl.kongsgaard/projects/cdc-dashboards/.venv/lib/python3.12/site-packages/grpc/_common.py", line 156, in wait
_wait_once(wait_fn, MAXIMUM_WAIT_TIMEOUT, spin_cb)
File "/Users/kyrre.wahl.kongsgaard/projects/cdc-dashboards/.venv/lib/python3.12/site-packages/grpc/_common.py", line 116, in _wait_once
wait_fn(timeout=timeout)
File "/Users/kyrre.wahl.kongsgaard/.local/share/uv/python/cpython-3.12.7-macos-aarch64-none/lib/python3.12/threading.py", line 359, in wait
gotit = waiter.acquire(True, timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/kyrre.wahl.kongsgaard/projects/cdc-dashboards/.venv/lib/python3.12/site-packages/marimo/_runtime/handlers.py", line 31, in interrupt_handler
raise MarimoInterrupt
marimo._runtime.control_flow.MarimoInterrupt
The Spark job will, however, just continue on the Databricks side (and potentially happily scan through hundreds of terabytes without notifying the user 🙀).
importibisimportmarimoasmofromdatabricks.sdk.coreimportConfigfromdatabricks.connectimportDatabricksSessionconfig=Config(profile="security")
spark=DatabricksSession.builder.sdkConfig(config).getOrCreate()
con=ibis.pyspark.connect(spark)
cloud_app_events= (
con.table(
name="cloud_app_events",
database=("old_security_logs", "mde"),
)
.select(_.properties)
.unpack("properties")
)
# and then try to cancel after running this cloud_app_events.execute()
The text was updated successfully, but these errors were encountered:
hi @kyrre, we will investigate this and create a databricks account. In the meantime, can you try this in jupyter? it will help determine if this is a bug in marimo or a bug in the library.
but the execution will continue and the cell itself will never finish. This is the same the same as for marimo.
However, if you push the "interrupt kernel" button twice it will throw a KeyboardInterrupt exception, but now the Spark job will fail with the reason "SPARK_JOB_CANCELLED" (as expected).
Describe the bug
We are using Ibis and the Databricks Connect (Spark Connect) backend to run queries over large datasets.
However, if I mess up a query and try cancelling it using the "Stop (interrupt) execution" button, then marimo will register the interruption request:
but then just continue waiting forever (even after the query is finished in Databricks).
If I hit push the "Stop (interrupt) execution" button twice the exception will be raised and I get the following trace:
The Spark job will, however, just continue on the Databricks side (and potentially happily scan through hundreds of terabytes without notifying the user 🙀).
Environment
Code to reproduce
The text was updated successfully, but these errors were encountered: