Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: fixtures API and consistent / contextual cell.run #3569

Open
dmadisetti opened this issue Jan 24, 2025 · 2 comments
Open

feature: fixtures API and consistent / contextual cell.run #3569

dmadisetti opened this issue Jan 24, 2025 · 2 comments
Assignees

Comments

@dmadisetti
Copy link
Collaborator

dmadisetti commented Jan 24, 2025

I was thinking of writing some tests for persistent_cache and realized utilizing the native pytest ability might be a good way to go about it, such that the tests could leverage pytest's tmp_path.

This is what I came up with

tests/_save/test_persistent_cache.py
from __future__ import annotations

# Run as direct script for access to fixtures.
from _pytest.tmpdir import tmp_path as tmp_path_fixture

import marimo

app = marimo.App()

@app.cell
def fixtures():
  tmp_path_fixture = None

@app.cell
def setup():
  import marimo as mo
  import os
  pc = mo.persistent_cache


@app.cell
def test_general_loader(os, pc, tmp_path_fixture):
  print(tmp_path_fixture)
  assert not os.path.exists(tmp_path_fixture / "basic")
  with pc("basic", save_path=tmp_path_fixture) as cache:
      _b = 1
  assert _b == 1
  assert not cache._cache.hit
  assert os.path.exists(tmp_path_fixture / "basic" / f"C_{cache._cache.hash}.pickle")

@app.cell
def test_general_loader_hit(os, pc, tmp_path_fixture, cache):
  print(tmp_path_fixture)
  with pc("basic", save_path=tmp_path_fixture) as cache_2:
      _b = 1
  assert _b == 1
  assert cache_2._cache.hit
  assert cache._cache.hash == cache_2._cache.hash
  assert os.path.exists(tmp_path_fixture / "basic" / f"C_{cache._cache.hash}.pickle")


@app.cell
def test_json_loader(os, pc, tmp_path_fixture):
  print(tmp_path_fixture)
  assert not os.path.exists(tmp_path_fixture / "json")
  with pc("json", save_path=tmp_path_fixture, method="json") as json_cache:
      _b = 1
  assert _b == 1
  assert not json_cache._cache.hit
  assert os.path.exists(tmp_path_fixture / "json" / f"C_{json_cache._cache.hash}.json")

@app.cell
def test_json_loader_hit(os, pc, tmp_path_fixture, json_cache):
  print(tmp_path_fixture)
  with pc("json", save_path=tmp_path_fixture, method="json") as json_cache_2:
      _b = 1
  assert _b == 1
  assert json_cache_2._cache.hit
  assert json_cache._cache.hash == json_cache_2._cache.hash
  assert os.path.exists(tmp_path_fixture / "json" / f"C_{json_cache._cache.hash}.json")

The fixtures are successfully inserted, but this breaks for 2 reasons:

  1. There is no context

Which is fair and reasonable, it's just cell.run, and might be worth working around in cache (mainly used for UI and state, but using them should be the failure point not cache) (context is definitely needed since graph dependent)

  1. cell.run only injects definitions into the cell where run was called

This was a bit more surprising, but makes sense. So in the script above, test_json_loader_hit fails since the parent test_json_loader runs with tmp_path_fixture = None

It would be nice to have a run_with (maybe run_from?) where if the injected defs provided completely cover the input graph, then those parent nodes are skipped.

edit: Woops hit enter before I was done

@akshayka
Copy link
Contributor

cell.run only injects definitions into the cell where run was called
It would be nice to have a run_with (maybe run_from?) where if the injected defs provided completely cover the input graph, then those parent nodes are skipped.

Oh ... I might consider this a bug. Perhaps we should just update cell.run() to propagate the injected variables to ancestors?

@dmadisetti
Copy link
Collaborator Author

dmadisetti commented Jan 27, 2025

Got you. Wasn't sure if that was the expected behavior. What's the best way to handle the following case?

@app.cell
def dep_1(): # Do we skip this cell all together? Has additional defs and a side effect
    a = 1
    b = a + 1
    print(1)
    return a, b

@app.cell
def dep_2(): # Do we skip this cell all together? Has no other deps, and no side effects
    c = 3
    return c

@app.cell
def intermediate(b):
    d = b * 2
    print(2)
    return d

@app.cell
def to_run(a, c, d):
    e = a + c + d
    e
    return e


to_run.run(a=7, c=8) # print 1?

Think on it first, and then here's my thoughts hidden to prevent bias

I think this should create an error. Overloaded parameters should cover should cover all defs in their cells. Error in the above case can be as simple as:

Run Exception:  Consider defining `b` in a separate cell from `a`, or overloading a value for b

Then with the cell defs defined, that block is skipped. So the above example prints 2, but not 1

@dmadisetti dmadisetti self-assigned this Jan 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants