implement metric steps with tests #1788

sfc-gh-mchok · 2024-10-23T20:55:11Z

Pre-review checklist

I've confirmed that instructions included in README.md are still correct after my changes in the codebase.
I've added or updated automated unit tests to verify correctness of my new code.
I've added or updated integration tests to verify correctness of my new code.
I've confirmed that my changes are working by executing CLI's commands manually on MacOS.
I've confirmed that my changes are working by executing CLI's commands manually on Windows.
I've confirmed that my changes are up-to-date with the target branch.
I've described my changes in the release notes.
I've described my changes in the section below.

Changes description

To support instrumenting commands, we're implementing tracking steps of a command
We support both a context manager and manually calling start/end_step functions for flexible use cases
See the implemented test plan for the general contract

sfc-gh-mchok · 2024-10-23T20:56:18Z

tests/api/test_metrics_counters.py

tests were getting long, so I separated the two different use cases in their own test files

sfc-gh-fcampbell · 2024-10-23T21:02:53Z

src/snowflake/cli/api/metrics.py

+    def __init__(self, message: str):
+        super().__init__(message)


Suggested change

def __init__(self, message: str):

super().__init__(message)

pass

sfc-gh-fcampbell · 2024-10-23T21:03:39Z

src/snowflake/cli/api/metrics.py

+    def to_dict(self) -> Dict[str, Union[str, int, float, None]]:
+        return {
+            self.ID_KEY: self.step_id,
+            self.NAME_KEY: self.name,
+            self.PARENT_KEY: self.parent,
+            self.PARENT_ID_KEY: self.parent_id,
+            self.START_TIME_KEY: self.start_time,
+            self.EXECUTION_TIME_KEY: self.execution_time,
+            self.ERROR_KEY: self.error,
+        }


can this class be an @dataclass? then we wouldn't need this method or all the getters

sfc-gh-fcampbell · 2024-10-23T21:05:15Z

src/snowflake/cli/api/metrics.py

+        new_step = _CLIMetricsStep(name, parent_step)
+        self._executing_steps.append(new_step)
+        new_step.start()
+        return new_step.step_id


what's the use-case for only returning the ID? can we return the whole step?

esp since, AFAICT, step IDs aren't globally unique. Ideally for a start/stop scenario, one should be able to obtain some kind of unique token / structure from the start method, and pass that unmodified to the stop method.

sfc-gh-fcampbell · 2024-10-23T21:05:33Z

src/snowflake/cli/api/metrics.py

+        """
+        step_id = self.start_step(name)
+        try:
+            yield


could we yield the whole step here?

sfc-gh-fcampbell · 2024-10-23T21:08:14Z

tests/api/test_metrics_steps.py

+def test_metrics_steps_time_is_valid():
+    # given
+    metrics = CLIMetrics()
+
+    # when
+    with metrics.track_step("step1"):
+        pass
+
+    # then
+    assert len(metrics.steps) == 1
+    step1 = metrics.steps[0]
+    assert step1[_CLIMetricsStep.NAME_KEY] == "step1"
+
+
+def test_metrics_steps_name_is_valid():


i think the names of these two tests are reversed

sfc-gh-fcampbell · 2024-10-23T21:09:03Z

tests/api/test_metrics_steps.py

+def test_metrics_steps_error_caught():
+    # given
+    metrics = CLIMetrics()
+
+    # when
+    with metrics.track_step("step1"):
+        try:
+            raise RuntimeError()
+        except RuntimeError:
+            pass
+
+    # then
+    assert len(metrics.steps) == 1
+    step1 = metrics.steps[0]
+    assert step1[_CLIMetricsStep.ERROR_KEY] is None


this test is a bit redundant, it's obvious that the exception is caught and not propagated

sfc-gh-bdufour

(partial review)

sfc-gh-bdufour · 2024-10-24T13:02:02Z

src/snowflake/cli/api/metrics.py

+        self._parent: Optional[str] = parent.name if parent is not None else None
+        self._parent_id: Optional[int] = parent.step_id if parent is not None else None


nit: not a huge fan of expanding things this way. Why can't we just keep a ref to the parent and read the data on the fly as needed?

sfc-gh-bdufour · 2024-10-24T13:27:05Z

src/snowflake/cli/api/metrics.py

+        if error:
+            self._error = type(error).__name__


why not preserve the entire object?

sfc-gh-bdufour · 2024-10-24T13:29:04Z

src/snowflake/cli/api/metrics.py

+        """
+        return self._error
+
+    def to_dict(self) -> Dict[str, Union[str, int, float, None]]:


this is for reporting? I'd add a comment to that effect, since it makes this method a contract that should be stable.

sfc-gh-bdufour · 2024-10-24T13:29:54Z

src/snowflake/cli/api/metrics.py

+        return self._executing_steps[-1] if len(self._executing_steps) > 0 else None
+
+    @contextmanager
+    def track_step(self, name: str):


let's workshop the name

AFAIK the common nomenclature is "trace", which are composed of "spans". datadog/prometheus/opentelemetry use this: https://opentelemetry.io/docs/concepts/signals/traces/

sfc-gh-bdufour · 2024-10-24T13:30:27Z

src/snowflake/cli/api/metrics.py

 class CLIMetrics:
    """
    Class to track various metrics across the execution of a command
    """

    def __init__(self):
        self._counters: Dict[str, int] = {}
+        # stack of current steps as command is executing
+        self._executing_steps: List[_CLIMetricsStep] = []


executing -> in progress maybe?

sfc-gh-bdufour · 2024-10-24T14:05:51Z

src/snowflake/cli/api/metrics.py

+            if step_id or step_name:
+                raise CLIMetricsInvalidUsageError(
+                    f"step with {'id' if step_id else 'name'} '{step_id or step_name}' could not be ended because it could not be found"
+                )


this hurts my brain. if step_id elif step_name else would be a lot clearer.

sfc-gh-fcampbell · 2024-10-24T14:05:56Z

src/snowflake/cli/api/metrics.py

+    class for holding metrics step data and encapsulating related operations
+    """
+
+    _id_counter = count(start=1, step=1)


since steps end up in a single snowflake table, the ID should probably be a UUID (in case we flatten the array of steps to multiple rows in a query)

sfc-gh-bdufour · 2024-10-24T14:08:17Z

src/snowflake/cli/api/metrics.py

+        Useful if you are using the manual start/end steps and an error
+        propagated up, requiring you to clear out all the executing steps


doesn't say what this actually does. Please document the contract of the method.

sfc-gh-bdufour · 2024-10-24T14:10:01Z

src/snowflake/cli/api/metrics.py

+        while self._current_step:
+            self.end_step(error=error)


flush doesn't usually mean this. Also, that doesn't make a lot of sense to me. It's unrolling the entire stack. It can only be safely called in a single place, right before reporting I guess? Even then it's a bit of a tricky contract.

sfc-gh-bdufour · 2024-10-24T14:10:17Z

src/snowflake/cli/api/metrics.py

+        self._executing_steps.remove(found_step)
+        self._finished_steps.append(found_step)
+
+    def flush_steps(self, error: Optional[BaseException] = None) -> None:


I can't see a use case for calling this without an error. Can you?

sfc-gh-bdufour · 2024-10-24T14:11:55Z

src/snowflake/cli/api/metrics.py

+    def steps(self) -> List[Dict[str, Union[str, int, float, None]]]:
+        """
+        returns the finished steps tracked throughout a command, sorted by start time
+        """


I have no idea what the return type represents here, other than it's a list of dictionaries. Would List[dict] suffice? Basically you're returning a list of JSON objects.

sfc-gh-bdufour · 2024-10-24T14:14:03Z

src/snowflake/cli/api/metrics.py

+        new_step.start()
+        return new_step.step_id
+
+    def end_step(


have we considered modelling this as a self-contained Step structure, such that start/stop would be instance methods of a step? It would fit nicely with the context manager abstraction as well. I think it would help simplify the implementation here quite a bit too.

sfc-gh-bdufour · 2024-10-24T14:14:34Z

src/snowflake/cli/api/metrics.py

+        """
+        return [
+            step.to_dict()
+            for step in sorted(self._finished_steps, key=lambda step: step.start_time)


it's called steps but only returns the completed ones. A bit misleading?

sfc-gh-bdufour · 2024-10-24T14:16:05Z

tests/api/test_metrics_steps.py

+    # then
+    assert len(metrics.steps) == 1
+    step1 = metrics.steps[0]
+    assert step1[_CLIMetricsStep.NAME_KEY] == "step1"


please just merge a bunch of assertions into the same test. There is no need to write a single test case per field in the output.

sfc-gh-bdufour · 2024-10-24T14:17:01Z

tests/api/test_metrics_steps.py

+        try:
+            raise RuntimeError()
+        except RuntimeError:
+            pass


you're testing the python interpreter here. No value added.

sfc-gh-bdufour · 2024-10-24T14:17:48Z

tests/api/test_metrics_steps.py

+    except RuntimeError:
+        pass


never do this in tests. assert that the exception is indeed propagated instead. Pytest makes that trivial.

sfc-gh-bdufour · 2024-10-24T14:19:37Z

tests/api/test_metrics_steps.py

+    assert child[_CLIMetricsStep.NAME_KEY] == "child"
+    assert child[_CLIMetricsStep.PARENT_KEY] == "parent"
+    assert parent[_CLIMetricsStep.NAME_KEY] == "parent"


weak checks. I'd make sure the returned info looks right more generally. Did both steps get terminated properly, etc

sfc-gh-bdufour · 2024-10-24T14:21:52Z

tests/api/test_metrics_steps.py

+    with metrics.track_step("duplicate"):
+        with metrics.track_step("duplicate"):
+            pass


this example worries me a little. We could blow up the metrics if we try to time a recursive method. Should this be allowed?

sfc-gh-bdufour · 2024-10-24T14:23:22Z

tests/api/test_metrics_steps.py

+    parent, child = metrics.steps
+
+    assert (
+        child[_CLIMetricsStep.START_TIME_KEY] > parent[_CLIMetricsStep.START_TIME_KEY]


likely flaky, would use >= at least since there is no guarantee that execution will keep being slow enough to trigger a different start time for both steps.

sfc-gh-bdufour · 2024-10-24T14:23:49Z

tests/api/test_metrics_steps.py

+    with metrics.track_step("parent"):
+        try:
+            with metrics.track_step("child"):
+                raise RuntimeError()
+        except RuntimeError:
+            pass


meh, you're testing the python interpreter again

sfc-gh-bdufour · 2024-10-24T14:24:54Z

tests/api/test_metrics_steps.py

+        > child[_CLIMetricsStep.EXECUTION_TIME_KEY]
+    )
+
+


missing a test for multiple children under the same parent, esp multiple with the same name but diff ids?

sfc-gh-bdufour · 2024-10-24T14:25:11Z

tests/api/test_metrics_steps.py

+    assert child[_CLIMetricsStep.ERROR_KEY] == "RuntimeError"
+
+
+def test_metrics_steps_manual_start_and_end_overlapping_proper_parent():


what's a "proper" parent?

sfc-gh-bdufour · 2024-10-24T14:26:17Z

tests/api/test_metrics_steps.py

+    step1_id = metrics.start_step("step1")
+    step2_id = metrics.start_step("step2")
+
+    metrics.end_step(step_id=step2_id)
+    metrics.end_step(step_id=step1_id)


I don't get the reference to a parent in the test case name, it looks like there's no nesting at play here.

sfc-gh-bdufour · 2024-10-24T14:27:04Z

tests/api/test_metrics_steps.py

+    assert len(metrics.steps) == 2
+    step1, step2 = metrics.steps
+
+    assert step2[_CLIMetricsStep.PARENT_KEY] == "step1"


yikes, that's surprising. Steps are implicitly nested? Let's discuss offline.

sfc-gh-bdufour · 2024-10-24T14:27:30Z

tests/api/test_metrics_steps.py

+        step2[_CLIMetricsStep.START_TIME_KEY]
+        + step2[_CLIMetricsStep.EXECUTION_TIME_KEY]
+    )
+    assert step1_end_time > step2_end_time


ditto, could be flaky

implement metric steps with tests

1a3aae7

sfc-gh-mchok requested a review from a team as a code owner October 23, 2024 20:55

Merge branch 'main' into mchok-cli-metrics-steps

fff03b1

sfc-gh-mchok commented Oct 23, 2024

View reviewed changes

sfc-gh-fcampbell reviewed Oct 23, 2024

View reviewed changes

sfc-gh-bdufour reviewed Oct 24, 2024

View reviewed changes

sfc-gh-fcampbell reviewed Oct 24, 2024

View reviewed changes

sfc-gh-bdufour reviewed Oct 24, 2024

View reviewed changes

Merge branch 'main' into mchok-cli-metrics-steps

b2927c3

	def __init__(self, message: str):
	super().__init__(message)
	pass

		self._parent: Optional[str] = parent.name if parent is not None else None
		self._parent_id: Optional[int] = parent.step_id if parent is not None else None

		Useful if you are using the manual start/end steps and an error
		propagated up, requiring you to clear out all the executing steps

		assert child[_CLIMetricsStep.ERROR_KEY] == "RuntimeError"


		def test_metrics_steps_manual_start_and_end_overlapping_proper_parent():

implement metric steps with tests #1788

Are you sure you want to change the base?

implement metric steps with tests #1788

Conversation

sfc-gh-mchok commented Oct 23, 2024

Pre-review checklist

Changes description

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sfc-gh-bdufour left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sfc-gh-fcampbell Oct 24, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sfc-gh-bdufour Oct 24, 2024 • edited Loading

Choose a reason for hiding this comment

sfc-gh-fcampbell Oct 24, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sfc-gh-fcampbell Oct 24, 2024 •

edited

Loading

sfc-gh-bdufour Oct 24, 2024 •

edited

Loading

sfc-gh-fcampbell Oct 24, 2024 •

edited

Loading