Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement metric steps with tests #1788

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

sfc-gh-mchok
Copy link
Collaborator

Pre-review checklist

  • I've confirmed that instructions included in README.md are still correct after my changes in the codebase.
  • I've added or updated automated unit tests to verify correctness of my new code.
  • I've added or updated integration tests to verify correctness of my new code.
  • I've confirmed that my changes are working by executing CLI's commands manually on MacOS.
  • I've confirmed that my changes are working by executing CLI's commands manually on Windows.
  • I've confirmed that my changes are up-to-date with the target branch.
  • I've described my changes in the release notes.
  • I've described my changes in the section below.

Changes description

To support instrumenting commands, we're implementing tracking steps of a command
We support both a context manager and manually calling start/end_step functions for flexible use cases
See the implemented test plan for the general contract

@sfc-gh-mchok sfc-gh-mchok requested a review from a team as a code owner October 23, 2024 20:55
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tests were getting long, so I separated the two different use cases in their own test files

Comment on lines +27 to +28
def __init__(self, message: str):
super().__init__(message)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def __init__(self, message: str):
super().__init__(message)
pass

Comment on lines +159 to +168
def to_dict(self) -> Dict[str, Union[str, int, float, None]]:
return {
self.ID_KEY: self.step_id,
self.NAME_KEY: self.name,
self.PARENT_KEY: self.parent,
self.PARENT_ID_KEY: self.parent_id,
self.START_TIME_KEY: self.start_time,
self.EXECUTION_TIME_KEY: self.execution_time,
self.ERROR_KEY: self.error,
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this class be an @dataclass? then we wouldn't need this method or all the getters

new_step = _CLIMetricsStep(name, parent_step)
self._executing_steps.append(new_step)
new_step.start()
return new_step.step_id
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the use-case for only returning the ID? can we return the whole step?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

esp since, AFAICT, step IDs aren't globally unique. Ideally for a start/stop scenario, one should be able to obtain some kind of unique token / structure from the start method, and pass that unmodified to the stop method.

"""
step_id = self.start_step(name)
try:
yield
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we yield the whole step here?

Comment on lines +34 to +48
def test_metrics_steps_time_is_valid():
# given
metrics = CLIMetrics()

# when
with metrics.track_step("step1"):
pass

# then
assert len(metrics.steps) == 1
step1 = metrics.steps[0]
assert step1[_CLIMetricsStep.NAME_KEY] == "step1"


def test_metrics_steps_name_is_valid():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think the names of these two tests are reversed

Comment on lines +64 to +78
def test_metrics_steps_error_caught():
# given
metrics = CLIMetrics()

# when
with metrics.track_step("step1"):
try:
raise RuntimeError()
except RuntimeError:
pass

# then
assert len(metrics.steps) == 1
step1 = metrics.steps[0]
assert step1[_CLIMetricsStep.ERROR_KEY] is None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this test is a bit redundant, it's obvious that the exception is caught and not propagated

Copy link
Contributor

@sfc-gh-bdufour sfc-gh-bdufour left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(partial review)

Comment on lines +94 to +95
self._parent: Optional[str] = parent.name if parent is not None else None
self._parent_id: Optional[int] = parent.step_id if parent is not None else None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: not a huge fan of expanding things this way. Why can't we just keep a ref to the parent and read the data on the fly as needed?

Comment on lines +117 to +118
if error:
self._error = type(error).__name__
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not preserve the entire object?

"""
return self._error

def to_dict(self) -> Dict[str, Union[str, int, float, None]]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is for reporting? I'd add a comment to that effect, since it makes this method a contract that should be stable.

return self._executing_steps[-1] if len(self._executing_steps) > 0 else None

@contextmanager
def track_step(self, name: str):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's workshop the name

Copy link
Contributor

@sfc-gh-fcampbell sfc-gh-fcampbell Oct 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK the common nomenclature is "trace", which are composed of "spans". datadog/prometheus/opentelemetry use this: https://opentelemetry.io/docs/concepts/signals/traces/

class CLIMetrics:
"""
Class to track various metrics across the execution of a command
"""

def __init__(self):
self._counters: Dict[str, int] = {}
# stack of current steps as command is executing
self._executing_steps: List[_CLIMetricsStep] = []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

executing -> in progress maybe?

Comment on lines +281 to +284
if step_id or step_name:
raise CLIMetricsInvalidUsageError(
f"step with {'id' if step_id else 'name'} '{step_id or step_name}' could not be ended because it could not be found"
)
Copy link
Contributor

@sfc-gh-bdufour sfc-gh-bdufour Oct 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this hurts my brain. if step_id elif step_name else would be a lot clearer.

class for holding metrics step data and encapsulating related operations
"""

_id_counter = count(start=1, step=1)
Copy link
Contributor

@sfc-gh-fcampbell sfc-gh-fcampbell Oct 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since steps end up in a single snowflake table, the ID should probably be a UUID (in case we flatten the array of steps to multiple rows in a query)

Comment on lines +296 to +297
Useful if you are using the manual start/end steps and an error
propagated up, requiring you to clear out all the executing steps
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't say what this actually does. Please document the contract of the method.

Comment on lines +299 to +300
while self._current_step:
self.end_step(error=error)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flush doesn't usually mean this. Also, that doesn't make a lot of sense to me. It's unrolling the entire stack. It can only be safely called in a single place, right before reporting I guess? Even then it's a bit of a tricky contract.

self._executing_steps.remove(found_step)
self._finished_steps.append(found_step)

def flush_steps(self, error: Optional[BaseException] = None) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't see a use case for calling this without an error. Can you?

Comment on lines +308 to +311
def steps(self) -> List[Dict[str, Union[str, int, float, None]]]:
"""
returns the finished steps tracked throughout a command, sorted by start time
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea what the return type represents here, other than it's a list of dictionaries. Would List[dict] suffice? Basically you're returning a list of JSON objects.

new_step.start()
return new_step.step_id

def end_step(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have we considered modelling this as a self-contained Step structure, such that start/stop would be instance methods of a step? It would fit nicely with the context manager abstraction as well. I think it would help simplify the implementation here quite a bit too.

"""
return [
step.to_dict()
for step in sorted(self._finished_steps, key=lambda step: step.start_time)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's called steps but only returns the completed ones. A bit misleading?

# then
assert len(metrics.steps) == 1
step1 = metrics.steps[0]
assert step1[_CLIMetricsStep.NAME_KEY] == "step1"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please just merge a bunch of assertions into the same test. There is no need to write a single test case per field in the output.

Comment on lines +70 to +73
try:
raise RuntimeError()
except RuntimeError:
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're testing the python interpreter here. No value added.

Comment on lines +89 to +90
except RuntimeError:
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

never do this in tests. assert that the exception is indeed propagated instead. Pytest makes that trivial.

Comment on lines +125 to +127
assert child[_CLIMetricsStep.NAME_KEY] == "child"
assert child[_CLIMetricsStep.PARENT_KEY] == "parent"
assert parent[_CLIMetricsStep.NAME_KEY] == "parent"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

weak checks. I'd make sure the returned info looks right more generally. Did both steps get terminated properly, etc

Comment on lines +152 to +154
with metrics.track_step("duplicate"):
with metrics.track_step("duplicate"):
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this example worries me a little. We could blow up the metrics if we try to time a recursive method. Should this be allowed?

parent, child = metrics.steps

assert (
child[_CLIMetricsStep.START_TIME_KEY] > parent[_CLIMetricsStep.START_TIME_KEY]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

likely flaky, would use >= at least since there is no guarantee that execution will keep being slow enough to trigger a different start time for both steps.

Comment on lines +174 to +179
with metrics.track_step("parent"):
try:
with metrics.track_step("child"):
raise RuntimeError()
except RuntimeError:
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

meh, you're testing the python interpreter again

> child[_CLIMetricsStep.EXECUTION_TIME_KEY]
)


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing a test for multiple children under the same parent, esp multiple with the same name but diff ids?

assert child[_CLIMetricsStep.ERROR_KEY] == "RuntimeError"


def test_metrics_steps_manual_start_and_end_overlapping_proper_parent():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's a "proper" parent?

Comment on lines +193 to +197
step1_id = metrics.start_step("step1")
step2_id = metrics.start_step("step2")

metrics.end_step(step_id=step2_id)
metrics.end_step(step_id=step1_id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get the reference to a parent in the test case name, it looks like there's no nesting at play here.

assert len(metrics.steps) == 2
step1, step2 = metrics.steps

assert step2[_CLIMetricsStep.PARENT_KEY] == "step1"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yikes, that's surprising. Steps are implicitly nested? Let's discuss offline.

step2[_CLIMetricsStep.START_TIME_KEY]
+ step2[_CLIMetricsStep.EXECUTION_TIME_KEY]
)
assert step1_end_time > step2_end_time
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto, could be flaky

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants