-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
implement metric steps with tests #1788
base: main
Are you sure you want to change the base?
Conversation
tests/api/test_metrics_counters.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tests were getting long, so I separated the two different use cases in their own test files
def __init__(self, message: str): | ||
super().__init__(message) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def __init__(self, message: str): | |
super().__init__(message) | |
pass |
def to_dict(self) -> Dict[str, Union[str, int, float, None]]: | ||
return { | ||
self.ID_KEY: self.step_id, | ||
self.NAME_KEY: self.name, | ||
self.PARENT_KEY: self.parent, | ||
self.PARENT_ID_KEY: self.parent_id, | ||
self.START_TIME_KEY: self.start_time, | ||
self.EXECUTION_TIME_KEY: self.execution_time, | ||
self.ERROR_KEY: self.error, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this class be an @dataclass
? then we wouldn't need this method or all the getters
new_step = _CLIMetricsStep(name, parent_step) | ||
self._executing_steps.append(new_step) | ||
new_step.start() | ||
return new_step.step_id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the use-case for only returning the ID? can we return the whole step?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
esp since, AFAICT, step IDs aren't globally unique. Ideally for a start/stop scenario, one should be able to obtain some kind of unique token / structure from the start method, and pass that unmodified to the stop method.
""" | ||
step_id = self.start_step(name) | ||
try: | ||
yield |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we yield the whole step here?
def test_metrics_steps_time_is_valid(): | ||
# given | ||
metrics = CLIMetrics() | ||
|
||
# when | ||
with metrics.track_step("step1"): | ||
pass | ||
|
||
# then | ||
assert len(metrics.steps) == 1 | ||
step1 = metrics.steps[0] | ||
assert step1[_CLIMetricsStep.NAME_KEY] == "step1" | ||
|
||
|
||
def test_metrics_steps_name_is_valid(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think the names of these two tests are reversed
def test_metrics_steps_error_caught(): | ||
# given | ||
metrics = CLIMetrics() | ||
|
||
# when | ||
with metrics.track_step("step1"): | ||
try: | ||
raise RuntimeError() | ||
except RuntimeError: | ||
pass | ||
|
||
# then | ||
assert len(metrics.steps) == 1 | ||
step1 = metrics.steps[0] | ||
assert step1[_CLIMetricsStep.ERROR_KEY] is None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this test is a bit redundant, it's obvious that the exception is caught and not propagated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(partial review)
self._parent: Optional[str] = parent.name if parent is not None else None | ||
self._parent_id: Optional[int] = parent.step_id if parent is not None else None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: not a huge fan of expanding things this way. Why can't we just keep a ref to the parent and read the data on the fly as needed?
if error: | ||
self._error = type(error).__name__ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not preserve the entire object?
""" | ||
return self._error | ||
|
||
def to_dict(self) -> Dict[str, Union[str, int, float, None]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is for reporting? I'd add a comment to that effect, since it makes this method a contract that should be stable.
return self._executing_steps[-1] if len(self._executing_steps) > 0 else None | ||
|
||
@contextmanager | ||
def track_step(self, name: str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's workshop the name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK the common nomenclature is "trace", which are composed of "spans". datadog/prometheus/opentelemetry use this: https://opentelemetry.io/docs/concepts/signals/traces/
class CLIMetrics: | ||
""" | ||
Class to track various metrics across the execution of a command | ||
""" | ||
|
||
def __init__(self): | ||
self._counters: Dict[str, int] = {} | ||
# stack of current steps as command is executing | ||
self._executing_steps: List[_CLIMetricsStep] = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
executing -> in progress maybe?
if step_id or step_name: | ||
raise CLIMetricsInvalidUsageError( | ||
f"step with {'id' if step_id else 'name'} '{step_id or step_name}' could not be ended because it could not be found" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this hurts my brain. if step_id elif step_name else
would be a lot clearer.
class for holding metrics step data and encapsulating related operations | ||
""" | ||
|
||
_id_counter = count(start=1, step=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since steps end up in a single snowflake table, the ID should probably be a UUID (in case we flatten the array of steps to multiple rows in a query)
Useful if you are using the manual start/end steps and an error | ||
propagated up, requiring you to clear out all the executing steps |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doesn't say what this actually does. Please document the contract of the method.
while self._current_step: | ||
self.end_step(error=error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
flush doesn't usually mean this. Also, that doesn't make a lot of sense to me. It's unrolling the entire stack. It can only be safely called in a single place, right before reporting I guess? Even then it's a bit of a tricky contract.
self._executing_steps.remove(found_step) | ||
self._finished_steps.append(found_step) | ||
|
||
def flush_steps(self, error: Optional[BaseException] = None) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't see a use case for calling this without an error. Can you?
def steps(self) -> List[Dict[str, Union[str, int, float, None]]]: | ||
""" | ||
returns the finished steps tracked throughout a command, sorted by start time | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no idea what the return type represents here, other than it's a list of dictionaries. Would List[dict]
suffice? Basically you're returning a list of JSON objects.
new_step.start() | ||
return new_step.step_id | ||
|
||
def end_step( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have we considered modelling this as a self-contained Step structure, such that start/stop would be instance methods of a step? It would fit nicely with the context manager abstraction as well. I think it would help simplify the implementation here quite a bit too.
""" | ||
return [ | ||
step.to_dict() | ||
for step in sorted(self._finished_steps, key=lambda step: step.start_time) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's called steps
but only returns the completed ones. A bit misleading?
# then | ||
assert len(metrics.steps) == 1 | ||
step1 = metrics.steps[0] | ||
assert step1[_CLIMetricsStep.NAME_KEY] == "step1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please just merge a bunch of assertions into the same test. There is no need to write a single test case per field in the output.
try: | ||
raise RuntimeError() | ||
except RuntimeError: | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you're testing the python interpreter here. No value added.
except RuntimeError: | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
never do this in tests. assert that the exception is indeed propagated instead. Pytest makes that trivial.
assert child[_CLIMetricsStep.NAME_KEY] == "child" | ||
assert child[_CLIMetricsStep.PARENT_KEY] == "parent" | ||
assert parent[_CLIMetricsStep.NAME_KEY] == "parent" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
weak checks. I'd make sure the returned info looks right more generally. Did both steps get terminated properly, etc
with metrics.track_step("duplicate"): | ||
with metrics.track_step("duplicate"): | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this example worries me a little. We could blow up the metrics if we try to time a recursive method. Should this be allowed?
parent, child = metrics.steps | ||
|
||
assert ( | ||
child[_CLIMetricsStep.START_TIME_KEY] > parent[_CLIMetricsStep.START_TIME_KEY] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
likely flaky, would use >=
at least since there is no guarantee that execution will keep being slow enough to trigger a different start time for both steps.
with metrics.track_step("parent"): | ||
try: | ||
with metrics.track_step("child"): | ||
raise RuntimeError() | ||
except RuntimeError: | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
meh, you're testing the python interpreter again
> child[_CLIMetricsStep.EXECUTION_TIME_KEY] | ||
) | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing a test for multiple children under the same parent, esp multiple with the same name but diff ids?
assert child[_CLIMetricsStep.ERROR_KEY] == "RuntimeError" | ||
|
||
|
||
def test_metrics_steps_manual_start_and_end_overlapping_proper_parent(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's a "proper" parent?
step1_id = metrics.start_step("step1") | ||
step2_id = metrics.start_step("step2") | ||
|
||
metrics.end_step(step_id=step2_id) | ||
metrics.end_step(step_id=step1_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get the reference to a parent in the test case name, it looks like there's no nesting at play here.
assert len(metrics.steps) == 2 | ||
step1, step2 = metrics.steps | ||
|
||
assert step2[_CLIMetricsStep.PARENT_KEY] == "step1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yikes, that's surprising. Steps are implicitly nested? Let's discuss offline.
step2[_CLIMetricsStep.START_TIME_KEY] | ||
+ step2[_CLIMetricsStep.EXECUTION_TIME_KEY] | ||
) | ||
assert step1_end_time > step2_end_time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto, could be flaky
Pre-review checklist
Changes description
To support instrumenting commands, we're implementing tracking steps of a command
We support both a context manager and manually calling start/end_step functions for flexible use cases
See the implemented test plan for the general contract