Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SNOW-1805840: Augment telemetry with method_call_count #2804
base: main
Are you sure you want to change the base?
SNOW-1805840: Augment telemetry with method_call_count #2804
Changes from 11 commits
9810e24
d6d1a6e
733f4b0
ed2bd25
0801060
13c3f5e
69a2cf2
cccc1e2
857e249
e04b091
05204b0
b38e304
2509c2e
7062bd0
b3caf57
4cb5589
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a test where 2 different methods are called? Also, how does this interact with query compiler methods that call other query compiler methods?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a test that calls two different methods on the same QC now.
Trying out align, it returns the telemetry with its own func name for ex
[{'func_name': 'DataFrame.align', 'category': 'snowpark_pandas', 'error_msg': None, 'call_count': 1, 'api_calls': [{'name': 'DataFrame.align'}]}]
Did you have another specific method in mind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As an example,
SnowflakeQueryCompiler.any
with axis=0 callsSnowflakeQueryCompiler._bool_reduce_helper
, which then in turn callsSnowflakeQueryCompiler.agg
. I was wondering how this would be reflected in telemetry.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case
_query_compiler._method_call_counts
would include the func_name count'DataFrame.BasePandasDataset.any' = 1
as well as the other attributes called tracked with telemetry such as'DataFrame.__repr__'
and'DataFrame.property.iloc_get'.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Would it be accurate to say that
_query_compiler._method_call_counts
only tracks methods called on this particular instance of query compiler (the example call chain I gave returns a new query compiler instance every time), and these counts are only used in telemetry for certain frontend methods like dataframe and repr that we specify explicitly?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes exactly the goal is to track call counts on the same query compiler instance, more info is in the design doc here: https://docs.google.com/document/d/1EfqQwejVbF5_36hnOP-ap0t3NaCWmDz62iAcR0PtX20/edit?tab=t.0#heading=h.4uu48icmuq7z
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this second call served from some cache or recomputed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the telemetry call data is recomputed every time, @sfc-gh-azhan might confirm