-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add LogPatternTool #413
Add LogPatternTool #413
Conversation
de3a554
to
2a5cbc2
Compare
if (hits != null && hits.length > 0) { | ||
String patternField = parameters.containsKey(PATTERN_FIELD) | ||
? parameters.get(PATTERN_FIELD) | ||
: findLongestField(hits[0].getSourceAsMap()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we stop if longestField is fewer than maybe 30 chars or continue to search for 10 more records in case the first one has outlier data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It just imitates the logic of Discover log pattern.
It sounds good to avoid outliner, but it could also make the logic more complex and inconsistent with Discover.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The above suggestion is not ideal, just want to trigger some discussion. It's ok to leave it as it is if we don't have better idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can implement the ut and it after resolving my comments
Signed-off-by: Heng Qian <[email protected]>
src/main/java/org/opensearch/agent/tools/AbstractRetrieverTool.java
Outdated
Show resolved
Hide resolved
super(client, xContentRegistry, null, null, docSize); | ||
this.topNPattern = topNPattern; | ||
this.sampleLogSize = sampleLogSize; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are docSize, topNPattern and sampleLogSize tool parameters or run time parameters?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
run time parameters I think. But I think users can set them as well when creating this tool, and it should support override them in runtime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did not see the PPL, but it makes sense to use DSL for search. Would you like to check whether the code can be reused with search index tool? In search index tool, it has specialized operations on connector, model and model_group index.
LogPatternTool aims to generate log pattern for DSL. PPL already supports |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job @qianheng-aws ! You can go on with ut and it after resolving these two mistakes.
Signed-off-by: Heng Qian <[email protected]>
Signed-off-by: Heng Qian <[email protected]>
Signed-off-by: Heng Qian <[email protected]>
Signed-off-by: Heng Qian <[email protected]>
Signed-off-by: Heng Qian <[email protected]>
5ed75f0
to
6b71d21
Compare
It's better to add ITs to cover the code, is there any concern on adding ITs? |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #413 +/- ##
============================================
- Coverage 81.78% 77.75% -4.04%
- Complexity 193 260 +67
============================================
Files 11 15 +4
Lines 961 1380 +419
Branches 137 196 +59
============================================
+ Hits 786 1073 +287
- Misses 121 231 +110
- Partials 54 76 +22
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
I'm going to add one for this PR. |
The backport to
To backport manually, run these commands in your terminal: # Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/skills/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/skills/backport-2.x
# Create a new branch
git switch --create backport/backport-413-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 7d726125bc9704d5a1fa01831f44c7c0d7b4aab5
# Push it to GitHub
git push --set-upstream origin backport/backport-413-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/skills/backport-2.x Then, create a pull request where the |
@qianheng-aws Please raise another PR with UT and IT. |
* ut fix: construct update in Monitor Signed-off-by: Heng Qian <[email protected]> * Add LogPatternTool Signed-off-by: Heng Qian <[email protected]> * Address comments Signed-off-by: Heng Qian <[email protected]> * Add function doc Signed-off-by: Heng Qian <[email protected]> * Address comments Signed-off-by: Heng Qian <[email protected]> * Address comments Signed-off-by: Heng Qian <[email protected]> --------- Signed-off-by: Heng Qian <[email protected]> (cherry picked from commit 7d72612)
* Add LogPatternTool (#413) * ut fix: construct update in Monitor Signed-off-by: Heng Qian <[email protected]> * Add LogPatternTool Signed-off-by: Heng Qian <[email protected]> * Address comments Signed-off-by: Heng Qian <[email protected]> * Add function doc Signed-off-by: Heng Qian <[email protected]> * Address comments Signed-off-by: Heng Qian <[email protected]> * Address comments Signed-off-by: Heng Qian <[email protected]> --------- Signed-off-by: Heng Qian <[email protected]> (cherry picked from commit 7d72612) * Make compatible with java11 Signed-off-by: Heng Qian <[email protected]> * spotlessApply Signed-off-by: Heng Qian <[email protected]> --------- Signed-off-by: Heng Qian <[email protected]>
The backport to
To backport manually, run these commands in your terminal: # Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/skills/backport-2.17 2.17
# Navigate to the new working tree
pushd ../.worktrees/skills/backport-2.17
# Create a new branch
git switch --create backport/backport-413-to-2.17
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 7d726125bc9704d5a1fa01831f44c7c0d7b4aab5
# Push it to GitHub
git push --set-upstream origin backport/backport-413-to-2.17
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/skills/backport-2.17 Then, create a pull request where the |
* Add LogPatternTool (opensearch-project#413) * ut fix: construct update in Monitor Signed-off-by: Heng Qian <[email protected]> * Add LogPatternTool Signed-off-by: Heng Qian <[email protected]> * Address comments Signed-off-by: Heng Qian <[email protected]> * Add function doc Signed-off-by: Heng Qian <[email protected]> * Address comments Signed-off-by: Heng Qian <[email protected]> * Address comments Signed-off-by: Heng Qian <[email protected]> --------- Signed-off-by: Heng Qian <[email protected]> (cherry picked from commit 7d72612) * Make compatible with java11 Signed-off-by: Heng Qian <[email protected]> * spotlessApply Signed-off-by: Heng Qian <[email protected]> --------- Signed-off-by: Heng Qian <[email protected]>
* Add LogPatternTool (#413) * ut fix: construct update in Monitor * Add LogPatternTool * Address comments * Add function doc * Address comments * Address comments --------- (cherry picked from commit 7d72612) * Make compatible with java11 * spotlessApply --------- Signed-off-by: Heng Qian <[email protected]>
…ct#418) (opensearch-project#422)" This reverts commit 7cd2c22.
…ct#418) (opensearch-project#422)" This reverts commit 7cd2c22. Signed-off-by: Heng Qian <[email protected]>
Description
Add LogPatternTool to support generate log patterns based on an input
DSL
.Background of log pattern
Log pattern is a feature in Observability. It aims to find the common feature among retrieved logs.
The implementation of the log pattern feature in Observability includes 3 steps:
source=opensearch_dashboards_sample_data_flights | where timestamp >= ’[2024-09-09 16](tel:2024090916):16:30.321000' and timestamp <= '2024-09-10 07:16:30.322000' | patterns Dest | stats count(), take(Dest, 1) by patterns_field
By default, the operator pattern in PPL will generate a new field named pattern_field which will ignore characters conatined by [a-zA-Z\d] in the original value(Code Ref; For alert analysis, it also seems less helpful to analysis on such pattern).
However, if customers use
DSL
to retrieve logs, it doesn't support such feature. Thus we need this tool to achieve this goal.Implementation of LogPatternTool
This tool supports generating log patterns on the input dsl and index. It's implemented by
several steps:
2.1 Find Pattern Field: If users provide parameter [[${PATTERN_FIELD}]], use it as the pattern
field; Otherwise, find the string field with the longest length on the first log.
2.2 Extract Pattern: If users provide parameter [[${PATTERN}]], compile it as a pattern;
Otherwise, use [[${DEFAULT_IGNORED_CHARS}]]. It will remove all chars matching the pattern.
Example
Related Issues
Check List
--signoff
.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.