Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CreateAnomalyDetectorTool #348

Merged
merged 5 commits into from
Jul 16, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion src/main/java/org/opensearch/agent/ToolPlugin.java
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
import java.util.function.Supplier;

import org.opensearch.agent.common.SkillSettings;
import org.opensearch.agent.tools.CreateAnomalyDetectorTool;
import org.opensearch.agent.tools.NeuralSparseSearchTool;
import org.opensearch.agent.tools.PPLTool;
import org.opensearch.agent.tools.RAGTool;
Expand Down Expand Up @@ -73,6 +74,7 @@ public Collection<Object> createComponents(
SearchAnomalyDetectorsTool.Factory.getInstance().init(client, namedWriteableRegistry);
SearchAnomalyResultsTool.Factory.getInstance().init(client, namedWriteableRegistry);
SearchMonitorsTool.Factory.getInstance().init(client);
CreateAnomalyDetectorTool.Factory.getInstance().init(client);
return Collections.emptyList();
}

Expand All @@ -87,7 +89,8 @@ public List<Tool.Factory<? extends Tool>> getToolFactories() {
SearchAlertsTool.Factory.getInstance(),
SearchAnomalyDetectorsTool.Factory.getInstance(),
SearchAnomalyResultsTool.Factory.getInstance(),
SearchMonitorsTool.Factory.getInstance()
SearchMonitorsTool.Factory.getInstance(),
CreateAnomalyDetectorTool.Factory.getInstance()
);
}

Expand Down

Large diffs are not rendered by default.

25 changes: 2 additions & 23 deletions src/main/java/org/opensearch/agent/tools/PPLTool.java
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@
import org.opensearch.action.search.SearchRequest;
import org.opensearch.agent.common.SkillSettings;
import org.opensearch.agent.tools.utils.ClusterSettingHelper;
import org.opensearch.agent.tools.utils.ToolHelper;
import org.opensearch.client.Client;
import org.opensearch.cluster.metadata.MappingMetadata;
import org.opensearch.core.action.ActionListener;
Expand Down Expand Up @@ -401,7 +402,7 @@ private String constructTableInfo(SearchHit[] searchHits, Map<String, MappingMet
);
}
Map<String, String> fieldsToType = new HashMap<>();
extractNamesTypes(mappingSource, fieldsToType, "");
ToolHelper.extractFieldNamesTypes(mappingSource, fieldsToType, "");
StringJoiner tableInfoJoiner = new StringJoiner("\n");
List<String> sortedKeys = new ArrayList<>(fieldsToType.keySet());
Collections.sort(sortedKeys);
Expand Down Expand Up @@ -439,28 +440,6 @@ private String constructPrompt(String tableInfo, String question, String indexNa
return substitutor.replace(contextPrompt);
}

private void extractNamesTypes(Map<String, Object> mappingSource, Map<String, String> fieldsToType, String prefix) {
if (!prefix.isEmpty()) {
prefix += ".";
}

for (Map.Entry<String, Object> entry : mappingSource.entrySet()) {
String n = entry.getKey();
Object v = entry.getValue();

if (v instanceof Map) {
Map<String, Object> vMap = (Map<String, Object>) v;
if (vMap.containsKey("type")) {
if (!((vMap.getOrDefault("type", "")).equals("alias"))) {
fieldsToType.put(prefix + n, (String) vMap.get("type"));
}
} else if (vMap.containsKey("properties")) {
extractNamesTypes((Map<String, Object>) vMap.get("properties"), fieldsToType, prefix + n);
}
}
}
}

private static void extractSamples(Map<String, Object> sampleSource, Map<String, String> fieldsToSample, String prefix)
throws PrivilegedActionException {
if (!prefix.isEmpty()) {
Expand Down
42 changes: 42 additions & 0 deletions src/main/java/org/opensearch/agent/tools/utils/ToolHelper.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
/*
* Copyright OpenSearch Contributors
* SPDX-License-Identifier: Apache-2.0
*/

package org.opensearch.agent.tools.utils;

import java.util.Map;

public class ToolHelper {
/**
* Flatten all the fields in the mappings, insert the field->field type mapping to a map
* @param mappingSource the mappings of an index
* @param fieldsToType the result containing the field->field type mapping
* @param prefix the parent field path
*/
public static void extractFieldNamesTypes(Map<String, Object> mappingSource, Map<String, String> fieldsToType, String prefix) {
if (prefix.length() > 0) {
prefix += ".";
}

for (Map.Entry<String, Object> entry : mappingSource.entrySet()) {
String n = entry.getKey();
Object v = entry.getValue();

if (v instanceof Map) {
Map<String, Object> vMap = (Map<String, Object>) v;
if (vMap.containsKey("type")) {
if (!((vMap.getOrDefault("type", "")).equals("alias"))) {
fieldsToType.put(prefix + n, (String) vMap.get("type"));
}
}
if (vMap.containsKey("properties")) {
extractFieldNamesTypes((Map<String, Object>) vMap.get("properties"), fieldsToType, prefix + n);
}
if (vMap.containsKey("fields")) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method extractFieldNamesTypes leverages most of the code of the original method in the PPLTool class, but I made some change because the following cases should be taken into account:

"a": {
      "type": "object",
      "properties": {
        "b": {
          "type":"keyword"
        }
      }
    }

and

"c": {
      "type": "text",
      "fields": {
        "d": {
          "type":"keyword"
        }
      }
    }

, @zane-neo and @xinyual, please help to check if this change has a bad impact on the PPLTool, thank you!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems after this change, the recursion will go one layer deeper when type is text, which results in changing the result from text to keyword, @xinyual please check from your end if this have impact on PPLTool.

extractFieldNamesTypes((Map<String, Object>) vMap.get("fields"), fieldsToType, prefix + n);
}
}
}
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
{
"CLAUDE": "Human:\" turn\": Here is an example of the create anomaly detector API: POST _plugins/_anomaly_detection/detectors, {\"time_field\":\"timestamp\",\"indices\":[\"server_log*\"],\"feature_attributes\":[{\"feature_name\":\"test\",\"feature_enabled\":true,\"aggregation_query\":{\"test\":{\"sum\":{\"field\":\"value\"}}}}],\"category_field\":[\"ip\"]}, and here are the mapping info containing all the fields in the index ${indexInfo.indexName}: ${indexInfo.indexMapping}, and the optional aggregation methods are count, avg, min, max and sum. Please give me some suggestion about creating an anomaly detector for the index ${indexInfo.indexName}, you need to give the key information: the top 3 suitable aggregation fields which are numeric types and the suitable aggregation method for each field, if there are no numeric type fields, both the aggregation field and method are empty string, and also give the category field if there exists a keyword type field like ip, address, host, city, country or region, if not exist, the category field is empty. Show me a format of keyed and pipe-delimited list wrapped in a curly bracket just like {category_field=the category field if exists|aggregation_field=comma-delimited list of all the aggregation field names|aggregation_method=comma-delimited list of all the aggregation methods}. \n\nAssistant:\" turn\"",
"OPENAI": "Here is an example of the create anomaly detector API: POST _plugins/_anomaly_detection/detectors, {\"time_field\":\"timestamp\",\"indices\":[\"server_log*\"],\"feature_attributes\":[{\"feature_name\":\"test\",\"feature_enabled\":true,\"aggregation_query\":{\"test\":{\"sum\":{\"field\":\"value\"}}}}],\"category_field\":[\"ip\"]}, and here are the mapping info containing all the fields in the index ${indexInfo.indexName}: ${indexInfo.indexMapping}, and the optional aggregation methods are count, avg, min, max and sum. Please give me some suggestion about creating an anomaly detector for the index ${indexInfo.indexName}, you need to give the key information: the top 3 suitable aggregation fields which are numeric types and the suitable aggregation method for each field, if there are no numeric type fields, both the aggregation field and method are empty string, and also give the category field if there exists a keyword type field like ip, address, host, city, country or region, if not exist, the category field is empty. Show me a format of keyed and pipe-delimited list wrapped in a curly bracket just like {category_field=the category field if exists|aggregation_field=comma-delimited list of all the aggregation field names|aggregation_method=comma-delimited list of all the aggregation methods}. "
}
Loading
Loading