Allow llmQuestion to be optional when llmMessages is used. (Issue #3… #3072

austintlee · 2024-10-08T04:01:29Z

…067)

Description

Remove the check on llmQuestion being present in RAG request parameters.

Related Issues

Check List

[x ] New functionality includes testing.
New functionality has been documented.
API changes companion pull request created.
[x ] Commits are signed per the DCO using --signoff.
Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

…nsearch-project#3067) Signed-off-by: Austin Lee <[email protected]>

b4sjoo · 2024-10-08T04:07:56Z

.../org/opensearch/searchpipelines/questionanswering/generative/ext/GenerativeQAParameters.java

@@ -185,7 +187,7 @@ public GenerativeQAParameters(
    public GenerativeQAParameters(StreamInput input) throws IOException {
        this.conversationId = input.readOptionalString();
        this.llmModel = input.readOptionalString();
-        this.llmQuestion = input.readString();
+        this.llmQuestion = input.readOptionalString();


Hi @austintlee this does not quite make sense to me, why you only make llmQuestion into optional while keeping llmMessage as mandatory field, if you are trying to make user to choose one of them?

@b4sjoo I do make them both optional in the first constructor:

public GenerativeQAParameters( String conversationId, String llmModel, String llmQuestion, String systemPrompt, String userInstructions, Integer contextSize, Integer interactionSize, Integer timeout, String llmResponseField, List<MessageBlock> llmMessages ) { this.conversationId = conversationId; this.llmModel = llmModel; Preconditions .checkArgument( !(Strings.isNullOrEmpty(llmQuestion) && (llmMessages == null || llmMessages.isEmpty())), "At least one of " + LLM_QUESTION + " or " + LLM_MESSAGES_FIELD + " must be provided." ); this.llmQuestion = llmQuestion; this.systemPrompt = systemPrompt; this.userInstructions = userInstructions; this.contextSize = (contextSize == null) ? SIZE_NULL_VALUE : contextSize; this.interactionSize = (interactionSize == null) ? SIZE_NULL_VALUE : interactionSize; this.timeout = (timeout == null) ? SIZE_NULL_VALUE : timeout; this.llmResponseField = llmResponseField; if (llmMessages != null) { this.llmMessages.addAll(llmMessages); } }

But internally, llmMessages is never null and by default is an empty array.

So, when we write out to StreamOut, we don't need to do a null check:

public void writeTo(StreamOutput out) throws IOException { out.writeOptionalString(conversationId); out.writeOptionalString(llmModel); out.writeOptionalString(llmQuestion); out.writeOptionalString(systemPrompt); out.writeOptionalString(userInstructions); out.writeInt(contextSize); out.writeInt(interactionSize); out.writeInt(timeout); out.writeOptionalString(llmResponseField); out.writeList(llmMessages); }

Which is why I always expect it to be present (at least as an empty list) when I read it back:

public GenerativeQAParameters(StreamInput input) throws IOException { this.conversationId = input.readOptionalString(); this.llmModel = input.readOptionalString(); this.llmQuestion = input.readOptionalString(); this.systemPrompt = input.readOptionalString(); this.userInstructions = input.readOptionalString(); this.contextSize = input.readInt(); this.interactionSize = input.readInt(); this.timeout = input.readInt(); this.llmResponseField = input.readOptionalString(); this.llmMessages.addAll(input.readList(MessageBlock::new)); }

Is this an incorrect assumption? Does the StreamInput constructor need to consider llmMessages not being present in input?

You can also take a look at stream roundtrip test cases I have in GenerativeQAParamExtBuilderTests.

It's just because I saw a null check above and then you make here mandatory makes me confused. I think your answer makes sense to me, that llmMessage should never be null due to an empty list created

BTW, changing a readString() into readOptionalString() could potentially introduce a bwc issue when we have a mixed cluster

Going from required to optional should be OK, but not the other way around. How do we test it?

@pyek-bot is currently testing it, he should have a result by tomorrow. Basically we create a lower version cluster (e.g. 2.16) with dedicated master node, then we upgrade the data node to the current version to test. After this we perform the test again, but we upgrade master this time. Does this make sense?

I have tested this scenario. It seems to work fine when all nodes are eventually upgraded to 2.17.

When only the data node is upgraded, the NPE [https://github.com/[BUG] RAG processor throws null pointer exception #2983] comes into play since the master cannot serialize the data and send it to the data node.

When only the master node is upgraded, the data node cannot de-serialize due to new format and throws unexpected byte error.

However, when both are upgraded it works as expected with both llmQuestion and llmMessages.

dhrubo-os · 2024-10-08T06:02:04Z

plugin/src/test/java/org/opensearch/ml/rest/RestMLRAGSearchProcessorIT.java

@@ -359,7 +359,7 @@ public class RestMLRAGSearchProcessorIT extends MLCommonsRestTestCase {
        + "   \"ext\": {\n"
        + "      \"generative_qa_parameters\": {\n"
        + "        \"llm_model\": \"%s\",\n"
-        + "        \"llm_question\": \"%s\",\n"
+        // + " \"llm_question\": \"%s\",\n"


why not removing this line?

dhrubo-os · 2024-10-08T06:02:11Z

plugin/src/test/java/org/opensearch/ml/rest/RestMLRAGSearchProcessorIT.java

@@ -378,7 +378,7 @@ public class RestMLRAGSearchProcessorIT extends MLCommonsRestTestCase {
        + "   \"ext\": {\n"
        + "      \"generative_qa_parameters\": {\n"
        + "        \"llm_model\": \"%s\",\n"
-        + "        \"llm_question\": \"%s\",\n"
+        // + " \"llm_question\": \"%s\",\n"


Will remove it.

austintlee · 2024-10-09T01:39:42Z

REPRODUCE WITH: ./gradlew ':opensearch-ml-plugin:integTest' --tests "org.opensearch.ml.rest.RestMLInferenceSearchResponseProcessorIT.testMLInferenceProcessorLocalModel" -Dtests.seed=E6490143B0860730 -Dtests.security.manager=false -Dtests.locale=sw-TZ -Dtests.timezone=Asia/Aden -Druntime.java=21
RestMLInferenceSearchResponseProcessorIT > testMLInferenceProcessorLocalModel STANDARD_ERROR
    REPRODUCE WITH: ./gradlew ':opensearch-ml-plugin:integTest' --tests "org.opensearch.ml.rest.RestMLInferenceSearchResponseProcessorIT.testMLInferenceProcessorLocalModel" -Dtests.seed=E6490143B0860730 -Dtests.security.manager=false -Dtests.locale=sw-TZ -Dtests.timezone=Asia/Aden -Druntime.java=21

RestMLInferenceSearchResponseProcessorIT > testMLInferenceProcessorLocalModel FAILED
    org.opensearch.client.ResponseException: method [POST], host [http://127.0.0.1:32939/], URI [/_plugins/_ml/models/null/_deploy], status line [HTTP/1.1 404 Not Found]
    {"error":{"root_cause":[{"type":"status_exception","reason":"Failed to find model"}],"type":"status_exception","reason":"Failed to find model"},"status":404}

Not related to my change.

Signed-off-by: Austin Lee <[email protected]>

b4sjoo · 2024-10-09T04:07:22Z

REPRODUCE WITH: ./gradlew ':opensearch-ml-plugin:integTest' --tests "org.opensearch.ml.rest.RestMLInferenceSearchResponseProcessorIT.testMLInferenceProcessorLocalModel" -Dtests.seed=E6490143B0860730 -Dtests.security.manager=false -Dtests.locale=sw-TZ -Dtests.timezone=Asia/Aden -Druntime.java=21
RestMLInferenceSearchResponseProcessorIT > testMLInferenceProcessorLocalModel STANDARD_ERROR
    REPRODUCE WITH: ./gradlew ':opensearch-ml-plugin:integTest' --tests "org.opensearch.ml.rest.RestMLInferenceSearchResponseProcessorIT.testMLInferenceProcessorLocalModel" -Dtests.seed=E6490143B0860730 -Dtests.security.manager=false -Dtests.locale=sw-TZ -Dtests.timezone=Asia/Aden -Druntime.java=21

RestMLInferenceSearchResponseProcessorIT > testMLInferenceProcessorLocalModel FAILED
    org.opensearch.client.ResponseException: method [POST], host [http://127.0.0.1:32939/], URI [/_plugins/_ml/models/null/_deploy], status line [HTTP/1.1 404 Not Found]
    {"error":{"root_cause":[{"type":"status_exception","reason":"Failed to find model"}],"type":"status_exception","reason":"Failed to find model"},"status":404}

Not related to my change.

Yeah this is a know issue. I think it should be flaky

ylwu-amzn · 2024-10-09T04:18:48Z

plugin/src/test/java/org/opensearch/ml/rest/RestMLRAGSearchProcessorIT.java

@@ -723,8 +720,12 @@ public void testBM25WithBedrock() throws Exception {
    public void testBM25WithBedrockConverse() throws Exception {
        // Skip test if key is null
        if (AWS_ACCESS_KEY_ID == null) {
+            System.out.println("Skipping testBM25WithBedrockConverse because AWS_ACCESS_KEY_ID is null");


minor: can we use log ?

ylwu-amzn

LGTM. left minor comment for test code. But I'm ok to fix later

…#3072) * Allow llmQuestion to be optional when llmMessages is used. (Issue #3067) Signed-off-by: Austin Lee <[email protected]> * Remove unused lines. Signed-off-by: Austin Lee <[email protected]> --------- Signed-off-by: Austin Lee <[email protected]> (cherry picked from commit 48d275d)

…#3072) (#3082) * Allow llmQuestion to be optional when llmMessages is used. (Issue #3067) Signed-off-by: Austin Lee <[email protected]> * Remove unused lines. Signed-off-by: Austin Lee <[email protected]> --------- Signed-off-by: Austin Lee <[email protected]> (cherry picked from commit 48d275d) Co-authored-by: Austin Lee <[email protected]>

Allow llmQuestion to be optional when llmMessages is used. (Issue ope…

138def6

…nsearch-project#3067) Signed-off-by: Austin Lee <[email protected]>

austintlee requested review from b4sjoo, dhrubo-os, jngz-es, model-collapse, rbhavna, ylwu-amzn, zane-neo, Zhangxunmt, HenryL27, samuel-oci and xinyual as code owners October 8, 2024 04:01

austintlee had a problem deploying to ml-commons-cicd-env October 8, 2024 04:01 — with GitHub Actions Failure

austintlee temporarily deployed to ml-commons-cicd-env October 8, 2024 04:01 — with GitHub Actions Inactive

b4sjoo reviewed Oct 8, 2024

View reviewed changes

dhrubo-os reviewed Oct 8, 2024

View reviewed changes

austintlee had a problem deploying to ml-commons-cicd-env October 9, 2024 01:40 — with GitHub Actions Failure

austintlee temporarily deployed to ml-commons-cicd-env October 9, 2024 04:05 — with GitHub Actions Inactive

Remove unused lines.

1492608

Signed-off-by: Austin Lee <[email protected]>

austintlee temporarily deployed to ml-commons-cicd-env October 9, 2024 04:06 — with GitHub Actions Inactive

austintlee had a problem deploying to ml-commons-cicd-env October 9, 2024 04:06 — with GitHub Actions Failure

b4sjoo added the backport 2.x label Oct 9, 2024

ylwu-amzn reviewed Oct 9, 2024

View reviewed changes

ylwu-amzn approved these changes Oct 9, 2024

View reviewed changes

pyek-bot approved these changes Oct 9, 2024

View reviewed changes

b4sjoo approved these changes Oct 9, 2024

View reviewed changes

b4sjoo merged commit 48d275d into opensearch-project:main Oct 9, 2024
7 of 8 checks passed

opensearch-trigger-bot bot mentioned this pull request Oct 9, 2024

[Backport 2.x] Allow llmQuestion to be optional when llmMessages is used. (Issue #3… #3082

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow llmQuestion to be optional when llmMessages is used. (Issue #3… #3072

Allow llmQuestion to be optional when llmMessages is used. (Issue #3… #3072

austintlee commented Oct 8, 2024

b4sjoo Oct 8, 2024

austintlee Oct 9, 2024

b4sjoo Oct 9, 2024 •

edited

Loading

b4sjoo Oct 9, 2024 •

edited

Loading

austintlee Oct 9, 2024

b4sjoo Oct 9, 2024

pyek-bot Oct 9, 2024 •

edited

Loading

dhrubo-os Oct 8, 2024

austintlee Oct 9, 2024

dhrubo-os Oct 8, 2024

austintlee Oct 9, 2024

austintlee commented Oct 9, 2024

b4sjoo commented Oct 9, 2024

ylwu-amzn Oct 9, 2024

ylwu-amzn left a comment

Allow llmQuestion to be optional when llmMessages is used. (Issue #3… #3072

Allow llmQuestion to be optional when llmMessages is used. (Issue #3… #3072

Conversation

austintlee commented Oct 8, 2024

Description

Related Issues

Check List

Choose a reason for hiding this comment

Choose a reason for hiding this comment

b4sjoo Oct 9, 2024 • edited Loading

Choose a reason for hiding this comment

b4sjoo Oct 9, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pyek-bot Oct 9, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

austintlee commented Oct 9, 2024

b4sjoo commented Oct 9, 2024

Choose a reason for hiding this comment

ylwu-amzn left a comment

Choose a reason for hiding this comment

b4sjoo Oct 9, 2024 •

edited

Loading

b4sjoo Oct 9, 2024 •

edited

Loading

pyek-bot Oct 9, 2024 •

edited

Loading