fix: The simple application of stream=false Q&A will also directly return segmented content when the similarity is not enough #2073

shaohuzhang1 · 2025-01-22T09:35:51Z

fix: The simple application of stream=false Q&A will also directly return segmented content when the similarity is not enough

…turn segmented content when the similarity is not enough

f2c-ci-robot · 2025-01-22T09:35:55Z

Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

f2c-ci-robot · 2025-01-22T09:35:59Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

shaohuzhang1 · 2025-01-22T09:36:13Z

apps/application/chat_pipeline/step/chat_step/impl/base_chat_step.py

-                                      paragraph.hit_handling_method == 'directly_return']
+        directly_return_chunk_list = [AIMessageChunk(content=paragraph.content)
+                                      for paragraph in paragraph_list if (
+                                              paragraph.hit_handling_method == 'directly_return' and paragraph.similarity >= paragraph.directly_return_similarity)]
        if directly_return_chunk_list is not None and len(directly_return_chunk_list) > 0:
            return directly_return_chunk_list[0], False
        elif len(paragraph_list) == 0 and no_references_setting.get(


There are several issues and potential optimization suggestions in the provided code:

Function Name: The function get_block_result uses underscores which is typically preferred over snake case. It should be renamed to get_block_result.

Variable Names:

message_list, paragraph_list, AIMessage, problem_text, hit_handling_method, directly_return_chunk_list, similarity, etc., could benefit from more descriptive names to clarify their purpose.

Default Values:

Default values for parameters like paragraph_list and possibly other variables could make the function usage more intuitive and consistent.

Logical Conditions:

The logical conditions inside list comprehensions can be simplified for readability.

Comments:

Comments related to default reference settings (no_references_setting.get(...)) might need clarification if it's part of another function or variable not visible here.

Optimization Suggestions:

If the for loop processing each paragraph is computationally expensive, consider using generator expressions instead to create directly_return_chunk_list. This would reduce memory usage if dealing with a large number of paragraphs.

Here’s an updated version of the function incorporating some of these improvements:

def get_block_content(message_list: List[BaseMessage], paragraph_list=None, problem_text=None): if paragraph_list is None: paragraph_list = [] # Use a generator expression for creating directly returned chunks to avoid unnecessary memory usage directly_return_chunks_gen = ( AIMessageChunk(content=paragraph.content) for paragraph in paragraph_list if (paragraph.hit_handling_method == 'directly_return' and paragraph.similarity >= paragraph.directly_return_similarity) ) if directly_return_chunks_gen is not None and next(directly_return_chunks_gen, None) is not None: return next(directly_return_chunks_gen), False if len(paragraph_list) == 0 and not no_references_settings['enable']: # Add logic for when there are no references and references setting is off pass # Placeholder for further actions

Key Changes:

Renamed get_block_result to get_block_content.

Replaced underscores with camelCase where applicable for better Pythonic style.

Simplified comments about the no_references_setting.

Used a generator expression in directly_return_chunk_list to optimize memory usage.

fix: The simple application of stream=false Q&A will also directly re…

c4304cf

…turn segmented content when the similarity is not enough

f2c-ci-robot bot added the do-not-merge/release-note-label-needed label Jan 22, 2025

shaohuzhang1 merged commit 34b626d into main Jan 22, 2025
4 checks passed

shaohuzhang1 deleted the pr@main@fix_chat branch January 22, 2025 09:36

shaohuzhang1 commented Jan 22, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: The simple application of stream=false Q&A will also directly return segmented content when the similarity is not enough #2073

fix: The simple application of stream=false Q&A will also directly return segmented content when the similarity is not enough #2073

shaohuzhang1 commented Jan 22, 2025

f2c-ci-robot bot commented Jan 22, 2025

f2c-ci-robot bot commented Jan 22, 2025

shaohuzhang1 Jan 22, 2025

fix: The simple application of stream=false Q&A will also directly return segmented content when the similarity is not enough #2073

fix: The simple application of stream=false Q&A will also directly return segmented content when the similarity is not enough #2073

Conversation

shaohuzhang1 commented Jan 22, 2025

f2c-ci-robot bot commented Jan 22, 2025

f2c-ci-robot bot commented Jan 22, 2025

shaohuzhang1 Jan 22, 2025

Choose a reason for hiding this comment

Key Changes: