-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: The simple application of stream=false Q&A will also directly return segmented content when the similarity is not enough #2073
Conversation
…turn segmented content when the similarity is not enough
Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
paragraph.hit_handling_method == 'directly_return'] | ||
directly_return_chunk_list = [AIMessageChunk(content=paragraph.content) | ||
for paragraph in paragraph_list if ( | ||
paragraph.hit_handling_method == 'directly_return' and paragraph.similarity >= paragraph.directly_return_similarity)] | ||
if directly_return_chunk_list is not None and len(directly_return_chunk_list) > 0: | ||
return directly_return_chunk_list[0], False | ||
elif len(paragraph_list) == 0 and no_references_setting.get( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are several issues and potential optimization suggestions in the provided code:
-
Function Name: The function
get_block_result
uses underscores which is typically preferred over snake case. It should be renamed toget_block_result
. -
Variable Names:
message_list
,paragraph_list
,AIMessage
,problem_text
,hit_handling_method
,directly_return_chunk_list
,similarity
, etc., could benefit from more descriptive names to clarify their purpose.
-
Default Values:
- Default values for parameters like
paragraph_list
and possibly other variables could make the function usage more intuitive and consistent.
- Default values for parameters like
-
Logical Conditions:
- The logical conditions inside list comprehensions can be simplified for readability.
-
Comments:
- Comments related to default reference settings (
no_references_setting.get(...)
) might need clarification if it's part of another function or variable not visible here.
- Comments related to default reference settings (
-
Optimization Suggestions:
- If the
for
loop processing each paragraph is computationally expensive, consider using generator expressions instead to createdirectly_return_chunk_list
. This would reduce memory usage if dealing with a large number of paragraphs.
- If the
Here’s an updated version of the function incorporating some of these improvements:
def get_block_content(message_list: List[BaseMessage], paragraph_list=None, problem_text=None):
if paragraph_list is None:
paragraph_list = []
# Use a generator expression for creating directly returned chunks to avoid unnecessary memory usage
directly_return_chunks_gen = (
AIMessageChunk(content=paragraph.content)
for paragraph in paragraph_list
if (paragraph.hit_handling_method == 'directly_return' and paragraph.similarity >=
paragraph.directly_return_similarity)
)
if directly_return_chunks_gen is not None and next(directly_return_chunks_gen, None) is not None:
return next(directly_return_chunks_gen), False
if len(paragraph_list) == 0 and not no_references_settings['enable']:
# Add logic for when there are no references and references setting is off
pass # Placeholder for further actions
Key Changes:
- Renamed
get_block_result
toget_block_content
. - Replaced underscores with camelCase where applicable for better Pythonic style.
- Simplified comments about the
no_references_setting
. - Used a generator expression in
directly_return_chunk_list
to optimize memory usage.
fix: The simple application of stream=false Q&A will also directly return segmented content when the similarity is not enough