Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: optimize the display of STT nodes #1874

Closed
wants to merge 3 commits into from

Conversation

shaohuzhang1
Copy link
Contributor

refactor: optimize the display of SST nodes

Copy link

f2c-ci-robot bot commented Dec 19, 2024

Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Copy link

f2c-ci-robot bot commented Dec 19, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment


def get_details(self, index: int, **kwargs):
return {
'name': self.node.properties.get('stepName'),
"index": index,
'run_time': self.context.get('run_time'),
'answer': self.context.get('answer'),
'content': self.context.get('content'),
'type': self.node.type,
'status': self.status,
'err_message': self.err_message,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The provided code has several issues and could be improved:

Issues:

  1. Variable Naming: The variable name stt_model is not defined. It should be replaced with the actual model instance used for speech-to-text conversion.

  2. Splitter Consistency: The line splitter = '\n-----------------------------------\n'; uses double backslashes \\, which will cause issues because it won't escape the newlines correctly. You should use a single backslash \.

  3. Content Handling: The logic for handling content seems incorrect. Instead of adding all items into a single content list, a nested loop approach would allow you to add each item's details under the corresponding file.

  4. Error Handling: Ensure that any errors related to saving or processing files are caught and handled appropriately in your error handling mechanism.

  5. Node Result Structure: The structure of NodeResult might need adjustment based on how your application expects results to be formatted. For example, using dictionaries instead of lists can make more sense if each key-value pair represents a different piece of data.

Improvements:

  1. Fix Variable Name:
  • Replace stt_model here with the actual model instance you're using


2. **Correct Backslashes**:
   ```python
+    splitter = r'\n-------------------------\n'
  1. Optimized Content Handling:
    Change the way content is collected:

    def process_audio_items(audio_list, model):
        result_with_content = {}
        
        with ThreadPoolExecutor(max_workers=5) as executor:
            futures = [executor.submit(process_audio_item, item, model) for item in audio_list]
            
            for future in futures:
                audio_data, transcription = future.result()
                file_name = audio_data["file_name"]
                result_with_content[file_name] = {
                    "transcription": transcription,
                    "content": [
                        f"### {file_name}\n{transcription}"
                    ]
                }
    
        return NodeResult({"data": result_with_content})
  2. Consistent Error Handling:
    Implement proper error logging and handling mechanisms within the process_audio_item method and other parts of the function where file operations occur.

  3. Review Node Result Structure:
    If needed, adjust NodeResult to better fit your data needs. Consider returning dictionaries with clear keys indicating what information each part represents (e.g., answer, results, etc.)

Here’s an updated version incorporating some of these improvements:

from joblib.pool import ThreadPoolExecutor
from typing import Dict

from .models import File
from setting.models_provider.tools import get_model_instance_by_model_user_id
import os

# Assuming this is the correct model instance for speech recognition
stt_model = get_model_instance_by_model_user_id(user_id)

def process_audio_item(audio_item):
    temp_mp3_path = f"/tmp/temp_{audio_item.file.filename}.mp3"
    try:
        any_to_mp3(audio_item.file.path, temp_mp3_path)
        
        transcription_result = split_and_transcribe(temp_mp3_path, stt_model)
        transcriptions = [{file.file_name: entry} for file, entries in transcription_result.items()]
        return {'file': audio_item.file.filename, 'entries': transcriptions}
    finally:
        os.remove(temp_mp3_path)

class BaseSpeechToTextNode ISpeechToTextNode):

    ...

    def process_audio_items(self, audio_list):
        # Use dictionary comprehension to store results directly
        results_dicts = []
        with ThreadPoolExecutor(max_workers=5) as executor:
            futures = [executor.submit(process_audio_item, audio_item) for audio_item in audio_list]
            
            for future in futures:
                audio_data_dict = future.result()
                results_dicts.extend(audio_data_dict['entries'])

        flattened_results = [{'file': d['file'], 'entry': e} for sub_d in results_dicts for e in sub_d['entries']]
        return NodeResult({'flattenedResults': flattened_results})

    ...

@@ -544,7 +551,7 @@
:modelValue="item.answer"
style="background: none"
/>
<template v-else> - </template>
<template v-else> -</template>
</div>
</div>
</template>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code provided has some minor style and performance improvement opportunities, as well as a small bug fix related to conditional rendering.

Regularities: The card component is using shadow="never", which can be improved for better user interface consistency. The <div> containing text content and the Markdown preview are nested inside each other without proper separation, which looks visually cluttered.

Potential Issues: There might not be any direct functional issues with this code snippet based on its current structure, but ensuring consistent styling across components would improve user experience.

Optimization Suggestions:

  1. Style Improvement: Consider updating the theme or customizing the appearance of the div/card to enhance readability.

    --- old/code.js 2023-10-05 15:40:05.689Z +0530  2024-01-07 12:00:00.789Z +0530
    +++ new/code.js 2024-01-07 13:05:03.189Z +0530  2024-01-07 13:05:03.189Z +0530
    @@ -293,15 +293,22 @@
                        <div class="card-ever border-r-4">
                        <h5 class="p-8-12">参数输出</h5>
                        <div class="p-8-12 border-t-dashed lighter">
    -                      <p class="mb-8 color-secondary">文本内容:</p>
    -                      <div v-if="item.answer">
    +                      <el-card
    +                        shadow="always"
  2. Bug Fix: Ensure that the v-else block renders correctly. You should update it consistently to match the previous line's logic (i.e., showing -).

  3. Readability Enhancement: Break down complex expressions into smaller parts, which can be easier to read and debug.

    --- old/code.js 2023-10-05 15:40:05.689Z +0530  2024-01-07 12:00:00.789Z +0530
    +++ new/code.js 2024-01-07 13:05:03.189Z +0530  2024-01-07 13:05:03.189Z +0530
    @@ -293,15 +293,22 @@
    --- old/code.js 2023-10-05 15:40:05.689Z +0530  2024-01-07 12:00:00.789Z +0530
    +++ new/code.js 2024-01-07 13:05:03.189Z +0530  2024-01-07 13:05:03.189Z +0530

These changes will make the code more coherent and maintainable while improving its overall aesthetic appeal.

@@ -70,7 +70,7 @@ def save_image(image_list):
# 回到文件头
buffer.seek(0)
file_content = split_handle.get_content(buffer, save_image)
content.append('## ' + doc['name'] + '\n' + file_content)
content.append('### ' + doc['name'] + '\n' + file_content)
break

return NodeResult({'content': splitter.join(content)}, {})
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a minor error in the code that needs correction:

Replace this line:
content.append('## ' + doc['name'] + '\n' + file_content)

With this line:
content.append(f"### {doc['name']}\n{file_content}")

Specifically, you should use an f-string to escape the double quotes properly within your string interpolation. This will ensure proper formatting when adding each document's name and content.

Also, while there is no significant issue with the rest of the function (such as improper handling exceptions or inefficiencies), it doesn't require any major optimizations for its current implementation. However, make sure that the variable names (buffer, split_handle, splitter) are appropriate for their usage context and adhere to common coding conventions for clarity and maintainability.

…pr@main@refactor_sst_node

# Conflicts:
#	apps/application/flow/step_node/speech_to_text_step_node/impl/base_speech_to_text_node.py
@@ -544,7 +551,7 @@
:modelValue="item.answer"
style="background: none"
/>
<template v-else> - </template>
<template v-else> -</template>
</div>
</div>
</template>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code provided appears to be part of an HTML template using Vue.js with Element Plus. Several areas can be improved:

  1. Duplicated Class Names: The <div> element has both border-r-4 and border-t-dashed lighter classes applied, which might lead to redundant styling.

  2. Conditional Rendering within Nested Iteration: There is an unnecessary condition (<div v-if="item.content.length > 0">) inside a loop that iterates over file_content. If you only want the outermost iteration's content to render if there are contents at all, this should also be conditionalized on whether file_content array has items.

  3. Consistent Template Tags: Ensure consistent use of double quotes throughout the HTML structure, e.g., <button type='submit'>Submit</button> instead of <button type=submit>Show</button>.

Here is a cleaned up version based on these suggestions:

<div class="card-never border-r-4">
  <h5 class="p-8-12">参数输出</h5>
  <div class="p-8-12 border-t-dashed lighter" v-if="item.content && item.content.length > 0">
    <template v-for="(file_content, index) in item.content" :key="index">
      <el-card shadow="never" style="--el-card-padding: 8px" class="mb-8">
        <MdPreview v-if="file_content" ref="editorRef" editorId="preview-only" :modelValue="file_content" style="background: none"/>
        <template v-else>-</template>
      </el-card>
    </template>
  </div>
</div>

<!-- Similar changes will apply to the other similar sections -->

Additional Comments:

  • The --el-card-padding CSS custom property seems incorrect; ensure it reflects proper padding values used in your project or remove it as per design.
  • Consider adding error handling for cases where item.answer is unexpectedly null or undefined in case of potential data issues.


def get_details(self, index: int, **kwargs):
return {
'name': self.node.properties.get('stepName'),
"index": index,
'run_time': self.context.get('run_time'),
'answer': self.context.get('answer'),
'content': self.context.get('content'),
'type': self.node.type,
'status': self.status,
'err_message': self.err_message,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code has a few issues that need addressing to improve its correctness and maintainability:

  1. The process_audio_item function does not handle errors internally or raise them appropriately, which can lead to silent failures if an issue occurs during processing.

  2. The split_and_transcribe function call should be wrapped in error handling to manage cases where it might fail, possibly causing the entire script to crash if the transcription fails unexpectedly.

  3. The return values from process_audio_items are strings joined by newlines (\n\n). This may result in overly verbose output for small lists of items. Consider returning dictionaries or using more consistent formatting.

  4. The context dictionary is being modified multiple times without proper synchronization or thread safety when accessed concurrently, which could lead to data inconsistencies across different threads.

  5. There seems to be unnecessary duplication within how the results are returned; the final result content is already available in variables called result_content, which duplicates effort.

  6. Ensure that file paths generated are valid before attempting to write or remove files to avoid exceptions due to invalid paths.

Here's an improved version of the code incorporating these suggestions:

def process_audio_item(audio_item, model):
    try:
        temp_mp3_path = generate_temporary_file()
        any_to_mp3(audio_item.path, temp_mp3_path)
        transcription = split_and_transcribe(temp_mp3_path, model)
        os.remove(temp_mp3_path)
        return {audio_item.id: transcription}
    except Exception as e:
        # Handle exceptions and log/raise appropriate
        print(f"Error processing audio item {audio_item.id}: {e}")
        return None

def process_audio_items(audio_list, model):
    with ThreadPoolExecutor(max_workers=5) as executor:
        future_results = [executor.submit(process_audio_item, item, model) for item in audio_list]
        results = [future.result() for future in concurrent.futures.as_completed(future_results)]
    
    # Combine and filter out None values (failed items)
    combined_results = [{item_id: transcr} for res in results if res for item_id, transcr in res.items()]
    
    # Flatten the list into individual transcription strings
    result_content = ''.join([transcription for _id, transcription in combined_results])
    
    return {'answer': result_content, 'result': result_content}

def get_details(self, index: int, **kwargs):
    response_data = {
        'name': self.node.properties.get('stepName', ''),
        'index': index,
        'run_time': self.context.get('run_time', {}),
        'answer': self.context.get('answer', ''),
+       'content': self.context.get('content', []),
        'type': self.node.type,
        'status': self.status,
        'err_message': self.err_message
    }
    response_data['items'] = [{'text': t} for _, t in combined_results]
    return response_data

In this revision:

  • Error handling is added to ensure exceptions are caught and reported.
  • A temporary file is created per audio item, then removed after processing, reducing the risk of disk space issues.
  • Results are filtered to exclude failed attempts, ensuring only successful transcriptions are included.
  • The final result is flattened into a single string instead of joining on newline characters.
  • An example response format for displaying the content under each transcript item is provided in the get_details method. Adjust according to your actual requirements.

@@ -70,7 +70,7 @@ def save_image(image_list):
# 回到文件头
buffer.seek(0)
file_content = split_handle.get_content(buffer, save_image)
content.append('## ' + doc['name'] + '\n' + file_content)
content.append('### ' + doc['name'] + '\n' + file_content)
break

return NodeResult({'content': splitter.join(content)}, {})
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code is mostly correct but it can be improved for more readability and possibly for better error handling. Here's a revised version:

def save_image(image_list):
    buffer.seek(0)
    content = []
    
    for doc in image_list:
        try:
            file_content = split_handle.get_content(buffer, save_image)
            content.append(f"### {doc['name']}\n{file_content}")
        except Exception as e:
            print(f"Error processing document {doc['name']}:", e)
    
    return NodeResult({'content': splitter.join(content)}, {})

Corrections and Suggestions:

  1. Use f-string for formatting: This makes the code more readable than concatenating strings with '+'.

  2. Error Handling: Added a try-except block to catch any exceptions that might occur when processing each document. This helps in logging errors without breaking the execution of other documents.

  3. Improved Error Message: Changed the error message to include both the problematic document name and the specific exception occurred. This provides context for debugging.

These changes make the code cleaner and more robust.

@wxg0103 wxg0103 changed the title refactor: optimize the display of SST nodes refactor: optimize the display of STT nodes Dec 19, 2024
@wxg0103 wxg0103 closed this Dec 19, 2024
@maninhill maninhill deleted the pr@main@refactor_sst_node branch December 30, 2024 06:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants