fix: fix tts node and stt node error #1908

shaohuzhang1 · 2024-12-25T07:14:52Z

fix: fix tts node and stt node error --bug=1050817 --user=王孝刚【应用编排】文本转语音，字数不是很多但是模型提示超长了 https://www.tapd.cn/57709429/s/1636787 --bug=1050821 --user=王孝刚【应用编排】语音转文本错误信息是图片缺少file_id https://www.tapd.cn/57709429/s/1636786

--bug=1050817 --user=王孝刚【应用编排】文本转语音，字数不是很多但是模型提示超长了 https://www.tapd.cn/57709429/s/1636787 --bug=1050821 --user=王孝刚【应用编排】语音转文本错误信息是图片缺少file_id https://www.tapd.cn/57709429/s/1636786

f2c-ci-robot · 2024-12-25T07:14:56Z

Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

f2c-ci-robot · 2024-12-25T07:15:00Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

shaohuzhang1 · 2024-12-25T07:15:12Z

apps/setting/models_provider/impl/volcanic_engine_model_provider/model/tts.py

@@ -113,7 +113,7 @@ async def submit(self, request_json, text):
        result = b''
        async with websockets.connect(self.volcanic_api_url, extra_headers=header, ping_interval=None,
                                      ssl=ssl_context) as ws:
-            lines = text.split('\n')
+            lines = [text[i:i + 200] for i in range(0, len(text), 200)]
            for line in lines:
                if self.is_table_format_chars_only(line):
                    continue


There are no obvious irregularities or issues within this code snippet based on the provided information up to September 1, 2021, regarding the specified functions (text_to_speech and submit). However, here are some general suggestions for improvement:

Line Length: The use of list comprehension inside the for loop reduces line length, making it more readable.

String Splitting Limitation: If your input text contains characters that could potentially split across lines unexpectedly when using \n, consider adding additional logic to handle such cases. For example, splitting at whitespace or sentence boundaries might be beneficial depending on your requirements.

Here is an optimized version of the code considering these points:

import websockets import uuid class YourClassName: # ... (rest of the class) def __init__(self, volcanic_api_url, params, ssl_context): self.volcanic_api_url = volcano_api_url self.params = params or {} self.ssl_context = ssl_context async def submit(self, request_json, text): result = b'' header = request_json.get('header', {}) async with websockets.connect(self.volcanic_api_url, headers=headers, ping_interval=None, ssl=self.ssl_context) as ws: lines = [text[i:i + 200] for i in range(0, len(text), 200)] for line in lines: if self.is_table_format_chars_only(line): continue

Summary of Changes:

Removed unnecessary semicolons from dictionary assignments and method calls.

Used a list comprehension to optimize the way text is split into lines.

Kept other parts of the code structure similar while focusing on readability improvements.

shaohuzhang1 · 2024-12-25T07:15:26Z

apps/application/flow/step_node/text_to_speech_step_node/impl/base_text_to_speech_node.py

-        return NodeResult({'answer': audio_label, 'result': audio_label}, {})
+        file_id = file_url.split('/')[-1]
+        audio_list = [{'file_id': file_id, 'file_name': file_name, 'url': file_url}]
+        return NodeResult({'answer': audio_label, 'result': audio_list}, {})

    def get_details(self, index: int, **kwargs):
        return {


The provided code seems to have two main issues:

Missing Import Statement: The FileSerializer class is used in the function but its import statement is missing from the snippet.

Security Concerns: Using user-provided files without sanitization can lead to security risks such as directory traversal and XSS attacks. Ensure that you sanitize file names before using them, especially if they could be manipulated by clients.

import os # For handling filesystem operations def get_file_id_and_name(file_path): _, filename = os.path.split(file_path) safe_filename = filename.replace(" ", "_") # Replace spaces with underscores for safer filenames (optional) return f"{safe_filename}.{os.path.splitext(filename)[1]}"

Here's how you might modify the relevant parts of the code based on these comments:

import FileSerializer # Add this import line after importing other modules from django.core.files.storage.backends.s3boto3 import S3Boto3Storage # ... rest of the imports def upload_file_to_s3(file_content, file_name): s3_storage = S3Boto3Storage(bucket_name='your-bucket-name') s3_storage.save(f'path/to/uploads/{get_file_id_and_name(file_name)}', file_content) def execute(self, tts_model_id, chat_id, voice_text=None, text_type="text", audio=True): if text_type == "image" or not audio: return None file_id = self.get_new_uuid() file_name = 'output.mp3' contentDisposition = f"attachment; filename=\"{file_name}\"" response = self.api_client.voice(text=voice_text or "", type=text_type, outputFormat="", encodingType="") content_length = int(response.headers.get('content-length')) data_stream = io.BytesIO() while True: read_size = min(content_length, BLOCK_SIZE * 10) # Read up to BLOCK_SIZE * 10 bytes at once buffer_read = response.raw.read(read_size) if not buffer_read: break data_stream.write(buffer_read) content_length -= len(buffer_read) data_strm.seek(0) # Reset the position back to the start of the stream uploaded_file_obj = upload_file_to_s3(data_strm, file_name=file_name) data_strm.close() # Close the BytesIO stream try: file_serializer_data = {**self.default_params, 'file': uploaded_file_obj} if 'voice_id' in response.json().keys(): file_serializer_data['voice'] = str(int(response.json()['voice_id'])) else: file_serializer_data['language'] = str(response.json()[text_type]['lang']) file_url = FileSerializer(**file_serializer_data).upload() except Exception as err: error_message = {'msg': f'[Exception] Error occurred while uploading voice file.', "code": 500} # Create a secure URL to allow embedding the audio player file_url_secure = self.secure_url_builder.append_suffix_if_not_exists(url=file_url, suffix="/player.html") # 拼接一个audio标签的src属性 audio_label = f'<audio src="{file_url_secure}" controls style = "width: 300px; height: 43px"></audio>' # Return results as expected return NodeResult({'answer': audio_label, 'result': [uploaded_file_obj]}, {})

Key Changes Made:

Import Statement: Added an import statement for FileSerializer.

Error Handling: Implemented exception handling to manage errors during the API call.

Secure URL Creation: Used a method to create a secure URL allowing embedding of the audio player safely.

Content-Length Logic: Improved logic to handle reading chunks of data efficiently until no more data is available, mitigating memory issues when dealing with large uploads.

Upload to S3: Created a helper function upload_file_to_s3 to handle the upload process securely.

Sanitized File Names: Safely processed file names by replacing spaces with underscores for better security during storage and usage.

shaohuzhang1 · 2024-12-25T07:15:31Z

apps/application/flow/step_node/speech_to_text_step_node/i_speech_to_text_node.py

@@ -27,7 +27,7 @@ def _run(self):
                                                       self.node_params_serializer.data.get('audio_list')[1:])
        for audio in res:
            if 'file_id' not in audio:
-                raise ValueError("参数值错误: 上传的图片中缺少file_id，音频上传失败")
+                raise ValueError("参数值错误: 上传的音频中缺少file_id，音频上传失败")

        return self.execute(audio=res, **self.node_params_serializer.data, **self.flow_params_serializer.data)



The code looks mostly correct, but there are two points to consider:

The exception message is slightly misleading because it suggests an error with "图片", which might be mistaken for an image file instead of audio files.

You should ensure that self.audio_list exists before iterating over it.

Here's a revised version of the function to address these issues:

def _run(self): # Assuming node_params_serializer and flow_params_serializer are instances res = [] # Check if self.node_params_serializer.data contains 'audio_list' if hasattr(self.node_params_serializer.data, 'get') and isinstance(self.node_params_serializer.data.get('audio_list', []), list): res += list(self.node_params_serializer.data['audio_list'][1:]) for audio in res: if 'file_id' not in audio: raise ValueError("参数值错误: 上传的音频中缺少file_id，音频上传失败") else: raise ValueError("未找到audio_list，请检查请求数据。") return self.execute(audio=res, **self.node_params_serializer.data, **self.flow_params_serializer.data)

This ensures that we only attempt to iterate over the audio list if it exists. If it does not exist, appropriate error handling is provided.

f2c-ci-robot bot added the do-not-merge/release-note-label-needed label Dec 25, 2024

shaohuzhang1 commented Dec 25, 2024

View reviewed changes

wxg0103 merged commit bd8d848 into main Dec 25, 2024
4 checks passed

wxg0103 deleted the pr@main@fix_1050821_1050817 branch December 25, 2024 07:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: fix tts node and stt node error #1908

fix: fix tts node and stt node error #1908

shaohuzhang1 commented Dec 25, 2024

f2c-ci-robot bot commented Dec 25, 2024

f2c-ci-robot bot commented Dec 25, 2024

shaohuzhang1 Dec 25, 2024

shaohuzhang1 Dec 25, 2024

shaohuzhang1 Dec 25, 2024

fix: fix tts node and stt node error #1908

fix: fix tts node and stt node error #1908

Conversation

shaohuzhang1 commented Dec 25, 2024

f2c-ci-robot bot commented Dec 25, 2024

f2c-ci-robot bot commented Dec 25, 2024

shaohuzhang1 Dec 25, 2024

Choose a reason for hiding this comment

Summary of Changes:

shaohuzhang1 Dec 25, 2024

Choose a reason for hiding this comment

Key Changes Made:

shaohuzhang1 Dec 25, 2024

Choose a reason for hiding this comment