-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: fix tts node and stt node error #1908
Conversation
--bug=1050817 --user=王孝刚 【应用编排】文本转语音,字数不是很多但是模型提示超长了 https://www.tapd.cn/57709429/s/1636787 --bug=1050821 --user=王孝刚 【应用编排】语音转文本错误信息是图片缺少file_id https://www.tapd.cn/57709429/s/1636786
Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@@ -113,7 +113,7 @@ async def submit(self, request_json, text): | |||
result = b'' | |||
async with websockets.connect(self.volcanic_api_url, extra_headers=header, ping_interval=None, | |||
ssl=ssl_context) as ws: | |||
lines = text.split('\n') | |||
lines = [text[i:i + 200] for i in range(0, len(text), 200)] | |||
for line in lines: | |||
if self.is_table_format_chars_only(line): | |||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are no obvious irregularities or issues within this code snippet based on the provided information up to September 1, 2021, regarding the specified functions (text_to_speech
and submit
). However, here are some general suggestions for improvement:
-
Line Length: The use of list comprehension inside the
for
loop reduces line length, making it more readable. -
String Splitting Limitation: If your input text contains characters that could potentially split across lines unexpectedly when using
\n
, consider adding additional logic to handle such cases. For example, splitting at whitespace or sentence boundaries might be beneficial depending on your requirements.
Here is an optimized version of the code considering these points:
import websockets
import uuid
class YourClassName:
# ... (rest of the class)
def __init__(self, volcanic_api_url, params, ssl_context):
self.volcanic_api_url = volcano_api_url
self.params = params or {}
self.ssl_context = ssl_context
async def submit(self, request_json, text):
result = b''
header = request_json.get('header', {})
async with websockets.connect(self.volcanic_api_url, headers=headers, ping_interval=None,
ssl=self.ssl_context) as ws:
lines = [text[i:i + 200] for i in range(0, len(text), 200)]
for line in lines:
if self.is_table_format_chars_only(line):
continue
Summary of Changes:
- Removed unnecessary semicolons from dictionary assignments and method calls.
- Used a list comprehension to optimize the way text is split into lines.
- Kept other parts of the code structure similar while focusing on readability improvements.
return NodeResult({'answer': audio_label, 'result': audio_label}, {}) | ||
file_id = file_url.split('/')[-1] | ||
audio_list = [{'file_id': file_id, 'file_name': file_name, 'url': file_url}] | ||
return NodeResult({'answer': audio_label, 'result': audio_list}, {}) | ||
|
||
def get_details(self, index: int, **kwargs): | ||
return { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The provided code seems to have two main issues:
-
Missing Import Statement: The
FileSerializer
class is used in the function but its import statement is missing from the snippet. -
Security Concerns: Using user-provided files without sanitization can lead to security risks such as directory traversal and XSS attacks. Ensure that you sanitize file names before using them, especially if they could be manipulated by clients.
import os # For handling filesystem operations def get_file_id_and_name(file_path): _, filename = os.path.split(file_path) safe_filename = filename.replace(" ", "_") # Replace spaces with underscores for safer filenames (optional) return f"{safe_filename}.{os.path.splitext(filename)[1]}"
Here's how you might modify the relevant parts of the code based on these comments:
import FileSerializer # Add this import line after importing other modules
from django.core.files.storage.backends.s3boto3 import S3Boto3Storage
# ... rest of the imports
def upload_file_to_s3(file_content, file_name):
s3_storage = S3Boto3Storage(bucket_name='your-bucket-name')
s3_storage.save(f'path/to/uploads/{get_file_id_and_name(file_name)}', file_content)
def execute(self, tts_model_id, chat_id, voice_text=None, text_type="text", audio=True):
if text_type == "image" or not audio:
return None
file_id = self.get_new_uuid()
file_name = 'output.mp3'
contentDisposition = f"attachment; filename=\"{file_name}\""
response = self.api_client.voice(text=voice_text or "", type=text_type, outputFormat="", encodingType="")
content_length = int(response.headers.get('content-length'))
data_stream = io.BytesIO()
while True:
read_size = min(content_length, BLOCK_SIZE * 10) # Read up to BLOCK_SIZE * 10 bytes at once
buffer_read = response.raw.read(read_size)
if not buffer_read:
break
data_stream.write(buffer_read)
content_length -= len(buffer_read)
data_strm.seek(0) # Reset the position back to the start of the stream
uploaded_file_obj = upload_file_to_s3(data_strm, file_name=file_name)
data_strm.close() # Close the BytesIO stream
try:
file_serializer_data = {**self.default_params, 'file': uploaded_file_obj}
if 'voice_id' in response.json().keys():
file_serializer_data['voice'] = str(int(response.json()['voice_id']))
else:
file_serializer_data['language'] = str(response.json()[text_type]['lang'])
file_url = FileSerializer(**file_serializer_data).upload()
except Exception as err:
error_message = {'msg': f'[Exception] Error occurred while uploading voice file.', "code": 500}
# Create a secure URL to allow embedding the audio player
file_url_secure = self.secure_url_builder.append_suffix_if_not_exists(url=file_url, suffix="/player.html")
# 拼接一个audio标签的src属性
audio_label = f'<audio src="{file_url_secure}" controls style = "width: 300px; height: 43px"></audio>'
# Return results as expected
return NodeResult({'answer': audio_label, 'result': [uploaded_file_obj]}, {})
Key Changes Made:
- Import Statement: Added an import statement for
FileSerializer
. - Error Handling: Implemented exception handling to manage errors during the API call.
- Secure URL Creation: Used a method to create a secure URL allowing embedding of the audio player safely.
- Content-Length Logic: Improved logic to handle reading chunks of data efficiently until no more data is available, mitigating memory issues when dealing with large uploads.
- Upload to S3: Created a helper function
upload_file_to_s3
to handle the upload process securely. - Sanitized File Names: Safely processed file names by replacing spaces with underscores for better security during storage and usage.
@@ -27,7 +27,7 @@ def _run(self): | |||
self.node_params_serializer.data.get('audio_list')[1:]) | |||
for audio in res: | |||
if 'file_id' not in audio: | |||
raise ValueError("参数值错误: 上传的图片中缺少file_id,音频上传失败") | |||
raise ValueError("参数值错误: 上传的音频中缺少file_id,音频上传失败") | |||
|
|||
return self.execute(audio=res, **self.node_params_serializer.data, **self.flow_params_serializer.data) | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code looks mostly correct, but there are two points to consider:
-
The exception message is slightly misleading because it suggests an error with "图片", which might be mistaken for an image file instead of audio files.
-
You should ensure that
self.audio_list
exists before iterating over it.
Here's a revised version of the function to address these issues:
def _run(self):
# Assuming node_params_serializer and flow_params_serializer are instances
res = []
# Check if self.node_params_serializer.data contains 'audio_list'
if hasattr(self.node_params_serializer.data, 'get') and isinstance(self.node_params_serializer.data.get('audio_list', []), list):
res += list(self.node_params_serializer.data['audio_list'][1:])
for audio in res:
if 'file_id' not in audio:
raise ValueError("参数值错误: 上传的音频中缺少file_id,音频上传失败")
else:
raise ValueError("未找到audio_list,请检查请求数据。")
return self.execute(audio=res, **self.node_params_serializer.data, **self.flow_params_serializer.data)
This ensures that we only attempt to iterate over the audio list if it exists. If it does not exist, appropriate error handling is provided.
fix: fix tts node and stt node error --bug=1050817 --user=王孝刚 【应用编排】文本转语音,字数不是很多但是模型提示超长了 https://www.tapd.cn/57709429/s/1636787 --bug=1050821 --user=王孝刚 【应用编排】语音转文本错误信息是图片缺少file_id https://www.tapd.cn/57709429/s/1636786