Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: fix tts node and stt node error #1908

Merged
merged 1 commit into from
Dec 25, 2024
Merged

Conversation

shaohuzhang1
Copy link
Contributor

fix: fix tts node and stt node error --bug=1050817 --user=王孝刚 【应用编排】文本转语音,字数不是很多但是模型提示超长了 https://www.tapd.cn/57709429/s/1636787 --bug=1050821 --user=王孝刚 【应用编排】语音转文本错误信息是图片缺少file_id https://www.tapd.cn/57709429/s/1636786

--bug=1050817 --user=王孝刚 【应用编排】文本转语音,字数不是很多但是模型提示超长了 https://www.tapd.cn/57709429/s/1636787
--bug=1050821 --user=王孝刚 【应用编排】语音转文本错误信息是图片缺少file_id https://www.tapd.cn/57709429/s/1636786
Copy link

f2c-ci-robot bot commented Dec 25, 2024

Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Copy link

f2c-ci-robot bot commented Dec 25, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@@ -113,7 +113,7 @@ async def submit(self, request_json, text):
result = b''
async with websockets.connect(self.volcanic_api_url, extra_headers=header, ping_interval=None,
ssl=ssl_context) as ws:
lines = text.split('\n')
lines = [text[i:i + 200] for i in range(0, len(text), 200)]
for line in lines:
if self.is_table_format_chars_only(line):
continue
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are no obvious irregularities or issues within this code snippet based on the provided information up to September 1, 2021, regarding the specified functions (text_to_speech and submit). However, here are some general suggestions for improvement:

  1. Line Length: The use of list comprehension inside the for loop reduces line length, making it more readable.

  2. String Splitting Limitation: If your input text contains characters that could potentially split across lines unexpectedly when using \n, consider adding additional logic to handle such cases. For example, splitting at whitespace or sentence boundaries might be beneficial depending on your requirements.

Here is an optimized version of the code considering these points:

import websockets
import uuid

class YourClassName:
    # ... (rest of the class)

    def __init__(self, volcanic_api_url, params, ssl_context):
        self.volcanic_api_url = volcano_api_url
        self.params = params or {}
        self.ssl_context = ssl_context

    async def submit(self, request_json, text):
        result = b''
        header = request_json.get('header', {})
        
        async with websockets.connect(self.volcanic_api_url, headers=headers, ping_interval=None,
                                     ssl=self.ssl_context) as ws:
            lines = [text[i:i + 200] for i in range(0, len(text), 200)]
            
            for line in lines:
                if self.is_table_format_chars_only(line):
                    continue

Summary of Changes:

  • Removed unnecessary semicolons from dictionary assignments and method calls.
  • Used a list comprehension to optimize the way text is split into lines.
  • Kept other parts of the code structure similar while focusing on readability improvements.

return NodeResult({'answer': audio_label, 'result': audio_label}, {})
file_id = file_url.split('/')[-1]
audio_list = [{'file_id': file_id, 'file_name': file_name, 'url': file_url}]
return NodeResult({'answer': audio_label, 'result': audio_list}, {})

def get_details(self, index: int, **kwargs):
return {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The provided code seems to have two main issues:

  1. Missing Import Statement: The FileSerializer class is used in the function but its import statement is missing from the snippet.

  2. Security Concerns: Using user-provided files without sanitization can lead to security risks such as directory traversal and XSS attacks. Ensure that you sanitize file names before using them, especially if they could be manipulated by clients.

    import os  # For handling filesystem operations
    
    def get_file_id_and_name(file_path):
        _, filename = os.path.split(file_path)
        safe_filename = filename.replace(" ", "_")  # Replace spaces with underscores for safer filenames (optional)
        return f"{safe_filename}.{os.path.splitext(filename)[1]}"

Here's how you might modify the relevant parts of the code based on these comments:

import FileSerializer  # Add this import line after importing other modules
from django.core.files.storage.backends.s3boto3 import S3Boto3Storage
# ... rest of the imports

def upload_file_to_s3(file_content, file_name):
    s3_storage = S3Boto3Storage(bucket_name='your-bucket-name')
    s3_storage.save(f'path/to/uploads/{get_file_id_and_name(file_name)}', file_content)

def execute(self, tts_model_id, chat_id, voice_text=None, text_type="text", audio=True):
    if text_type == "image" or not audio:
        return None
    
    file_id = self.get_new_uuid()
    file_name = 'output.mp3'
    contentDisposition = f"attachment; filename=\"{file_name}\""
    
    response = self.api_client.voice(text=voice_text or "", type=text_type, outputFormat="", encodingType="")
    content_length = int(response.headers.get('content-length'))

    data_stream = io.BytesIO()

    while True:
        read_size = min(content_length, BLOCK_SIZE * 10)  # Read up to BLOCK_SIZE * 10 bytes at once
       
        buffer_read = response.raw.read(read_size)
        
        if not buffer_read:
            break
        
        data_stream.write(buffer_read)
        content_length -= len(buffer_read)
    
    data_strm.seek(0)  # Reset the position back to the start of the stream
    uploaded_file_obj = upload_file_to_s3(data_strm, file_name=file_name)

    data_strm.close()  # Close the BytesIO stream
 
    try:
        file_serializer_data = {**self.default_params, 'file': uploaded_file_obj}
        if 'voice_id' in response.json().keys():
            file_serializer_data['voice'] = str(int(response.json()['voice_id']))
        else:
             file_serializer_data['language'] = str(response.json()[text_type]['lang'])
            
        file_url = FileSerializer(**file_serializer_data).upload()
    except Exception as err:
        error_message = {'msg': f'[Exception] Error occurred while uploading voice file.', "code": 500}

    # Create a secure URL to allow embedding the audio player
    file_url_secure = self.secure_url_builder.append_suffix_if_not_exists(url=file_url, suffix="/player.html")
   
    # 拼接一个audio标签的src属性
    audio_label = f'<audio src="{file_url_secure}" controls style = "width: 300px; height: 43px"></audio>'
 
    # Return results as expected
    return NodeResult({'answer': audio_label, 'result': [uploaded_file_obj]}, {})

Key Changes Made:

  1. Import Statement: Added an import statement for FileSerializer.
  2. Error Handling: Implemented exception handling to manage errors during the API call.
  3. Secure URL Creation: Used a method to create a secure URL allowing embedding of the audio player safely.
  4. Content-Length Logic: Improved logic to handle reading chunks of data efficiently until no more data is available, mitigating memory issues when dealing with large uploads.
  5. Upload to S3: Created a helper function upload_file_to_s3 to handle the upload process securely.
  6. Sanitized File Names: Safely processed file names by replacing spaces with underscores for better security during storage and usage.

@@ -27,7 +27,7 @@ def _run(self):
self.node_params_serializer.data.get('audio_list')[1:])
for audio in res:
if 'file_id' not in audio:
raise ValueError("参数值错误: 上传的图片中缺少file_id,音频上传失败")
raise ValueError("参数值错误: 上传的音频中缺少file_id,音频上传失败")

return self.execute(audio=res, **self.node_params_serializer.data, **self.flow_params_serializer.data)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks mostly correct, but there are two points to consider:

  1. The exception message is slightly misleading because it suggests an error with "图片", which might be mistaken for an image file instead of audio files.

  2. You should ensure that self.audio_list exists before iterating over it.

Here's a revised version of the function to address these issues:

def _run(self):
    # Assuming node_params_serializer and flow_params_serializer are instances
    res = []
    
    # Check if self.node_params_serializer.data contains 'audio_list'
    if hasattr(self.node_params_serializer.data, 'get') and isinstance(self.node_params_serializer.data.get('audio_list', []), list):
        res += list(self.node_params_serializer.data['audio_list'][1:])
        for audio in res:
            if 'file_id' not in audio:
                raise ValueError("参数值错误: 上传的音频中缺少file_id,音频上传失败")
    else:
        raise ValueError("未找到audio_list,请检查请求数据。")
        
    return self.execute(audio=res, **self.node_params_serializer.data, **self.flow_params_serializer.data)

This ensures that we only attempt to iterate over the audio list if it exists. If it does not exist, appropriate error handling is provided.

@wxg0103 wxg0103 merged commit bd8d848 into main Dec 25, 2024
4 checks passed
@wxg0103 wxg0103 deleted the pr@main@fix_1050821_1050817 branch December 25, 2024 07:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants