Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

求助关于ffmpeg使用pipe处理bytes数据 #23

Open
chenqianhe opened this issue Nov 12, 2022 · 6 comments
Open

求助关于ffmpeg使用pipe处理bytes数据 #23

chenqianhe opened this issue Nov 12, 2022 · 6 comments

Comments

@chenqianhe
Copy link
Collaborator

whisper使用ffmpeg读取视频文件拿到audio代码如下

try:
    # This launches a subprocess to decode audio while down-mixing and resampling as necessary.
    # Requires the ffmpeg CLI and `ffmpeg-python` package to be installed.
    out, _ = (
        ffmpeg.input(file, threads=0)
        .output("-", format="s16le", acodec="pcm_s16le", ac=1, ar=sr)
        .run(cmd=["ffmpeg", "-nostdin"], capture_stdout=True, capture_stderr=True)
    )
except ffmpeg.Error as e:
    raise RuntimeError(f"Failed to load audio: {e.stderr.decode()}") from e

return np.frombuffer(out, np.int16).flatten().astype(np.float32) / 32768.0

目前我是在做将输入修改为bytes并使用pipe进行输入输出,但是目前遇到问题了,不知道怎么拿到audio了。
视频上传目前是需要支持['.mp4', '.mov', '.mkv', '.flv']四种格式。我在ffmpeg处理之前可以拿到视频格式和视频宽高等信息。

out, _ = Popen(shlex.split('ffprobe -v error -i pipe: -select_streams v -print_format json -show_streams'),
                           stdin=PIPE, stdout=PIPE, bufsize=-1)\
                .communicate(input=bytes_data)
video_info = json.loads(out)
width = (video_info['streams'][0])['width']
height = (video_info['streams'][0])['height']

out, _ = (
    ffmpeg.input('pipe:', threads=0, format='rawvideo', s='{}x{}'.format(width, height))
    .output('pipe:', format="s16le", acodec="pcm_s16le", ac=1, ar=self.args.sampling_rate)
    .run(input=bytes_data, capture_stdout=True, capture_stderr=True)
)

目前这么做会有如下信息输出

[mov,mp4,m4a,3gp,3g2,mj2 @ 0x14f704be0] stream 0, offset 0x30: partial file
[rawvideo @ 0x153f04080] Packet corrupt (stream = 0, dts = 0).
Input #0, rawvideo, from 'pipe:':
Duration: N/A, start: 0.000000, bitrate: 1119955 kb/s
Stream #0:0: Video: rawvideo (I420 / 0x30323449), yuv420p, 2564x1456, 1119955 kb/s, 25 tbr, 25 tbn
Output #0, s16le, to 'pipe:':
Output file #0 does not contain any stream

不知道怎么处理视频bytes才能达到和whisper中load_audio一样的效果

@chenqianhe
Copy link
Collaborator Author

@chenqianhe
Copy link
Collaborator Author

目前查到的资料看了感觉像是输入指定了rawvideo,就把audio给不管了意思。有些不太懂

@mli
Copy link
Owner

mli commented Nov 12, 2022

ffmpeg 是可以从 stdin 直接读bytes把。例如 https://kkroening.github.io/ffmpeg-python/#ffmpeg.input

@mli
Copy link
Owner

mli commented Nov 12, 2022

反过来讲,因为要回放视频,最好还是存本地一下。这样可以save workspace。最多设置个expire时间,太老的video就删了。

@chenqianhe
Copy link
Collaborator Author

chenqianhe commented Nov 12, 2022 via email

@chenqianhe
Copy link
Collaborator Author

chenqianhe commented Nov 12, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants