Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opusenc: add 4x opusenc options #64

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open

opusenc: add 4x opusenc options #64

wants to merge 4 commits into from

Conversation

chemag
Copy link

@chemag chemag commented Aug 19, 2021

No description provided.

@chemag
Copy link
Author

chemag commented Aug 19, 2021

I also have a patch to force the mode (silk, hybrid, celt, auto), but the OPUS_SET_FORCE_MODE API is private. I can just copy the values, but I assume there's a reason for having it private, and anyway the right solution would be to make it public.

@chemag chemag changed the title add 2x opusenc options add 4x opusenc options Aug 19, 2021
@chemag chemag changed the title add 4x opusenc options opusenc: add 4x opusenc options Aug 19, 2021
@chemag
Copy link
Author

chemag commented Aug 19, 2021

I also have a patch to force the mode (silk, hybrid, celt, auto), but the OPUS_SET_FORCE_MODE API is private. I can just copy the values, but I assume there's a reason for having it private, and anyway the right solution would be to make it public.

I managed to do this (dirty way, just hard-coding the FORCE_MODE values), but it required changes in opus-tools and libopusenc. A clean approach will also require moving the FORCE_MODE values in the opus repo from src/opus_private.h to include/opus.h

@mark4o
Copy link
Collaborator

mark4o commented Aug 21, 2021

Thanks. I'm not sure that these options make sense for opusenc. For example FEC is for handling packet loss, which cannot happen in an Ogg container since it is stream based; packets are not sent individually. Similarly DTX allows for not transmitting some packets, which is also not possible in Ogg. Options that don't make sense for opusenc just confuse users and make the program more difficult to use. Can you explain the use case for adding these to opusenc?

As for choosing the encoding mode, the --music and --speech options are much better ways to do that, because they will consider the other settings and which settings are supported by each mode. For example only CELT mode supports frame sizes smaller than 10 ms, so when smaller frames are required it will automatically use CELT mode. The internal force mode setting is used internally by code that has already considered the other settings and chosen a mode that is valid for those settings.

@chemag
Copy link
Author

chemag commented Aug 23, 2021

Hi, mark4o, thanks for the review.

Thanks. I'm not sure that these options make sense for opusenc. For example FEC is for handling packet loss, which cannot happen in an Ogg container since it is stream based; packets are not sent individually. Similarly DTX allows for not transmitting some packets, which is also not possible in Ogg. Options that don't make sense for opusenc just confuse users and make the program more difficult to use. Can you explain the use case for adding these to opusenc?

I'm running some experiments to measure the effect of Opus settings on bitrate. I saw the disparity in settings available to opusenc and opus_demo, and I tried to make them the same. As you mention, some of the settings (FEC, DTX) make no sense once you encapsulate the opus output in ogg. The encoding mode patch is probably too hairy right now. Does it make sense to add the other 2x settings to opusenc (application and bandwidth)?

Also, I added signal control to opus_demo (see xiph/opus#233).

@vadimkantorov
Copy link

Sorry, it's a novice question: will forced bandwidth in all packets lead to decoding in that sample rate?
The usecase is storing speech recognition dataset as opus files (for space saving). Then it's useful to be able to decode in the raw sample rate

@mark4o
Copy link
Collaborator

mark4o commented Jul 27, 2022

@vadimkantorov The encoder bandwidth does not affect the decoding sample rate. Using the opusdec program you can decode at any sample rate using the option --rate n.

@vadimkantorov
Copy link

I guess the reasonable feature request is then to ask for an option that would take the decoding sample rate from the informational OpusHead packet in one go (and if it doesn’t exist yet, I’ll check first :) - for a function in API to very fast read only this packet). This sort of functioning is useful because the file doesn’t have any meaningful frequency content besides the raw input sample rate (and then applying the input bandwidth with the option from this PR).

I’ll create a new issue for this.

Thank you!

@mark4o
Copy link
Collaborator

mark4o commented Jul 27, 2022

The original sample rate in the header is already the default decoding sample rate in opusdec.

@vadimkantorov
Copy link

vadimkantorov commented Feb 14, 2023

Some way for forcing SILK/CELT is useful when evaluating opus, be it --speech or some other option. Is there currently any way to force SILK or CELT in the released opusenc? I've tried passing --set-ctl-int 4000=2048 (OPUS_SET_APPLICATION_REQUEST=OPUS_APPLICATION_VOIP) (following defines in https://github.com/xiph/opus/blob/master/include/opus_defines.h), but I'm not sure if it had any effect. Should I also add --set-ctl-int 4024=3001 (OPUS_SET_SIGNAL_REQUEST=OPUS_SIGNAL_VOICE)? or also --set-ctl-int 11002=1000( OPUS_SET_FORCE_MODE_REQUEST=MODE_SILK_ONLY https://github.com/xiph/opus/blob/master/src/opus_private.h if will not be rejected by frontend)

Along the same lines, some option for debug/verbose print-out with information which codec is being chosen and maybe some other details would be useful too.

@vadimkantorov
Copy link

vadimkantorov commented Dec 6, 2023

@mark4o For DTX, is not transferring silence frames possible with some other containers like WebM or MKV? #49

Is there any other container that you can recommend for saving space on skipping long silence?

Do I understand you correctly that the standard .ogg container / demuxers can't handle missing silence/DTX frames in non-realtime mode? Because from the RFC standard it seems that it might https://datatracker.ietf.org/doc/html/rfc7845#section-4.1, I'm not understanding it probably :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants