diff --git a/README.md b/README.md index 5ab49fc62..214bf5640 100644 --- a/README.md +++ b/README.md @@ -69,6 +69,7 @@ bzip2, [csv](doc/formats.md#csv), dns, dns_tcp, +[dolby_metadata](doc/formats.md#dolby_metadata), elf, ether8023_frame, exif, @@ -157,7 +158,7 @@ vp9_cfm, vp9_frame, vpx_ccr, [wasm](doc/formats.md#wasm), -wav, +[wav](doc/formats.md#wav), webp, [xml](doc/formats.md#xml), yaml, diff --git a/doc/dev.md b/doc/dev.md index 5159056a2..44a8dbade 100644 --- a/doc/dev.md +++ b/doc/dev.md @@ -38,16 +38,27 @@ Flags can be struct with bit-fields. - Use commit messages with a context prefix to make it easier to find and understand, ex:
`mp3: Validate sync correctly` - Tests: - - If possible use a pair of `testdata/file` and `testdata/file.fqtest` where `file.fqtest` is `$ fq dv file` or `$ fq 'dv,torepr' file` if there is `torepr` support. + - If possible, add one or more pairs of example input binary file plus an `.fqtest` file with one or more expected CLI commands and their outputs, with naming like: + - `./format//testdata/.`, e.g. [`./format/mp4/testdata/aac.mp4`](../format/mp4/testdata/aac.mp4) + - and `./format//testdata/.fqtest`, e.g. [`./format/mp4/testdata/aac.fqtest`](../format/mp4/testdata/aac.fqtest) + - The `*.fqtest` files use lines prefixed with `$` of commands to run / test (usually `fq` command style of course), and their expected output below them, e.g. in simple [`./pkg/interp/testdata/version.fqtest`](../pkg/interp/testdata/version.fqtest): + ``` + $ fq -v + testversion (testos testarch) + ``` + - A basic test of a (potentially new) format can be specified by creating a single line `./format//testdata/.fqtest` with `$ fq dv .` or `$ fq 'dv,torepr' .` if there is `torepr` support. (Note: for the test file, do not use full path like `./format//testdata/.`) + - You can of course look at your code's current output for a test like this by locally running the same command using `go` and with full path, e.g. `go run . dv ./format//testdata/.` - If `dv` produces a lof of output maybe use `dv({array_truncate: 50})` etc - - Run `go test ./format -run TestFormats/` to test expected output. - - Run `go test ./format -run TestFormats/ -update` to update current output as expected output. + - Run `go test ./format -run TestFormats/` to test expected outputs for all tests under format ``. + - Run `go test ./format -run TestFormats/ -update` to automatically update current (or create, on initial addition of new test files and lines) expected output with the currently locally executing content. + - If you've added a new format, you'll probably also see that the generic `all` test fails, update that as well with: `go test ./format -run TestFormats/all -update` + - Double check that your git diff looks sensible for new or updated `*.fqtest` files (e.g. avoiding checking in test failure output like `exitcode: 2 error: wrong_path.mp4: no such file or directory`) - If you have format specific documentation: - Put it in `format/*/.md` and use `//go:embed .md`/`interp.RegisterFS(..)` to embed/register it. - Use simple markdown, just sections (depth starts at 3, `### Section`), paragraphs, lists and links. - - No heading section is needs with format name, will be added by `make doc` and fq cli help system. + - No heading section is needed with format name, will be added by `make doc` and fq cli help system. - Add a `testdata/_help.fqtest` with just `$ fq -h ` to test CLI help. - - If in doubt look at `mp4.md`/`mp4.go` etc. + - If in doubt look at [`mp4.md`](../format/mp4/mp4.md)/[`mp4.go`](../format/mp4/mp4.go) etc. - Run `make README.md doc/formats.md` to update md files. - Run linter `make lint` - Run fuzzer `make fuzz GROUP=`, see usage in Makefile diff --git a/doc/formats.md b/doc/formats.md index 80b7aea4b..beb3937d1 100644 --- a/doc/formats.md +++ b/doc/formats.md @@ -41,6 +41,7 @@ |[`csv`](#csv) |Comma separated values || |`dns` |DNS packet || |`dns_tcp` |DNS packet (TCP) || +|[`dolby_metadata`](#dolby_metadata) |Dolby Metadata (Atmos, AC3, Dolby Digital) || |`elf` |Executable and Linkable Format || |`ether8023_frame` |Ethernet 802.3 frame |`inet_packet`| |`exif` |Exchangeable Image File Format || @@ -129,7 +130,7 @@ |`vp9_frame` |VP9 frame || |`vpx_ccr` |VPX Codec Configuration Record || |[`wasm`](#wasm) |WebAssembly Binary Format || -|`wav` |WAV file |`id3v2` `id3v1` `id3v11`| +|[`wav`](#wav) |WAV file |`id3v2` `id3v1` `id3v11` `dolby_metadata`| |`webp` |WebP image |`exif` `vp8_frame` `icc_profile` `xml`| |[`xml`](#xml) |Extensible Markup Language || |`yaml` |YAML Ain't Markup Language || @@ -590,6 +591,33 @@ $ fq -d csv -o comma="\t" to_csv file.tsv $ fq -d csv '.[0] as $t | .[1:] | map(with_entries(.key = $t[.key]))' file.csv ``` +## dolby_metadata +Dolby Metadata (Atmos, AC3, Dolby Digital). + +Dolby Metadata from `` chunk of RIFF / WAV / Broadcast Wave Format (BWF), +including Dolby Atmos, AC3, Dolby Digital \[Plus\], and Dolby Audio Info (e.g. LUFS, True Peak). + +### Examples +Decode Dolby metadata from `` chunk: +``` +$ fq -d wav '.chunks[] | select(.id | IN("dbmd")) | tovalue' adm-bwf.wav +``` + +RIFF / WAV / Broadcast Wave Format (BWF) chunks: +- `` Track UIDs of Audio Definition Model +- `` BWF XML Metadata, e.g. for Audio Definition Model ambisonics and elements + +### Authors +- [@johnnymarnell](https://johnnymarnell.github.io), original author + +### References +- https://adm.ebu.io/background/what_is_the_adm.html +- https://tech.ebu.ch/publications/tech3285s7 +- https://tech.ebu.ch/publications/tech3285s5 +- https://tech.ebu.ch/files/live/sites/tech/files/shared/tech/tech3285s6.pdf +- https://github.com/DolbyLaboratories/dbmd-atmos-parser +- https://github.com/MediaArea/MediaInfoLib/blob/master/Source/MediaInfo/Audio/File_DolbyAudioMetadata.cpp + ## fit Garmin Flexible and Interoperable Data Transfer. @@ -1442,6 +1470,49 @@ $ fq '.sections | {import: map(select(.id == "import_section").content.im.x[].nm ### References - https://webassembly.github.io/spec/core/ +## wav +WAV file. + +WAVE audio file format. + +Also includes support for [Audio Definition Model](https://adm.ebu.io/background/what_is_the_adm.html) and 3D Audio. + +RIFF / WAV / Broadcast Wave Format (BWF) chunks: + +- `RIFF`: primary container chunk specifying the file type and containing sub-chunks (e.g., fmt, data) +- `fmt`: describes format / stream encoding in data chunk +- `data`: indicates size and contains encoded raw sound data +- `bext`: broadcast extension chunk, containing broadcast-specific metadata such as description, originator, creation date, time reference, and more +- `LIST`: organizes additional metadata in sub-chunks, often used to include information like artist, genre, or title in INFO or other standardized formats +- `smpl`: sample metadata chunk, containing looping and sampling information, such as start and end points for loops, sample rate, and MIDI pitch +- `fact`: contains metadata on the original uncompressed data, such as the number of samples, typically used in non-PCM (compressed) formats to aid in playback and synchronization +- `chna`: track UIDs of Audio Definition Model +- `axml`: XML metadata, e.g. for Audio Definition Model ambisonics and elements as in [EBUCore spec](https://tech.ebu.ch/docs/tech/tech3293.pdf) +- `dbmd`: Dolby specific metadata like loudness and binaural settings, see also [`dolby_metadata` format](#dolby_metadata) + + +### Examples +Decode ADM configuration from `` and `` chunks: +```bash +$ fq -d wav '.chunks[] | select(.id | IN("chna", "axml")) | tovalue' amd-bwf.wav + +# Extract ADM chunk objects definitions xml content +$ fq -r -d wav '.chunks[] | select(.id | IN("axml")) | .xml | tovalue' amd-bwf.wav | tee axml-content.xml +``` + +### Authors +- [@wader](https://github.com/wader), original author +- [@johnnymarnell](https://johnnymarnell.github.io), ADM support + +### References +- http://soundfile.sapp.org/doc/WaveFormat/ +- https://github.com/FFmpeg/FFmpeg/blob/master/libavformat/wavdec.c +- https://tech.ebu.ch/docs/tech/tech3285.pdf +- http://www-mmsp.ece.mcgill.ca/Documents/AudioFormats/WAVE/WAVE.html +- https://adm.ebu.io/background/what_is_the_adm.html +- https://tech.ebu.ch/docs/tech/tech3285s7.pdf +- https://tech.ebu.ch/docs/tech/tech3285s5.pdf + ## xml Extensible Markup Language. diff --git a/format/all/all.fqtest b/format/all/all.fqtest index 7dea22d8b..393e32610 100644 --- a/format/all/all.fqtest +++ b/format/all/all.fqtest @@ -85,6 +85,7 @@ cbor Concise Binary Object Representation csv Comma separated values dns DNS packet dns_tcp DNS packet (TCP) +dolby_metadata Dolby Metadata (Atmos, AC3, Dolby Digital) elf Executable and Linkable Format ether8023_frame Ethernet 802.3 frame exif Exchangeable Image File Format diff --git a/format/format.go b/format/format.go index a87ff0e02..006e6878f 100644 --- a/format/format.go +++ b/format/format.go @@ -92,6 +92,7 @@ var ( CSV = &decode.Group{Name: "csv"} DNS = &decode.Group{Name: "dns"} DNS_TCP = &decode.Group{Name: "dns_tcp"} + Dolby_Metadata = &decode.Group{Name: "dolby_metadata"} ELF = &decode.Group{Name: "elf"} Ether_8023_Frame = &decode.Group{Name: "ether8023_frame"} Exif = &decode.Group{Name: "exif"} diff --git a/format/riff/common.go b/format/riff/common.go index f15a821db..05b66ae3e 100644 --- a/format/riff/common.go +++ b/format/riff/common.go @@ -36,9 +36,9 @@ func riffDecode(d *decode.D, path path, headFn func(d *decode.D, path path) (str } }) - wordAlgin := d.AlignBits(16) - if wordAlgin != 0 { - d.FieldRawLen("align", int64(wordAlgin)) + wordAlign := d.AlignBits(16) + if wordAlign != 0 { + d.FieldRawLen("align", int64(wordAlign)) } } @@ -58,6 +58,16 @@ var chunkIDDescriptions = scalar.StrMapDescription{ "dmlh": "Extended AVI header", + "data": "Raw sound encoded data", + "bext": "Broadcast extension, e.g. creator, date, etc.", + "smpl": "Sample metadata, e.g. loop points", + "fact": "Original info used for compression, e.g. sample length", + + // BWF ADM master and Dolby Metadata + "chna": "Track UIDs of Audio Definition Model", + "axml": "Audio Definition Model ambisonics and elements", + "dbmd": "Dolby Metadata, e.g. Atmos, AC3, Dolby Digital [Plus]", + "ISMP": "SMPTE timecode", "IDIT": "Time and date digitizing commenced", "IARL": "Archival Location. Indicates where the subject of the file is archived.", diff --git a/format/riff/dolby.go b/format/riff/dolby.go new file mode 100644 index 000000000..32f5a30da --- /dev/null +++ b/format/riff/dolby.go @@ -0,0 +1,294 @@ +package riff + +// Dolby Metadata, e.g. Atmos, AC3, Dolby Digital [Plus] +// https://tech.ebu.ch/files/live/sites/tech/files/shared/tech/tech3285s6.pdf +// https://github.com/DolbyLaboratories/dbmd-atmos-parser + +import ( + "fmt" + "strings" + + "github.com/wader/fq/pkg/decode" + "github.com/wader/fq/pkg/scalar" +) + +func old_dbmdDecode(d *decode.D) any { + version := d.U32() + major := (version >> 24) & 0xFF + minor := (version >> 16) & 0xFF + patch := (version >> 8) & 0xFF + build := version & 0xFF + d.FieldValueStr("version", fmt.Sprintf("%d.%d.%d.%d", major, minor, patch, build)) + + d.FieldArray("metadata_segments", func(d *decode.D) { + for { + d.FieldStruct("metadata_segment", func(d *decode.D) { + segmentID := d.FieldU8("metadata_segment_id") + + // TODO(jmarnell): I think I need a loop until, but not creating these empty segments + // spec says we're done with 0 ID, so I'd like to not make the empty segment(s) + if segmentID == 0 { + if d.BitsLeft() > 0 { + d.SeekRel(d.BitsLeft() * 8) + } + return + } + + segmentSize := d.FieldU16("metadata_segment_size") + // bitsLeft := d.BitsLeft() + + switch segmentID { + case 1: + tmp_parseDolbyE(d) + case 3: + tmp_parseDolbyDigital(d) + case 7: + tmp_parseDolbyDigitalPlus(d) + case 8: + tmp_parseAudioInfo(d) + case 9: + tmp_parseDolbyAtmos(d) + case 10: + tmp_parseDolbyAtmosSupplemental(d) + default: + d.FieldRawLen("unknown_segment_raw", int64(segmentSize*8)) + } + + // bytesRemaining := (bitsLeft-d.BitsLeft())/8 - int64(segmentSize) + // if bytesRemaining < 0 { + // d.Fatalf("Read too many bytes for segment %d, read %d over, expected %d", segmentID, -bytesRemaining, segmentSize) + // } else if bytesRemaining > 0 { + // d.FieldValueUint("SKIPPED_BYTES", uint64(bytesRemaining)) + // d.SeekRel((int64(segmentSize) - bytesRemaining) * 8) + // } + + d.FieldU8("metadata_segment_checksum") + }) + } + }) + + return nil +} + +var compressionDesc = scalar.UintMapDescription{ + 0: "none", + 1: "Film, Standard", + 2: "Film, Light", + 3: "Music, Standard", + 4: "Music, Light", + 5: "Speech", + // TODO(jmarnell): Can I handle rest is "Reserved"? +} + +// TODO(jmarnell): Better way to handle "Reserved"? +func mapWithReserved(m map[uint64]string, key uint64) string { + if val, ok := m[key]; ok { + return val + } + return "Reserved" +} + +var bitstreamMode = scalar.UintMapDescription{ + 0b000: "main audio service: complete main (CM)", + 0b001: "main audio service: music and effects (ME)", + 0b010: "associated service: visually impaired (VI)", + 0b011: "associated service: hearing impaired (HI)", + 0b100: "associated service: dialogue (D)", + 0b101: "associated service: commentary (C)", + 0b110: "associated service: emergency (E)", + 0b111: "associated service: voice over (VO)", + + 0b1000: "associated service: karaoke (K)", +} + +var binaural = scalar.UintMapDescription{ + 0: "bypass", + 1: "near", + 2: "far", + 3: "mid", + 4: "not indicated", +} + +var warpMode = scalar.UintMapDescription{ + 0: "normal", + 1: "warping", + 2: "downmix Dolby Pro Logic IIx", + 3: "downmix LoRo", + 4: "not indicated (Default warping will be applied.)", +} + +var tmp_trimConfigName = scalar.UintMapDescription{ + 0: "2.0", + 1: "5.1", + 2: "7.1", + 3: "2.1.2", + 4: "5.1.2", + 5: "7.1.2", + 6: "2.1.4", + 7: "5.1.4", + 8: "7.1.4", +} + +var trimType = scalar.UintMapDescription{ + 0: "manual", + 1: "automatic", +} + +func tmp_parseDolbyE(d *decode.D) { + d.FieldValueStr("metadata_segment_type", "dolby_e") + + d.FieldU8("program_config") + d.FieldU8("frame_rate_code") + d.FieldRawLen("e_SMPTE_time_code", 8*8) + d.FieldRawLen("e_reserved", 1*8) + d.FieldRawLen("e_reserved2", 25*8) + d.FieldRawLen("reserved_for_future_use", 80*8) +} + +func tmp_parseDolbyDigital(d *decode.D) { + d.FieldValueStr("metadata_segment_type", "dolby_digital") + + d.FieldU8("ac3_program_id") + d.FieldU8("program_info") + d.FieldU8("datarate_info") + d.FieldRawLen("reserved", 1*8) + d.FieldU8("surround_config") + d.FieldU8("dialnorm_info") + d.FieldU8("ac3_langcod") + d.FieldU8("audio_prod_info") + d.FieldU8("ext_bsi1_word1") + d.FieldU8("ext_bsi1_word2") + d.FieldU8("ext_bsi2_word1") + d.FieldRawLen("reserved2", 3*8) + d.FieldU8("ac3_compr1") + d.FieldU8("ac3_dynrng1") + d.FieldRawLen("reserved_for_future_use", 21*8) + d.FieldRawLen("program_description_text", 32*8) +} + +func tmp_parseDolbyDigitalPlus(d *decode.D) { + d.FieldValueStr("metadata_segment_type", "dolby_digital_plus") + + d.FieldU8("program_id") + programInfo := d.FieldU8("program_info") + lfeon := programInfo & 0b1_000_000 + bsmod := programInfo & 0b0_111_000 + acmod := programInfo & 0b0_000_111 + d.FieldValueBool("lfe_on", lfeon != 0) + if bsmod == 0b111 && acmod != 0b001 { + bsmod = 0b1000 + } + d.FieldValueStr("bitstream_mode", bitstreamMode[bsmod]) + + d.FieldU16LE("ddplus_reserved_a") + + d.FieldU8("surround_config") + d.FieldU8("dialnorm_info") + d.FieldU8("langcod") + d.FieldU8("audio_prod_info") + d.FieldU8("ext_bsi1_word1") + d.FieldU8("ext_bsi1_word2") + d.FieldU8("ext_bsi2_word1") + + d.FieldU24LE("ddplus_reserved_b") + + d.FieldValueStr("compr1_type", mapWithReserved(compressionDesc, d.FieldU8("compr1"))) + d.FieldValueStr("dynrng1_type", mapWithReserved(compressionDesc, d.FieldU8("dynrng1"))) + + d.FieldU24LE("ddplus_reserved_c") + + d.FieldU8("ddplus_info1") + + d.FieldU40LE("ddplus_reserved_d") + + d.FieldU16LE("datarate") + d.FieldRawLen("reserved_for_future_use", 69*8) +} + +func tmp_parseAudioInfo(d *decode.D) { + d.FieldValueStr("metadata_segment_type", "audio_info") + + d.FieldU8("program_id") + d.FieldUTF8("audio_origin", 32) + d.FieldU32LE("largest_sample_value") + d.FieldU32LE("largest_sample_value_2") + d.FieldU32LE("largest_true_peak_value") + d.FieldU32LE("largest_true_peak_value_2") + d.FieldU32LE("dialogue_loudness") + d.FieldU32LE("dialogue_loudness_2") + d.FieldU32LE("speech_content") + d.FieldU32LE("speech_content_2") + d.FieldUTF8("last_processed_by", 32) + d.FieldUTF8("last_operation", 32) + d.FieldUTF8("segment_creation_date", 32) + d.FieldUTF8("segment_modified_date", 32) +} + +func tmp_parseDolbyAtmos(d *decode.D) { + d.FieldValueStr("metadata_segment_type", "dolby_atmos") + + // d.SeekRel(32 * 8) + str := d.FieldUTF8Null("atmos_dbmd_content_creation_preamble") + d.SeekRel(int64(32-len(str)-1) * 8) + + str = d.FieldUTF8Null("atmos_dbmd_content_creation_tool") + d.SeekRel(int64(64-len(str)-1) * 8) + + major := d.U8() + minor := d.U8() + micro := d.U8() + d.FieldValueStr("version", fmt.Sprintf("%d.%d.%d", major, minor, micro)) + d.SeekRel(53 * 8) + + warpBedReserved := d.U8() + d.FieldValueUint("warp_mode", warpBedReserved&0x7) + d.FieldValueStr("warp_mode_type", warpMode[warpBedReserved&0x7]) + + d.SeekRel(15 * 8) + d.SeekRel(80 * 8) +} + +func tmp_parseDolbyAtmosSupplemental(d *decode.D) { + d.FieldValueStr("metadata_segment_type", "dolby_atmos_supplemental") + + sync := d.FieldU32LE("dasms_sync") + d.FieldValueBool("dasms_sync_valid", sync == 0xf8726fbd) + + objectCount := int64(d.FieldU16LE("object_count")) + d.FieldU8LE("reserved") + + i := 0 + d.FieldStructNArray("trim_configs", "trim_config", 9, func(d *decode.D) { + autoTrimReserved := d.FieldU8LE("auto_trim_reserved") + autoTrim := autoTrimReserved & 0x01 + d.FieldValueBool("auto_trim", autoTrim == 1) + d.FieldValueStr("trim_type", trimType[autoTrim]) + d.FieldValueStr("trim_config_name", trimConfigName[uint64(i)]) + + //d.SeekRel(14 * 8) + // d.FieldUTF8("raw", 14) + str := d.UTF8(14) + bytes := []byte(str) + var nonZeroBytes []string + for _, b := range bytes { + if b != 0 { + nonZeroBytes = append(nonZeroBytes, fmt.Sprintf("%d", b)) + } + } + // TODO(jmarnell): I think the +3dB trim settings are here. + // Would like this at least as an array of numbers, instead of this CSV string + d.FieldValueStr("trim_defs", strings.Join(nonZeroBytes, ", ")) + + i++ + }) + + d.FieldStructNArray("objects", "object", objectCount, func(d *decode.D) { + d.FieldU8LE("object_value") + }) + + d.FieldStructNArray("binaural_render_modes", "binaural_render_mode", objectCount, func(d *decode.D) { + mode := d.U8LE() & 0x7 + d.FieldValueUint("render_mode", mode) + d.FieldValueStr("render_mode_type", binaural[mode]) + }) +} diff --git a/format/riff/dolby_metadata.go b/format/riff/dolby_metadata.go new file mode 100644 index 000000000..56a521139 --- /dev/null +++ b/format/riff/dolby_metadata.go @@ -0,0 +1,317 @@ +package riff + +// Dolby Metadata, e.g. Atmos, AC3, Dolby Digital [Plus] +// https://tech.ebu.ch/files/live/sites/tech/files/shared/tech/tech3285s6.pdf +// https://github.com/DolbyLaboratories/dbmd-atmos-parser + +import ( + "embed" + + "github.com/wader/fq/format" + "github.com/wader/fq/pkg/decode" + "github.com/wader/fq/pkg/interp" + "github.com/wader/fq/pkg/scalar" +) + +//go:embed dolby_metadata.md +var dolbyMetadataFS embed.FS + +func init() { + interp.RegisterFormat( + format.Dolby_Metadata, + &decode.Format{ + Description: "Dolby Metadata (Atmos, AC3, Dolby Digital)", + DecodeFn: dbmdDecode, + }, + ) + interp.RegisterFS(dolbyMetadataFS) +} + +func dbmdDecode(d *decode.D) any { + d.Endian = decode.LittleEndian + + d.FieldStruct("version", func(d *decode.D) { + d.FieldU8("major") + d.FieldU8("minor") + d.FieldU8("patch") + d.FieldU8("build") + }) + + d.FieldArray("metadata_segments", func(d *decode.D) { + seenEnd := false + for !seenEnd { + d.FieldStruct("metadata_segment", func(d *decode.D) { + segmentID := d.FieldU8("id", metadataSegmentTypeMap) + + // TODO(jmarnell): This will always make an empty end segment, I think it would be better to omit it + if segmentID == metadataSegmentTypeEnd { + seenEnd = true + return + } + + segmentSize := d.FieldU16("size") + + switch segmentID { + case metadataSegmentTypeDolbyE: + parseDolbyE(d) + case metadataSegmentTypeDolbyDigital: + parseDolbyDigital(d) + case metadataSegmentTypeDolbyDigitalPlus: + parseDolbyDigitalPlus(d) + case metadataSegmentTypeAudioInfo: + parseAudioInfo(d) + case metadataSegmentTypeDolbyAtmos: + parseDolbyAtmos(d) + case metadataSegmentTypeDolbyAtmosSupplemental: + parseDolbyAtmosSupplemental(d) + default: + d.FieldRawLen("unknown", int64(segmentSize*8)) + } + + // TODO: use this to validate parsing + d.FieldU8("checksum", scalar.UintHex) + }) + } + }) + + return nil +} + +var compressionDescMap = scalar.UintMapSymStr{ + 0: "none", + 1: "film_standard", + 2: "film_light", + 3: "music_standard", + 4: "music_light", + 5: "speech", +} + +var downmix5to2DescMap = scalar.UintMap{ + 0: {Sym: "not_indicated", Description: "Not indicated (Lo/Ro)"}, + 1: {Sym: "loro", Description: "Lo/Ro"}, + 2: {Sym: "ltrt_dpl", Description: "Lt/Rt (Dolby Pro Logic)"}, + 3: {Sym: "ltrt_dpl2", Description: "Lt/Rt (Dolby Pro Logic II)"}, + 4: {Sym: "direct_stereo_render", Description: "Direct stereo render"}, +} + +var phaseShift5to2DescMap = scalar.UintMap{ + 0: {Sym: "no_shift", Description: "Without Phase 90"}, + 1: {Sym: "shift_90", Description: "With Phase 90"}, +} + +var bitstreamModeMap = scalar.UintMapDescription{ + 0b000: "main audio service: complete main (CM)", + 0b001: "main audio service: music and effects (ME)", + 0b010: "associated service: visually impaired (VI)", + 0b011: "associated service: hearing impaired (HI)", + 0b100: "associated service: dialogue (D)", + 0b101: "associated service: commentary (C)", + 0b110: "associated service: emergency (E)", + 0b111: "associated service: voice over (VO)", + + 0b1000: "associated service: karaoke (K)", +} + +var binauralRenderModeMap = scalar.UintMapSymStr{ + 0: "bypass", + 1: "near", + 2: "far", + 3: "mid", + 4: "not_indicated", +} + +var warpModeMap = scalar.UintMap{ + 0: {Sym: "normal", Description: "possibly: Direct render"}, + 1: {Sym: "warping", Description: "possibly: Direct render with room balance"}, + 2: {Sym: "downmix_dolby_pro_logic_iix", Description: "Dolby Pro Logic IIx"}, + 3: {Sym: "downmix_loro", Description: "possibly: Standard (Lo/Ro)"}, + 4: {Sym: "not_indicated", Description: "Default warping will be applied"}, +} + +var trimConfigName = scalar.UintMapDescription{ + 0: "2.0", + 1: "5.1", + 2: "7.1", + 3: "2.1.2", + 4: "5.1.2", + 5: "7.1.2", + 6: "2.1.4", + 7: "5.1.4", + 8: "7.1.4", +} + +const ( + metadataSegmentTypeEnd = 0 + metadataSegmentTypeDolbyE = 1 + metadataSegmentTypeDolbyReserved2 = 2 + metadataSegmentTypeDolbyDigital = 3 + metadataSegmentTypeDolbyReserved4 = 4 + metadataSegmentTypeDolbyReserved5 = 5 + metadataSegmentTypeDolbyReserved6 = 6 + metadataSegmentTypeDolbyDigitalPlus = 7 + metadataSegmentTypeAudioInfo = 8 + metadataSegmentTypeDolbyAtmos = 9 + metadataSegmentTypeDolbyAtmosSupplemental = 10 +) + +var metadataSegmentTypeMap = scalar.UintMapSymStr{ + metadataSegmentTypeEnd: "end", + metadataSegmentTypeDolbyE: "dolby_e_metadata", + metadataSegmentTypeDolbyReserved2: "reserved2", + metadataSegmentTypeDolbyDigital: "dolby_digital_metadata", + metadataSegmentTypeDolbyReserved4: "reserved4", + metadataSegmentTypeDolbyReserved5: "reserved5", + metadataSegmentTypeDolbyReserved6: "reserved6", + metadataSegmentTypeDolbyDigitalPlus: "dolby_digital_plus_metadata", + metadataSegmentTypeAudioInfo: "audio_info", + metadataSegmentTypeDolbyAtmos: "dolby_atmos", + metadataSegmentTypeDolbyAtmosSupplemental: "dolby_atmos_supplemental", +} + +func parseDolbyE(d *decode.D) { + d.FieldU8("program_config") + d.FieldU8("frame_rate_code") + d.FieldRawLen("e_smpte_time_code", 8*8) + d.FieldRawLen("e_reserved", 1*8) + d.FieldRawLen("e_reserved2", 25*8) + d.FieldRawLen("reserved_for_future_use", 80*8) +} + +func parseDolbyDigital(d *decode.D) { + d.FieldU8("ac3_program_id") + d.FieldU8("program_info") + d.FieldU8("datarate_info") + d.FieldRawLen("reserved", 1*8) + d.FieldU8("surround_config") + d.FieldU8("dialnorm_info") + d.FieldU8("ac3_langcod") + d.FieldU8("audio_prod_info") + d.FieldU8("ext_bsi1_word1") + d.FieldU8("ext_bsi1_word2") + d.FieldU8("ext_bsi2_word1") + d.FieldRawLen("reserved2", 3*8) + d.FieldU8("ac3_compr1") + d.FieldU8("ac3_dynrng1") + d.FieldRawLen("reserved_for_future_use", 21*8) + d.FieldRawLen("program_description_text", 32*8) +} + +func parseDolbyDigitalPlus(d *decode.D) { + d.FieldU8("program_id") + // TODO: make struct and read U1(?) U1 (lfeon) U3 (bsmod) U3(acmod) fields? + programInfo := d.FieldU8("program_info") + lfeon := programInfo & 0b1_000_000 + bsmod := programInfo & 0b0_111_000 + acmod := programInfo & 0b0_000_111 + d.FieldValueBool("lfe_on", lfeon != 0) + if bsmod == 0b111 && acmod != 0b001 { + bsmod = 0b1000 + } + d.FieldValueStr("bitstream_mode", bitstreamModeMap[bsmod]) + + d.FieldU16("ddplus_reserved_a") + + d.FieldU8("surround_config") + d.FieldU8("dialnorm_info") + d.FieldU8("langcod") + d.FieldU8("audio_prod_info") + d.FieldU8("ext_bsi1_word1") + d.FieldU8("ext_bsi1_word2") + d.FieldU8("ext_bsi2_word1") + + d.FieldU24("ddplus_reserved_b") + + d.FieldU8("compr1", scalar.UintSym("reserved"), compressionDescMap) + d.FieldU8("dynrng1", scalar.UintSym("reserved"), compressionDescMap) + + d.FieldU24("ddplus_reserved_c") + + d.FieldU8("ddplus_info1") + + d.FieldU40("ddplus_reserved_d") + + d.FieldU16("datarate") + d.FieldRawLen("reserved_for_future_use", 69*8) +} + +func parseAudioInfo(d *decode.D) { + d.FieldU8("program_id") + d.FieldUTF8("audio_origin", 32) + d.FieldU32("largest_sample_value") + d.FieldU32("largest_sample_value_2") + d.FieldU32("largest_true_peak_value") + d.FieldU32("largest_true_peak_value_2") + d.FieldU32("dialogue_loudness") + d.FieldU32("dialogue_loudness_2") + d.FieldU32("speech_content") + d.FieldU32("speech_content_2") + d.FieldUTF8("last_processed_by", 32) + d.FieldUTF8("last_operation", 32) + d.FieldUTF8("segment_creation_date", 32) + d.FieldUTF8("segment_modified_date", 32) +} + +func parseDolbyAtmos(d *decode.D) { + d.FieldUTF8NullFixedLen("atmos_dbmd_content_creation_preamble", 32) + d.FieldUTF8NullFixedLen("atmos_dbmd_content_creation_tool", 64) + d.FieldStruct("version", func(d *decode.D) { + d.FieldU8("major") + d.FieldU8("minor") + d.FieldU8("patch") + }) + // TODO: All these unknowns? (mostly from MediaInfoLib, also Dolby repo) + + d.FieldRawLen("unknown0", 21*8) + + d.FieldRawLen("unknown1", 1) + d.FieldU3("downmix_5to2", scalar.UintSym("unknown"), downmix5to2DescMap) + d.FieldRawLen("unknown2", 2) + d.FieldU2("phaseshift_90deg_5to2", scalar.UintSym("unknown"), phaseShift5to2DescMap) + + d.FieldRawLen("unknown3", 12*8) + + d.FieldRawLen("bed_distribution", 2) + d.FieldRawLen("reserved0", 3) + d.FieldU3("warp_mode", warpModeMap) + + d.FieldRawLen("unknown4", 15*8) + d.FieldRawLen("unknown5", 80*8) +} + +func parseDolbyAtmosSupplemental(d *decode.D) { + d.FieldU32("dasms_sync", d.UintAssert(0xf8726fbd), scalar.UintHex) + + objectCount := int64(d.FieldU16("object_count")) + d.FieldU8("reserved") + + i := 0 + d.FieldStructNArray("trim_configs", "trim_config", 9, func(d *decode.D) { + d.FieldRawLen("reserved0", 7) + trimType := d.FieldU1("type", scalar.UintMapSymStr{ + 0: "manual", + 1: "automatic", + }) + d.FieldValueStr("config_name", trimConfigName[uint64(i)]) + + if trimType == 1 { + d.FieldUTF8("reserved1", 14) + } else { + // TODO: Reference MediaInfo's logic and Dolby pdf's + d.FieldUTF8("manual_trim_raw_config", 14) + } + i++ + }) + + d.FieldArray("objects", func(d *decode.D) { + for i := int64(0); i < objectCount; i++ { + d.FieldU8("object_value") + } + }) + + d.FieldArray("binaural_render_modes", func(d *decode.D) { + // TODO: 0x7 mask needed? + for i := int64(0); i < objectCount; i++ { + d.FieldU8("render_mode", scalar.UintActualFn(func(a uint64) uint64 { return a & 0x7 }), binauralRenderModeMap) + } + }) +} diff --git a/format/riff/dolby_metadata.md b/format/riff/dolby_metadata.md new file mode 100644 index 000000000..dd74b09ca --- /dev/null +++ b/format/riff/dolby_metadata.md @@ -0,0 +1,23 @@ +Dolby Metadata from `` chunk of RIFF / WAV / Broadcast Wave Format (BWF), +including Dolby Atmos, AC3, Dolby Digital \[Plus\], and Dolby Audio Info (e.g. LUFS, True Peak). + +### Examples +Decode Dolby metadata from `` chunk: +``` +$ fq -d wav '.chunks[] | select(.id | IN("dbmd")) | tovalue' adm-bwf.wav +``` + +RIFF / WAV / Broadcast Wave Format (BWF) chunks: +- `` Track UIDs of Audio Definition Model +- `` BWF XML Metadata, e.g. for Audio Definition Model ambisonics and elements + +### Authors +- [@johnnymarnell](https://johnnymarnell.github.io), original author + +### References +- https://adm.ebu.io/background/what_is_the_adm.html +- https://tech.ebu.ch/publications/tech3285s7 +- https://tech.ebu.ch/publications/tech3285s5 +- https://tech.ebu.ch/files/live/sites/tech/files/shared/tech/tech3285s6.pdf +- https://github.com/DolbyLaboratories/dbmd-atmos-parser +- https://github.com/MediaArea/MediaInfoLib/blob/master/Source/MediaInfo/Audio/File_DolbyAudioMetadata.cpp diff --git a/format/riff/testdata/bext.wav.fqtest b/format/riff/testdata/bext.wav.fqtest index 874ffab58..ef32db4a2 100644 --- a/format/riff/testdata/bext.wav.fqtest +++ b/format/riff/testdata/bext.wav.fqtest @@ -5,7 +5,7 @@ $ fq dv bext.wav 0x000| 57 41 56 45 | WAVE | format: "WAVE" (valid) 0x8-0xc (4) | | | chunks[0:1]: 0xc-0x26e (610) | | | [0]{}: chunk 0xc-0x26e (610) -0x000| 62 65 78 74| bext| id: "bext" 0xc-0x10 (4) +0x000| 62 65 78 74| bext| id: "bext" (Broadcast extension, e.g. creator, date, etc.) 0xc-0x10 (4) 0x010|5a 02 00 00 |Z... | size: 602 0x10-0x14 (4) 0x010| 00 00 00 00 00 00 00 00 00 00 00 00| ............| description: "" 0x14-0x114 (256) 0x020|00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00|................| diff --git a/format/riff/testdata/dolby_metadata.fqtest b/format/riff/testdata/dolby_metadata.fqtest new file mode 100644 index 000000000..e3874965a --- /dev/null +++ b/format/riff/testdata/dolby_metadata.fqtest @@ -0,0 +1,131 @@ +$ fq dv foobar.wav + |00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f|0123456789abcdef|.{}: dolby_metadata.wav (wav) 0x0-0x2aec (10988) +0x0000|52 49 46 46 |RIFF | id: "RIFF" 0x0-0x4 (4) +0x0000| e4 2a 00 00 | .*.. | size: 10980 0x4-0x8 (4) +0x0000| 57 41 56 45 | WAVE | format: "WAVE" (valid) 0x8-0xc (4) + | | | chunks[0:6]: 0xc-0x2a40 (10804) + | | | [0]{}: chunk 0xc-0x54 (72) +0x0000| 4a 55 4e 4b| JUNK| id: "JUNK" (Alignment) 0xc-0x10 (4) +0x0010|40 00 00 00 |@... | size: 64 0x10-0x14 (4) +0x0010| 00 00 00 00 00 00 00 00 00 00 00 00| ............| data: raw bits 0x14-0x54 (64) +0x0020|00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00|................| +* |until 0x53.7 (64) | | + | | | [1]{}: chunk 0x54-0x6c (24) +0x0050| 66 6d 74 20 | fmt | id: "fmt" 0x54-0x58 (4) +0x0050| 10 00 00 00 | .... | size: 16 0x58-0x5c (4) +0x0050| 01 00 | .. | audio_format: "pcm_s16le" (1) 0x5c-0x5e (2) +0x0050| 03 00| ..| num_channels: 3 0x5e-0x60 (2) +0x0060|80 bb 00 00 |.... | sample_rate: 48000 0x60-0x64 (4) +0x0060| 80 97 06 00 | .... | byte_rate: 432000 0x64-0x68 (4) +0x0060| 09 00 | .. | block_align: 9 0x68-0x6a (2) +0x0060| 18 00 | .. | bits_per_sample: 24 0x6a-0x6c (2) + | | | [2]{}: chunk 0x6c-0x1154 (4328) +0x0060| 64 61 74 61| data| id: "data" (Raw sound encoded data) 0x6c-0x70 (4) +0x0070|e0 10 00 00 |.... | size: 4320 0x70-0x74 (4) +0x0070| 00 00 00 00 00 00 00 00 00 00 00 00| ............| samples: raw bits 0x74-0x1154 (4320) +0x0080|00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00|................| +* |until 0x1153.7 (4320) | | + | | | [3]{}: chunk 0x1154-0x2862 (5902) +0x1150| 61 78 6d 6c | axml | id: "axml" (Audio Definition Model ambisonics and elements) 0x1154-0x1158 (4) +0x1150| 06 17 00 00 | .... | size: 5894 0x1158-0x115c (4) +0x1150| 3c 3f 78 6d| \n\n\t\n\t\t\n\t\t\t\n\t\t\t\t\n\t\t\t\t\tACO_1001\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\tAO_1001\n\t\t\t\t\tAO_100b\n\t\t\t\t\t2\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\tAP_00011001\n\t\t\t\t\tATU_00000001\n\t\t\t\t\tATU_00000002\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\tAP_00031001\n\t\t\t\t\tATU_00000003\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\tAC_00011001\n\t\t\t\t\tAC_00011002\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\tAC_00031001\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\tRC_L\n\t\t\t\t\t\t1\n\t\t\t\t\t\t-1.0000000000\n\t\t\t\t\t\t1.0000000000\n\t\t\t\t\t\t0.0000000000\n\t\t\t\t\t\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\tRC_R\n\t\t\t\t\t\t1\n\t\t\t\t\t\t1.0000000000\n\t\t\t\t\t\t1.0000000000\n\t\t\t\t\t\t0.0000000000\n\t\t\t\t\t\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\t\n\t\t\t\t\t\t1\n\t\t\t\t\t\t0.0000000000\n\t\t\t\t\t\t1.0000000000\n\t\t\t\t\t\t1\n\t\t\t\t\t\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\tAC_00011001\n\t\t\t\t\tAP_00011001\n\t\t\t\t\tAT_00011001_01\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\tAC_00011002\n\t\t\t\t\tAP_00011001\n\t\t\t\t\tAT_00011002_01\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\tAC_00031001\n\t\t\t\t\tAP_00031001\n\t\t\t\t\tAT_00031001_01\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\tAS_00011001\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\tAS_00011002\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\tAS_00031001\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\tAT_00011001_01\n\t\t\t\t\tAP_00011001\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\tAT_00011002_01\n\t\t\t\t\tAP_00011001\n\t\t\t\t\n\t\t\t\t\n\t\t\t\t\tAT_00031001_01\n\t\t\t\t\tAP_00031001\n\t\t\t\t\n\t\t\t\n\t\t\n\t\n\n" 0x115c-0x2862 (5894) +0x1160|6c 20 76 65 72 73 69 6f 6e 3d 22 31 2e 30 22 20|l version="1.0" | +* |until 0x2861.7 (5894) | | + | | | [4]{}: chunk 0x2862-0x28e6 (132) +0x2860| 63 68 6e 61 | chna | id: "chna" (Track UIDs of Audio Definition Model) 0x2862-0x2866 (4) +0x2860| 7c 00 00 00 | |... | size: 124 0x2866-0x286a (4) +0x2860| 03 00 | .. | num_tracks: 3 0x286a-0x286c (2) +0x2860| 03 00 | .. | num_uids: 3 0x286c-0x286e (2) + | | | audio_ids[0:3]: 0x286e-0x28e6 (120) + | | | [0]{}: audio_id 0x286e-0x2896 (40) +0x2860| 01 00| ..| track_index: 1 0x286e-0x2870 (2) +0x2870|41 54 55 5f 30 30 30 30 30 30 30 31 |ATU_00000001 | uid: "ATU_00000001" 0x2870-0x287c (12) +0x2870| 41 54 5f 30| AT_0| track_format_id_reference: "AT_00011001_01" 0x287c-0x288a (14) +0x2880|30 30 31 31 30 30 31 5f 30 31 |0011001_01 | +0x2880| 41 50 5f 30 30 30| AP_000| pack_format_id_reference: "AP_00011001" 0x288a-0x2895 (11) +0x2890|31 31 30 30 31 |11001 | +0x2890| 00 | . | padding: raw bits 0x2895-0x2896 (1) + | | | [1]{}: audio_id 0x2896-0x28be (40) +0x2890| 02 00 | .. | track_index: 2 0x2896-0x2898 (2) +0x2890| 41 54 55 5f 30 30 30 30| ATU_0000| uid: "ATU_00000002" 0x2898-0x28a4 (12) +0x28a0|30 30 30 32 |0002 | +0x28a0| 41 54 5f 30 30 30 31 31 30 30 32 5f| AT_00011002_| track_format_id_reference: "AT_00011002_01" 0x28a4-0x28b2 (14) +0x28b0|30 31 |01 | +0x28b0| 41 50 5f 30 30 30 31 31 30 30 31 | AP_00011001 | pack_format_id_reference: "AP_00011001" 0x28b2-0x28bd (11) +0x28b0| 00 | . | padding: raw bits 0x28bd-0x28be (1) + | | | [2]{}: audio_id 0x28be-0x28e6 (40) +0x28b0| 03 00| ..| track_index: 3 0x28be-0x28c0 (2) +0x28c0|41 54 55 5f 30 30 30 30 30 30 30 33 |ATU_00000003 | uid: "ATU_00000003" 0x28c0-0x28cc (12) +0x28c0| 41 54 5f 30| AT_0| track_format_id_reference: "AT_00031001_01" 0x28cc-0x28da (14) +0x28d0|30 30 33 31 30 30 31 5f 30 31 |0031001_01 | +0x28d0| 41 50 5f 30 30 30| AP_000| pack_format_id_reference: "AP_00031001" 0x28da-0x28e5 (11) +0x28e0|33 31 30 30 31 |31001 | +0x28e0| 00 | . | padding: raw bits 0x28e5-0x28e6 (1) + | | | [5]{}: chunk 0x28e6-0x2a40 (346) +0x28e0| 64 62 6d 64 | dbmd | id: "dbmd" (Dolby Metadata, e.g. Atmos, AC3, Dolby Digital [Plus]) 0x28e6-0x28ea (4) +0x28e0| fe 01 00 00 | .... | size: 510 0x28ea-0x28ee (4) + | | | version{}: 0x28ee-0x28f2 (4) +0x28e0| 06 | . | major: 6 0x28ee-0x28ef (1) +0x28e0| 00| .| minor: 0 0x28ef-0x28f0 (1) +0x28f0|00 |. | patch: 0 0x28f0-0x28f1 (1) +0x28f0| 01 | . | build: 1 0x28f1-0x28f2 (1) + | | | metadata_segments[0:3]: 0x28f2-0x2a40 (334) + | | | [0]{}: metadata_segment 0x28f2-0x2956 (100) +0x28f0| 07 | . | id: "dolby_digital_plus_metadata" (7) 0x28f2-0x28f3 (1) +0x28f0| 60 00 | `. | size: 96 0x28f3-0x28f5 (2) +0x28f0| 00 | . | program_id: 0 0x28f5-0x28f6 (1) +0x28f0| 47 | G | program_info: 71 0x28f6-0x28f7 (1) + | | | lfe_on: true + | | | bitstream_mode: "main audio service: complete main (CM)" +0x28f0| 00 00 | .. | ddplus_reserved_a: 0 0x28f7-0x28f9 (2) +0x28f0| 00 | . | surround_config: 0 0x28f9-0x28fa (1) +0x28f0| 60 | ` | dialnorm_info: 96 0x28fa-0x28fb (1) +0x28f0| 00 | . | langcod: 0 0x28fb-0x28fc (1) +0x28f0| 00 | . | audio_prod_info: 0 0x28fc-0x28fd (1) +0x28f0| 24 | $ | ext_bsi1_word1: 36 0x28fd-0x28fe (1) +0x28f0| 24 | $ | ext_bsi1_word2: 36 0x28fe-0x28ff (1) +0x28f0| 00| .| ext_bsi2_word1: 0 0x28ff-0x2900 (1) +0x2900|00 00 00 |... | ddplus_reserved_b: 0 0x2900-0x2903 (3) +0x2900| 02 | . | compr1: "film_light" (2) 0x2903-0x2904 (1) +0x2900| 02 | . | dynrng1: "film_light" (2) 0x2904-0x2905 (1) +0x2900| 00 00 00 | ... | ddplus_reserved_c: 0 0x2905-0x2908 (3) +0x2900| 00 | . | ddplus_info1: 0 0x2908-0x2909 (1) +0x2900| 00 00 00 00 00 | ..... | ddplus_reserved_d: 0 0x2909-0x290e (5) +0x2900| 00 00| ..| datarate: 0 0x290e-0x2910 (2) +0x2910|00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00|................| reserved_for_future_use: raw bits 0x2910-0x2955 (69) +* |until 0x2954.7 (69) | | +0x2950| ad | . | checksum: 0xad 0x2955-0x2956 (1) + | | | [1]{}: metadata_segment 0x2956-0x2a3f (233) +0x2950| 09 | . | id: "dolby_atmos" (9) 0x2956-0x2957 (1) +0x2950| f8 00 | .. | size: 248 0x2957-0x2959 (2) +0x2950| 43 72 65 61 74 65 64| Created| atmos_dbmd_content_creation_preamble: "Created using Dolby equipment" 0x2959-0x2979 (32) +0x2960|20 75 73 69 6e 67 20 44 6f 6c 62 79 20 65 71 75| using Dolby equ| +0x2970|69 70 6d 65 6e 74 00 00 00 |ipment... | +0x2970| 44 6f 6c 62 79 20 41| Dolby A| atmos_dbmd_content_creation_tool: "Dolby Atmos Composer Essential (fiedler audio)" 0x2979-0x29b9 (64) +0x2980|74 6d 6f 73 20 43 6f 6d 70 6f 73 65 72 20 45 73|tmos Composer Es| +* |until 0x29b8.7 (64) | | + | | | version{}: 0x29b9-0x29bc (3) +0x29b0| 01 | . | major: 1 0x29b9-0x29ba (1) +0x29b0| 00 | . | minor: 0 0x29ba-0x29bb (1) +0x29b0| 01 | . | patch: 1 0x29bb-0x29bc (1) +0x29b0| 00 00 00 00| ....| unknown0: raw bits 0x29bc-0x29d1 (21) +0x29c0|03 00 00 00 00 00 00 00 22 ff 00 00 00 00 00 03|........".......| +0x29d0|00 |. | +0x29d0| 00 | . | unknown1: raw bits 0x29d1-0x29d1.1 (0.1) +0x29d0| 00 | . | downmix_5to2: "not_indicated" (0) (Not indicated (Lo/Ro)) 0x29d1.1-0x29d1.4 (0.3) +0x29d0| 00 | . | unknown2: raw bits 0x29d1.4-0x29d1.6 (0.2) +0x29d0| 00 | . | phaseshift_90deg_5to2: "no_shift" (0) (Without Phase 90) 0x29d1.6-0x29d2 (0.2) +0x29d0| 00 00 00 00 00 00 00 00 00 00 00 00 | ............ | unknown3: raw bits 0x29d2-0x29de (12) +0x29d0| 00 | . | bed_distribution: raw bits 0x29de-0x29de.2 (0.2) +0x29d0| 00 | . | reserved0: raw bits 0x29de.2-0x29de.5 (0.3) +0x29d0| 00 | . | warp_mode: "normal" (0) (possibly: Direct render) 0x29de.5-0x29df (0.3) +0x29d0| f0| .| unknown4: raw bits 0x29df-0x29ee (15) +0x29e0|08 00 00 00 10 00 00 00 00 00 00 00 00 00 |.............. | +0x29e0| 00 00| ..| unknown5: raw bits 0x29ee-0x2a3e (80) +0x29f0|00 83 00 00 00 00 00 00 00 00 00 00 00 00 00 00|................| +* |until 0x2a3d.7 (80) | | +0x2a30| 00 | . | checksum: 0x0 0x2a3e-0x2a3f (1) + | | | [2]{}: metadata_segment 0x2a3f-0x2a40 (1) +0x2a30| 00| .| id: "end" (0) 0x2a3f-0x2a40 (1) +0x2a40|00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00|................| gap0: raw bits 0x2a40-0x2aec (172) +* |until 0x2aeb.7 (end) (172) | | diff --git a/format/riff/testdata/dolby_metadata.wav b/format/riff/testdata/dolby_metadata.wav new file mode 100644 index 000000000..9e489453a Binary files /dev/null and b/format/riff/testdata/dolby_metadata.wav differ diff --git a/format/riff/testdata/end-of-file.fqtest b/format/riff/testdata/end-of-file.fqtest index 434629fea..3e348e95a 100644 --- a/format/riff/testdata/end-of-file.fqtest +++ b/format/riff/testdata/end-of-file.fqtest @@ -25,7 +25,7 @@ $ fq -d wav dv end-of-file.wav 0x030| 4c 61 76 66 35 38 2e 34| Lavf58.4| value: "Lavf58.45.100" 0x38-0x46 (14) 0x040|35 2e 31 30 30 00 |5.100. | | | | [2]{}: chunk 0x46-0x732 (1772) -0x040| 64 61 74 61 | data | id: "data" 0x46-0x4a (4) +0x040| 64 61 74 61 | data | id: "data" (Raw sound encoded data) 0x46-0x4a (4) 0x040| ff ff ff ff | .... | size: 0xffffffff (Rest of file) 0x4a-0x4e (4) 0x040| 00 00| ..| samples: raw bits 0x4e-0x732 (1764) 0x050|00 00 b5 00 b5 00 69 01 69 01 1d 02 1d 02 ce 02|......i.i.......| diff --git a/format/riff/testdata/stereo.fqtest b/format/riff/testdata/stereo.fqtest index 2da687796..c29818786 100644 --- a/format/riff/testdata/stereo.fqtest +++ b/format/riff/testdata/stereo.fqtest @@ -25,7 +25,7 @@ $ fq -d wav dv stereo.wav 0x030| 4c 61 76 66 35 38 2e 32| Lavf58.2| value: "Lavf58.29.100" 0x38-0x46 (14) 0x040|39 2e 31 30 30 00 |9.100. | | | | [2]{}: chunk 0x46-0x732 (1772) -0x040| 64 61 74 61 | data | id: "data" 0x46-0x4a (4) +0x040| 64 61 74 61 | data | id: "data" (Raw sound encoded data) 0x46-0x4a (4) 0x040| e4 06 00 00 | .... | size: 1764 0x4a-0x4e (4) 0x040| 00 00| ..| samples: raw bits 0x4e-0x732 (1764) 0x050|00 00 b5 00 b5 00 69 01 69 01 1d 02 1d 02 ce 02|......i.i.......| diff --git a/format/riff/wav.go b/format/riff/wav.go index 6237089b3..32a4c52d3 100644 --- a/format/riff/wav.go +++ b/format/riff/wav.go @@ -8,14 +8,20 @@ package riff // TODO: default little endian import ( + "embed" + "github.com/wader/fq/format" "github.com/wader/fq/pkg/decode" "github.com/wader/fq/pkg/interp" "github.com/wader/fq/pkg/scalar" ) +//go:embed wav.md +var wavFS embed.FS + var wavHeaderGroup decode.Group var wavFooterGroup decode.Group +var wavDolbyMetadataGroup decode.Group func init() { interp.RegisterFormat( @@ -28,8 +34,10 @@ func init() { Dependencies: []decode.Dependency{ {Groups: []*decode.Group{format.ID3v2}, Out: &wavHeaderGroup}, {Groups: []*decode.Group{format.ID3v1, format.ID3v11}, Out: &wavFooterGroup}, + {Groups: []*decode.Group{format.Dolby_Metadata}, Out: &wavDolbyMetadataGroup}, }, }) + interp.RegisterFS(wavFS) } const ( @@ -158,6 +166,30 @@ func wavDecode(d *decode.D) any { d.FieldRawLen("coding_history", d.BitsLeft()) return false, nil + case "chna": + d.FieldU16("num_tracks") + d.FieldU16("num_uids") + d.FieldArray("audio_ids", func(d *decode.D) { + for !d.End() { + d.FieldStruct("audio_id", func(d *decode.D) { + d.FieldU16("track_index") + d.FieldUTF8("uid", 12) + d.FieldUTF8("track_format_id_reference", 14) + d.FieldUTF8("pack_format_id_reference", 11) + d.FieldRawLen("padding", 8) + }) + } + }) + return false, nil + case "axml": + d.FieldUTF8("xml", int(d.BitsLeft())/8) + return false, nil + case "dbmd": + // TEMP TEMP TEMP: delete old dolby.go and bring uncomment + // old_dbmdDecode(d) // + d.Format(&wavDolbyMetadataGroup, nil) + return false, nil + default: if riffIsStringChunkID(id) { d.FieldUTF8NullFixedLen("value", int(d.BitsLeft())/8) diff --git a/format/riff/wav.md b/format/riff/wav.md new file mode 100644 index 000000000..a448e3ea0 --- /dev/null +++ b/format/riff/wav.md @@ -0,0 +1,39 @@ +WAVE audio file format. + +Also includes support for [Audio Definition Model](https://adm.ebu.io/background/what_is_the_adm.html) and 3D Audio. + +RIFF / WAV / Broadcast Wave Format (BWF) chunks: + +- `RIFF`: primary container chunk specifying the file type and containing sub-chunks (e.g., fmt, data) +- `fmt`: describes format / stream encoding in data chunk +- `data`: indicates size and contains encoded raw sound data +- `bext`: broadcast extension chunk, containing broadcast-specific metadata such as description, originator, creation date, time reference, and more +- `LIST`: organizes additional metadata in sub-chunks, often used to include information like artist, genre, or title in INFO or other standardized formats +- `smpl`: sample metadata chunk, containing looping and sampling information, such as start and end points for loops, sample rate, and MIDI pitch +- `fact`: contains metadata on the original uncompressed data, such as the number of samples, typically used in non-PCM (compressed) formats to aid in playback and synchronization +- `chna`: track UIDs of Audio Definition Model +- `axml`: XML metadata, e.g. for Audio Definition Model ambisonics and elements as in [EBUCore spec](https://tech.ebu.ch/docs/tech/tech3293.pdf) +- `dbmd`: Dolby specific metadata like loudness and binaural settings, see also [`dolby_metadata` format](#dolby_metadata) + + +### Examples +Decode ADM configuration from `` and `` chunks: +```bash +$ fq -d wav '.chunks[] | select(.id | IN("chna", "axml")) | tovalue' amd-bwf.wav + +# Extract ADM chunk objects definitions xml content +$ fq -r -d wav '.chunks[] | select(.id | IN("axml")) | .xml | tovalue' amd-bwf.wav | tee axml-content.xml +``` + +### Authors +- [@wader](https://github.com/wader), original author +- [@johnnymarnell](https://johnnymarnell.github.io), ADM and Dolby support + +### References +- http://soundfile.sapp.org/doc/WaveFormat/ +- https://github.com/FFmpeg/FFmpeg/blob/master/libavformat/wavdec.c +- https://tech.ebu.ch/docs/tech/tech3285.pdf +- http://www-mmsp.ece.mcgill.ca/Documents/AudioFormats/WAVE/WAVE.html +- https://adm.ebu.io/background/what_is_the_adm.html +- https://tech.ebu.ch/docs/tech/tech3285s7.pdf +- https://tech.ebu.ch/docs/tech/tech3285s5.pdf