Bender

A YAML driven binary file viewer

This program reads in a binary file specification written in YAML and emits neat, formatted data to your console.

Why Use Bender

Parsing binary files containing arbitrary and dynamic can be tedious. A typical use case would be serialized binary data consumed by space-constrained device. You might not need a parser, perhaps you have safely figured out a struct packing scheme to read and write the data. However, when you want to peek at the data from outside of your application, you must write a parser or use some type of hex dump application.

Bender allows you to define you binary layout in plain-text to control how the contents are rendered for you. This tool really shines when you have non-trivial data structures with dynamic layouts.

Define your YAML spec
Feed YAML spec and a binary to Bender
Save a few brain cycles analyzing the binary

YAML Definition

Name	Default Value	Description
format	bender.v1	Version of this specification
name	empty	Human friendly name of your spec
extensions	empty	List of extensions associated with the file you are describing
base_element	Default Values for each type in Element	The default element for your spec
structures	empty	Complex data types, e.g. structs
enumerations	empty	List of enumeration definitions
elements	empty	Named definition of data in your binary
layout	empty	Ordered list of elements as they are expected to be found in a binary.

Element Object

The fundamental component of Bender is the element. This is the name we use to describe one or more bytes in a binary file. It has a collection of fields that control how the bytes are read, stored, and displayed. Elements can be immediate values (e.g. the current read position) or deferred objects. In the latter case, a deferred object is a marker of 8 bytes describing the offset and size of the real data, each of which are 4 bytes in width.

You may take advantage of YAML merge keys to generate base objects from which you can build your elements.

Field	Description	Legal Values
name	Name to display for this element	strings
elide	Hide this element from display	YAML bool
units	How many bytes are in this element	A positive integer
signed	Represent bytes as a signed value	YAML bool
format	How the bytes should be interpreted	binary, octal, decimal, hex, ascii, unicode, hexstr, float
little_endian	What order the bytes are stored in the file	YAML bool
is_deferred	True if this object is a deferral	Optionally specify that this element is a pointer to more data
is_array_count	True if this value is a count of the next object	Optionally hint that the next object is repeated N times
is_array	True if this value is an implicit array	Optionally hint that the section may contain an array of the specified type
matrix	Optional matrix definition	Defines the shape of the data as a matrix type
enumeration	Optional enumeration definition	Defines this type as an enumeration

Deferred Object

Sometimes your binary has dynamic data. We can still parse it by using a deferred definition. Using this approach, you are defining a contract that gives an offset (relative to the start of the file) and a size in bytes of future data. Once all the elements have been read, the deferred data types are processed according to your spec.

This pattern is based on the technique of inserting a marker in your binary that is effectively a custom pointer. Both the size and offset must be 4-byte values. There are no plans to support alternative integer sizes for these values.

typedef def_location_t {
	uint32_t offset;
	uint32_t size;
};

Matrix Object

A matrix element is a definition for representing your data as a matrix. When a matrix is detected by the Bender parser, the format of the parent element controls the representation of the bytes. For deferred objects, the total size is captured from the 4-byte size definition which is used to calculate the total row count. For immediate objects, the element's units field is used to determine the total size.

Field	Description	Legal Values
columns	How many variables per row	A positive integer
units	How many bytes per variable	A positive integer

Structure Object

Sometimes your data is more than just a number or a matrix. Use structures to define sequences of bytes that create more complicated data types. Each element may also contain structures, matrices, and can be a mix of deferred and immediate objects.

There are no checks in place to detect circular references so be careful. Also, the only limit on structure depth is your stack size.

Field	Description	Legal Values
name	Name referenced by and element	strings
elements	A list of elements contained in this structure	Any valid Element listed under structure_elements

Enumeration Object

For numeric types, you can define an enumeration to map the number to a string. For example,

typedef enum {
   my_enum_foo = 0,
   my_enum_bar,
   my_enum_baz
} my_enum_t;

could have a definition of

Value	Name
0	Foo
1	Bar
2	Baz

which will allow Bender to replace the integer with its string representation automatically.

Arrays

An array and either be implicit or explicit. Explicit arrays use the is_array_count property to indicate that a value represents the count of elements to follow. Any numeric element can be marked as an array to indicate that the next element should be repeated a number of times. This supports a structure similar to the following:

typedef def_array_t {
	uint32_t aCount;
	struct some_struct_t a[SOME_CONSTANT_A];
	uint32_t bCount;
	int b[SOME_CONSTANT_B]
};

This allows you to support immediate data with arbitrary length. The count and the element being repeated are separate objects so you can name and format them separately. Any type of element can be repeated.

An implicit array uses the is_array property to hint that the current container may container more than one of the specified element. In this case, all elements must have the same size. Each block equal to the expected element's size is parsed as an element until all blocks are exhausted.

Format

An element may be formatted as any of the following:

binary Uses 0b prefix, e.g. 0b00000010 == 2
octal Uses O prefix, e.g. O11 == 9
decimal No prefix
hex Uses 0x prefix
ascii Interprets string as 7-bits per character. Length is == Units of elements.
unicode Interprets string as unicode characters. Length in bytes will not equal count of bytes.
bigint Interprets arbitrary count of bytes as a hex formatted string without a prefix.
float Units must be 4 bytes for single precision or 8 bytes for double precision.

For string types, the units field of element is the count of letters in the string. For ASCII, that happens to be equal to the count of bytes. For Unicode, since each character is composed of multiple bytes, the length in bytes will not equal the character count.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

readme.md

readme.md

Bender

Why Use Bender

YAML Definition

Element Object

Deferred Object

Matrix Object

Structure Object

Enumeration Object

Arrays

Format

Files

readme.md

Latest commit

History

readme.md

File metadata and controls

Bender

Why Use Bender

YAML Definition

Element Object

Deferred Object

Matrix Object

Structure Object

Enumeration Object

Arrays

Format