Home

VLSV File Format

Introduction

VLSV file format was created for writing domain-decomposed meshes, and variables defined on their zones, efficiently in parallel using message passing interface (MPI). Here is a list of features:

Data written in parallel using collective MPI calls
Parallel visualization with VisIt
Scalability tested up to tens of millions of zones per mesh
Multiple meshes per file
Stretched Cartesian, Cylindrical, and Spherical coordinate systems
Arbitrary coordinate scaling for visualization purposes (for example, from meters to Earth radii)
User-defined axis labels and units

Data and File Format

VLSV assumes that data is written as arrays, where each array element is a vector of size vectorsize. Number of elements in an array is denoted as arraysize. All vectors in an array must have the same size, i.e., vectorsize cannot change between array elements. Furthermore, each vector element must have the same primitive datatype (int32_t, float, etc).

Each VLSV file starts with two 64 bit (uint64_t) integers. The first integer is used to store the integer endianness. The second integer is the byte offset of the footer (see below). The status integers are followed by the array data.

An XML footer is written to the end of file between and tags. The footer then contains zero or more tags that give, for each array, the number of elements, vector sizes, datatypes, and any additional attributes passed by a user. XML tags and their attributes also tell how the arrays map into simulation meshes, or to a variable defined in a mesh.

Each XML tag associated with an array have attributes arraysize, vectorsize, datasize, and datatype. Attributes arraysize and vectorsize give the number of elements in the array, and the size of the data vector, correspondingly. Attributes datasize and datatype indicate the primitive datatype stored in the vector. The value of a tag is the byte offset of array data, counting from the start of file.

For example, consider the following example:
<A arraysize="100" vectorsize="3" datasize="4" datatype="int">17253123</A>

Array A above has 100 elements, and each element is a vector of size 3. Datatype of each vector element is a 32 bit (4 byte) unsigned integer. Array data starts at file offset of 17253123 bytes. Multi-Domain Meshes

A multi-domain mesh is your vanilla uniform Cartesian mesh consisting of (Nx,Ny,Nz) zones and (Nx+1,Ny+1,Nz+1) nodes in (x,y,z) directions. The twist is that zones are decomposed into independent domains, as illustrated by the coloring in the figure below. Zones are written out in any specific order, and some of them may not even exist (you can have holes in the mesh).

Typically a parallel simulation is run with multiple MPI processes that are responsible for performing computations on variables defined in zones of their respective domains. This requires that MPI processes keep local copies of some zones defined in neighboring domains, as illustrated by the uncolored zones in the figure below. Here such zones are called ghost zones, or simply ghosts. Parallel visualization requires that each domain can be processes independently, thus each MPI process must write out its own domain plus ghost zones.

FIGURE: Example of a Cartesian mesh decomposed into four domains (red, blue, green, orange) that are each owned by different MPI processes. Pictures of the right show ghost zones (white).

Rationale: Strictly speaking mesh can be drawn without information on ghost zones. However, many plot types in VisIt are unable to cross domain boundaries correctly without variable data in ghosts.

A mesh is defined by its bounding box, i.e. the "global" simulation mesh. A domain is simply an unordered collection of zone (local+ghost) global ID numbers. Zones have unique global ID numbers that are calculated according to the C convention,
global ID = k*Ny*Nx + j*Nx + i,

where (i,j,k) are the zone indices, as illustrated in the figure below. Replace Nx by (Nx+1) and Ny by (Ny+1) in the formula above to get node global indices. Node and zone (i,j,k) indices are meshes logical coordinates. Mapping into physical coordinates depends on the coordinate system used.

When mesh is visualized node coordinates are obtained by a table lookup, i.e. you are required to provide three arrays of sizes Nx+1, Ny+1, and Nz+1, containing nodes' x, y, and z coordinate values. For example, assume that these arrays are called xcrds, ycrds, and zcrds. Assuming Cartesian coordinate system, (x,y,z) coordinates of a node having indices (i,j,k) are xcrds[i], ycrds[j], and zcrds[k]. This is how VLSV format implements stretched meshes -- zone sizes do not need to be uniform. Strictly speaking, values in arrays xcrds, ycrds, and zcrds depend on the coordinate system as discussed here.

FIGURE: Two-dimensional example of a mesh bounding box. Zones are labeled with their (i_cell, j_cell) indices, and nodes with (i_node, j_node) indices. Note that the number of nodes per coordinate direction is number of zones plus one.

Your simulation may of course use a different numbering scheme for nodes and zones. When writing VLSV files, however, zones must be numbered as presented above.

Coordinate Systems

All meshes in VLSV format are logically Cartesian, i.e. mesh is defined by its Cartesian bounding box, and node coordinates can be calculated from (i,j,k) indices. With Cartesian coordinate system the logical coordinates coincide with physical coordinates,

float x = xcrds[i];
float y = ycrds[j];
float z = zcrds[k];

However, in case of Cylindrical and spherical coordinate systems the logical coordinates must map into (r,phi,z) and (r,theta,phi). VLSV visualization plugin maps these into Cartesian coordinates for Visit,

// Cylindrical
float r_cyl   = xcrds[i];
float phi_cyl = ycrds[j];
float z_cyl   = zcrds[k];
float x_cyl = r_cyl * cos(phi_cyl);
float y_cyl = r_cyl * sin(phi_cyl);
float z_cyl = z_cyl;

// Spherical
float r_sph     = xcrds[i];
float theta_sph = ycrds[j];
float phi_sph   = zcrds[k];
float x_sph = r_sph * sin(theta_sph) * cos(phi_sph);
float y_sph = r_sph * sin(theta_sph) * sin(phi_sph);float z_sph = r_sph * cos(theta_sph);

HOWTO: Write a Multi-Domain Mesh

Here is a checklist (for each domain):

Separate zones into locals and ghosts. Local zones are always written first, only mesh data needs ghost zones.
Figure out an order in which to write out zones.
At this point you have local IDs for all zones (=the order they are written to file).
Synchronize local IDs with neighbor processes.
Each process must know the domain (=MPI rank) where the ghost zone belongs to, and local ID of ghost zone in that domain.