Skip to content
Hunter Belanger edited this page Apr 9, 2020 · 7 revisions

This page provides an overview of how to use the exdir-cpp library, once it has been built. This assumes that you have either installed the libraries and headers into the default locations, or you put them somewhere else where you know how to use them.

Includes

To use exdir-cpp in your code, you should include the exdir/exdir.hpp header file in your source file. This should be the only header you need to include, and will include the other exdir-cpp header files, giving you access to all of the libraries features. All of the library features exists inside of the exdir namespace.

Creating and Opening an Exdir Directory

An Exdir directory is represented as an exdir::File object. You can create a new Exdir directory with

exdir::File exdir_file = create_file("name/of/file.exdir");

The argument can be either an std::string, or an std::filesystem::path object to where you would like the directory to be. This like of code will create the required folder, and exdir.yaml file within. If the name of the directory already exists, a runtime error will be thrown.

If you would like to open an Exdir directory which already exists on your system, you can then use

exdir::File exdir_file("name/of/existing/file.exdir");

where the argument can again be an std::string or std::filesystem::path object.

Groups

To access a Group which already exists, both File and Group objects have a method called get_group, which takes a string with the name of the group. Continuing the previous example, if there is a group called "test_group" in the exdir_file, then we can access it with

exdir::Group group = exdir_file.get_group("group_name");

A list of all groups contained in a File or parent Group can be obtained with the member_groups() method, which returns an std::vectorstd::string object of all the groups which are present within that object. A new group can be added with

exdir::Group group2 = exdir_file.create_group("group_name_2");

Datasets

A Dataset is used to contain numerical data, and can be read from a group or file with their get_dataset("name") method, which returns a Dataset object. In doing this, the stored numerical data will be read into the object, and is ready to be retrieved. Due to the strict static type system of C++, the data must be cast to the appropriate type. To help with this, an enum called DType exists in the exdir namespace. The possible values for a DType and the C++ type they represent are :

  • CHAR = char
  • UCHAR = unsigned char
  • INT16 = int16_t
  • INT32 = int32_t
  • INT64 = int64_t (int)
  • UINT16 = uint16_t
  • UINT32 = uint32_t
  • UINT64 = uint64_t
  • FLOAT32 = float
  • DOUBLE64 = double So, when a dataset is loaded, the data type can be checked
exdir::Dataset data = exdir_file.get_dataset("data_set_1");
exdir::DType data_dtype = data.dtype();

To actually get the data out in a usable manner, it must then retrieved from the object, where it will also be cast to the desired type. Should data_dtype == exdir::DType::DOUBLE64 be true, then it is certainly safe to cast the data to that type. Data is always retrieved into an exdir::Array object.

exdir::Array data_array = data.retrieve_data<double>();

From the Array, the data can be accessed in place, or be transferred to the users preferred object. Once the data has been retrieved and copied into the Array, the data in the Dataset object is erased to conserve memory. If you wish to load another Array with the same data, you must first re-load the data into the array, using

data.load_data();

Alternatively, if data has been loaded (which can be verified with the method data.data_loaded() which returns true if data is retrievable), you can clear the data from the array using

data.clear_data();

Creation of a new Dataset also requires an exdir::Array object. Assuming the Array already exists, then one uses

group2.create_dataset("dataset_name", data_array_2);

where data_array_2 is of type exdir::Array. It should be noted that this method does not return a Dataset object of the newly created Dataset. If you would like to further access the dataset (to add attributes for example), you must first get the object with exdir::Dataset data_2 = group1.get_dataset("dataset_name");.

Arrays

Arrays are multi-dimensional objects, used to send and receive data from exdir::Dataset objects. They are similar to Numpy Arrays which exist in Python. The data is stored in a 1D vector, and the linear index is calculated from the provided indicies depending on whether the stored ypedata is C-contiguous, or Fortran-contiguous. An array can be obtained from a Dataset, but can also be created from an std::vector.

std::vector<int> array_data { 1,  2,  3,  4,
                              5,  6,  7,  8,
                              9, 10, 11, 12,
                             13, 14, 15, 16};

std::vector<size_t> array_shape {4,4};
bool array_c_contiguous = true;
exdir::DType array_dtype = exdir::DType::INT64;
exdir::Array<int> test_array(array_data, array_shape, array_dtype, array_c_contiguous);

This example creates a 4x4 matrix of integers. The shape is defined by a vector of type size_t, with one element per dimension. Once created, elements can be accessed using the () operator:

int array_element = test_array(2,2); // array_element = 11 based on the previous array
std::vector<size_t> indicies {2,2};
int array_element_2 = test_array(indicies); // array_element_2 = 11 as well

An index value can be passed directly to the operator, or they can be put into a vector and sent in a single object. There must be one index per dimension of the Array. If this is not the case, an error will be thrown. The data may also be accessed by the linear index using the [] operator. Like a Numpy array, an exdir::Array may also be reshaped by passing a new vector of size_t objects, and it must correspond to a linear array of the same number of elements

test_array.reshape({4,2,2});
/*
test_array now resembles an array of the form
[[[ 1,  2],
  [ 3,  4]],

 [[ 5,  6],
  [ 7,  8]],

 [[ 9, 10],
  [11, 12]],

 [[13, 14],
  [15, 16]]]
*/

The linear index of each data element has remained the same, but an element is no longer indexed with the same values. It also required 3 arguments to index this Array with the () operator, instead of 2.

Raws

Raws can be accessed or created with commands similar to those of groups.

exdir::Raw raw = data_set.get_raw("raw_data");
exdir::Raw raw2 = data_set.create_raw("more_raw_data");

Other objects may not be added to Raws. Raws do have a special method which returns a vector of strings of the files inside the directory.

std::vector<std::string> raw_files = raw.member_file();

Attributes

All objects (File, Group, Dataset, Raw) are able to contain attributes, which contain properties about them. These are accesses through the attrs member of the object. This public object is actually a raw yaml-cpp node. It works somewhat like a dictionary, taking an std::string as a key. You can set an attribute for an object with something along the lines of

group2.attrs["density"] = 2.3;

To learn more about how to use the attributes and yaml-cpp nodes, take a look at their tutorial here.

If you make changes to attributes, they will not be saved unless you call the write() method of the object.

group2.write();

If this is not called, the attributes will not be saved. Attributes are NOT saved on destruction of an object!

Clone this wiki locally