Skip to content

cvxgrp/cvxbson

Repository files navigation

PyPI version Apache 2.0 License Downloads Coverage Status

Open in GitHub Codespaces

IPC

IPC stands for InterProcess Communication. It is a mechanism that allows to share data between processes. A traditional way to do so is to use json files. Json files are rather flexible and can be used to share data between different programming languages. However, they are not very efficient.

Here we use their binary counterpart, bson files. Bson files are much more efficient but somewhat lack the flexibility of json files. Here we rely on the bson package to read and write bson files. We are interested in parsing dictionaries of numpy arrays, pandas and polars dataframes as fast as possible.

There might be faster ways to achieve this goal and we are open to suggestions and pull requests.

We recommend using json files to transfer configurations and small amounts of data. Bson files can then be used to transfer large matrices. A coexistence is possible and encouraged.

Demo

import numpy as np

from src.cvx.bson import read_bson, write_bson

data = {"A": np.random.rand(50, 50), "B": np.random.rand(50)}

write_bson("test.bson", data)
recovered = read_bson("test.bson")

assert np.allclose(data["A"], recovered["A"])
assert np.allclose(data["B"], recovered["B"])

We have also implemented the same functionality in for json files but would advise against using it. It is much slower and less efficient.

You may want to avoid the explicit construction of files. It is possible to work directly with bson strings. We provide methods for that, too.

uv

You need to install task. Starting with

task cvxbson:install

will install uv and create the virtual environment defined in pyproject.toml and locked in uv.lock.

marimo

We install marimo on the fly within the aforementioned virtual environment. Executing

task cvxbson:marimo

will install and start marimo.