Skip to content

Latest commit

 

History

History
109 lines (69 loc) · 5.04 KB

README.md

File metadata and controls

109 lines (69 loc) · 5.04 KB

Animus

One framework to rule them all.

Animus is a "write it yourself"-based machine learning framework.
Please see examples/ for more information.
Framework architecture is mainly inspired by Catalyst.

FAQ

What is Animus?

Animus is a general-purpose for-loop-based experiment wrapper. It divides ML experiment with the straightforward logic:

def run(experiment):
    for epoch in experiment.epochs:
        for dataset in epoch.datasets:
            for batch in dataset.batches:
                handle_batch(batch)

Each for encapsulated with on_{for}_start, run_{for}, and on_{for}_end for customisation purposes. Moreover, each for has its own metrics storage: {for}_metrics (batch_metrics, dataset_metrics, epoch_metrics, experiment_metrics).

What are Animus' competitors?

Any high-level ML/DL libraries, like Catalyst, Ignite, FastAI, Keras, etc.

Why do we need Animus if we have high-level alternatives?

Although I find high-level DL frameworks an essential step for the community and the spread of Deep Learning (I have written one by myself), they have a few weaknesses.

First of all, usually, they are heavily bounded to a single "low-level" DL framework (Jax, PyTorch, Tensorflow). While "low-level" frameworks become close each year, high-level frameworks introduce different synthetic sugar, which makes it impossible for a fair comparison, or complementary use, of "low-level" frameworks.

Secondly, high-level frameworks introduce high-level abstractions, which:

  • are built with some assumptions in mind, which could be wrong in your case,
  • can cause additional bugs - even "low-level" frameworks have quite a lot of them,
  • are really hard to debug/extend because of "user-friendly" interfaces and extra integrations.

While these steps could seem unimportant in common cases, like supervised learning with (features, targets), they became more and more important during research and heavy pipeline customization (e.g. privacy-aware multi-node distributed training with custom backpropagation).

Thirdly, many high-level frameworks try to divide ML pipeline into data, hardware, model, etc layers, making it easier for practitioners to start ML experiments and giving teams a tool to separate ML pipeline responsibility between different members. However, while it speeds up the creation of ML pipelines, it disregards that ML experiment results are heavily conditioned on the used model hyperparameters, and data preprocessing/transformations/sampling, and hardware setup.
I found this the main reason why ML experiments fail - you have to focus on the whole data transformation pipeline simultaneously, from raw data through the training process to distributed inference, which is quite hard. And that's the reason Animus has Experiment abstraction (Catalyst analog - IRunner), which connects all parts of the experiment: hardware backend, data transformations, model train, and validation/inference logic.

What is Animus' purpose?

Highlight common "breakpoints" in ML experiments and provide a unified interface for them.

What is Animus' main application?

Research experiments, where you have to define everything on your own to get the results right.

Does Animus have any requirements?

No. That's the case - only pure Python libraries. PyTorch and Keras could be used for extensions.

Do you have plans for documentation?

No. Animus core is about 300 lines of code, so it's much easier to read than 3000 lines of documentation.

Demo