Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate to repype #21

Merged
merged 46 commits into from
Sep 17, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
46 commits
Select commit Hold shift + click to select a range
3fc5463
Add repype dependency
kostrykin Sep 10, 2024
a2c1ed2
Remove legacy tests
kostrykin Sep 10, 2024
0ee7bf3
Install git in CI
kostrykin Sep 10, 2024
56122f0
Merge branch 'develop' into dev/repype
kostrykin Sep 10, 2024
6139482
Bump devcontainer to Python 3.11
kostrykin Sep 12, 2024
6c8a73e
Fix `postCreateCommand`
kostrykin Sep 12, 2024
d4d30bd
Migrate pipeline.py to repype
kostrykin Sep 12, 2024
b2a9913
Add .isort.cfg
kostrykin Sep 12, 2024
85027ec
Migrate stages to repype
kostrykin Sep 12, 2024
5e6997f
Reformat code
kostrykin Sep 12, 2024
e50879e
Add `Pipeline.process_image`
kostrykin Sep 12, 2024
1d0bbf0
Fix imports
kostrykin Sep 12, 2024
23b175d
Fix `Pipeline.configure`
kostrykin Sep 12, 2024
2ce9b59
Fix bugs
kostrykin Sep 12, 2024
d4def88
Fix
kostrykin Sep 12, 2024
26b8dbe
Update .gitignore
kostrykin Sep 13, 2024
27959a5
Add debug output
kostrykin Sep 13, 2024
cdc02a4
Fix loading image
kostrykin Sep 13, 2024
1bd754a
Fix to make tests pass
kostrykin Sep 13, 2024
caca002
Bump repype version to e9dc8d87788c984222ebd14e30364fa3120bff1c
kostrykin Sep 13, 2024
59b1097
Drop `superdsm.config`
kostrykin Sep 13, 2024
19ea1ee
Drop obsolete documentation sources
kostrykin Sep 13, 2024
03822b7
Migrate task.json to task.yml
kostrykin Sep 13, 2024
e2da02c
Bumb repype to 985d9e88a895c90f516e35cd5753370d3f9c9456
kostrykin Sep 13, 2024
a857564
Bumb repype to 97bcdc2baafd145fcaac0382c81b598a89c80ba7
kostrykin Sep 13, 2024
c480f2d
Start adding `superdsm.task.Task`
kostrykin Sep 13, 2024
e5f4546
Bumb repype to 6b0bbcef0eec1962486a1caf7d03180d950e3433
kostrykin Sep 13, 2024
f1f9882
Implement `superdsm.task.Task`
kostrykin Sep 13, 2024
1d4efee
Drop superdsm.batch and superdsm.export
kostrykin Sep 13, 2024
9f58480
Bump repype to 81e7b14fdd2acaa51145e624f7f701b069849111
kostrykin Sep 13, 2024
b1103a5
Add __main__.py
kostrykin Sep 13, 2024
424dd88
Bump repype to 7769b46149e9ec3c49ffae4878f507be0ac09d0b
kostrykin Sep 13, 2024
9474029
Bump repype to ae360bad8afdd2f937a85702bd9dd112dd5887e7
kostrykin Sep 13, 2024
e28bda5
Fix batch execution
kostrykin Sep 13, 2024
3881fef
Reformat status updates
kostrykin Sep 13, 2024
a5c020c
Bump repype to d20611bc89602a1cb6a14f8bedc7d191c004408b
kostrykin Sep 14, 2024
4942dfc
Fix tasks and bump repype to 93fb7524a8d4290fbbab48009eef4dcc9a96e13c
kostrykin Sep 15, 2024
b705f7c
Bump repype to 4fc9cdef0e2ac767ebbc911be9d18d31a0ea7cb7
kostrykin Sep 15, 2024
83e22f5
Bump repype to 6c1c95f6255715dfbf6f5a36f3b9a87a05ec540c
kostrykin Sep 15, 2024
24cb7ec
Add superdsm.textual.py
kostrykin Sep 15, 2024
0c3fae1
Pin repype to v1.0.0
kostrykin Sep 16, 2024
a77efdb
Update usage.rst
kostrykin Sep 16, 2024
308fd79
Add `install_requires`
kostrykin Sep 16, 2024
ed466bf
Add flake8
kostrykin Sep 16, 2024
241cace
Fix sphinx build
kostrykin Sep 16, 2024
073d6f5
Update the documentation to repype
kostrykin Sep 16, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 6 additions & 3 deletions .devcontainer/devcontainer.json
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,10 @@
{
"name": "Python 3",
// Or use a Dockerfile or Docker Compose file. More info: https://containers.dev/guide/dockerfile
"image": "mcr.microsoft.com/devcontainers/python:0-3.8",
"image": "mcr.microsoft.com/devcontainers/python:0-3.11",
"containerEnv": {
"DOCKER_DEFAULT_PLATFORM": "linux/amd64"
},

// Features to add to the dev container. More info: https://containers.dev/features.
// "features": {},
Expand All @@ -12,13 +15,13 @@
// "forwardPorts": [],

// Use 'postCreateCommand' to run commands after the container is created.
"postCreateCommand": "pip3 install --user -r requirements.txt",
"postCreateCommand": "pip3 install -r requirements.txt",

// Configure tool-specific properties.
"customizations": {
"vscode": {
"settings": {
"python.analysis.extraPaths": ["/home/vscode/.local/lib/python3.8/site-packages"]
"python.analysis.extraPaths": ["/home/vscode/.local/lib/python3.11/site-packages"]
}
}
}
Expand Down
10 changes: 10 additions & 0 deletions .flake8
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
[flake8]

max-line-length = 120

extend-ignore = E221,E211,E222,E202,F541,E201,E203,E125

per-file-ignores =
repype/typing.py:F405

exclude = tests/*.py tests/textual/*.py docs/source/conf.py
6 changes: 3 additions & 3 deletions .github/workflows/regressiontests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -58,10 +58,10 @@ jobs:

- name: Run SuperDSM
run: |
python -m "superdsm.batch" examples --task-dir "${{ matrix.taskdir }}"
python -m "superdsm.batch" examples --task-dir "${{ matrix.taskdir }}" --run
python -m "superdsm" examples --task-dir "examples/${{ matrix.taskdir }}"
python -m "superdsm" examples --task-dir "examples/${{ matrix.taskdir }}" --run
env:
SUPERDSM_INTERMEDIATE_OUTPUT: false
REPYPE_CLI_INTERMEDIATE: false
SUPERDSM_NUM_CPUS: 20

- name: Validate results
Expand Down
3 changes: 3 additions & 0 deletions .github/workflows/testsuite.yml
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,9 @@ jobs:
- name: Clear cache # otherwise corrupted packages can be reported sometimes
run: rm -rf /github/home/conda_pkgs_dir

- name: Install dependencies
run: apt-get update && apt-get install -y git

- name: Setup Miniconda
uses: conda-incubator/setup-miniconda@v2
with:
Expand Down
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,9 @@
/build
*.swp
.DS_Store
._.DS_Store
/actual_csv
/examples.regression
/examples/**/*.csv
/examples/**/*.gz
/examples/**/.*.json
Expand All @@ -15,3 +18,5 @@
/tests/data
/tests/actual
/tests/logs
/.venv
/docs/build
13 changes: 13 additions & 0 deletions .isort.cfg
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
[settings]

multi_line_output = 3

force_grid_wrap = 2

include_trailing_comma = true

skip_glob =
django/csgo_app/*.py
django/*/migrations/*.py
django/*/apps.py
django/awpy_fork/**/*.py
9 changes: 9 additions & 0 deletions .vscode/settings.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
{
"[git-commit]": {
"editor.rulers": [50]
},

"[python]": {
"editor.rulers": [119]
}
}
7 changes: 7 additions & 0 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,11 @@
copyright = '2017-2024 Leonid Kostrykin, Biomedical Computer Vision Group, Heidelberg University'
author = 'Leonid Kostrykin'

# -- Add directory which contains the project to sys.path
import os, sys
sys.path.insert(0, os.path.abspath('../..'))
os.environ['PYTHONPATH'] = os.path.abspath('../..') + ':' + os.environ.get('PYTHONPATH', '')

# -- General configuration

extensions = [
Expand All @@ -14,12 +19,14 @@
'sphinx.ext.autodoc',
'sphinx.ext.autosummary',
'sphinx.ext.intersphinx',
'sphinx.ext.napoleon',
'sphinx_autorun',
]

intersphinx_mapping = {
'python': ('https://docs.python.org/3/', None),
'sphinx': ('https://www.sphinx-doc.org/en/master/', None),
'repype': ('https://repype.readthedocs.io/en/stable/', None),
}
intersphinx_disabled_domains = ['std']

Expand Down
87 changes: 35 additions & 52 deletions docs/source/pipeline.rst
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
.. _pipeline:

Default pipeline
================
The SuperDSM pipeline
=====================

Refer to the :py:mod:`.pipeline` module for a general overview of the pipeline concept (involving different stages, inputs, and outputs).
Refer to the :py:mod:`repype.pipeline` module for a general overview of the pipeline concept (involving different stages, inputs, and outputs).

.. _pipeline_theory:

Expand Down Expand Up @@ -98,8 +98,9 @@ holds and the sequential computation is not required. Regions of possibly cluste
Pipeline stages
---------------

The function :py:meth:`pipeline.create_default_pipeline() <superdsm.pipeline.create_default_pipeline>` employs the following stages:
The SuperDSM :py:meth:`~superdsm.pipeline.Pipeline` generally employs the following stages:

#. :py:class:`~.pipeline.LoadInput` – Loads the input image into the pipeline.
#. :py:class:`~.preprocess.Preprocessing` — Implements the computation of the intensity offsets.
#. :py:class:`~.dsmcfg.DSM_Config` — Provides the hyperparameters from the ``dsm`` namespace as an output.
#. :py:class:`~.c2freganal.C2F_RegionAnalysis` — Implements the coarse-to-fine region analysis scheme.
Expand All @@ -111,13 +112,13 @@ The function :py:meth:`pipeline.create_default_pipeline() <superdsm.pipeline.cre
Inputs and outputs
------------------

Pipeline stages require different inputs and produce different outputs. These are like intermediate results, which are shared or passed between the stages. The pipeline maintains their state, which is kept inside the *pipeline data object*. Below is an overview over all inputs and outputs available within the default pipeline:
Pipeline stages require different inputs and produce different outputs. These are like intermediate results, which are shared or passed between the stages. The pipeline maintains their state, which is kept inside the *pipeline data object*. Below is an overview over all inputs and outputs available within the SuperDSM pipeline:

``g_raw``
The raw image intensities :math:`g_{x^{1}}, \dots, g_{x^{\#\Omega}}`, normalized so that the intensities range from 0 to 1. Up to the normalization, this corresponds to the original input image, unless histological image data is being processed (i.e. the hyperparameter ``histological`` is set to ``True``). Provided by the pipeline via the :py:meth:`~.pipeline.Pipeline.init` method, refer to its documentation for details.
The raw image intensities :math:`g_{x^{1}}, \dots, g_{x^{\#\Omega}}`, normalized so that the intensities range from 0 to 1. Up to the normalization, this corresponds to the original input image, unless histological image data is being processed (i.e. the hyperparameter ``histological`` is set to ``True``). Provided by the :py:class:`~.pipeline.LoadInput` stage.

``g_rgb``
This is the original image, if histological image data is being processed (i.e. the hyperparameter ``histological`` is set to ``True``). Otherwise, ``g_rgb`` is not available as an input. Provided by the pipeline via the :py:meth:`~.pipeline.Pipeline.init` method, refer to its documentation for details.
This is the original image, if histological image data is being processed (i.e. the hyperparameter ``histological`` is set to ``True``). Otherwise, ``g_rgb`` is not available as an input. Provided by the :py:class:`~.pipeline.LoadInput` stage.

``y``
The offset image intensities :math:`Y_\omega|_{\omega = \Omega}`, represented as an object of type ``numpy.ndarray`` of the same shape as the ``g_raw`` image. Provided by the :py:class:`~.preprocess.Preprocessing` stage.
Expand Down Expand Up @@ -165,72 +166,54 @@ Batch system
Task specification
^^^^^^^^^^^^^^^^^^

To perform batch processing of a dataset, you first need to create a *task*. To do that, create an empty directory, and put a ``task.json`` file in it. This file will contain the specification of the segmentation task. Below is an example specification:
To perform batch processing of a dataset, you first need to create a *repype task* (`repype documentation <https://repype.readthedocs.io/en/latest/examples/segmentation.html#Task-specifications>`_). To do that, create an empty directory, and put a ``task.yml`` file in it. This file will contain the specification of the segmentation task. Below is an example specification:

.. code-block:: json
.. code-block:: yaml

{
"runnable": true,
"num_cpus": 16,
"environ": {
"MKL_NUM_THREADS": 2,
"OPENBLAS_NUM_THREADS": 2
},
runnable: true
environ:
MKL_NUM_THREADS: 2
OPENBLAS_NUM_THREADS: 2

scopes:
inputs: "/data/dataset/img-%d.tiff"
masks: "seg/img-%d.png"
adjacencies: "adj/img-%d.png"
config: "cfg/img-%d.yml"
overlays: "overlays/img-%d.png"

"img_pathpattern": "/data/dataset/img-%d.tiff",
"seg_pathpattern": "seg/dna-%d.png",
"adj_pathpattern": "adj/dna-%d.png",
"log_pathpattern": "log/dna-%d",
"cfg_pathpattern": "cfg/dna-%d.json",
"overlay_pathpattern": "overlays/dna-%d.png",
"file_ids": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],

"config": {
}
}
input_ids: 1-10

The meaning of the different fields is the follows:

``runnable``
Marks this task as runnable (or not runnable). If set to ``false``, the specification will be treated as a template for derived tasks. Derived tasks are placed in sub-folders and inherit the specification of the parent task. This is useful, for example, if you want to try out different hyperparameters. The batch system automatically picks up intermediate results of parent tasks to speed up the completion of derived tasks.

``num_cpus``
The number of processes which is to be used simultaneously (in parallel).

``environ``
Defines environment variables which are to be set. In the example above, MKL and OpenBLAS numpy backends are both instructed to use two threads for parallel computations.

``img_pathpattern``
``inputs``
Defines the path to the input images of the dataset, using placeholders like ``%d`` for decimals and ``%s`` for strings (decimals can also be padded with zeros to a fixed length using, e.g., use ``%02d`` for a length of 2).

``seg_pathpattern``
``masks``
Relative path of files, where the segmentation masks are to be written to, using placeholders as described above.

``adj_pathpattern``
``adjacencies``
Relative path of files, where the images of the atomic image regions and adjacency graphs are to be written to, using placeholders as described above (see :ref:`pipeline_theory_c2freganal`).

``log_pathpattern``
Relative path of files, where the logs are to be written to, using placeholders as described above (mainly for debugging purposes).

``cfg_pathpattern``
``config``
Relative path of files, where the hyperparameters are to be written to, using placeholders as described above (mainly for reviewing the automatically generated hyperparameters).

``file_ids``
List of file IDs, which are used to resolve the pattern-based fields described above. In the considered example, the list of input images will resolve to ``/data/dataset/img-1.tiff``, …, ``/data/dataset/img-10.tiff``. File IDs are allowed to be strings, and they are also allowed to contain ``/`` to encode paths which involve sub-directories.

``last_stage``
If specified, then the pipeline processing will end at the specified stage.

``dilate``
Performs morphological dilation for all final segmentation masks, using the given amount of pixels. For negative values, morphological erosion is performed.
``overlays``
Relative path of files, where the segmentation overlays are to be written to, using placeholders as described above.

``merge_overlap_threshold``
If specified, then any pair of two objects (final segmentation masks) with an overlap larger than this threshold will be merged into a single object.
``input_ids``
List of tokens, which are used to resolve the pattern-based fields described above. In the considered example, the list of input images will resolve to ``/data/dataset/img-1.tiff``, …, ``/data/dataset/img-10.tiff``. Input IDs are allowed to be integers or strings, and they are also allowed to contain ``/`` to encode paths which involve sub-directories.

``config``
Defines the hyperparameters to be used. The available hyperparameters are described in the documentation of the respective stages of the default pipeline (see :ref:`pipeline_stages`). Note that namespaces must be specified as nested JSON objects.
Defines the hyperparameters to be used. The available hyperparameters are described in the documentation of the respective stages of the SuperDSM pipeline (see :ref:`pipeline_stages`). Many examples are available in the ``examples`` directory.

Instead of specifying the hyperparameters in the task specification directly, it is also possible to include them from a separate JSON file using the ``base_config_path`` field. The path must be either absolute or relative to the ``task.json`` file. It is also possible to use ``{DIRNAME}`` as a substitute for the name of the directory, which the ``task.json`` file resides in. The placeholder ``{ROOTDIR}`` in the path specification resolves to the *root directory* passed to the batch system (see below).
Instead of specifying the hyperparameters in the task specification directly, it is also possible to include them from a separate YAML file using the ``base_config_path`` field. The path must be either absolute or relative to the ``task.yml`` file. It is also possible to use ``{DIRNAME}`` as a substitute for the name of the directory, which the ``task.yml`` file resides in. The placeholder ``{ROOTDIR}`` in the path specification resolves to the *root directory* passed to the batch system (see below).

Examples can be found in the ``examples`` sub-directory of the `SuperDSM repository <https://github.com/BMCV/SuperDSM>`_.

Expand All @@ -243,12 +226,12 @@ To perform batch processing of all tasks specified in the current working direct

.. code-block:: console

python -m 'superdsm.batch' .
python -m superdsm .

This will run the batch system in *dry mode*, so nothing will actually be processed. Instead, each task which is going to be processed will be printed, along with some additional information. To actually start the processing, re-run the command and include the ``--run`` argument.

In this example, the current working directory will correspond to the *root directory* when it comes to resolving the ``{ROOTDIR}`` placeholder in the path specification.

Note that the batch system will automatically skip tasks which already have been completed in a previous run, unless the ``--force`` argument is used. On the other hand, tasks will not be marked as completed if the ``--oneshot`` argument is used. To run only a single task from the root directory, use the ``--task`` argument, or ``--task-dir`` if you want to automatically include the dervied tasks. Note that, in both cases, the tasks must be specified relatively to the root directory.
Note that the batch system will automatically skip tasks which already have been completed in a previous run. To run only a single task from the root directory, use the ``--task`` argument, or ``--task-dir`` if you want to automatically include the dervied tasks. Note that, in both cases, the tasks must be specified relatively to the root directory.

Refer to ``python -m 'superdsm.batch' --help`` for further information.
Refer to ``python -m superdsm --help`` for further information.
6 changes: 3 additions & 3 deletions docs/source/superdsm.atoms.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,6 @@ superdsm.atoms
==============

.. automodule:: superdsm.atoms
:members:
:undoc-members:
:show-inheritance:
:members:
:undoc-members:
:show-inheritance:
7 changes: 0 additions & 7 deletions docs/source/superdsm.automation.rst

This file was deleted.

7 changes: 0 additions & 7 deletions docs/source/superdsm.batch.rst

This file was deleted.

6 changes: 3 additions & 3 deletions docs/source/superdsm.c2freganal.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,6 @@ superdsm.c2freganal
===================

.. automodule:: superdsm.c2freganal
:members:
:undoc-members:
:show-inheritance:
:members:
:undoc-members:
:show-inheritance:
7 changes: 0 additions & 7 deletions docs/source/superdsm.config.rst

This file was deleted.

6 changes: 3 additions & 3 deletions docs/source/superdsm.dsm.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,6 @@ superdsm.dsm
============

.. automodule:: superdsm.dsm
:members:
:undoc-members:
:show-inheritance:
:members:
:undoc-members:
:show-inheritance:
6 changes: 3 additions & 3 deletions docs/source/superdsm.dsmcfg.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,6 @@ superdsm.dsmcfg
===============

.. automodule:: superdsm.dsmcfg
:members:
:undoc-members:
:show-inheritance:
:members:
:undoc-members:
:show-inheritance:
7 changes: 0 additions & 7 deletions docs/source/superdsm.export.rst

This file was deleted.

6 changes: 3 additions & 3 deletions docs/source/superdsm.globalenergymin.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,6 @@ superdsm.globalenergymin
========================

.. automodule:: superdsm.globalenergymin
:members:
:undoc-members:
:show-inheritance:
:members:
:undoc-members:
:show-inheritance:
6 changes: 3 additions & 3 deletions docs/source/superdsm.image.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,6 @@ superdsm.image
==============

.. automodule:: superdsm.image
:members:
:undoc-members:
:show-inheritance:
:members:
:undoc-members:
:show-inheritance:
6 changes: 3 additions & 3 deletions docs/source/superdsm.io.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,6 @@ superdsm.io
===========

.. automodule:: superdsm.io
:members:
:undoc-members:
:show-inheritance:
:members:
:undoc-members:
:show-inheritance:
6 changes: 3 additions & 3 deletions docs/source/superdsm.maxsetpack.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,6 @@ superdsm.maxsetpack
===================

.. automodule:: superdsm.maxsetpack
:members:
:undoc-members:
:show-inheritance:
:members:
:undoc-members:
:show-inheritance:
Loading
Loading