Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

python streaming pipeline #5136

Closed
ArthurVincentCS opened this issue Jan 17, 2025 · 8 comments
Closed

python streaming pipeline #5136

ArthurVincentCS opened this issue Jan 17, 2025 · 8 comments
Labels
type:Bug Inconsistencies or issues which will cause an incorrect result under some or all circumstances

Comments

@ArthurVincentCS
Copy link

Description

I'm trying to use the ITK API for Python to set up a data pipeline with chunk-wise reading, inspired by examples I found on the subject, specifically:

Steps to Reproduce

  1. Install the last itk conda package from conda-forge
  2. run the following code snippet where the input image is a 3D (512 x 512 x 1 band) tif file.
import itk


def myFunc(py_image_filter):
    input_image = py_image_filter.GetInput()
    print("input requested region: ",input_image.GetRequestedRegion())
    input_array = itk.GetArrayViewFromImage(input_image)
    output_image = py_image_filter.GetOutput()
    print("output requested region: ",output_image.GetRequestedRegion())
    output_image.Allocate()
    output_array = itk.GetArrayViewFromImage(output_image)
    output_array[:] = input_array

    # inplace operation
    output_array += 1

file_path = "constant.tif"

xDiv = 6
yDiv = 4
zDiv = 1

PixelType = itk.F
Dimension = 3
ImageType = itk.Image[PixelType, Dimension]

reader = itk.ImageFileReader[ImageType].New(FileName=file_path)
PixelType = itk.F
ImageType = itk.Image[PixelType, Dimension]
ReaderType = itk.ImageFileReader[ImageType]
reader = ReaderType.New()
reader.SetFileName(file_path)
reader.UpdateOutputInformation()

fullRegion = reader.GetOutput().GetLargestPossibleRegion()
fullSize = fullRegion.GetSize()
start = itk.Index[Dimension]()
end = itk.Index[Dimension]()
size = itk.Size[Dimension]()

filter = itk.PyImageFilter[ImageType, ImageType].New()
filter.SetPyGenerateData(myFunc)
filter.SetInput(reader.GetOutput())
for z in range(zDiv):
    start[2] = int(fullSize[2] * float(z) / zDiv)
    end[2] = int(fullSize[2] * (z + 1.0) / zDiv)
    size[2] = end[2] - start[2]

    for y in range(yDiv):
        start[1] = int(fullSize[1] * float(y) / yDiv)
        end[1] = int(fullSize[1] * (y + 1.0) / yDiv)
        size[1] = end[1] - start[1]

        for x in range(xDiv):
            start[0] = int(fullSize[0] * float(x) / xDiv)
            end[0] = int(fullSize[0] * (x + 1.0) / xDiv)
            size[0] = end[0] - start[0]

            region = itk.ImageRegion[Dimension]()
            region.SetIndex(start)
            region.SetSize(size)

            def generate_input_requested_region(filter_instance):
                filter_instance.GetInput().SetRequestedRegion(filter_instance.GetOutput().GetRequestedRegion())

            # Same behavior with the next line uncommented
            # filter.SetPyGenerateInputRequestedRegion(generate_input_requested_region)

            filter.GetOutput().SetRequestedRegion(region)
            filter.Update()
            results = filter.GetOutput()

Actual behavior

Currently, it seems that the region is not being propagated to the filter preceding the one created with PyImageFilter.
The execution trace is as follows:

TIFFReadDirectory: Warning, Unknown field with tag 33550 (0x830e) encountered.
TIFFReadDirectory: Warning, Unknown field with tag 33922 (0x8482) encountered.
TIFFReadDirectory: Warning, Unknown field with tag 34735 (0x87af) encountered.
TIFFReadDirectory: Warning, Unknown field with tag 34736 (0x87b0) encountered.
TIFFReadDirectory: Warning, Unknown field with tag 34737 (0x87b1) encountered.
input requested region:  itkImageRegion3([0, 0, 0], [512, 512, 1])
output requested region:  itkImageRegion3([0, 0, 0], [85, 128, 1])
Traceback (most recent call last):
  File "itk_test.py", line 12, in myFunc
    output_array = itk.GetArrayViewFromImage(output_image)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "lib/python3.11/site-packages/itk/support/extras.py", line 356, in GetArrayViewFromImage
    return _GetArrayFromImage(
           ^^^^^^^^^^^^^^^^^^^
  File "lib/python3.11/site-packages/itk/support/extras.py", line 317, in _GetArrayFromImage
    return templatedFunction(img, keep_axes, update)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "lib/python3.11/site-packages/itk/itkPyBufferPython.py", line 3821, in GetArrayViewFromImage
    ndarr_view  = np.asarray(memview).view(dtype = numpy_dtype).reshape(shape).view(np.ndarray)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: cannot reshape array of size 0 into shape (1,512,512)
Traceback (most recent call last):
  File "itk_test.py", line 71, in <module>
    filter.Update()
  File "lib/python3.11/site-packages/itk/itkPyImageFilterPython.py", line 409, in Update
    super().Update()
RuntimeError: /work/ITK-source/ITK/Wrapping/Generators/Python/PyUtils/itkPyImageFilter.hxx:276:
ITK ERROR: PyImageFilter(0x5628561b6560): There was an error executing the CommandCallable.

Reproducibility

It happen every time.

Version

ITK 5.3.0 (itk-5.3.0-py311h781c19f_0.conda)

If I am doing something wrong or if there is more documentation about the Python API and what needs to be done to propagate regions, please feel free to let me know.

Thank you for any assistance you might provide.

@ArthurVincentCS ArthurVincentCS added the type:Bug Inconsistencies or issues which will cause an incorrect result under some or all circumstances label Jan 17, 2025
Copy link

Thank you for contributing an issue! 🙏

Welcome to the ITK community! 🤗👋☀️

We are glad you are here and appreciate your contribution. Please keep in mind our community participation guidelines. 📜
Also, please check existing open issues and consider discussion on the ITK Discourse. 📖

This is an automatic message. Allow for time for the ITK community to be able to read the issue and comment on it.

@dzenanz
Copy link
Member

dzenanz commented Jan 17, 2025

Have you seen this example, and does it help you?

@ArthurVincentCS
Copy link
Author

ArthurVincentCS commented Jan 17, 2025

Hi,

thanks you for your quick reply ! Actually this example seems quite similar to the first one I mentioned (https://examples.itk.org/src/io/imagebase/processimagechunks/documentation).

My purpose is to read images by chunk and using an itk.PyImageFilter connected to an itk.ImageFileReader. I don't see any PyImageFilter in this example

@thewtex
Copy link
Member

thewtex commented Jan 17, 2025

Hi @ArthurVincentCS ,

When converting the image to a NumPy array, it forces a load of the entire image into memory.

One approach you may find helpful is to first convert the image to an OME-Zarr with ngff-zarr. Then process the chunk-by-chunk in the Dask Array with Dask Array map_block or map_overlap. Internally, you can use PyImageFilter, etc. See:

https://blog.dask.org/2019/08/09/image-itk

@blowekamp
Copy link
Member

The itkTIFFImageIO class does not support streaming, so the reader will always produce the full image in the file. Image file formats such as NRRD an MetaIO support streaming.

@ArthurVincentCS
Copy link
Author

Hi @ArthurVincentCS ,

When converting the image to a NumPy array, it forces a load of the entire image into memory.

One approach you may find helpful is to first convert the image to an OME-Zarr with ngff-zarr. Then process the chunk-by-chunk in the Dask Array with Dask Array map_block or map_overlap. Internally, you can use PyImageFilter, etc. See:

https://blog.dask.org/2019/08/09/image-itk

Thank you for the links to the Dask documentation. Is there a plan to enable Python streaming in future versions of ITK, or do you consider that users should turn to Dask or other solutions in such cases?

The itkTIFFImageIO class does not support streaming, so the reader will always produce the full image in the file. Image file formats such as NRRD an MetaIO support streaming.

OK, thanks you.

@thewtex
Copy link
Member

thewtex commented Jan 20, 2025

Is there a plan to enable Python streaming in future versions of ITK, or do you consider that users should turn to Dask or other solutions in such cases?

Streaming can be performed via ITK's Python API as described in this notebook.

@hjmjohnson
Copy link
Member

This appears to be an issue with understanding and find the correct documentation. If there is a coding issue, please reopen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:Bug Inconsistencies or issues which will cause an incorrect result under some or all circumstances
Projects
None yet
Development

No branches or pull requests

5 participants