Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loading HDF5 files in ctapipe container #20

Closed
vuillaut opened this issue Jul 5, 2018 · 3 comments
Closed

Loading HDF5 files in ctapipe container #20

vuillaut opened this issue Jul 5, 2018 · 3 comments

Comments

@vuillaut
Copy link
Member

vuillaut commented Jul 5, 2018

Hi,
Is there a method to load HDF5 files in ctapipe containers?
(same way we open simtel files)
So that we can start processing in ctapipe from calibrated images if desired.

@bryankim96
Copy link
Member

Sorry for the very late reply (work/maintenance on this repo was temporarily put on hold due to a lack of time/manpower) and thanks for the interest in this code!

Could you provide more detail about the functionality you'd like to have? As you probably know, our initial motivation for writing this tool was to dump calibrated ctapipe DL1 data into a standardized Pytables HDF5 format to be used for testing image analysis ML techniques on calibrated images. Basically, we wanted to start using TF and other Python packages for conv-net based image analysis and needed a convenient way to store/load our data.

This idea was originally developed before ctapipe had any internal HDF5/PyTables dumper implementation. However, now with the arrival of HDF5TableWriter and support for Pytables directly in ctapipe,

it seems that it might be time to move on, integrate with ctapipe directly, or re-base this project on the new HDF5TableWriter functionality. On the other hand, if there is functionality that ctapipe cannot or will not support that would be useful for people who are working on image analysis, maybe it would be worth it to keep a separate tool available. Additionally, some other noteworthy advantages of having a separate tool might include:

  • Freedom to design the data format in a way that is most convenient for the work being done by members of the ML group (without having to compromise/modify the general ctapipe container classes).
  • The possibility to generalize this code to also dump other non-ctapipe-supported data formats (VERITAS, HESS, etc.) in a standardized way. This would align well with the concept of https://github.com/ctlearn-project/ctlearn, which is to create general a general toolkit for IACT image analysis, independent of data source.

Do you have any general thoughts on what functionality would be helpful for image analysis and how we should go about organizing it? I will tag @nietootein and @aribrill so they can join in the discussion. I also think we should discuss this in detail with @kosack at some point to see how we can best cooperate with the ctapipe team.

@kosack
Copy link

kosack commented Sep 27, 2018

Is there a method to load HDF5 files in ctapipe containers?
(same way we open simtel files)
So that we can start processing in ctapipe from calibrated images if desired.

Yes, just like there is an ctapipe.io.HDF5TableWriter, there is a corresponding ctapipe.io.HDF5TableReader class that reads from an HDF5 table and fills a Container. So far this exists, but hasn't been used much, since in most cases we've been using HDF5 for high-level data (DL1/DL2/DL3), and then the analysis is often better using pandas or pytables (to process all events at once, rather than looping)

That should be able to read HDF5 outputs from the image_extractor in principle as well, though I haven't tried it (perhaps some modification will be needed)

@bryankim96
Copy link
Member

I think that for the most part this discussion can be continued in #31 if necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants