Add herschel_get_spec functionality to spectroscopy notebook #284
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This new module downloads and extracts "reduced" Herschel spectroscopy for a set of sources. It is completely functional, but not perfect. Description of imperfections (with short rants included) below:
Some decisions had to be made as this is a complicated dataset with complicated sets of rules about when a final spectrum is generated and which fluxes of the many reported are the best to use for any given target.
only observations which produce a HPSSPEC directory are considered. There seems to be no way to ascertain this in the archive search, so the code downloads all tar files for all observations and then checks to see if this directory exists.
source resolution matters, ie., recommendations for which reduced spectrum to use are based on wether or not it is a point source or how extended it is. I have followed the advice in https://www.cosmos.esa.int/documents/12133/996891/Product+decision+trees to use the flux column which has the largest sum. It is beyond the scope of this notebook to calculate if the sources is extended in Herschel band or not.
The Herschel directory tree in the downloaded tar files is extensive, and there is no way to customize and only download the one file that I need, so there is a lot of code around tracking filenames and paths and searching paths for filenames.
The Herschel archive breaks the connection at some point (testing on Arp220 which has many observations > 50, will hit this error). I cannot for the life of me figure out how to catch the specific exceptions, and have spent too long trying to do this. So I know that blanket exceptions are not good practice, but they are the only way I know how to get this to work. I have opened an issue on this both here (fix open 'try except' call in herschel_get_spec function #282 ) and at astroquery (to try to fix this at the source).
I have written into the function an option to delete tar files after the spectrum is extracted.
I re-numbered the sections to account for the fact that Herschel data is being accessed through the ESA archive and not IRSA as was originally suggested in the outline. The use of astroquery's ESA module made this module possible in an automated fashion from python, so it is not possible to switch to IRSA. IRSA only houses an API (which may even just access the ESA API???).
It works with the plotting function before Andreas's newest PR Spectroscopy Notebook Updates #281 , but probably needs to be checked for compatibility with new function. I have opened an issue for this (Test plotting of Herschel data after PR #281 closes #283 ).