Merge branch 'release-0.12'

Subaru-PFS · May 10, 2019 · 0796fbd · 0796fbd
2 parents 2a7d473 + 1b74726
commit 0796fbd
Show file tree

Hide file tree

Showing 25 changed files with 1,041 additions and 349 deletions.
diff --git a/MANUAL.md b/MANUAL.md
@@ -0,0 +1,127 @@
+# DRP_1DPIPE - User's Manual
+
+## Installation
+
+See README.md
+
+## Configuration
+
+### Runtime environment
+
+Pipeline is run with the `drp_1dpipe` command.
+
+#### Synopsis
+```
+drp_1dpipe [-h] [--workdir WORKDIR] [--logdir LOGDIR] [--spectra-path DIR]
+           [--pre-commands COMMAND] [--loglevel LOGLEVEL]
+		   [--scheduler SCHEDULER] [--bunch-size SIZE]
+```
+#### Optional arguments
+
+Arguments can be given on the command line and in the configuration file `drp_1dpipe/io/conf/drp_1dpipe.conf`. Command line arguments take precedence.
+
+##### `--workdir WORKDIR`
+
+The root working directory where data is located.
+
+##### `--logdir LOGDIR`
+
+The logging directory.
+
+##### `--loglevel LOGLEVEL`
+
+The logging level. `CRITICAL`, `ERROR`, `WARNING`, `INFO` or `DEBUG`.
+
+##### `--scheduler SCHEDULER`
+
+The scheduler to use. Either `local` or `pbs`.
+
+##### `--pre-commands COMMAND`
+
+Commands to run before before `process_spectra`. This gives the opportunity for each process to initialize a virtualenv or mount a data directory.
+`COMMAND` is a list of commands given as a python expression.
+
+##### `--spectra-path DIR`
+
+Base path where to find spectra. Relative to workdir.
+
+##### `--bunch-size SIZE`
+
+Maximum number of spectra per bunch.
+
+##### `-h`, `--help`
+
+Show the help message and exit.
+
+#### Examples
+
+#####
+
+Run using all defaults from `scheduler.conf`:
+
+```sh
+drp_1dpipe
+```
+
+#####
+
+Run on a PBS queue, activating a virtualenv on each node before running:
+
+```sh
+drp_1dpipe --scheduler pbs \
+           --pre-commands "source $HOME/venv/bin/activate"
+```
+
+### Algorithmic parameters
+
+Algorithms are described in [DRP Algorith documentation](https://sumire.pbworks.com/w/file/132378141/LAM-PFS-1D-DRP-Algo-Pipeline_v0.82.pdf "DRP-Algo").
+
+Redshift determination algorithms can be tuned using an optional JSON file.
+Defaults to `drp_1dpipe/io/auxdir/parameters.json`. An example file can be found in `drp_1dpipe/io/auxdir/parameters.json.example`.
+
+
+Parameters are as follow :
+
+| **Parameter name** | **Type** | **Default** | **Description** |
+| --- | --- | --- | --- |
+| _**general**_ ||| _**parameters always applicable**_ |
+| `lambdarange` | `["min", "max"]` | `[ "3000", "13000"]` | lambda range in Angströms|
+| `redshiftrange` | `["min", "max"]` | `[ "0.0", "6."]` | redshift range|
+| `redshiftstep` | `float` | `0.0001` | redshift step for linear scale or lowest step for log scale|
+| `redshiftsampling` |`log` / `lin` | `log` | linear or logarithmic scale|
+| `calibrationDir` | | | |
+|||||
+| _**`continuumRemoval`**_ ||| _**Method parameters to remove continuum of data spectra**_ |
+| `  .method` |&bull; `zero`<br>&bull; `IrregularSamplingMedian`<br>&bull; | `IrregularSamplingMedian`| continuum estimation method. See also `linemodelsolve.linemodel.continuumcomponent` |
+| `  .medianKernelWidth` |`float` |`400` | relevant only for median (in Angströms)|
+|||||
+|  _**`linemodelsolve.linemodel`**_ ||| _**parameters for linemodel**_ |
+| `  .linetypefilter` |`no` / `E` / `A` |`no`|  restrict the type of line to fit (`no`: fit all)|
+| `  .lineforcefilter`|`no` / `W` / `S` |`no` | restrict the strength category of lines to fit (`no`: fit all)|
+| `  .instrumentresolution` |`float` | `4300`| intrument resolution (R)|
+| `  .velocityemission` |`float`| `200`| emission lines velocity (in $km \cdot s^{-1}$)|
+| `  .velocityabsorption` |`float` |`300` | absorption lines velocity (in $km \cdot s^{-1}$)}|
+| `  .velocityfit` |`yes` / `no` |`yes` | decide wether the 2nd pass include line width fitting|
+| `  .emvelocityfitmin` |`float` |`10` | tabulation of velocity for line width fitting in $km \cdot s^{-1}$|
+| `  .emvelocityfitmax` |`float` |`400`|
+| `  .emvelocityfitstep` |`float` |`20`|
+| `  .absvelocityfitmin` |`float` |`150`|
+| `  .absvelocityfitmax` |`float` |`500`|
+| `  .absvelocityfitstep` |`float` |`50`|
+| `  .tplratio_ismfit` |`yes` / `no` |`no` | activate fit of ISM extinction _i.e._ Ebv parameter from Calzetti profiles. Parameter scan from 0 to 0.9, step = 0.1. (best value stored in FittedTplshapeIsmCoeff in `linemodelsolve.linemodel_extrema.csv`)|
+| `  .continuumcomponent` |&bull; `fromspectrum`<br>&bull; `tplfit` | `tplfit`  | select the method for processing the continuum:<br>&bull; `fromspectrum`: remove an estimated continuum (the continuum estmation is then tuned via `continuumRemoval` parameters). The redshift is thus only estimated from the lines.<br>&bull; `tplfit`: fit a set of redshifted template (aka `fullmodel` _i.e._ contiuum model + line model|
+| `  .continuumfit.ismfit` |`yes` / `no` |`yes` | activate fit of ISM extinction _i.e._. Ebv parameter from Calzetti profiles. Parameter scan from 0 to 0.9, step = 0.1. (best value stored in FittedTplDustCoeff in `linemodelsolve.linemodel_extrema.csv`)  |
+| `  .continuumfit.igmfit` |`yes` / `no` |`yes` | activate fit of IGM with Meiksin tables. Index scan from 0 to 0.9, step = 0.1 ??? (best profile index stored in FittedTplMeiksinIdx parameter in `linemodelsolve.linemodel_extrema.csv`)|
+| `  .skipsecondpass` |`yes` / `no` |`no` | toggle the processing  of a second pass refined pass arround the candidates (`no` by default) |
+| `  .extremacount` |`int` |`5` |Number of candidates to retain |
+| `  .extremacutprobathreshold` |`float` |`30` |Select the number of candidates to refine at the 2nd pass.<br>&bull; `-1`: retain a fixed number (set from `extremacount` parameter)<br>&bull; any positive value: retain all candidates with $log(max(pdf))-log(pdf)$ values (not integrated)  below this threshold|
+| `  .firstpass.tplratio_ismfit` |`yes` / `no` | `no` | overwrite the `tplratio_ismfit` parameter|
+|||||
+| `  .stronglinesprior` |`float` |`-1` | strongline prior<br>&bull; `-1`: no prior<br>&bull; otherwise, use this value (positive below 1) as a low probability when no strong line is measured (the measured amplitude is s>0) & probability is set to 1 when a strong line is observed|
+
+
+## Outputs
+
+### ```pfsZCandidates```
+
+See [datamodel](https://github.com/Subaru-PFS/datamodel/blob/master/datamodel.txt "Subaru-PFS datamodel).
diff --git a/README.md b/README.md
@@ -16,9 +16,9 @@ Install required python packages
 
 	pip install -r pip-requirements.txt
 
-### Using pip
+### Install drp1d_pip
 
-To install Subaru PFS 1D data reduction pipeline with `pip` simply run.
+To install Subaru PFS 1D data reduction pipeline, simply run :
 
     pip install .
 

diff --git a/drp_1dpipe/io/auxdir/linemeas-parameters.json b/drp_1dpipe/io/auxdir/linemeas-parameters.json
@@ -0,0 +1,66 @@
+{
+    "lambdarange": [
+        "2000",
+        "20000"
+    ],
+    "redshiftrange": [
+        "0.0",
+        "0.0"
+    ],
+    "redshiftstep": "0.0001",
+    "redshiftsampling": "lin",
+    "smoothWidth": "0",
+    "method": "LineModel",
+    "templateCategoryList": [
+        "emission",
+        "galaxy",
+        "star",
+        "qso"
+    ],
+    "templateCatalog": {
+        "continuumRemoval": {
+            "method": "zero",
+            "medianKernelWidth": "75",
+            "decompScales": "8",
+            "binPath": "absolute_path_to_df_binaries_here"
+        }
+    },
+    "continuumRemoval": {
+        "method": "IrregularSamplingMedian",
+        "medianKernelWidth": "50",
+        "binPath": "~\/amazed_cluster\/gitlab\/amazed\/extern\/df_centos\/",
+        "decompScales": "9"
+    },
+    "linemodelsolve": {
+        "linemodel": {
+            "linetypefilter": "no",
+            "lineforcefilter": "no",
+            "fittingmethod": "hybrid",
+            "firstpass": {
+                "largegridstep": "0.001",
+                "tplratio_ismfit": "no",
+                "multiplecontinuumfit_disable": "yes"
+            },
+            "continuumcomponent": "fromspectrum",
+            "rigidity": "rules",
+            "linewidthtype": "velocitydriven",
+            "instrumentresolution": "2350",
+            "velocityemission": "100",
+            "velocityabsorption": "300",
+            "velocityfit": "yes",
+            "continuumreestimation": "onlyextrema",
+            "rules": "no",
+            "extremacount": "1",
+            "stronglinesprior": "1e-1",
+            "pdfcombination": "marg",
+            "continuumismfit": "no",
+            "continuumigmfit": "no",
+	    "emvelocityfitmin": "20",
+            "emvelocityfitmax": "2000",
+            "emvelocityfitstep": "1",
+            "absvelocityfitmin": "20",
+            "absvelocityfitmax": "2000",
+            "absvelocityfitstep": "5"
+        }
+    }
+}
diff --git a/drp_1dpipe/io/auxdir/parameters.json → drp_1dpipe/io/auxdir/parameters.json.example b/drp_1dpipe/io/auxdir/parameters.json → drp_1dpipe/io/auxdir/parameters.json.example
@@ -9,7 +9,6 @@
     ],
     "redshiftstep": "0.0001",
     "redshiftsampling": "log",
-    "smoothWidth": "0",
     "method": "linemodel",
     "templateCategoryList": [
         "emission",
@@ -43,37 +42,66 @@
             "velocityabsorption": "300",
             "continuumreestimation": "no",
             "rules": "all",
-            "extremacount": "5",
+            "extremacount": "10",
+            "extremacutprobathreshold": "30",
             "velocityfit": "yes",
             "rigidity": "tplshape",
-            "pdfcombination": "bestchi2",
-            "emvelocityfitmin": "20",
-            "emvelocityfitmax": "500",
+            "pdfcombination": "marg",
+            "emvelocityfitmin": "10",
+            "emvelocityfitmax": "400",
             "absvelocityfitmin": "150",
             "absvelocityfitmax": "500",
             "saveintermediateresults": "no",
-            "tplratio_catalog": "linecatalogs_tplshapes\/linecatalogs_tplshape_ExtendedTemplatesJan2017v3_20170602_B14C_v3_emission",
-            "continuumismfit": "yes",
-            "continuumigmfit": "yes",
-            "secondpasslcfittingmethod": "no",
+            "tplratio_catalog": "linecatalogs_tplshapes\/linecatalogs_tplshape_ExtendedTemplatesJan2017v3_20170602_B14C_v5_emission",
+            "secondpasslcfittingmethod": "-1",
             "offsets_catalog": "linecatalogs_offsets\/offsetsCatalogs_20170410_m150",
-            "emvelocityfitstep": "50",
+            "emvelocityfitstep": "20",
             "absvelocityfitstep": "50",
             "tplratio_ismfit": "no",
             "firstpass": {
                 "largegridstep": "0.001",
-                "tplratio_ismfit": "no",
+                "tplratio_ismfit": "yes",
                 "multiplecontinuumfit_disable": "yes",
                 "fittingmethod": "individual"
             },
-            "continuumfitcount": "1",
-            "continuumfitignorelinesupport": "no",
+            "secondpass": {
+                "continuumfit": "retryall"
+            },
+            "continuumfit": {
+                "ismfit": "no",
+                "igmfit": "no",
+                "count": "1",
+                "ignorelinesupport": "no",
+                "priors": {
+                    "beta": "1",
+                    "catalog_reldirpath": ""
+                }
+            },
+            "lyaforcedisablefit": "yes",
+            "secondpasslcfittingmethod": "-1",
+            "skipsecondpass": "no",
             "stronglinesprior": "-1",
             "euclidnhaemittersStrength": "-1",
-            "extremacutprobathreshold": "30"
+            "manvelocityfitdzmin": "-0.0005999999999999999",
+            "manvelocityfitdzmax": "0.0005999999999999999",
+            "manvelocityfitdzstep": "0.0001",
+            "lyaforcefit": "no",
+            "lyafit": {
+                "asymfitmin": "0",
+                "asymfitmax": "4",
+                "asymfitstep": "1",
+                "widthfitmin": "1",
+                "widthfitmax": "4",
+                "widthfitstep": "1",
+                "deltafitmin": "0",
+                "deltafitmax": "0",
+                "deltafitstep": "1"
+            },
+            "modelpriorzStrength": "-1"
         }
     },
     "enablestellarsolve": "no",
+    "enableqsosolve": "no",
     "calibrationDir": "\/home\/aschmitt\/amazed_cluster\/calibration\/",
     "SaveIntermediateResults": "default",
     "linemeascatalog": "",

diff --git a/drp_1dpipe/io/conf/drp_1dpipe.conf b/drp_1dpipe/io/conf/drp_1dpipe.conf
@@ -0,0 +1,10 @@
+# Write your program options here. e.g.: option = string
+workdir = .
+logdir = logdir
+loglevel = WARNING
+spectra-dir = spectra
+scheduler = local
+bunch-size = 8
+# pre_commands:
+# Run before each process_spectra task. e.g., to initialize a virtualenv or mount a data directory
+#pre-commands = source path/to/venv/bin/activate
diff --git a/drp_1dpipe/io/conf/merge_results.conf b/drp_1dpipe/io/conf/merge_results.conf
@@ -1,4 +1,5 @@
-workdir = ~/workdir
+# Write your program options here. e.g. : option = string
+workdir = workdir
 logdir = logdir
 loglevel = WARNING
 spectra_path = spectra

diff --git a/drp_1dpipe/io/conf/pre_process.conf b/drp_1dpipe/io/conf/pre_process.conf
@@ -1,7 +1,7 @@
 # Write your program options here. e.g. : option = string
-workdir = ~/workdir
+workdir = workdir
 logdir = logdir
 loglevel = WARNING
-bunch_size = 8
-spectra_path = spectra
-bunch_list = bunches.json
+bunch-size = 8
+spectra-path = spectra
+bunch-list = bunches.json
diff --git a/drp_1dpipe/io/conf/process_spectra.conf b/drp_1dpipe/io/conf/process_spectra.conf
@@ -1,11 +1,13 @@
-workdir = ~/workdir
+# Write your program options here. e.g. : option = string
+workdir = workdir
 logdir = logdir
 loglevel = WARNING
-calibration_dir = calibration
-#parameters_file = parameters.json
-template_dir = templates/BC03_sdss_tremonti21
-linecatalog = linecatalogs/linecatalogamazedvacuum_C1_noHepsilon.txt
-zclassifier_dir = reliability/zclassifier_C6A3iS1nD9cS1nS1_20180404
-output_dir = output
-spectra_path = spectra
-process_method = amazed
+calibration-dir = calibration
+#parameters-file = parameters.json
+template-dir = calibration/templates/BC03_sdss_tremonti21
+linecatalog = calibration/linecatalogs/linecatalogamazedvacuum_D2_noHepsilon_noAbsHt3700A.txt
+linemeas-linecatalog = calibration/linecatalogs/linecatalogamazedvacuum_D2_noHepsilon-linemeas.txt
+zclassifier-dir = calibration/reliability/zclassifier_C6A3iS1nD9cS1nS1_20180404
+output-dir = output
+spectra-dir = spectra
+process-method = amazed
diff --git a/drp_1dpipe/io/conf/scheduler.conf b/drp_1dpipe/io/conf/scheduler.conf
diff --git a/drp_1dpipe/io/reader.py b/drp_1dpipe/io/reader.py
@@ -1,20 +1,24 @@
-import fitsio
 import os.path
-from pyamazed.redshift import *
+from astropy.io import fits
+from pyamazed.redshift import (CSpectrumSpectralAxis,
+                               CSpectrumFluxAxis_withError,
+                               CSpectrum)
 import numpy as np
 
+
 def read_spectrum(path):
     """
     Read a pfsObject FITS file and build a CSpectrum out of it
 
     :param path: FITS file name
     :rtype: CSpectrum
     """
-    fits = fitsio.FITS(path)
-    obj_id = fits[0].read_header()['PFSVHASH']
-    data = fits['FLUXTBL']
-    spectralaxis = CSpectrumSpectralAxis(data['lambda'][:]*10)
-    signal = CSpectrumFluxAxis(data['flux'][:], np.sqrt(data['fluxVariance'][:]))
-    spectrum = CSpectrum(spectralaxis, signal)
-    spectrum.SetName(os.path.basename(path))
+
+    with fits.open(path) as f:
+        fluxtbl = f['FLUXTBL'].data
+        error = np.sqrt(f['COVAR'].data[:,0])
+        spectralaxis = CSpectrumSpectralAxis(fluxtbl['wavelength'] * 10)
+        signal = CSpectrumFluxAxis_withError(fluxtbl['flux'], error)
+        spectrum = CSpectrum(spectralaxis, signal)
+        spectrum.SetName(os.path.basename(path))
     return spectrum