This tutorial will go through using bilby_pipe
, a tool for running analyses
at scale. bilby_pipe
is not as flexible as bilby
, but intended to do common
tasks.
All the files used are available in the repo here for future reference.
bilby_pipe
is intended for use on either HTCondor or slurm clusters. You can also use it on a local machine (running jobs serially). If you haven't already, follow the installation steps to make sure you havebilby_pipe
installed.- To check it is installed, run
$ bilby_pipe --version
bilby_pipe=1.0.0 (CLEAN) ....
bilby_pipe
uses thegwpy
,NDS2
, andLDASTools-frameCPP
libraries to access LIGO-Virgo-KAGRA data. To get full support, these need to be installed via conda
$ conda install -c conda-forge python-ldas-tools-framecpp python-nds2-client
If you don't install these, remote data access (i.e. not on a LIGO Data Grid cluster) will be difficult.
bilby_pipe
operates through an ini
file (also known as a configuration file). The ini
file specifies what to run and how. To see all the available options for the ini
file, you can run
$ bilby_pipe --help
You can also view this output in the docs.
🚀 Task: run the --help command and check it produces output |
---|
An ini file consists of key=value
pairs, a complete guide can be found in the docs, here we will go through an example for a simulated signal.
# The accounting tag, onnly needed on LDG clusters.
# See https://ldas-gridmon.ligo.caltech.edu/accounting/condor_groups/determine_condor_account_group.html
# for help with determining what tag to use
accounting = FILL_THIS_IN
# A label to help us remember what the job was for
label = bbh_injection
# The directory to store results in
outdir = outdir_bbh_injection
# Which detectors to use, option: H1, L1, V1
detectors = [H1, L1]
# The duration of data to analyse in seconds
duration = 4
# The sampler
sampler = dynesty
# The options to pass to the sampler
sampler-kwargs = {'nlive': 1000}
# The prior file to use
prior-file = bbh_simple_example.prior
# We want to inject a signal (in the case, drawn randomly from the prior)
injection = True
# We want to use Gaussian noise (default is to simulate it from O4-design sensitivity curves)
gaussian-noise = True
# We'll do just one simulation
n-simulation = 1
# We'll run one "parallel" job. This runs n-parallel *identical* jobs and then combines the results together into a single combined run
n-parallel = 1
# Use parallisation using 4 cores on one node (remove comment to use)
# request-cpus = 4
🚀 Task: save the contents of the ini file above in a file bbh_simple_example.ini |
---|
As mentioned elsewhere, bilby supports a paralleisation up to the number of cores available a node. To access this, add the line
request-cpus = 4
You can check this is being applied by seeing this line in the logs:
Setting up multiproccesing pool with 4 processes.
Note, in the ini file, we specified prior-file = bbh_simple_example.prior
, this is pointing to a file (you can specify and absolute or relative path as needed).
Here are the contents of that file, as you can see, this is a simple prior in only the distance and inclination
mass_1 = 50.0
mass_2 = 45.0
a_1 = 0.0
a_2 = 0.0
tilt_1 = 0.0
tilt_2 = 0.0
phi_12 = 0.0
phi_jl = 0.0
luminosity_distance = bilby.gw.prior.UniformComovingVolume(name='luminosity_distance', minimum=1e2, maximum=5e3, unit='Mpc')
dec = -0.2
ra = 1.4
theta_jn = Sine(name='theta_jn')
psi = 0.0
phase = 0.0
geocent_time = 0.0
🚀 Task: save the contents of the ini file above in a file bbh_simple_example.prior . |
---|
In the example above, we see a few different styles of input, let's briefly review them
key = value
(e.g,sampler = dynesty
,duration=4
, etc)key = list
(e.g.detectors = [H1, L1]
key = dict
(e.g.sampler-kwargs = {'nlive': 1000}
) note that the dictionary itself can handle lots of types of input.key = boolean
(.e.ggaussian-noise = True
)
If you followed the steps above, you should have two files stored bbh_simple_example.ini
and bilby_simple_example.prior
. If you do, you can run run bilby_pipe with the command
$ bilby_pipe bbh_simple_example.ini
This will produce the following output:
09:37 bilby_pipe INFO : Running bilby_pipe version: 1.0.0: (CLEAN) e6223de 2020-07-27 17:01:09 -0500
09:37 bilby_pipe INFO : Running bilby: 1.0.0: release
09:37 bilby_pipe INFO : Setting segment duration 4.0s
09:37 bilby_pipe INFO : Setting prior-file to bbh_simple_example.prior
09:37 bilby_pipe INFO : Setting n_simulation=1
09:37 bilby_pipe INFO : Using injection file outdir_bbh_injection/data/bbh_injection_injection_file.dat
09:37 bilby_pipe INFO : Setting analysis request_memory=4.0GB
09:37 bilby_pipe INFO : Setting request_memory_generation=8GB
09:37 bilby_pipe INFO : Setting analysis request_cpus = 1
09:37 bilby_pipe INFO : Using geocent_time prior from prior_file
09:37 bilby_pipe INFO : Input prior = {
"mass_1": "DeltaFunction(peak=50.0, name=None, latex_label=None, unit=None)",
"mass_2": "DeltaFunction(peak=45.0, name=None, latex_label=None, unit=None)",
"a_1": "DeltaFunction(peak=0.0, name=None, latex_label=None, unit=None)",
"a_2": "DeltaFunction(peak=0.0, name=None, latex_label=None, unit=None)",
"tilt_1": "DeltaFunction(peak=0.0, name=None, latex_label=None, unit=None)",
"tilt_2": "DeltaFunction(peak=0.0, name=None, latex_label=None, unit=None)",
"phi_12": "DeltaFunction(peak=0.0, name=None, latex_label=None, unit=None)",
"phi_jl": "DeltaFunction(peak=0.0, name=None, latex_label=None, unit=None)",
"luminosity_distance": "UniformComovingVolume(minimum=100.0, maximum=5000.0, cosmology=FlatLambdaCDM(name=\"Planck15\", H0=67.7 km / (Mpc s), Om0=0.307, Tcmb0=2.725 K, Neff=3.05, m_nu=[0. 0. 0.06] eV, Ob0=0.0486), name='luminosity_distance', latex_label='$d_L$', unit=Unit(\"Mpc\"), boundary=None)",
"dec": "DeltaFunction(peak=-0.2, name=None, latex_label=None, unit=None)",
"ra": "DeltaFunction(peak=1.4, name=None, latex_label=None, unit=None)",
"theta_jn": "Sine(name='theta_jn', latex_label='$\\\\theta_{JN}$', unit=None, minimum=0, maximum=3.141592653589793, boundary=None)",
"psi": "DeltaFunction(peak=0.0, name=None, latex_label=None, unit=None)",
"phase": "DeltaFunction(peak=0.0, name=None, latex_label=None, unit=None)",
"geocent_time": "DeltaFunction(peak=0.0, name=None, latex_label=None, unit=None)"
}
09:37 bilby_pipe WARNING : File outdir_bbh_injection/bbh_injection_config_complete.ini already exists, not writing to file.
09:37 bilby_pipe INFO : Setting segment duration 4.0s
09:37 bilby_pipe INFO : Setting prior-file to /home/user1/GWParameterEstimationWorkshop2020/pipe_examples/bbh_simple_example.prior
09:37 bilby_pipe INFO : Setting n_simulation=1
09:37 bilby_pipe INFO : Using injection file outdir_bbh_injection/data/bbh_injection_injection_file.dat
09:37 bilby_pipe INFO : Setting analysis request_memory=4.0GB
09:37 bilby_pipe INFO : Setting request_memory_generation=8GB
09:37 bilby_pipe INFO : Setting analysis request_cpus = 1
09:37 bilby_pipe INFO : Setting segment trigger-times [0]
09:37 bilby_pipe INFO : DAG generation complete, to submit jobs run:
$ condor_submit_dag outdir_bbh_injection/submit/dag_bbh_injection.submit
09:37 bilby_pipe INFO : Using geocent_time prior from prior_file
09:37 bilby_pipe INFO : Overview page available at outdir_bbh_injection/results_page/overview.html
🚀 Task: run bilby_pipe bbh_simple_example.ini and confirm you get output like that above |
---|
Note, this has not run or submitted the job, only created the files neccersery to do so. You should now see a directory outdir_bbh_injection
with several sub-directories:
$ ls outdir_bbh_injection/
data log_data_analysis log_data_generation log_results_page result results_page submit bbh_injection_config_complete.ini
Except the submit
directory, these will be empty right now and populated one the run starts. Let's look in the submit
directory
$ ls outdir_bbh_injection/submit/
bash_bbh_injection.sh # <- This is a bash file containing all the commands to run different parts of the job locally
bbh_injection_data0_0_analysis_H1L1_dynesty.submit # <- This is a HTCondor submit file for the analysis step
bbh_injection_data0_0_generation.submit # <- This is a HTCondor submit file for the analysis step
dag_bbh_injection.submit # <- This is the HTCondor dag submit file
If you are running things on a HTCondor cluster, you can now submit the job with
$ condor_submit_dag outdir_bbh_injection/submit/dag_bbh_injection.submit
Alternatively, you can submit things when you run bilby_pipe
by adding the --submit
flag, i.e.
$ bilby_pipe bbh_simple_example.ini --submit
As you'll notice, the default in bilby pipe is to use the HTCondor
scheduler (this is the standard on all LDG clusters). If you are on a Slurm cluster, you need to add some lines to the ini file, a detailed discussion of these can be found in the documentation, but will depend on your cluster setup.
All bilby_pipe jobs can be run on your local machine. There are two ways to do this
You can run
$ bilby_pipe bbh_simple_example.ini --local
This will run each step of the job sequentially on your local machine
Let's look inside the bash
file
$ cat outdir_bbh_injection/submit/bash_bbh_injection.sh
#!/usr/bin/env bash
# bbh_injection_data0_0_generation
# PARENTS
# CHILDREN bbh_injection_data0_0_analysis_H1L1_dynesty
/home/user1/anaconda3/envs/bilby/bin/bilby_pipe_generation outdir_bbh_injection/bbh_injection_config_complete.ini --label bbh_injection_data0_0_generation --idx 0 --trigger-time 0 --injection-file outdir_bbh_injection/data/bbh_injection_injection_file.dat
# bbh_injection_data0_0_analysis_H1L1_dynesty
# PARENTS bbh_injection_data0_0_generation
# CHILDREN
/home/user1/anaconda3/envs/bilby/bin/bilby_pipe_analysis outdir_bbh_injection/bbh_injection_config_complete.ini --outdir outdir_bbh_injection --detectors H1 --detectors L1 --label bbh_injection_data0_0_analysis_H1L1_dynesty --data-dump-file outdir_bbh_injection/data/bbh_injection_data0_0_generation_data_dump.pickle --sampler dynesty
This contains the set of instructions to run each step of the job. You can run this directly, i.e. with
$ bash outdir_bbh_injection/submit/bash_bbh_injection.sh
Or you can copy parts of the file you want to run to a separate script and run each one individually (this is often very useful if you just want to run a step locally, i.e. the plotting).
🚀 Task: the example script using either the --local flag or the bash file |
---|
Once you have run the example to completion, we should check the outputs. If you ran using either slurm
or HTCondor
the log*
files will be populated with files containing the run outputs. If instead you ran it locally, this output would have been printed to the terminal.
If you are using the dynesty sampler, you'll get output like this (either in the logs, or printed to the terminal)
136it [00:04, 4.57it/s, bound:0 nc: 1 ncall:6.6e+02 eff:20.7% logz-ratio=146.11+/-0.13 dlogz:321.891>0.1]
151it [00:04, 6.44it/s, bound:0 nc: 3 ncall:6.8e+02 eff:22.3% logz-ratio=151.24+/-0.13 dlogz:316.951>0.1]
169it [00:04, 9.05it/s, bound:0 nc: 1 ncall:7.0e+02 eff:24.1% logz-ratio=155.53+/-0.13 dlogz:312.568>0.1]
185it [00:04, 12.54it/s, bound:0 nc: 1 ncall:7.3e+02 eff:25.3% logz-ratio=160.67+/-0.14 dlogz:307.518>0.1]
203it [00:04, 17.38it/s, bound:0 nc: 1 ncall:7.5e+02 eff:27.0% logz-ratio=168.88+/-0.14 dlogz:299.232>0.1]
By column, this tells you:
203it
: the number of iterations00:04
: the time taken so far (note if jobs are restarted this counter is reset to zero)17.38it/s
: the number of iterations per secondbound:0
: the bound index (see dynesty docs)nc: 1
: the number of likelihood calls used in the last iterationsncall:7.5e+02
: the total number of callseff:27.0%
: the sampling efficiencylogz-ratio=168.88+/-0.14
: the current logz-ratio estimate (if > 0, this is a Bayes factor in support of the signal hypothesis vs. Gaussian noise)dlogz:299.232>3
the estimated remaining evidence and threshold (in this instance, 0.1). Sampling will finish whendlogz
is less than the threshold.
Roughly speaking logz-ratio + dlogz
is an estimate of the final evidence, but note that dlogz
can jump if new regions of the parameter space are found with high likelihoods.
In the results
directory, we have all of the relevant output:
$ ls outdir_bbh_injection/result
bbh_injection_data0_0_analysis_H1L1_dynesty_result.json # <- THIS IS THE MAIN RESULT FILE
bbh_injection_data0_0_analysis_H1L1_dynesty_dynesty.pickle # <- A python pickle file of the dynesty sampler output
bbh_injection_data0_0_analysis_H1L1_dynesty_resume.pickle # <- A python pickle file used to store results during the run
bbh_injection_data0_0_analysis_H1L1_dynesty_samples.dat # <- A set of samples created during the run, not the final output
bbh_injection_data0_0_analysis_H1L1_dynesty_checkpoint_run.png
bbh_injection_data0_0_analysis_H1L1_dynesty_checkpoint_stats.png
bbh_injection_data0_0_analysis_H1L1_dynesty_checkpoint_trace.png
The last three png
images contain diagnostic plots. These will be updated during the run and help show the progress
Shows the dynesty summary plot
This is a plot of the number of calls and the scale parameter for nested sampling proposals (for this short run, it is a little uninteresting)
Shows the dynesty trace plot. The right-hand side are the 1D posteriors
As seen above, you can specify your own prior using a prior file. bilby_pipe
also provides a set of default prior files based on the duration of data. These are:
The chirp-mass range ensures that the signals fit within the data duration. To use these default priors, simplify specify
prior-file=4s
If you want to access data, you'll need to specify a trigger-time
(a GPS time as used by LIGO/Virgo) and the channel-dict
used to determine which data to use. Here is an example ini file for running on GW150914
# The accounting tag, onnly needed on LDG clusters.
# See https://ldas-gridmon.ligo.caltech.edu/accounting/condor_groups/determine_condor_account_group.html
# for help with determining what tag to use
accounting = FILL_THIS_IN
# A label to help us remember what the job was for
label = GW150914
# The directory to store results in
outdir = outdir_bbh_gwosc_GW150914
# Which detectors to use, option: H1, L1, V1
detectors = [H1, L1]
# The duration of data to analyse in seconds
duration = 4
# The sampler
sampler = dynesty
# The options to pass to the sampler
sampler-kwargs = {'nlive': 1000}
# The prior file to use
prior-file = 4s
# For events, we can pull in the time using a string
trigger-time = GW150914
# You could alternatively do (uncomment to test)
# trigger-time = 1126259462.4
# Here we specify the GWOSC channel, if you want to use a different channel
# you can pass that per-detector
channel-dict = {H1:GWOSC, L1:GWOSC}
# The accounting tag, onnly needed on LDG clusters.
# See https://ldas-gridmon.ligo.caltech.edu/accounting/condor_groups/determine_condor_account_group.html
# for help with determining what tag to use
accounting = FILL_THIS_IN
# A label to help us remember what the job was for
label = gwosc_injection
# The directory to store results in
outdir = outdir_bbh_gwosc_injection
# Which detectors to use, option: H1, L1, V1
detectors = [H1, L1]
# The duration of data to analyse in seconds
duration = 4
# The sampler
sampler = dynesty
# The options to pass to the sampler
sampler-kwargs = {'nlive': 1000}
# The prior file to use
prior-file = 4s
# Specify a random time (a few hundred seconds after GW150914)
trigger-time = 1126259600
# Here we specify the GWOSC channel, if you want to use a different channel
# you can pass that per-detector
channel-dict = {H1:GWOSC, L1:GWOSC}
injection=True
injection-dict={'chirp_mass': 17.051544979894693, 'mass_ratio': 0.3183945489993522, 'a_1': 0.29526500202350264, 'a_2': 0.23262056301313416, 'tilt_1': 1.0264673717225983, 'tilt_2': 2.1701305583885513, 'phi_12': 5.0962562029664955, 'phi_jl': 2.518241237045709, 'luminosity_distance': 497.2983560174788, 'dec': 0.2205292600865073, 'ra': 3.952677097361719, 'theta_jn': 1.8795187965094322, 'psi': 2.6973435044499543, 'phase': 3.686990398567503, 'geocent_time': 1126259600}
geocent_time
of the injection falls within the prior window (+/- 0.2 around the trigger_time
)
Running this, you can check estimated SNR of the injected signal in the logs, e.g.
12:53 bilby INFO : Injected signal in L1:
12:53 bilby INFO : optimal SNR = 13.28
12:53 bilby INFO : matched filter SNR = -3.65+38.02j
psd-dict PSD_DICT Dictionary of PSD files to use (default: None)
e.g.
psd-dict = {H1:psds_files/h1.psd}
If no PSD is given, one will be estimated from the data. Note this differs to standard LVK analyses which use a BayesWave generated PSD. Here are the options to control how that PSD is estimated.
--psd-fractional-overlap PSD_FRACTIONAL_OVERLAP
Fractional overlap of segments used in estimating the
PSD (default: 0.5)
--post-trigger-duration POST_TRIGGER_DURATION
Time (in s) after the trigger_time to the end of the
segment (default: 2.0)
--sampling-frequency SAMPLING_FREQUENCY
--psd-length PSD_LENGTH
Sets the psd duration (up to the psd-duration-
maximum). PSD duration calculated by psd-length x
duration [s]. Default is 32. (default: 32)
--psd-maximum-duration PSD_MAXIMUM_DURATION
The maximum allowed PSD duration in seconds, default
is 1024s. (default: 1024)
--psd-method PSD_METHOD
PSD method see gwpy.timeseries.TimeSeries.psd for
options (default: median)
--psd-start-time PSD_START_TIME
Start time of data (relative to the segment start)
used to generate the PSD. Defaults to psd-duration
before the segment start time (default: None)