Skip to content

Commit

Permalink
documentation updates
Browse files Browse the repository at this point in the history
documentation updates
  • Loading branch information
mschwamb committed Jan 12, 2025
1 parent 50e87c2 commit 2162419
Show file tree
Hide file tree
Showing 5 changed files with 75 additions and 41 deletions.
2 changes: 2 additions & 0 deletions docs/configfiles.rst
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,8 @@ approximation of the Rubin detector.
:language: text
:linenos:

.. _known_config:

Rubin Known Object Prediction
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This configuration file is appropriate for running ``Sorcha`` using the full camera footprint but with randomization,
Expand Down
3 changes: 3 additions & 0 deletions docs/ephemerisgen.rst
Original file line number Diff line number Diff line change
Expand Up @@ -122,6 +122,9 @@ If you want to use the same input orbits across multiple ``Sorcha`` runs, you ca
.. tip::
Compared to the other outputs from ``Sorcha``, the ephemeris output files are typicaly very large in size. The output will be slow to read in to ``Sorcha``, but for some use cases reading in the ephemeris as a file can be faster than ephemeris generation on the fly. We recommend only outuputting the contents of the ephemeris stage if you need it to speed up future simulations. If possible, use the HDF5 file format to help with disk I/O speeds.

.. tip::
If instead you want to know which of the input small body population lands in the survey observations with an estimate of their apparent magnitude wihtout applying any other cuts or filters on the detections (not including discovery efficiency and linking effects), you can use/adapt the :ref:`known_config` example :ref:`configs`.

Validation
--------------------------

Expand Down
10 changes: 6 additions & 4 deletions docs/example_files/multi_sorcha.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,9 @@
import pandas as pd
import sqlite3

def run_sorcha(i, args, path_inputs, pointings, instance, config):
print(f"sorcha run -c {config} -pd {pointings} -o {args.path}{instance}/ -t {instance}_{i} -ob {args.path}{instance}/orbits_{i}.csv -p {args.path}{instance}/physical_{i}.csv", flush=True)
os.system(f"sorcha run -c {config} -pd {pointings} -o {args.path}{instance}/ -t {instance}_{i} -ob {args.path}{instance}/orbits_{i}.csv -p {args.path}{instance}/physical_{i}.csv")
def run_sorcha(i, args, path_inputs, pointings, instance,stats, config):
print(f"sorcha run -c {config} --pd {pointings} -o {args.path}{instance}/ -t {instance}_{i} --ob {args.path}{instance}/orbits_{i}.csv -p {args.path}{instance}/physical_{i}.csv --st {stats}_{i}", flush=True)
os.system(f"sorcha run -c {config} --pd {pointings} -o {args.path}{instance}/ -t {instance}_{i} --ob {args.path}{instance}/orbits_{i}.csv -p {args.path}{instance}/physical_{i}.csv --st {stats}_{i}")

if __name__ == '__main__':
import argparse
Expand All @@ -22,6 +22,7 @@ def run_sorcha(i, args, path_inputs, pointings, instance, config):
parser.add_argument('--cleanup', action='store_true')
parser.add_argument('--copy_inputs', action='store_true')
parser.add_argument('--pointings', type=str)
parser.add_argument('--stats', type=str)
parser.add_argument('--config', type=str)
args = parser.parse_args()
chunk = args.chunksize
Expand All @@ -30,6 +31,7 @@ def run_sorcha(i, args, path_inputs, pointings, instance, config):
pointings = args.pointings
path = args.path
config = args.config
stats=args.stats

orbits = tb.Table.read(args.input_orbits)
orbits = orbits[instance*chunk:(instance+1)*chunk]
Expand All @@ -50,7 +52,7 @@ def run_sorcha(i, args, path_inputs, pointings, instance, config):
sub_phys.write(f"{args.path}{instance}/physical_{i}.csv", overwrite=True)

with Pool(processes=args.cores) as pool:
pool.starmap(run_sorcha, [(i, args, path_inputs, pointings, instance, config) for i in range(args.cores)])
pool.starmap(run_sorcha, [(i, args, path_inputs, pointings, instance, config, stats) for i in range(args.cores)])

data = []
for i in range(args.cores):
Expand Down
45 changes: 26 additions & 19 deletions docs/hpc.rst
Original file line number Diff line number Diff line change
@@ -1,32 +1,34 @@
Sorcha Parallelization
.. _hpc:

Parallelization
===============================================

Embarrassingly Parallel Problem
------------------------------------

Sorcha’s design lends itself perfectly to parallelization – when it simulates a large number of solar system objects, each one is considered in turn independently of all other objects. If you have access to a large number of computing cores, you can run Sorcha much more quickly by dividing up the labor: giving a small part of your model population to each core.
’s design lends itself perfectly to parallelization – when it simulates a large number of solar system objects, each one is considered in turn independently of all other objects. If you have access to a large number of computing cores, you can run ``Sorcha`` much more quickly by dividing up the labor: giving a small part of your model population to each core.

This involves two subtasks: breaking up your model population into an appropriate number of input files with unique names and organizing a large number of cores to simultaneously run Sorcha their own individually-named input files. Both of these tasks are easy in theory, but tricky enough in practice that we provide some guidance below.
This involves two subtasks: breaking up your model population into an appropriate number of input files with unique names and organizing a large number of cores to simultaneously run ``Sorcha`` on their own individually-named input files. Both of these tasks are easy in theory, but tricky enough in practice that we provide some guidance below.


SLURM
---------

Slurm Workload Manager is a resource management utility commonly used by computing clusters. We provide starter code for running large parallel Sorcha batches using SLURM, though general guidance we provide is applicable to any system. Documentation for SLURM is available `here <https://slurm.schedmd.com/>`_. Please note that your HPC (High Performance Computing) facility’s SLURM setup may differ from those on which Sorcha was tested, and it is always a good idea to read any facility-specific documentation or speak to the HPC maintainers before you begin to run jobs.
Slurm Workload Manager is a resource management utility commonly used by computing clusters. We provide starter code for running large parallel batches using SLURM, though general guidance we provide is applicable to any system. Documentation for SLURM is available `here <https://slurm.schedmd.com/>`_. Please note that your HPC (High Performance Computing) facility’s SLURM setup may differ from those on which ``Sorcha`` was tested, and it is always a good idea to read any facility-specific documentation or speak to the HPC maintainers before you begin to run jobs.

Quickstart
--------------

We provide as a starting point our example scripts for running Sorcha on HPC facilities using SLURM. Some modifications will be required to make them work for your facility.
We provide as a starting point our example scripts for running on HPC facilities using SLURM. Some modifications will be required to make them work for your facility.

Below is a very simple SLURM script example designed to run the demo files three times on three cores in parallel. Here, one core has been assigned to each Sorcha run, with each core assigned 1Gb of memory.
Below is a very simple SLURM script example designed to run the demo files three times on three cores in parallel. Here, one core has been assigned to each ``Sorcha`` run, with each core assigned 1Gb of memory.

.. literalinclude:: ./example_files/multi_sorcha.sh
:language: text

Please note that time taken to run and memory required will vary enormously based on the size of your input files, your input population, and the chunk size assigned in the Sorcha configuration file: we therefore recommend test runs before you commit to very large runs. The chunk size is an especially important parameter: too small and Sorcha will take a very long time to run, too large and the memory footprint may become prohibitive. We have found that chunk sizes of 1000 to 10,000 work best.
Please note that time taken to run and memory required will vary enormously based on the size of your input files, your input population, and the chunk size assigned in the ``Sorcha`` configuration file: we therefore recommend test runs before you commit to very large runs. The chunk size is an especially important parameter: too small and ``Sorcha`` will take a very long time to run, too large and the memory footprint may become prohibitive. We have found that chunk sizes of 1000 to 10,000 work best.

Below is a more complex example of a SLURM script. Here, multi_sorcha.sh calls multi_sorcha.py, which splits up an input file into a number of ‘chunks’ and runs Sorcha in parallel on a user-specified number of cores.
Below is a more complex example of a SLURM script. Here, multi_sorcha.sh calls multi_sorcha.py, which splits up an input file into a number of ‘chunks’ and runs ``Sorcha`` in parallel on a user-specified number of cores.

multi_sorcha.sh:

Expand All @@ -38,49 +40,54 @@ multi_sorcha.py:
.. literalinclude:: ./example_files/multi_sorcha.py
:language: python

.. note::
We provide these here for you to copy, paste, and edit as needed. You might have to some some slight modifications to both the SLURM script and multi_sorcha.py depending if you're using ``Sorcha`` without calling the stats file.

multi_sorcha.sh requests many parallel Slurm jobs of multi_sorcha.py, feeding each a different --instance parameter. After changing ‘my_orbits.csv’, ‘my_colors.csv’, and ‘my_pointings.db’ to match the above, it could be run as sbatch --array=0-9 multi_sorcha.sh 25 4 to generate ten jobs, each with 4 cores running 25 orbits each.


You can run multi_sorcha.py on the command line as well::

python multi_sorcha.py --config sorcha_config_demo.ini --input_orbits mba_sample_1000_orbit.csv --input_physical mba_sample_1000_physical.csv --pointings baseline_v2.0_1yr.db --path ./ --chunksize 1000 --norbits 250 --cores 4 --instance 0 --cleanup --copy_inputs
python multi_sorcha.py --config sorcha_config_demo.ini --input_orbits mba_sample_1000_orbit.csv --input_physical mba_sample_1000_physical.csv --pointings baseline_v2.0_1yr.db --path ./ --chunksize 1000 --norbits 250 --cores 4 --instance 0 --stats mbastats --cleanup --copy_inputs

This will generate a single output file. It should work fine on a laptop, and be a bit, but not 4x, faster than the single-core equivalent due to overheads (time sorcha run -c sorcha_config_demo.ini -pd baseline_v2.0_1yr.db -o ./ -t 0_0 -ob mba_sample_1000_orbit.csv -p mba_sample_1000_physical.csv).
This will generate a single output file. It should work fine on a laptop, and be a bit, but not 4x, faster than the single-core equivalent due to overheads (time sorcha run -c sorcha_config_demo.ini -pd baseline_v2.0_1yr.db -o ./ -t 0_0 --st mbatats_0 -ob mba_sample_1000_orbit.csv -p mba_sample_1000_physical.csv).

This ratio improves as input file sizes grow. Make sure to experiment with different numbers of cores to find what’s fastest given your setup and file sizes.
.. note::
This ratio improves as input file sizes grow. Make sure to experiment with different numbers of cores to find what’s fastest given your setup and file sizes.


Sorcha’s Helpful Utilities
---------------------------------

Sorcha comes with a tool designed to combine the results of multiple runs and the input files used to create them into tables on a SQL database. This can make exploring your results easier. To see the usage of this tool, on the command line, run::
``Sorcha`` comes with a tool designed to combine the results of multiple runs and the input files used to create them into tables on a SQL database. This can make exploring your results easier. To see the usage of this tool, on the command line, run::

sorcha outputs create-sqlite –help

Sorcha also has a tool designed to search for and check the logs of a large number of runs. This tool can make sure all of the runs completed successfully, and output to either the terminal or a .csv file the names of the runs which have not completed and the relevant error message, if applicable. To see the usage of this tool, on the command line run::
``Sorcha`` also has a tool designed to search for and check the logs of a large number of runs. This tool can make sure all of the runs completed successfully, and output to either the terminal or a .csv file the names of the runs which have not completed and the relevant error message, if applicable. To see the usage of this tool, on the command line run::

sorcha outputs check-logs –help


Best Practices/Tips and Tricks
-------------------------------------

1. We strongly recommend that HPC users download the auxiliary files needed to run the ASSIST+REBOUND into a known, named directory, and use the -ar command line flag in their sorcha run call to point Sorcha to those files. You can download the auxiliary files using::
1. We strongly recommend that HPC users download the auxiliary files needed to run the ASSIST+REBOUND into a known, named directory, and use the -ar command line flag in their **sorcha run** call to point ``Sorcha`` to those files. You can download the auxiliary files using::

sorcha bootstrap --cache <directory>

And then run ``Sorcha`` via::

sorcha run … -ar /path/to/folder/

This is because Sorcha will otherwise attempt to download the files into the local cache, which may be on the HPC nodes rather than in your user directory, potentially triggering multiple slow downloads.
This is because ``Sorcha`` will otherwise attempt to download the files into the local cache, which may be on the HPC nodes rather than in your user directory, potentially triggering multiple slow downloads.

2. We recommend that each Sorcha run be given its own individual output directory. If multiple parallel Sorcha runs are attempting to save to the same file in the same directory, this will cause confusing and unexpected results.
2. We recommend that each ``Sorcha`` run be given its own individual output directory. If multiple parallel ``Sorcha`` runs are attempting to save to the same file in the same directory, this will cause confusing and unexpected results.

3. Sorcha output files can be very large, and user directories on HPC facilities are usually space-limited. Please ensure that your Sorcha runs are directing the output to be saved in a location with sufficient space, like your HPC cluster’s scratch drive.
3. ``Sorcha`` output files can be **very large**, and user directories on HPC facilities are usually space-limited. Please ensure that your ``Sorcha`` runs are directing the output to be saved in a location with sufficient space, like your HPC cluster’s scratch drive.

4. Think about having useful, helpful file names for your outputs. It is often tempting to call them something like “sorcha_output_<number>” or “sorcha_output_<taskid>”, but hard-won experience has led us to instead recommend more explanatory names for when you come back to your output later.




..tip::
You can use the **sorcha init** command to copy ``Sorcha``'s :ref:`example configuration files <example_configs>` into a directory of your choice.
56 changes: 38 additions & 18 deletions docs/outputs.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,17 @@
Outputs
==================

``Sorcha`` outputs:
* :ref:`Detections File <detections>` (list of all the detections of the input popuation made by the simulated survey
* (Optioanal) :ref:`Statistics (Tally) File <stats>` that provides a summary overview for the objects from the input population that were ''found'' in the simulated survey
* (Optional) :ref:`Ephemeris Output <ephem_output>` that provides the output from the :ref:`Ephemeris Generation<ephemeris_gen>`

.. image:: images/survey_simulator_flow_chart.png
:width: 800
:alt: An overview of the inputs and outputs of the Sorcha code.
:align: center


.. attention::
Use the **-o** flag on the command line to specify where ``Sorcha`` should be saving any output and log files (the file path).

Expand All @@ -26,6 +37,8 @@ The :ref:`configuration file<configs>` keyword output_format in the OUTPUT secti
.. attention::
Use the **-t** flag on the command line to specify the filename stem for all the ``Sorcha`` output files and logs.

.. _detections:

Detections File
----------------------

Expand All @@ -35,7 +48,7 @@ with a row for each predicted detection and a column for each parameter calcula

Additionally, the output columns of the detections file can be set to either "basic" or "all" settings (described below) using the output_columns :ref:`configuration file<configs>` keyword.

.. _basic::
.. _basic:

Basic Output
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand Down Expand Up @@ -114,7 +127,7 @@ Example Detections File in Basic Format
S1000000a,61789.27659,164.99043640246796,-19.09523631317997,164.29665099999988,-19.110176000000447,2.8895553381860802e-06,z,19.376978135088684,19.359651855968583,0.008079363622311368,0.00805998568672928,23.293210067462763,23.293123719813384
.. _full::
.. _full:

Full Output
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand Down Expand Up @@ -254,23 +267,9 @@ Detections File: Full Output Column Names, Formats, and Descriptions

Optional Outputs
----------------------

Ephemeris Output
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Optionally (with the **--ew** flag set at the command line), an ephemeris file of all detections near the
field can be generated to a separate file, which can then be provided back to ``Sorcha`` as an optional external ephemeris file with the **-er** flag.
More information can be found on this functionality, including the output columns, in the :ref:`Ephemeris Generation<ephemeris_gen>` section of the documentation.

The format of the outputted ephemeris file is controlled by the **eph_format** configuration keyword in the Inputs section of the :ref:`configuration file<configs>`e::

[INPUT]
ephemerides_type = external
eph_format = csv

.. attention::
Users should note that output produced by reading in a previously-generated ephemeris file will be in a different order than the output produced when running the ephemeris generator within ``Sorcha``.
This is simply a side-effect of how ``Sorcha`` reads in ephemeris files and does not affect the actual content of the output.

.. _stats:

Statistics (Tally) File
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
``Sorcha`` can also output a statistics or "tally" file (if specified uisng the **--st flag) which contains an overview of the ``Sorcha`` output for each object and filter. Minimally, this
Expand Down Expand Up @@ -311,3 +310,24 @@ Statistics (Tally) File Column Names, Formats, and Descriptions

.. note::
Unless the user has specified **drop_unlinked = False** in the :ref:`configuration file<configs>`, the object_linked column will read TRUE for all objects. To see which objects were not linked by ``Sorcha``, this variable must be set to False.

.. _ephem_output:

Ephemeris Output
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Optionally (with the **--ew** flag set at the command line), an ephemeris file of all detections near the
field can be generated to a separate file, which can then be provided back to ``Sorcha`` as an optional external ephemeris file with the **-er** flag.
More information can be found on this functionality, including the output columns, in the :ref:`Ephemeris Generation<ephemeris_gen>` section of the documentation.

The format of the outputted ephemeris file is controlled by the **eph_format** configuration keyword in the Inputs section of the :ref:`configuration file<configs>`e::

[INPUT]
ephemerides_type = external
eph_format = csv

.. attention::
Users should note that output produced by reading in a previously-generated ephemeris file will be in a different order than the output produced when running the ephemeris generator within ``Sorcha``. This is simply a side-effect of how ``Sorcha`` reads in ephemeris files and does not affect the actual content of the output.

.. tip::
If instead you want to know which of the input small body population lands in the survey observations with an estimate of their apparent magnitude wihtout applying any other cuts or filters on the detections (not including discovery efficiency and linking effects), you can use/adapt the :ref:`known_config` example :ref:`configs`.

0 comments on commit 2162419

Please sign in to comment.