diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
index 45c3811e..f83b0396 100644
--- a/.github/workflows/ci.yml
+++ b/.github/workflows/ci.yml
@@ -109,7 +109,14 @@ jobs:
uses: ammaraskar/sphinx-action@master
with:
docs-folder: "docs/"
- pre-build-command: "mkdir -p docs/build/html/coverage/htmlcov && chmod -R 777 docs/build/html && chown 1001 docs/build/html/coverage"
+ pre-build-command: |
+ "mkdir -p docs/build/html/coverage/htmlcov && chmod -R 777 docs/build/html && chmod -R 777 docs/build/html/cover/htmlcov && chown 1001 docs/build/html/coverage"
+ - name: "Chmod directory structure"
+ run: |
+ sudo chmod -R 777 docs/build/html
+ # sudo chmod -R 777 docs/build/html/coverage
+ # sudo chmod -R 777 docs/build/html/coverage/htmlcov
+ ls -la docs/build/html/coverage/htmlcov
- name: "Download coverage report"
uses: actions/download-artifact@v3
with:
@@ -120,6 +127,9 @@ jobs:
with:
name: "covbadge"
path: "docs/build/html"
+ - name: "List html dir"
+ run: |
+ ls -la docs/build/html/coverage/htmlcov
- name: Upload artifacts
uses: actions/upload-artifact@v3
with:
diff --git a/README.md b/README.md
index 2325d669..7e2faa3a 100644
--- a/README.md
+++ b/README.md
@@ -6,52 +6,145 @@ CEDA Near-Line Data Store
[![PEP8](https://img.shields.io/badge/code%20style-pep8-orange.svg)](https://www.python.org/dev/peps/pep-0008/)
[![Coverage](https://cedadev.github.io/nlds/coverage.svg)](https://cedadev.github.io/nlds/coverage/htmlcov/)
-This is the HTTP API server code for the CEDA Near-Line Data Store (NLDS).
-It requires the use of the NLDS client, either the command line or library:
-[NLDS client on GitHub](https://github.com/cedadev/nlds-client).
+This is the server code for the CEDA Near-Line Data Store (NLDS), consisting of
+an HTTP API and a cluster of rabbit consumer microservices. The
+[NLDS client](https://github.com/cedadev/nlds-client) is required to communicate
+with the API, either via the command line interface or python client library.
-NLDS server is built upon [FastAPI](https://fastapi.tiangolo.com).
+The NLDS is a unified storage solution, allowing easy use of disk, s3 object
+storage, and tape from a single interface. It utilises object storage as a cache
+for the tape backend allowing for low-latency backup
-NLDS requires Python 3. It has been tested with Python 3.9, 3.10 and Python 3.11.
+The NLDS server is built upon [FastAPI](https://fastapi.tiangolo.com) for the
+API, [RabbitMQ](https://www.rabbitmq.com/) for the message broker,
+[minio](https://min.io/) for the s3 client,
+[SQLAlchemy](https://www.sqlalchemy.org/) for the database client and
+[xrootd](https://xrootd.slac.stanford.edu/) for the tape interactions.
+
+Documentation can be found [here](https://cedadev.github.io/nlds/index.html).
Installation
------------
+If installing locally we strongly recommend the use of a virtual environment to
+manage the dependencies.
+
1. Create a Python virtual environment:
- `python3 -m venv ~/nlds-venv`
+
+ ```
+ python3 -m venv nlds-venv
+ ```
2. Activate the nlds-venv:
- `source ~/nlds-venv/bin/activate`
-
-3. Install the nlds package with editing capability:
- `pip install -e ~/Coding/nlds`
-
-Running - Dec 2021
-------------------
-
-1. NLDS currently uses `uvicorn` to run. The command line to invoke it is:
-```uvicorn nlds.main:nlds --reload```
-
- This will create the NLDS REST-API server at the IP-address: `http://127.0.0.1:8000/`
-2. To run the processors, you have two options:
- 1. In unique terminals start each processor individually, after
- activating the virtual env, for example:
- ```source ~/nlds-venv/bin/activate; python nlds_processors/index.py```
- This will send the output to the terminal.
-
- 2. Use the script `test_run_processor.sh`. This will run all five processors
- in the background, sending the output to five logs in the `~/nlds_log/`
- directory.
-
-Viewing the API docs
---------------------
-
-FastAPI displays automatically generated documentation for the REST-API. To browse this go to: http://127.0.0.1:8000/docs#/
+ ```
+ source nlds-venv/bin/activate
+ ```
+
+3. You could either install the nlds package with editing capability from a
+ locally cloned copy of this repo (note the inclusion of the editable flag
+ `-e`), e.g.
+
+ ```
+ pip install -e ~/Coding/nlds
+ ```
+
+ or install this repo directly from github:
+
+ ```
+ pip install git+https://github.com/cedadev/nlds.git
+ ```
+
+4. (Optional) There are several more requirements/dependencies defined:
+ * `requirements-dev.txt` - contains development-specific (i.e. not
+ production appropriate) dependencies. Currently this consists of a psycopg2
+ binary python package for interacting with PostgeSQL from a local NLDS
+ instance.
+ * `requirements-deployment.txt` - contains deployment-specific
+ dependencies, excluding `XRootD`. Currently this consists of the psycopg2
+ package but built from source instead of a precompiled binary.
+ * `requirements-tape.txt` - contains tape-specific dependencies, notably
+ `XRootD`.
+ * `tests/requirements.txt` - contains the dependencies for the test suite.
+ * `docs/requirements.txt` - contains the dependencies required for
+ building the documentation with sphinx.
Server Config
-------------
-To interface with the JASMIN accounts portal, for the OAuth2 authentication, a `.server_config` file has to be created. This contains infrastructure information and so is not included in the GitHub repository.
+To interface with the JASMIN accounts portal, for the OAuth2 authentication, a
+`.server_config` file has to be created. This contains infrastructure
+information and so is not included in the GitHub repository. See the
+[relevant documentation](https://cedadev.github.io/nlds/server-config/server-config.html)
+and [examples](https://cedadev.github.io/nlds/server-config/examples.html) for
+more information.
+
+A Jinja-2 template for the `.server_config` file can also be found in the
+`templates/` directory.
+
+Running the Server
+------------------
-A Jinja-2 template for the `.server_config` file can be found in the `templates/` directory.
+1. The NLDS API requires something to serve the API, usually uvicorn in a local
+ development environment:
+
+ ```
+ uvicorn nlds.main:nlds --reload
+ ```
+
+ This will create a local NLDS API server at `http://127.0.0.1:8000/`.
+ FastAPI displays automatically generated documentation for the REST-API, to
+ browse this go to http://127.0.0.1:8000/docs/
+
+2. To run the microservices, you have two options:
+ * In individual terminals, after activating the virtual env, (e.g.
+ `source ~/nlds-venv/bin/activate`), start each of the microservice
+ consumers:
+ ```
+ nlds_q
+ index_q
+ catalog_q
+ transfer_put_q
+ transfer_get_q
+ logging_q
+ archive_put_q
+ archive_get_q
+ ```
+ This will send the output of each consumer to its own terminal (as well
+ as whatever is configured in the logger).
+
+ * Alternatively, you can use the scripts in the `test_run/` directory,
+ notably `start_test_run.py` to start and `stop_test_run.py` to stop.
+ This will start a [screen](https://www.gnu.org/software/screen/manual/screen.html)
+ session with all 8 processors (+ api server) in, sending each output to
+ a log in the `./nlds_log/` directory.
+
+Tests
+-----
+
+The NLDS uses pytest for its unit test suite. Once `test/requirements.txt` have
+been installed, you can run the tests with
+```
+pytest
+```
+in the root directory. Pytest is also used for integration testing in the
+separate [nlds-test repo](https://github.com/cedadev/nlds-test).
+
+The `pytest` test-coverage report can (hopefully) be found [here](https://cedadev.github.io/nlds/coverage/htmlcov/).
+
+
+License
+-------
+
+The NLDS is available on a BSD 2-Clause License, see the [license](./LICENSE.txt)
+for more info.
+
+
+
+Acknowledgements
+================
+
+NLDS was developed at the Centre for Environmental Data Analysis and supported
+through the ESiWACE2 project. The project ESiWACE2 has received funding from the
+European Union's Horizon 2020 research and innovation programme under grant
+agreement No 823988.
diff --git a/docs/source/ceda.png b/docs/source/_images/ceda.png
similarity index 100%
rename from docs/source/ceda.png
rename to docs/source/_images/ceda.png
diff --git a/docs/source/_images/esiwace2.png b/docs/source/_images/esiwace2.png
new file mode 100644
index 00000000..ba3c7db7
Binary files /dev/null and b/docs/source/_images/esiwace2.png differ
diff --git a/docs/source/_images/icon-black.png b/docs/source/_images/icon-black.png
new file mode 100644
index 00000000..1b4b68e8
Binary files /dev/null and b/docs/source/_images/icon-black.png differ
diff --git a/docs/source/_images/icon.png b/docs/source/_images/icon.png
new file mode 100644
index 00000000..4197acc0
Binary files /dev/null and b/docs/source/_images/icon.png differ
diff --git a/docs/source/_images/logo.png b/docs/source/_images/logo.png
new file mode 100644
index 00000000..4c77be37
Binary files /dev/null and b/docs/source/_images/logo.png differ
diff --git a/docs/source/_images/nlds-logo.png b/docs/source/_images/nlds-logo.png
new file mode 100644
index 00000000..5d3fcbc0
Binary files /dev/null and b/docs/source/_images/nlds-logo.png differ
diff --git a/docs/source/_images/nlds.pdf b/docs/source/_images/nlds.pdf
new file mode 100644
index 00000000..4d58c387
Binary files /dev/null and b/docs/source/_images/nlds.pdf differ
diff --git a/docs/source/_images/nlds.png b/docs/source/_images/nlds.png
new file mode 100644
index 00000000..606727c4
Binary files /dev/null and b/docs/source/_images/nlds.png differ
diff --git a/docs/source/status_images/all_off.png b/docs/source/_images/status/all_off.png
similarity index 100%
rename from docs/source/status_images/all_off.png
rename to docs/source/_images/status/all_off.png
diff --git a/docs/source/status_images/failed.png b/docs/source/_images/status/failed.png
similarity index 100%
rename from docs/source/status_images/failed.png
rename to docs/source/_images/status/failed.png
diff --git a/docs/source/status_images/part_failed.png b/docs/source/_images/status/part_failed.png
similarity index 100%
rename from docs/source/status_images/part_failed.png
rename to docs/source/_images/status/part_failed.png
diff --git a/docs/source/status_images/short_table.png b/docs/source/_images/status/short_table.png
similarity index 100%
rename from docs/source/status_images/short_table.png
rename to docs/source/_images/status/short_table.png
diff --git a/docs/source/status_images/success.png b/docs/source/_images/status/success.png
similarity index 100%
rename from docs/source/status_images/success.png
rename to docs/source/_images/status/success.png
diff --git a/docs/source/_templates/layout.html b/docs/source/_templates/layout.html
new file mode 100644
index 00000000..255c0ebc
--- /dev/null
+++ b/docs/source/_templates/layout.html
@@ -0,0 +1,35 @@
+{% extends "!layout.html" %}
+ {% block footer %} {{ super() }}
+
+
+{% endblock %}
+
+
\ No newline at end of file
diff --git a/docs/source/conf.py b/docs/source/conf.py
index 8150715e..87601145 100644
--- a/docs/source/conf.py
+++ b/docs/source/conf.py
@@ -19,7 +19,7 @@
# -- Project information -----------------------------------------------------
-project = 'Near-line Data Store'
+project = 'Near-line Data Store Server'
copyright = '2022-2024, Neil Massey & Jack Leland'
author = 'Neil Massey & Jack Leland'
@@ -39,6 +39,7 @@
# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']
+html_favicon = '_images/icon-black.png'
# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
@@ -58,8 +59,8 @@
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ['_static']
-html_logo = "ceda.png"
+html_logo = "_images/nlds.png"
html_theme_options = {
- 'logo_only': True,
- 'display_version': False,
+ 'logo_only': False,
+ 'display_version': True,
}
\ No newline at end of file
diff --git a/docs/source/coverage/coverage-report.rst b/docs/source/coverage/coverage-report.rst
new file mode 100644
index 00000000..108cc09c
--- /dev/null
+++ b/docs/source/coverage/coverage-report.rst
@@ -0,0 +1,10 @@
+Test coverage report
+====================
+
+Link to the most `recent coverage report `_.
+
+.. toctree::
+ :maxdepth: 1
+ :caption: Coverage report
+
+ Index
\ No newline at end of file
diff --git a/docs/source/coverage/htmlcov/index.rst b/docs/source/coverage/htmlcov/index.rst
new file mode 100644
index 00000000..abe3ed1d
--- /dev/null
+++ b/docs/source/coverage/htmlcov/index.rst
@@ -0,0 +1,2 @@
+htmlcov
+=======
\ No newline at end of file
diff --git a/docs/source/deployment.rst b/docs/source/deployment.rst
index baac7c0a..f3d3fa1f 100644
--- a/docs/source/deployment.rst
+++ b/docs/source/deployment.rst
@@ -47,19 +47,20 @@ three specific roles:
The string after each of these corresponds to the image's location on CEDA's
Harbor registry (and therefore what tag/registry address to use to ``docker
pull`` each of them). As may be obvious, the FastAPI server runs on the
-``Generic Server`` image and contains an installation of asgi, building upon the
-``asgi`` [base-image], to actually run the server. The rest run on the ``Generic
-Consumer`` image, which has an installation of the NLDS repo, along
-with its dependencies, to allow it to run a given consumer. The only dependency
-which isn't included is xrootd as it is a very large and long installation
-process and unnecessary to the running of the non-tape consumers. Therefore the
-``Tape Consumer`` image was created, which appropriately builds upon the
-``Geneic Consumer`` image with an additional installation of ``xrootd`` with
-which to run tape commands. The two tape consumers, ``Archive-Put`` and
-``Archive-Get``, run on containers using this image.
+``Generic Server`` image and contains an installation of ``asgi``, building upon
+the ``asgi`` `base-image `_,
+to actually run the server. The rest run on the ``Generic Consumer`` image,
+which has an installation of the NLDS repo, along with its dependencies, to
+allow it to run a given consumer. The only dependency which isn't included is
+``xrootd`` as it is a very large and long installation process and unnecessary
+to the running of the non-tape consumers. Therefore the ``Tape Consumer`` image
+was created, which appropriately builds upon the ``Geneic Consumer`` image with
+an additional installation of ``xrootd`` with which to run tape commands. The
+two tape consumers, ``Archive-Put`` and ``Archive-Get``, run on containers using
+this image.
The two consumer containers run as the user NLDS, which is an official JASMIN
-user at uid=7054096 and is baked into the container (i.e. unconfigurable).
+user at ``uid=7054096`` and is baked into the container (i.e. unconfigurable).
Relatedly, every container runs with config associating the NLDS user with
supplemental groups, the list of which constitutes every group-workspace on
JASMIN. The list was generated with the command::
@@ -67,7 +68,7 @@ JASMIN. The list was generated with the command::
ldapsearch -LLL -x -H ldap://homer.esc.rl.ac.uk -b "ou=ceda,ou=Groups,o=hpc,dc=rl,dc=ac,dc=uk"
This will need to be periodically rerun and the output reformatted to update the
-list of ``supplementalGroups`` in [this config file].
+list of ``supplementalGroups`` in `this config file `_.
Each of the containers will also have specific config and specific deployment
setup to help the container perform its particular task its particular task.
@@ -80,8 +81,9 @@ tasks, which some, or all, of the containers make use of to function.
The most commonly used is the ``nslcd`` pod which provides the containers with
up-to-date uid and gid information from the LDAP servers. This directly uses the
-[``nslcd``] image developed for the notebook server, and runs as a side-car in
-every deployed pod to periodically poll the LDAP servers to provide names and
+`nslcd `_
+image developed for the notebook server, and runs as a side-car in every
+deployed pod to periodically poll the LDAP servers to provide names and
permissions information to the main container in the pod (the consumer) so that
file permissions can be handled properly. In other words, it ensures the
``passwd`` file on the consumer container is up to date, and therefore that the
@@ -116,6 +118,30 @@ up to date. This will be discussed in more detail in section
There are some slightly more complex deployment configurations involved in the
rest of the setup, which are described below.
+.. _api_server:
+
+API Server
+----------
+
+The NLDS API server, as mentioned above, was written using FastAPI. In a local
+development environment this is served using ``uvicorn``, but for the production
+deployment the `base-image `
+base-image is used, which runs the server instead with ``guincorn``. They are
+functionally identical so this is not a problem per se, just something to be
+aware of. The NLDS API helm deployment is an extension of the standard `FastAPI helm chart `_.
+
+On production, this API server sits facing the public internet behind an NGINX
+reverse-proxy, handled by the standard `nginx helm chart `_
+in the ``cedaci/helm-charts`` repo. It is served to the domain
+`https://nlds.jasmin.ac.uk `_, with the standard NLDS
+API endpoints extending from that (such as ``/docs``, ``/system/status``). The
+NLDS API also has an additional endpoint (``/probe/healthz``) for the Kubernetes
+liveness probe to periodically ping to ensure the API is alive, and that the
+appropriate party is notified if it goes down. Please note, this is not a
+deployment specific endpoint and will also exist on any local development
+instances.
+
+
.. _tape_keys:
Tape Keys
@@ -135,7 +161,7 @@ environment variables must be created::
The problem arises with the use of Kubernetes, wherein the keytab content string
must be kept secret. This is handled in the CEDA gitlab deployment process
-through the use of git-crypt (see `here `_
+through the use of git-crypt (see `here `__
for more details) to encrypt and Kubernetes secrets to decrypt at deployment
time. Unfortunately permissions can't be set, no changed, on files made by
Kubernetes secrets, so to get the keytab in the right place with the right
@@ -155,7 +181,7 @@ schema before the database has been migrated, and this is implemented through
two mechanisms in the deployment:
1. An init-container on the catalog, which has the config for both the catalog
- and montioring dbs, which has alembic installed and calls::
+ and montioring DBs, which has alembic installed and calls::
alembic upgrade head
@@ -178,10 +204,10 @@ the cluster controller) and no means of attaching a more persistent volume to
store logs in long-term.
The, relatively new, solution that exists on the CEDA cluster is the use of
-`fluentd`, and more precisely `fluentbit `_,
+``fluentd``, and more precisely `fluentbit `_,
to aggregate logs from the NLDS logging microservice and send them to a single
-external location running `fluentd` – currently the stats-collection virtual
-machine run on JASMIN. Each log sent to the `fluentd`` service is tagged with a
+external location running ``fluentd`` – currently the stats-collection virtual
+machine run on JASMIN. Each log sent to the ``fluentd`` service is tagged with a
string representing the particular microservice log file it was collected from,
e.g. the logs from the indexer microservice on the staging deployment are tagged
as::
@@ -189,18 +215,18 @@ as::
nlds_staging_index_q_log
This is practically achieved through the use of a sidecar – a further container
-running in teh same pod as the logging container – running the fluentbit image
-as defined by the `fluentbit helm chart `_.
-The full `fluentbit`` config, including the full list of tags, can be found `in
-the logging config yamls `_.
+running in the same pod as the logging container – running the ``fluentbit``
+image as defined by the `fluentbit helm chart `_.
+The full ``fluentbit`` config, including the full list of tags, can be found
+`in the logging config yamls `_.
When received by the fluentd server, each tagged log is collated into a larger
log file for help with debugging at some later date. The log files on the
logging microservice's container are rotated according to size, and so should
not exceed the pod's allocated memory limit.
.. note::
- The `fluentbit` service is still in its infancy and subject to change at
- short notice as the system & helm chart get more widely adopted. For example
+ The ``fluentbit`` service is still in its infancy and subject to change at
+ short notice as the system & helm chart get more widely adopted. For example,
the length of time log files are kept on the stats machine has not been
finalised yet.
@@ -232,7 +258,7 @@ where it is set to 8 and the Transfer-Get where is set to 2.
the size of a ``Rabbit`` queue for a given microservice, and while this is
`in theory` `possible `_,
this was not possible with the current installation of Kubernetes without
- additional plugins, namely `Prometheus`.
+ additional plugins, namely ``Prometheus``.
The other aspect of scaling is the resource requested by each of the pods, which
have current `default values `_
@@ -246,7 +272,7 @@ these were arrived at by using the command::
Ctrl + `
within the kubectl shell on the appropriate rancher cluster (accessible via the
-shell button in the top right, or shortcut |sc|). ``{NLDS_NAMESPACE}``will need
+shell button in the top right, or shortcut |sc|). ``{NLDS_NAMESPACE}`` will need
to be replaced with the appropriate namespace for the cluster you are on, i.e.::
kubectl top pod -n nlds # on wigbiorg
@@ -256,6 +282,47 @@ to be replaced with the appropriate namespace for the cluster you are on, i.e.::
and, as before, these will likely need to be adjusted as understanding of the
actual resource use of each of the microservices evolves.
+.. _chowning:
+
+Changing ownership of files
+---------------------------
+
+A unique problem arose in beta testing where the NLDS was not able to change
+ownership of the files downloaded during a ``get`` to the user that requested them
+from within a container that was not allowed to run as root. As such, a solution
+was required which allowed a very specific set of privileges to be escalated
+without leaving any security vulnerabilities open.
+
+The solution found was to include an additional binary in the
+``Generic Consumer`` image - ``chown_nlds`` - which has the ``setuid``
+permissions bit set and is therefore able to change directories. To minimise
+exposed attack surface, the binary was compiled from a `rust script `_
+which allows only the ``chown``-ing of files owned by the NLDS user (on JASMIN
+``uid=7054096``). Additionally, the target must be a file or directory and the
+``uid`` being changed to must be greater than 1024 to avoid clashes with system
+``uid``s. This binary will only execute on any containers where the appropriate
+security context is set, notably::
+
+ securityContext:
+ allowPrivilegeEscalation: true
+ add:
+ - CHOWN
+
+which in the NLDS deployment helm chart is only set for the ``Transfer-Get``
+containers/pods.
+
+
+.. _archive_put:
+
+Archive Put Cronjob
+-------------------
+
+The process by which the archive process is started has been automated for this
+deployment, running as a `Kubernetes cronjob `_
+every 12 hours at midnight and midday. The Helm config controlling this can be
+seen `here `_.
+This cronjob will simply call the ``send_archive_next()`` entry point, which
+sends a message directly to the RabbitMQ exchange for routing to the Catalog.
.. _staging:
@@ -291,4 +358,5 @@ everything on this page, this was true at the time of writing (2024-03-06).
- Uses ``nlds-cache-02-o`` tenancy, ``nlds-cache-01-o`` also available
* - API Server
- `https://nlds-master.130.246.130.221.nip.io/ `_ (firewalled)
- - `https://nlds.jasmin.ac.uk/ `_ (public, ssl secured)
\ No newline at end of file
+ - `https://nlds.jasmin.ac.uk/ `_ (public, ssl secured)
+
diff --git a/docs/source/index.rst b/docs/source/index.rst
index 2d02458b..9ca9b1d0 100644
--- a/docs/source/index.rst
+++ b/docs/source/index.rst
@@ -3,18 +3,22 @@
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
-Near-line Data Store documentation
-==================================
+Near-line Data Store - Server Documentation
+===========================================
-This is the documentation for the Near-line Data Store (NLDS), a tool developed
-at JASMIN to provide a single interface for disk, object storage and tape.
+This is the documentation for the Near-line Data Store (NLDS) server, a tool
+developed at JASMIN to provide a single interface for disk, object storage and
+tape. We also have separate documentation for the `NLDS client `_.
+
+These docs are split into a user-guide, if you plan on simply running the NLDS
+server; a development guide, for some of the specifics required to know about if
+you plan on contributing to the NLDS; and an API Reference.
.. toctree::
:maxdepth: 1
- :caption: Contents
+ :caption: User Guide
Getting started
- Specification document
Using the system status page
The server config file
Server config examples
@@ -25,8 +29,10 @@ at JASMIN to provide a single interface for disk, object storage and tape.
:maxdepth: 1
:caption: Development
+ Specification document
Setting up a CTA tape emulator
Database Migrations with Alembic
+ Test coverage report
.. toctree::
@@ -43,3 +49,20 @@ Indices and tables
* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`
+
+
+Acknowledgements
+================
+
+NLDS was developed at the Centre for Environmental Data Analysis and supported
+through the ESiWACE2 project. The project ESiWACE2 has received funding from the
+European Union's Horizon 2020 research and innovation programme under grant
+agreement No 823988.
+
+.. image:: _images/esiwace2.png
+ :width: 300
+ :alt: ESiWACE2 Project Logo
+
+.. image:: _images/ceda.png
+ :width: 300
+ :alt: CEDA Logo
\ No newline at end of file
diff --git a/docs/source/server-config/examples.rst b/docs/source/server-config/examples.rst
index 62fc8130..9af9b62e 100644
--- a/docs/source/server-config/examples.rst
+++ b/docs/source/server-config/examples.rst
@@ -19,17 +19,20 @@ machine - likely a laptop or single vm. This file would be saved at
"oauth_token_introspect_url" : "[REDACTED]"
}
},
+ "general": {
+ "retry_delays": [
+ 1, 5, 10, 20, 30, 60, 120, 240, 480
+ ]
+ },
"index_q":{
"logging":{
"enable": true
},
"filelist_threshold": 10000,
"check_permissions_fl": true,
- "use_pwd_gid_fl": true,
+ "max_filesize": 5000000,
"retry_delays": [
- 0,
- 1,
- 2
+ 0, 0, 0
]
},
"nlds_q":{
@@ -43,11 +46,8 @@ machine - likely a laptop or single vm. This file would be saved at
},
"tenancy": "example-tenancy.s3.uk",
"require_secure_fl": false,
- "use_pwd_gid_fl": true,
"retry_delays": [
- 0,
- 1,
- 2
+ 0, 1, 2
]
},
"transfer_get_q":{
@@ -56,12 +56,16 @@ machine - likely a laptop or single vm. This file would be saved at
},
"tenancy": "example-tenancy.s3.uk",
"require_secure_fl": false,
- "use_pwd_gid_fl": true
+ "retry_delays": [
+ 10,
+ 20,
+ 30
+ ]
},
"monitor_q":{
"db_engine": "sqlite",
"db_options": {
- "db_name" : "//Users/jack.leland/nlds/nlds_monitor.db",
+ "db_name" : "//home/nlds/nlds_monitor.db",
"db_user" : "",
"db_passwd" : "",
"echo": false
@@ -86,32 +90,69 @@ machine - likely a laptop or single vm. This file would be saved at
"logs/transfer_put_q.txt",
"logs/transfer_get_q.txt",
"logs/logging_q.txt",
- "logs/api_server.txt"
- ]
+ "logs/api_server.txt",
+ "logs/archive_put_q.txt",
+ "logs/archive_get_q.txt"
+ ],
+ "max_bytes": 33554432,
+ "backup_count": 0
}
},
"catalog_q":{
"db_engine": "sqlite",
"db_options": {
- "db_name" : "//Users/jack.leland/nlds/nlds_catalog.db",
+ "db_name" : "//home/nlds/nlds_catalog.db",
"db_user" : "",
"db_passwd" : "",
"echo": false
},
"retry_delays": [
0,
- 1,
- 2
+ 0,
+ 0
],
"logging":{
"enable": true
+ },
+ "default_tape_url": "root://example-tape.endpoint.uk//eos/ctaeos/cta/nlds",
+ "default_tenancy": "example-tenancy.s3.uk",
+ },
+ "archive_get_q": {
+ "tape_url": "root://example-tape.endpoint.uk//eos/ctaeos/cta/nlds",
+ "tape_pool": "",
+ "chunk_size": 262144,
+ "tenancy": "example-tenancy.s3.uk",
+ "print_tracebacks_fl": false,
+ "check_permissions_fl": false,
+ "require_secure_fl": false,
+ "max_retries": 5,
+ "retry_delays": [0.0, 0.0, 0.0],
+ "logging": {
+ "enable": true
+ }
+ },
+ "archive_put_q": {
+ "query_checksum_fl": true,
+ "tape_url": "root://example-tape.endpoint.uk//eos/ctaeos/cta/nlds",
+ "tape_pool": "",
+ "chunk_size": 262144,
+ "tenancy": "example-tenancy.s3.uk",
+ "print_tracebacks_fl": false,
+ "check_permissions_fl": false,
+ "require_secure_fl": false,
+ "max_retries": 1,
+ "retry_delays": [0.0, 0.0, 0.0],
+ "logging": {
+ "enable": true
}
},
"rabbitMQ": {
- "user": "full_access",
- "password": "passwordletmein123",
- "server": "130.246.3.98",
- "vhost": "delayed-test",
+ "user": "[REDACTED]",
+ "password": "[REDACTED]",
+ "heartbeat": 5,
+ "server": "[REDACTED]",
+ "vhost": "delayed-nlds",
+ "admin_port": 15672,
"exchange": {
"name": "test_exchange",
"type": "topic",
@@ -129,6 +170,10 @@ machine - likely a laptop or single vm. This file would be saved at
"exchange": "test_exchange",
"routing_key": "nlds-api.*.complete"
},
+ {
+ "exchange": "test_exchange",
+ "routing_key": "nlds-api.*.reroute"
+ },
{
"exchange": "test_exchange",
"routing_key": "nlds-api.*.failed"
@@ -175,6 +220,18 @@ machine - likely a laptop or single vm. This file would be saved at
{
"exchange": "test_exchange",
"routing_key": "*.catalog-del.start"
+ },
+ {
+ "exchange": "test_exchange",
+ "routing_key": "*.catalog-archive-next.start"
+ },
+ {
+ "exchange": "test_exchange",
+ "routing_key": "*.catalog-archive-del.start"
+ },
+ {
+ "exchange": "test_exchange",
+ "routing_key": "*.catalog-archive-update.start"
}
]
},
@@ -204,21 +261,154 @@ machine - likely a laptop or single vm. This file would be saved at
"routing_key": "*.log.*"
}
]
+ },
+ {
+ "name": "archive_get_q",
+ "bindings": [
+ {
+ "exchange": "test_exchange",
+ "routing_key": "*.archive-get.start"
+ }
+ ]
+ },
+ {
+ "name": "archive_put_q",
+ "bindings": [
+ {
+ "exchange": "test_exchange",
+ "routing_key": "*.archive-put.start"
+ }
+ ]
}
]
},
"rpc_publisher": {
"queue_exclusivity_fl": true
+ },
+ "cronjob_publisher": {
+ "access_key": "[REDACTED]",
+ "secret_key": "[REDACTED]",
+ "tenancy": "example-tenancy.s3.uk"
}
}
-Note that this is purely an example and doesn't necessarily use all features
-within the NLDS. For example, several individual consumers have ``retry_delays``
-set but not generic ``retry_delays`` is set in the ``general`` section. Note
-also that the jasmin authenication configuration is redacted for security
+
+Note that this is purely illustrative and doesn't necessarily use all features
+within the NLDS - it is provided as a reference for making a new working server
+config. Note also that certain sensitive information is redacted for security
purposes.
Distributed NLDS
----------------
-COMING SOON
\ No newline at end of file
+When making the config for a distributed NLDS, the above would need to be split
+into the appropriate sections for each of the distributed parts being run
+separately, namely by the consumer-specific and publisher-specific sections.
+Each consumer needs the core, required ``authentication`` and ``rabbitMQ``,
+optionally ``logging`` or ``general`` config and then whatever consumer-specific
+values necessary to change from default values.
+
+The following is a breakdown of how it might be achieved:
+
+API-Server
+^^^^^^^^^^
+
+This would only contain the required sections as well as, optionally, any config
+for the ``rpc_publisher``::
+
+ {
+ "authentication": {
+ "authenticator_backend": "jasmin_authenticator",
+ "jasmin_authenticator": {
+ "user_profile_url" : "[REDACTED]",
+ "user_services_url" : "[REDACTED]",
+ "oauth_token_introspect_url" : "[REDACTED]"
+ }
+ },
+ "rabbitMQ": {
+ "user": "[REDACTED]",
+ "password": "[REDACTED]",
+ "heartbeat": 5,
+ "server": "[REDACTED]",
+ "vhost": "nlds_staging",
+ "admin_port": 15672,
+ "exchange": {
+ "name": "nlds",
+ "type": "topic",
+ "delayed": true
+ },
+ "queues": []
+ },
+ "rpc_publisher": {
+ "time_limit": 60
+ }
+ }
+
+NLDS Worker
+^^^^^^^^^^^
+
+This, again, contains the required sections, as well as consumer specific config
+for the NLDS-Worker. In this case the additional info would be enabling the
+logging at ``debug`` level and defining the bindings (routing keys) for the
+consumer's queue.
+
+.. code-block:: json
+
+ {
+ "authentication": {
+ "authenticator_backend": "jasmin_authenticator",
+ "jasmin_authenticator": {
+ "user_profile_url" : "[REDACTED]",
+ "user_services_url" : "[REDACTED]",
+ "oauth_token_introspect_url" : "[REDACTED]"
+ }
+ },
+ "rabbitMQ": {
+ "user": "[REDACTED]",
+ "password": "[REDACTED]",
+ "heartbeat": 5,
+ "server": "[REDACTED]",
+ "vhost": "nlds_staging",
+ "admin_port": 15672,
+ "exchange": {
+ "name": "nlds",
+ "type": "topic",
+ "delayed": true
+ },
+ "queues": [
+ {
+ "name": "nlds_q",
+ "bindings": [
+ {
+ "exchange": "nlds",
+ "routing_key": "nlds-api.route.*"
+ },
+ {
+ "exchange": "nlds",
+ "routing_key": "nlds-api.*.complete"
+ },
+ {
+ "exchange": "nlds",
+ "routing_key": "nlds-api.*.failed"
+ }
+ ]
+ }
+ ]
+ },
+ "logging": {
+ "log_level": "debug"
+ },
+ "nlds_q": {
+ "logging": {
+ "enable": true
+ }
+ }
+ }
+
+Every other consumer would be populated similarly.
+
+.. note::
+ In the production deployment of NLDS, this is practically achieved through
+ ``helm`` and the combination of different yaml config files. Please see the
+ :doc:`../deployment` documentation for more details on the practicalities of
+ deploying the NLDS.
diff --git a/docs/source/server-config/server-config.rst b/docs/source/server-config/server-config.rst
index f04ff033..b29a6453 100644
--- a/docs/source/server-config/server-config.rst
+++ b/docs/source/server-config/server-config.rst
@@ -9,8 +9,10 @@ server_config in the templates section of the main nlds package
demystify the configuration needed for (a) a local development copy of the nlds,
and (b) a production system spread across several pods/virtual machines.
-*Please note that the NLDS is in active development and all of this is subject
-to change with no notice.*
+.. note::
+ Please note that the NLDS is still being developed and so the following is
+ subject to change in future versions.
+
Required sections
-----------------
@@ -53,6 +55,7 @@ brokering system. The following is an outline of what is required::
"rabbitMQ": {
"user": "{{ rabbit_user }}",
"password": "{{ rabbit_password }}",
+ "heartbeat": "{{ rabbit_heartbeat }}",
"server": "{{ rabbit_server }}",
"vhost": "{{ rabbit_vhost }}",
"exchange": {
@@ -75,33 +78,36 @@ brokering system. The following is an outline of what is required::
Here the ``user`` and ``password`` fields refer to the username and password for
the rabbit server you wish to connect to, which is in turn specified with
-``server``. ``vhost`` is similarly the virtual host on the rabbit server that
-you wish to connect to.
+``server``. ``vhost`` is similarly the `virtualhost` on the rabbit server that
+you wish to connect to. ``heartbeat`` is a recent addition which determines the
+`heartbeats` on the BlockingConnection that the NLDS makes with the rabbit
+server. This essentially puts a hard limit on when a connection has to be
+responsive by before it is killed by the server, see `the rabbit docs `_
+for more details.
The next two dictionaries are context specific. All publishing elements of the
NLDS, i.e. parts that will send messages, will require an exchange to publish
messages to. ``exchange`` is determines that exchange, with three required
subfields: ``name``, ``type``, and ``delayed``. The former two are self
-descriptive, they should just be the name of the exchange on the virtualhost and
-it's corresponding type e.g. one of fanout, direct or topic. ``delay`` is a
+descriptive, they should just be the name of the exchange on the `virtualhost`
+and it's corresponding type e.g. one of fanout, direct or topic. ``delay`` is a
boolean (``true`` or ``false`` in json-speak) dictating whether to use the
delay functionality utilised within the NLDS. Note that this requires the rabbit
server have the DelayedRabbitExchange plugin installed.
-Exchanges can be declared and created if not present on the virtual host the
-first time the NLDS is run, virtualhosts cannot and so will have to be created
+Exchanges can be declared and created if not present on the `virtualhost` the
+first time the NLDS is run, `virtualhosts` cannot and so will have to be created
beforehand manually on the server or through the admin interface. If an exchange
is requested but incorrect information given about either its `type` or
`delayed` status, then the NLDS will throw an error.
``queues`` is a list of queue dictionaries and must be implemented on consumers,
i.e. message processors, to tell ``pika`` where to take messages from. Each
-queue dictionary consists of a ``name`` and a list of `bindings`, with each
+queue dictionary consists of a ``name`` and a list of ``bindings``, with each
``binding`` being a dictionary containing the name of the ``exchange`` the queue
takes messages from, and the routing key that a message must have to be accepted
onto the queue. For more information on exchanges, routing keys, and other
-RabbitMQ features, please see [Rabbit's excellent documentation]
-(https://www.rabbitmq.com/tutorials/tutorial-five-python.html).
+RabbitMQ features, please see `Rabbit's excellent documentation `_.
Generic optional sections
@@ -117,13 +123,14 @@ Logging
The logging configuration options look like the following::
"logging": {
- "enable": boolean
+ "enable": boolean,
"log_level": str - ("none" | "debug" | "info" | "warning" | "error" | "critical"),
"log_format": str - see python logging docs for details,
"add_stdout_fl": boolean,
"stdout_log_level": str - ("none" | "debug" | "info" | "warning" | "error" | "critical"),
"log_files": List[str],
- "rollover": str - see python logging docs for details
+ "max_bytes": int,
+ "backup_count": int
}
These all set default options the native python logging system, with
@@ -137,17 +144,17 @@ to be different from the default log level.
``log_files`` is a list of strings describing the path or paths to log files
being written to. If no log files paths are given then no file logging will be
-done. If active, the file logging will be done with a TimedRotatingFileHandler,
-i.e. the files will be rotated on a rolling basis, with the rollover time
-denoted by the ``rollover`` option, which is a time string similar to that found
-in crontab. Please see the [python logging docs]
-(https://docs.python.org/3/library/logging.handlers.html#logging.handlers.TimedRotatingFileHandler)
-for more info on this.
+done. If active, the file logging will be done with a RotatingFileHandler, i.e.
+the files will be rotated when they reach a certain size. The threshold size is
+determined by ``max_bytes`` and the maximum number of files which are kept after
+rotation is controlled by ``backup_count``, both strings. For more information
+on this please refer to the `python logging docs `_.
As stated, these all set the default log options for all publishers and
consumers within the NLDS - these can be overridden on a consumer-specific basis
by inserting a ``logging`` sub-dictionary into a consumer-specific optional
-section.
+section. Each sub-dictionary has identical configuration options to those listed
+above.
General
^^^^^^^
@@ -181,8 +188,8 @@ The server config section is ``nlds_q``, and the following options are available
"nlds_q": {
"logging": [standard_logging_dictionary],
- "retry_delays": List[int]
- "print_tracebacks_fl": boolean,
+ "retry_delays": List[int],
+ "print_tracebacks_fl": boolean
}
Not much specifically happens in the NLDS worker that requires configuration, so
@@ -207,7 +214,7 @@ Server config section is ``index_q``, and the following options are available::
"max_retries": int,
"check_permissions_fl": boolean,
"check_filesize_fl": boolean,
- "use_pwd_gid_fl": boolean
+ "max_filesize": int
}
where ``logging``, ``retry_delays``, and ``print_tracebacks_fl`` are, as above,
@@ -237,14 +244,11 @@ list.
``check_permissions_fl`` and ``check_filesize_fl`` are commonly used boolean
flags to control whether the indexer checks the permissions and filesize of
-files respectively during the indexing step.
-
-``use_pwd_gid_fl`` is a final boolean flag which controls how permissions
-checking goes about getting the gid to check group permissions against. If True,
-it will _just_ use the gid found in the ``pwd`` table on whichever machine the
-indexer is running on. If false, then this gid is used `as well as` all of those
-found using the ``os.groups`` command - which will read all groups found on the
-machine the indexer is running on.
+files respectively during the indexing step. If the filesize is being checked,
+``max_filesize`` determines the maximum filesize, in bytes, of an individual
+file which can be added to any given holding. This defaults to ``500GB``, but is
+typically determined by the size of the cache in front of the tape, which for
+the STFC CTA instance is ``500GB`` (hence the default value).
Cataloguer
@@ -256,6 +260,7 @@ The server config entry for the catalog consumer is as follows::
"logging": {standard_logging_dictionary},
"retry_delays": List[int],
"print_tracebacks_fl": boolean,
+ "max_retries": int,
"db_engine": str,
"db_options": {
"db_name" : str,
@@ -263,12 +268,13 @@ The server config entry for the catalog consumer is as follows::
"db_passwd" : str,
"echo": boolean
},
- "max_retries": int
+ default_tenancy: str,
+ default_tape_url: str
}
where ``logging``, ``retry_delays``, and ``print_tracebacks_fl`` are, as above,
standard configurables within the NLDS consumer ecosystem. ``max_retries`` is
-similarly available in the cataloguer, with the same meaning as above.
+similarly available in the cataloguer, with the same meaning as defined above.
Here we also have two keys which control database behaviour via SQLAlchemy:
``db_engine`` and ``db_options``. ``db_engine`` is a string which specifies
@@ -281,6 +287,12 @@ your chosen flavour of database), along with the database username and password
``db_password``. Finally in this sub-dictionary ``echo``, an optional
boolean flag which controls the auto-logging of the SQLAlchemy engine.
+Finally ``default_tenancy`` and ``default_tape_url`` are the default values to
+place into the Catalog for a new Location's ``tenancy`` and ``tape_url`` values
+if not explicitly defined before reaching the catalog. This will happen if the
+user, for example, does not define a tenancy in their client-config.
+
+.. _transfer_put_get:
Transfer-put and Transfer-get
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -294,21 +306,35 @@ The server entry for the transfer-put consumer is as follows::
"print_tracebacks_fl": boolean,
"filelist_max_length": int,
"check_permissions_fl": boolean,
- "use_pwd_gid_fl": boolean,
- "tenancy": "cedadev-o.s3.jc.rl.ac.uk",
+ "tenancy": str,
"require_secure_fl": false
}
where we have ``logging``, ``retry_delays`` and ``print_tracebacks_fl`` as their
standard definitions defined above, and ``max_retries``, ``filelist_max_length``
-, ``check_permissions_fl``, and ``use_pwd_gid_fl`` defined the same as for the
-Indexer consumer.
+, and ``check_permissions_fl`` defined the same as for the Indexer consumer.
New definitions for the transfer processor are the ``tenancy`` and
``require_secure_fl``, which control ``minio`` behaviour. ``tenancy`` is a
string which denotes the address of the object store tenancy to upload/download
-files to/from, and ``require_secure_fl`` which specifies whether or not you
-require signed ssl certificates at the tenancy location.
+files to/from (e.g. ``_), and ``require_secure_fl``
+which specifies whether or not you require signed ssl certificates at the
+tenancy location.
+
+The transfer-get consumer is identical except for the addition of config
+controlling the change-ownership functionality on downloaded files – see
+:ref:`chowning` for details on why this is necessary. The additional config is
+as follows::
+
+ "transfer_get_q": {
+ ...
+ "chown_fl": boolean,
+ "chown_cmd": str
+ }
+
+where ``chown_fl`` is a boolean flag to specify whether to attempt to ``chown``
+files back to the requesting user, and ``chown_cmd`` is the name of the
+executable to use to ``chown`` said file.
Monitor
@@ -338,7 +364,7 @@ messages which failed due to an unexpected exception.
Logger
^^^^^^
-And finally, the server config entry for the Logger consumer is as follows::
+The server config entry for the Logger consumer is as follows::
"logging_q": {
"logging": {standard_logging_dictionary},
@@ -349,4 +375,105 @@ where the options have been previously defined. Note that there is no special
configurable behaviour on the Logger consumer as it is simply a relay for
redirecting logging messages into log files. It should also be noted that the
``log_files`` option should be set in the logging sub-dictionary for this to
-work properly, which may be a mandatory setting in future versions.
\ No newline at end of file
+work properly, which may be a mandatory setting in future versions.
+
+.. _archive_put_get:
+
+Archive-Put and Archive-Get
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Finally, the server config entry for the archive-put consumer is as follows::
+
+ "archive_put_q": {
+ "logging": {standard_logging_dictionary}
+ "max_retries": int,
+ "retry_delays": List[int],
+ "print_tracebacks_fl": boolean,
+ "tenancy": str,
+ "check_permissions_fl": boolean,
+ "require_secure_fl": boolean,
+ "tape_url": str,
+ "tape_pool": str,
+ "query_checksum_fl": boolean,
+ "chunk_size": int
+ }
+
+which is a combination of standard configuration, object-store configuration and
+as-yet-unseen tape configuration. Firstly, we have the standard options
+``logging``, ``max_retries``, ``retry_delays``, and ``print_tracebacks_fl``,
+which we have defined above. Then we have the object-store configuration options
+which we saw previously in the :ref:`transfer_put_get` consumer config, and have
+the same definitions.
+
+The latter four options control tape configuration, ``taoe_url`` and
+``tape_pool`` defining the ``xrootd`` url and tape pool at which to attempt to
+put files onto - note that these two values are combined together into a single
+``tape_path`` in the archiver. ``query_checksum`` is the next option, is a
+boolean flag to control whether the ADLER32 checksum calculated during streaming
+is used to check file integrity at the end of a write. Finally ``chunk_size`` is
+the size, in bytes, to chunk the stream into when writing into or reading from
+the CTA cache. This defaults to 5 MiB as this is the lower limit for
+``part_size`` when uploading back to object-store during an archive-get, but has
+not been properly benchmarked or optimised yet.
+
+Note that the above has been listed for the archive-put consumer but are
+shared by the archive-get consumer. The archive-get does have one additional
+config option::
+
+ "archive_get_q": {
+ ...
+ "prepare_requeue": int
+ }
+
+where ``prepare_requeue`` is the prepare-requeue delay, i.e. the delay, in
+milliseconds, before an archive recall message is requeued following a negative
+read-preparedness query has been made. This defaults to 30 seconds.
+
+
+Publisher-specific optional sections
+------------------------------------
+
+There are two, non-consumer, elements to the NLDS which can optionally be
+configured, listed below.
+
+RPC Publisher
+^^^^^^^^^^^^^
+
+The Remote Procedure Call (RPC) Publisher, the specific rabbit publisher which
+sits inside the API server and makes RPCs to the databases for quick metadata
+access from the client, has its own small config section::
+
+ "rpc_publisher": {
+ "time_limit": int,
+ "queue_exclusivity_fl": boolean
+ }
+
+where ``time_limit`` is the number of seconds the publisher waits before
+declaring the RPC timed out and the receiving consumer non-responsive, and
+``queue_exclusivity_fl`` controls whether the queue declared by the publisher is
+exclusive to the publisher. These values default to ``30`` seconds and ``True``
+respectively.
+
+
+Cronjob Publisher
+^^^^^^^^^^^^^^^^^
+
+The Archive-Put process, as described in :ref:`archive_put`, is periodically
+initiated by a cronjob which sends a message to the catalog to get the next,
+unarchived holding. This requires a small amount of configuration in order to
+(a) get access to the object store, (b) change the default ``tenancy`` or
+``tape_url``, if necessary. As such the allowed config options look like::
+
+ "cronjob_publisher": {
+ "access_key": str,
+ "secret_key": str,
+ "tenancy": str,
+ "tape_url": str
+ }
+
+where ``tape_url`` is identical to that specified in :ref:`archive_put_get`, and
+``access_key``, ``secret_key`` and ``tenancy`` are specified as in the
+`client config `_,
+referring to the objectstore tenancy located at ``tenancy`` and ``token`` and
+``secret_key`` required for accessing it. In practice only the ``access_key``
+and ``secret_key`` are specified during deployment.
\ No newline at end of file
diff --git a/docs/source/system-status.rst b/docs/source/system-status.rst
index bc944dd5..147dca5b 100644
--- a/docs/source/system-status.rst
+++ b/docs/source/system-status.rst
@@ -61,7 +61,7 @@ e.g.:
this link does actually work even though it looks very confusing it will set the time limit to 2 and open a table with catalog, monitor, index and logger rows
and would look like this:
-.. image:: status_images/short_table.png
+.. image:: _images/status/short_table.png
:width: 400
:alt: short table
@@ -153,26 +153,26 @@ representations of what the whole table will look like.
When no consumers are running, the info bar is blue, and the status text is red.
-.. image:: status_images/all_off.png
+.. image:: _images/status/all_off.png
:width: 600
:alt: All consumers off
|
When all consumers inside a microservice are offline the info bar is red as well as the status column text for the offline microservice. The working microservices status text is green.
-.. image:: status_images/failed.png
+.. image:: _images/status/failed.png
:width: 600
:alt: A consumer failed
|
When some consumers inside a microservice are offline the info bar is red
the partially failed microservice's status text is orange.
-.. image:: status_images/part_failed.png
+.. image:: _images/status/part_failed.png
:width: 600
:alt: some consumers failed
|
When all consumers online the info bar is green, there is nothing in failed consumer column and all status text is green.
-.. image:: status_images/success.png
+.. image:: _images/status/success.png
:width: 600
:alt: All consumers on
diff --git a/nlds/templates/server_config.j2 b/nlds/templates/server_config.j2
index 90420e4b..98896f04 100644
--- a/nlds/templates/server_config.j2
+++ b/nlds/templates/server_config.j2
@@ -10,6 +10,7 @@
"rabbitMQ": {
"user": "{{ rabbit_user }}",
"password": "{{ rabbit_password }}",
+ "heartbeat": "{{ rabbit_heartbeat }}",
"server": "{{ rabbit_server }}",
"admin_port": "{{ rabbit_port }}",
"vhost": "{{ rabbit_vhost }}",
@@ -29,5 +30,126 @@
]
}
]
+ },
+ "logging": {
+ "enable": "{{ logging_enable }}",
+ "log_level": "{{ logging_log_level }}",
+ "log_format": "{{ logging_log_format }}",
+ "add_stdout_fl": "{{ logging_add_stdout_fl }}",
+ "stdout_log_level": "{{ logging_stdout_log_level }}",
+ "log_files": "{{ logging_log_files }}",
+ "max_bytes": "{{ logging_max_bytes }}",
+ "backup_count": "{{ logging_backup_count }}"
+ },
+ "general": {
+ "retry_delays": "{{ general_retry }}"
+ },
+ "nlds_q": {
+ "logging": "{{ nlds_q_logging_dict }}",
+ "max_retries": "{{ nlds_q_max_retries }}",
+ "retry_delays": "{{ nlds_q_retry_delays }}",
+ "print_tracebacks_fl": "{{ nlds_q_print_tracebacks_fl }}"
+ },
+ "index_q": {
+ "logging": "{{ index_q_logging_dict }}",
+ "max_retries": "{{ index_q_max_retries }}",
+ "retry_delays": "{{ index_q_retry_delays }}",
+ "print_tracebacks_fl": "{{ index_q_print_tracebacks_fl }}",
+ "filelist_max_length": "{{ index_q_filelist_max_length }}",
+ "message_threshold": "{{ index_q_message_threshold }}",
+ "check_permissions_fl": "{{ index_q_check_permissions_fl }}",
+ "check_filesize_fl": "{{ index_q_check_filesize_fl }}",
+ "max_filesize": "{{ index_q_max_filesize }}"
+ },
+ "catalog_q": {
+ "logging": "{{ catalog_q_logging_dict }}",
+ "max_retries": "{{ catalog_q_max_retries }}",
+ "retry_delays": "{{ catalog_q_retry_delays }}",
+ "print_tracebacks_fl": "{{ catalog_q_print_tracebacks_fl }}",
+ "db_engine": "{{ catalog_q_db_engine }}",
+ "db_options": {
+ "db_name" : "{{ catalog_q_db_options_db_name }}",
+ "db_user" : "{{ catalog_q_db_options_db_user }}",
+ "db_passwd" : "{{ catalog_q_db_options_db_passwd }}",
+ "echo": "{{ catalog_q_db_options_echo }}"
+ },
+ "default_tenancy": "{{ catalog_q_default_tenancy }}",
+ "default_tape_url": "{{ catalog_q_default_tape_url }}"
+ },
+ "monitor_q": {
+ "logging": "{{ monitor_q_logging_dict }}",
+ "max_retries": "{{ monitor_q_max_retries }}",
+ "retry_delays": "{{ monitor_q_retry_delays }}",
+ "print_tracebacks_fl": "{{ monitor_q_print_tracebacks_fl }}",
+ "db_engine": "{{ monitor_q_db_engine }}",
+ "db_options": {
+ "db_name" : "{{ monitor_q_db_options_db_name }}",
+ "db_user" : "{{ monitor_q_db_options_db_user }}",
+ "db_passwd" : "{{ monitor_q_db_options_db_passwd }}",
+ "echo": "{{ monitor_q_db_options_echo }}"
+ }
+ },
+ "transfer_put_q": {
+ "logging": "{{ transfer_put_q_logging_dict }}",
+ "max_retries": "{{ transfer_put_q_max_retries }}",
+ "retry_delays": "{{ transfer_put_q_retry_delays }}",
+ "print_tracebacks_fl": "{{ transfer_put_q_print_tracebacks_fl }}",
+ "filelist_max_length": "{{ transfer_put_q_filelist_max_length }}",
+ "check_permissions_fl": "{{ transfer_put_q_check_permissions_fl }}",
+ "tenancy": "{{ transfer_put_q_tenancy }}",
+ "require_secure_fl": "{{ transfer_put_q_require_secure_fl }}"
+ },
+ "transfer_get_q": {
+ "logging": "{{ transfer_put_q_logging_dict }}",
+ "max_retries": "{{ transfer_put_q_max_retries }}",
+ "retry_delays": "{{ transfer_put_q_retry_delays }}",
+ "print_tracebacks_fl": "{{ transfer_put_q_print_tracebacks_fl }}",
+ "filelist_max_length": "{{ transfer_put_q_filelist_max_length }}",
+ "check_permissions_fl": "{{ transfer_put_q_check_permissions_fl }}",
+ "tenancy": "{{ transfer_put_q_tenancy }}",
+ "require_secure_fl": "{{ transfer_put_q_require_secure_fl }}",
+ "chown_fl": "{{ transfer_put_q_chown_fl }}",
+ "chown_cmd": "{{ transfer_put_q_chown_cmd }}"
+ },
+ "logging_q": {
+ "logging": "{{ logging_q_logging_dict }}",
+ "print_tracebacks_fl": "{{ logging_q_print_tracebacks_fl }}"
+ },
+ "archive_put_q": {
+ "logging": "{{ archive_put_q_logging_dict }}",
+ "max_retries": "{{ archive_put_q_max_retries }}",
+ "retry_delays": "{{ archive_put_q_retry_delays }}",
+ "print_tracebacks_fl": "{{ archive_put_q_print_tracebacks_fl }}",
+ "tenancy": "{{ archive_put_q_tenancy }}",
+ "check_permissions_fl": "{{ archive_put_q_check_permissions_fl }}",
+ "require_secure_fl": "{{ archive_put_q_require_secure_fl }}",
+ "tape_url": "{{ archive_put_q_tape_url }}",
+ "tape_pool": "{{ archive_put_q_tape_pool }}",
+ "query_checksum_fl": "{{ archive_put_q_query_checksum_fl }}",
+ "chunk_size": "{{ archive_put_q_chunk_size }}"
+ },
+ "archive_get_q": {
+ "logging": "{{ archive_get_q_logging_dict }}",
+ "max_retries": "{{ archive_get_q_max_retries }}",
+ "retry_delays": "{{ archive_get_q_retry_delays }}",
+ "print_tracebacks_fl": "{{ archive_get_q_print_tracebacks_fl }}",
+ "tenancy": "{{ archive_get_q_tenancy }}",
+ "check_permissions_fl": "{{ archive_get_q_check_permissions_fl }}",
+ "require_secure_fl": "{{ archive_get_q_require_secure_fl }}",
+ "tape_url": "{{ archive_get_q_tape_url }}",
+ "tape_pool": "{{ archive_get_q_tape_pool }}",
+ "query_checksum_fl": "{{ archive_get_q_query_checksum_fl }}",
+ "chunk_size": "{{ archive_get_q_chunk_size }}",
+ "prepare_requeue": "{{ archive_get_q_prepare_requeue }}"
+ },
+ "rpc_publisher": {
+ "time_limit": "{{ rpc_publisher_time_limit }}",
+ "queue_exclusivity_fl": "{{ rpc_publisher_queue_exclusivity_fl }}"
+ },
+ "cronjob_publisher": {
+ "access_key": "{{ cronjob_publisher_access_key }}",
+ "secret_key": "{{ cronjob_publisher_secret_key }}",
+ "tenancy": "{{ cronjob_publisher_tenancy }}",
+ "tape_url": "{{ cronjob_publisher_tape_url }}"
}
}
diff --git a/tests/conftest.py b/tests/conftest.py
index 4ded5987..08fe250d 100644
--- a/tests/conftest.py
+++ b/tests/conftest.py
@@ -11,13 +11,14 @@
TEMPLATE_CONFIG_PATH = os.path.join(os.path.dirname(__file__),
- 'server-config.json')
+ 'req-config-template.json')
@pytest.fixture
def template_config():
config_path = TEMPLATE_CONFIG_PATH
- fh = open(config_path)
- return json.load(fh)
+ with open(config_path) as fh:
+ config_dict = json.load(fh)
+ return config_dict
@pytest.fixture
def test_uuid():
diff --git a/tests/nlds/rabbit/test_publisher.py b/tests/nlds/rabbit/test_publisher.py
index 46ded5bb..0aa31ff8 100644
--- a/tests/nlds/rabbit/test_publisher.py
+++ b/tests/nlds/rabbit/test_publisher.py
@@ -7,9 +7,12 @@
from nlds.rabbit import publisher as publ
from nlds.rabbit.publisher import RabbitMQPublisher
-from nlds.server_config import LOGGING_CONFIG_ENABLE, LOGGING_CONFIG_FORMAT, \
- LOGGING_CONFIG_LEVEL, LOGGING_CONFIG_SECTION, \
- LOGGING_CONFIG_STDOUT, LOGGING_CONFIG_STDOUT_LEVEL
+from nlds.server_config import (
+ LOGGING_CONFIG_FORMAT,
+ LOGGING_CONFIG_LEVEL,
+ LOGGING_CONFIG_STDOUT,
+ LOGGING_CONFIG_STDOUT_LEVEL,
+)
def mock_load_config(template_config):
return template_config
@@ -62,8 +65,11 @@ def test_publish_message(default_publisher):
# Attempting to establish a connection with the template config should also
# fail with a socket error
- with pytest.raises(gaierror):
- default_publisher.get_connection()
+ # NOTE: (2024-03-12) Commented this out as the new daemon thread logic and
+ # perma-retries keep the connection from dying. No longer a useful test.
+ # TODO: rewrite but force close it?
+ # with pytest.raises(gaierror):
+ # default_publisher.get_connection()
# TODO: Make mock connection object and send messages through it?
diff --git a/tests/req-config-template.json b/tests/req-config-template.json
new file mode 100644
index 00000000..d6219ee8
--- /dev/null
+++ b/tests/req-config-template.json
@@ -0,0 +1,34 @@
+{
+ "authentication" : {
+ "authenticator_backend" : "jasmin_authenticator",
+ "jasmin_authenticator" : {
+ "user_profile_url" : "{{ user_profile_url }}",
+ "user_services_url" : "{{ user_services_url }}",
+ "oauth_token_introspect_url" : "{{ token_introspect_url }}"
+ }
+ },
+ "rabbitMQ": {
+ "user": "{{ rabbit_user }}",
+ "password": "{{ rabbit_password }}",
+ "heartbeat": "{{ rabbit_heartbeat }}",
+ "server": "{{ rabbit_server }}",
+ "admin_port": "{{ rabbit_port }}",
+ "vhost": "{{ rabbit_vhost }}",
+ "exchange": {
+ "name": "{{ rabbit_exchange_name }}",
+ "type": "{{ rabbit_exchange_type }}",
+ "delayed": "{{ rabbit_exchange_delayed }}"
+ },
+ "queues": [
+ {
+ "name": "{{ rabbit_queue_name }}",
+ "bindings": [
+ {
+ "exchange": "{{ rabbit_exchange_name }}",
+ "routing_key": "{{ rabbit_queue_routing_key }}"
+ }
+ ]
+ }
+ ]
+ }
+}
\ No newline at end of file
diff --git a/tests/server-config.json b/tests/server-config.json
deleted file mode 120000
index f04e096c..00000000
--- a/tests/server-config.json
+++ /dev/null
@@ -1 +0,0 @@
-../nlds/templates/server_config.j2
\ No newline at end of file