Skip to content

Commit

Permalink
OAProc: handle binary data when response: document (#1285)
Browse files Browse the repository at this point in the history
  • Loading branch information
tomkralidis committed Jul 24, 2024
1 parent 0281732 commit 1aa3c46
Show file tree
Hide file tree
Showing 3 changed files with 59 additions and 8 deletions.
59 changes: 53 additions & 6 deletions docs/source/data-publishing/ogcapi-processes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,8 @@ The pygeoapi offers two processes: a default ``hello-world`` process which allow
Configuration
-------------

The below configuration is an example of a process defined within the pygeoapi internal plugin registry:

.. code-block:: yaml
processes:
Expand All @@ -23,6 +25,38 @@ Configuration
processor:
name: HelloWorld
The below configuration is an example of a process defined as part of cusotm Python process:

.. code-block:: yaml
processes:
# enabled by default
hello-world:
processor:
# refer to a process in the standard PYTHONPATH
# e.g. my_package/my_module/my_file.py (class MyProcess)
# the MyProcess class must subclass from pygeoapi.process.base.BaseProcessor
name: my_package.my_module.my_file.MyProcess
See :ref:`example-custom-pygeoapi-processing-plugin` for processing plugin examples

Processing and response handling
--------------------------------

pygeoapi processing plugins are required to return a tuple of media type and native outputs. Multipart
responses are not supported at this time, and it is up to the process plugin implementor to return a single
payload defining multiple artifacts (or references to them).

By default (or via the OGC API - Processes ``response: raw`` execution parameter), pygeoapi provides
processing responses in their native encoding and media type, as defined by a given
plugin (which needs to set the response content type and payload accordingly).

pygeoapi also supports a JSON-based response type (via the OGC API - Processes ``response: document``
execution parameter). When this mode is requested, the response will always be a JSON encoding embedding
the resulting payload (which may be base64 encoded for binary data, for example).


Asynchronous support
--------------------

Expand All @@ -33,15 +67,27 @@ an asynchronous design pattern. This means that when a job is submitted in asyn
mode, the server responds immediately with a reference to the job, which allows the client
to periodically poll the server for the processing status of a given job.

pygeoapi provides asynchronous support by providing a 'manager' concept which, well,
In keeping with the OGC API - Processes specification, asynchronous process execution
can be requested by including the ``Prefer: respond-async`` HTTP header in the request.

Job management is required for asynchronous functionality.

Job management
--------------

pygeoapi provides job management by providing a 'manager' concept which, well,
manages job execution. The manager concept is implemented as part of the pygeoapi
:ref:`plugins` architecture. pygeoapi provides a default manager implementation
based on `TinyDB`_ for simplicity. Custom manager plugins can be developed for more
advanced job management capabilities (e.g. Kubernetes, databases, etc.).

In keeping with the OGC API - Processes specification, asynchronous process execution
can be requested by including the ``Prefer: respond-async`` HTTP header in the request
Job managers
------------

TinyDB
^^^^^^

TinyDB is the default job manager for pygeoapi when enabled.

.. code-block:: yaml
Expand All @@ -52,7 +98,8 @@ can be requested by including the ``Prefer: respond-async`` HTTP header in the r
output_dir: /tmp/
MongoDB
-------
^^^^^^^

As an alternative to the default, a manager employing `MongoDB`_ can be used.
The connection to a `MongoDB`_ instance must be provided in the configuration.
`MongoDB`_ uses ``localhost`` and port ``27017`` by default. Jobs are stored in a collection named
Expand All @@ -66,9 +113,9 @@ The connection to a `MongoDB`_ instance must be provided in the configuration.
connection: mongodb://host:port
output_dir: /tmp/
PostgreSQL
----------
^^^^^^^^^^

As another alternative to the default, a manager employing `PostgreSQL`_ can be used.
The connection to a `PostgreSQL`_ database must be provided in the configuration.
`PostgreSQL`_ uses ``localhost`` and port ``5432`` by default. Jobs are stored in a table named ``jobs``.
Expand Down
2 changes: 2 additions & 0 deletions docs/source/plugins.rst
Original file line number Diff line number Diff line change
Expand Up @@ -273,6 +273,8 @@ implementation.

Each base class documents the functions, arguments and return types required for implementation.

.. _example-custom-pygeoapi-processing-plugin:

Example: custom pygeoapi processing plugin
------------------------------------------

Expand Down
6 changes: 4 additions & 2 deletions pygeoapi/api/processes.py
Original file line number Diff line number Diff line change
Expand Up @@ -379,6 +379,8 @@ def execute_process(api: API, request: APIRequest,
requested_outputs = data.get('outputs')
LOGGER.debug(f'outputs: {requested_outputs}')

response_requested = data.get('response', 'raw')

subscriber = None
subscriber_dict = data.get('subscriber')
if subscriber_dict:
Expand Down Expand Up @@ -420,7 +422,7 @@ def execute_process(api: API, request: APIRequest,
if status == JobStatus.failed:
response = outputs

if data.get('response', 'raw') == 'raw':
if response_requested == 'raw':
headers['Content-Type'] = mime_type
response = outputs
elif status not in (JobStatus.failed, JobStatus.accepted):
Expand All @@ -433,7 +435,7 @@ def execute_process(api: API, request: APIRequest,
else:
http_status = HTTPStatus.OK

if mime_type == 'application/json':
if mime_type == 'application/json' or response_requested == 'document':
response2 = to_json(response, api.pretty_print)
else:
response2 = response
Expand Down

0 comments on commit 1aa3c46

Please sign in to comment.