Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OAProc: handle binary data when response: document (#1285) #1756

Merged
merged 2 commits into from
Jul 24, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
61 changes: 53 additions & 8 deletions docs/source/data-publishing/ogcapi-processes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,15 +14,47 @@ The pygeoapi offers two processes: a default ``hello-world`` process which allow
Configuration
-------------

The below configuration is an example of a process defined within the pygeoapi internal plugin registry:

.. code-block:: yaml

processes:

# enabled by default
# enabled by default
hello-world:
processor:
name: HelloWorld

The below configuration is an example of a process defined as part of a custom Python process:

.. code-block:: yaml

processes:
# enabled by default
hello-world:
processor:
# refer to a process in the standard PYTHONPATH
# e.g. my_package/my_module/my_file.py (class MyProcess)
# the MyProcess class must subclass from pygeoapi.process.base.BaseProcessor
name: my_package.my_module.my_file.MyProcess

See :ref:`example-custom-pygeoapi-processing-plugin` for processing plugin examples.

Processing and response handling
--------------------------------

pygeoapi processing plugins must return a tuple of media type and native outputs. Multipart
responses are not supported at this time, and it is up to the process plugin implementor to return a single
payload defining multiple artifacts (or references to them).

By default (or via the OGC API - Processes ``response: raw`` execution parameter), pygeoapi provides
processing responses in their native encoding and media type, as defined by a given
plugin (which needs to set the response content type and payload accordingly).

pygeoapi also supports a JSON-based response type (via the OGC API - Processes ``response: document``
execution parameter). When this mode is requested, the response will always be a JSON encoding, embedding
the resulting payload (part of which may be Base64 encoded for binary data, for example).


Asynchronous support
--------------------

Expand All @@ -33,15 +65,27 @@ an asynchronous design pattern. This means that when a job is submitted in asyn
mode, the server responds immediately with a reference to the job, which allows the client
to periodically poll the server for the processing status of a given job.

pygeoapi provides asynchronous support by providing a 'manager' concept which, well,
In keeping with the OGC API - Processes specification, asynchronous process execution
can be requested by including the ``Prefer: respond-async`` HTTP header in the request.

Job management is required for asynchronous functionality.

Job management
--------------

pygeoapi provides job management by providing a 'manager' concept which, well,
manages job execution. The manager concept is implemented as part of the pygeoapi
:ref:`plugins` architecture. pygeoapi provides a default manager implementation
based on `TinyDB`_ for simplicity. Custom manager plugins can be developed for more
advanced job management capabilities (e.g. Kubernetes, databases, etc.).

In keeping with the OGC API - Processes specification, asynchronous process execution
can be requested by including the ``Prefer: respond-async`` HTTP header in the request
Job managers
------------

TinyDB
^^^^^^

TinyDB is the default job manager for pygeoapi when enabled.

.. code-block:: yaml

Expand All @@ -52,7 +96,8 @@ can be requested by including the ``Prefer: respond-async`` HTTP header in the r
output_dir: /tmp/

MongoDB
-------
^^^^^^^

As an alternative to the default, a manager employing `MongoDB`_ can be used.
The connection to a `MongoDB`_ instance must be provided in the configuration.
`MongoDB`_ uses ``localhost`` and port ``27017`` by default. Jobs are stored in a collection named
Expand All @@ -66,9 +111,9 @@ The connection to a `MongoDB`_ instance must be provided in the configuration.
connection: mongodb://host:port
output_dir: /tmp/


PostgreSQL
----------
^^^^^^^^^^

As another alternative to the default, a manager employing `PostgreSQL`_ can be used.
The connection to a `PostgreSQL`_ database must be provided in the configuration.
`PostgreSQL`_ uses ``localhost`` and port ``5432`` by default. Jobs are stored in a table named ``jobs``.
Expand Down
2 changes: 2 additions & 0 deletions docs/source/plugins.rst
Original file line number Diff line number Diff line change
Expand Up @@ -273,6 +273,8 @@ implementation.

Each base class documents the functions, arguments and return types required for implementation.

.. _example-custom-pygeoapi-processing-plugin:

Example: custom pygeoapi processing plugin
------------------------------------------

Expand Down
6 changes: 4 additions & 2 deletions pygeoapi/api/processes.py
Original file line number Diff line number Diff line change
Expand Up @@ -379,6 +379,8 @@ def execute_process(api: API, request: APIRequest,
requested_outputs = data.get('outputs')
LOGGER.debug(f'outputs: {requested_outputs}')

response_requested = data.get('response', 'raw')

subscriber = None
subscriber_dict = data.get('subscriber')
if subscriber_dict:
Expand Down Expand Up @@ -420,7 +422,7 @@ def execute_process(api: API, request: APIRequest,
if status == JobStatus.failed:
response = outputs

if data.get('response', 'raw') == 'raw':
if response_requested == 'raw':
headers['Content-Type'] = mime_type
response = outputs
elif status not in (JobStatus.failed, JobStatus.accepted):
Expand All @@ -433,7 +435,7 @@ def execute_process(api: API, request: APIRequest,
else:
http_status = HTTPStatus.OK

if mime_type == 'application/json':
if mime_type == 'application/json' or response_requested == 'document':
response2 = to_json(response, api.pretty_print)
else:
response2 = response
Expand Down
Loading