-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Describe versioned TRS URIs #202
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -12,22 +12,84 @@ We are starting with a read-only API due to potentially different views and appr | |
|
||
Multiple formats for descriptors such as [CWL](https://github.com/common-workflow-language/common-workflow-language) and [WDL](https://github.com/broadinstitute/wdl) are permitted. | ||
|
||
#### TRS URIs | ||
|
||
For convenience, including potentially when passing content references to a WES server, we define a URI syntax for TRS-accessible content. Strings of the form `trs://<server>/<id>` mean _“you can fetch the content with TRS id `<id>` from the TRS server at `<server>` "_. | ||
|
||
For example, if a WES server was asked to process `trs://trs.example.org/123456323`, it would know that it could issue a GET request to `https://trs.example.org/api/ga4gh/trs/v2/tools/123456323` to learn how to fetch that object. | ||
#### TRS Tool and TRS Tool Version IDs | ||
|
||
Each implementation of TRS can choose its own identifier scheme, as long as it | ||
follows these guidelines: | ||
|
||
* TRS Tool and TRS Tool Version IDs are strings made up of uppercase and | ||
lowercase letters, decimal digits, hyphen, period, underscore and tilde | ||
(`[A-Za-z0-9.-_~]`). See [RFC 3986 § | ||
2.3](https://datatracker.ietf.org/doc/html/rfc3986#section-2.3). TRS Tool and | ||
TRS Tool Version IDs MAY further contain percent characters (`%`) whenever | ||
they are used in API calls, but only if they were introduced through | ||
percent-encoding of any non-valid characters (see next bullet point). | ||
* TRS Tool and TRS Tool Version IDs can contain other characters, but they MUST | ||
be percent-encoded (see [RFC 3986 § | ||
2.4](https://datatracker.ietf.org/doc/html/rfc3986#section-2.4)) into valid | ||
TRS Tool and TRS Tool Version IDs as per the previous rule whenever they are | ||
used in API calls. This is because non-encoded IDs may interfere with the | ||
interpretation of routes, e.g., for the `/tools/{id}/versions/{version_id}` | ||
endpoint. | ||
* Any given TRS Tool or TRS Tool Version ID MUST always identify the same | ||
resource (tool or tool version, respectively) on a given TRS implementation. | ||
This constraint aids with reproducibility. | ||
* TRS implementations MAY have more than one TRS Tool or TRS Tool Version ID | ||
mapping to the same resource (tool or tool version, respectively). | ||
|
||
This recommendation is intended to mirror the discussion that went into the [DRS URI scheme](https://ga4gh.github.io/data-repository-service-schemas/preview/develop/docs/#_drs_uris). | ||
#### TRS URIs | ||
|
||
For informational purposes, we recommend that TRS implementations add themselves to https://github.com/ga4gh/tool-registry-service-schemas/blob/develop/registry.json to provide for the possibility of creating a global indexing service and to allow others to more easily discover a TRS implementation. | ||
To conveniently pass content references to TRS resources, e.g., to advise a | ||
[WES](https://github.com/ga4gh/workflow-execution-service-schemas) or | ||
[TES](https://github.com/ga4gh/task-execution-schemas) service which tool | ||
(version) to use, we define a URI syntax for TRS-accessible content. Strings of | ||
the form `trs://<server>/<id>` (unversioned) and | ||
`trs://<server>/<id>/<version_id>` (versioned) mean _“you can fetch the content | ||
with TRS Tool ID `<id>` and, if provided, TRS Tool Version ID `<version_id>` | ||
from the TRS server at `<server>` "_. | ||
|
||
For example, if a TRS client was asked to process | ||
`trs://trs.example.org/tool_ABC/v1.2.3`, it would know that it could, e.g., | ||
issue `GET` requests to | ||
`https://trs.example.org/api/ga4gh/trs/v2/tools/tool_ABC` and | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @uniqueg This example is a bit awkward and likely points to a deficiency in the current approach for defining trs identifiers (and potentially drs identifiers as well). The TES specification defines the root path of the API as Putting aside specific trs versioning issues (from the conversation above), I am now left with another dilema: TRS identifiers as defined only provide the By convention, any tooling I create would resolve One approach could be just adding the prefix paths prior to the We essentially need a reliable way to extract the
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. One could probably calculate the prefixes on the fly from https://github.com/ga4gh/tool-registry-service-schemas/blob/develop/registry.json but it probably does make more sense to change the URI format since we're not using it yet(?) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @denis-yuen I would advise against relying on prefixes in the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think my personal preference for approaches would likely be: There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. tl;dr I agree with @patmagee's suggestion provided that DRS folks agree that it will work for them as well. Good catch @patmagee - thanks a lot! I had actually realized this issue before when implementing a TRS client that understands TRS URIs and having to introduce an optional From what I can see, @patmagee's suggestion should do the trick. But thinking this through a little more, maybe there are a couple of things that we should discuss outside of this issue before merging:
Besides that, it is possibly a good idea to aim for a consensus of how TRS servers:
- url: /ga4gh/trs/v2 DRS servers:
- url: https://{serverURL}/ga4gh/drs/v1
variables:
serverURL:
default: drs.example.org
description: >
DRS server endpoints MUST be prefixed by the '/ga4gh/drs/v1' endpoint
path WES servers:
- url: https://{defaultHost}/{basePath}/{apiVersion}
variables:
defaultHost:
default: www.example.com
basePath:
default: ga4gh/wes
apiVersion:
default: v1 TES servers:
- url: /ga4gh/tes/v1 So, it looks like TRS and TES do not care where the API is deployed, as long as endpoint paths are immediately preceded by Now, there is currently no consensus across those definitions. But perhaps it seems to be going in the direction of the DRS definition, i.e., start with There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Proposal at Cloud WS was to also break this into a separate ticket, but to prioritize it for discussion before 2.0.1 release and/or the GA4GH plenary |
||
`https://trs.example.org/api/ga4gh/trs/v2/tools/tool_ABC/versions/v1.2.3` to | ||
fetch information about tool `tool_ABC` and its version `v1.2.3` from a `v2` | ||
TRS API hosted at `https://trs.example.org/`, respectively. | ||
|
||
> Note that clients issuing requests to TRS services MUST NOT encode forward | ||
> the special characters slashes separating the `trs`, `<server>`, `<id>` and | ||
> `<version_id>` components of TRS URIs. However, [TRS Tool IDs and TRS Tool | ||
> Version IDs](#trs-tool-and-trs-tool-version-ids) containing | ||
> non-valid characters MUST be encoded _individually_ before constructing TRS | ||
> URIs. For example, for a TRS Tool ID `tool#1` and a TRS Version ID `(1)`, the | ||
> correct TRS URI for server `trs.example.org` would be | ||
> `trs://trs.example.org/tool%231/%281%29`, where `tool%231` and `%281%29` are | ||
> the percent-encoded TRS Tool and TRS Tool Version IDs, respectively. | ||
> | ||
> Also note that to ensure reproducibility, servers implementing multiple | ||
> versions of the TRS API specification MUST ensure that, within the limits of | ||
> schema differences across different API versions, corresponding endpoints | ||
> return consistent responses. | ||
|
||
This recommendation is intended to mirror the discussion that went into the | ||
[DRS URI | ||
scheme](https://ga4gh.github.io/data-repository-service-schemas/preview/develop/docs/#_drs_uris), | ||
with the necessary additions to account for versioned TRS URIs. | ||
|
||
### Misc | ||
|
||
The entire schema is shown below, but a more useful form is the Swagger editor to view our [schema in progress](https://editor.swagger.io/?url=https://raw.githubusercontent.com/ga4gh/tool-registry-service-schemas/develop/openapi/ga4gh-tool-discovery.yaml) | ||
|
||
Note that the swagger editor itself can kickstart a project by generating servers and clients in a variety of languages. | ||
|
||
#### Central GA4GH Service Registry | ||
|
||
For informational purposes, we recommend that TRS implementations add | ||
themselves to | ||
<https://github.com/ga4gh/tool-registry-service-schemas/blob/develop/registry.json> | ||
to provide for the possibility of creating a global indexing service and to | ||
allow others to more easily discover a TRS implementation. | ||
|
||
#### Outstanding Questions | ||
|
||
##### Authentication and Authorization | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, how this will work with versioning?
What happens when TRS v3 comes out and a new product only implements TRS v3 and doesn't implement v2?
That's assuming TRS v3 will use a new path segment
v3
, which I suppose it doesn't have to. But I assume it would.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Essentially very primitive content negotiation where the user agent would need to try the API versions it would be familiar with?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think versioning APIs is an important topic and I think there's most definitely a need for discussion and harmonization across GA4GH APIs (e.g., Stian mentioned HATEOAS in #160), but as this PR does not introduces the issue nor is designed to address it, I don't think it should be blocked by it.
Besides, I don't think that TRS URIs, as identifiers of resources, should be concerned with API versioning at all: IMO, when bumping API versions, care must be taken (both on the spec- and implementation-side) not to introduce ambiguities in the way that the different responses in, e.g., TRS
v2
andv3
can be interpreted by a client. And if changes to the internal representation of resources are required that would break backward compatibility, then I think those resources should be cloned and given new identifiers, such that reproducibility is not negatively affected. As long as these considerations/constraints are honored, it shouldn't matter which API version is used to access a given resource.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In any case, I have clarified this/addressed these concerns now in lines 56/57 and 69-72
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My question was not about API/schema versioning, which is a very complex subject and I'm no expert in. It was about the example that if you have a URI
trs://trs.example.org/tool_ABC/v1.2.3
, it says you can fetch the contenthttps://trs.example.org/api/ga4gh/trs/v2/tools/tool_ABC
(note thev2
in the path).If TRS v3 comes along, and a new TRS implementation only supports that v3, then the
v2
in the path means they would return a 404. Maybe that's OK, or maybe the requirement is that all TRS implementations need to implement v2 and above. Otherwise a client will need to iterate over paths with all potential versions.Maybe I'm overthinking this, and it is a real edge case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do think this is an API/schema versioning topic because the API version is precisely what
v2
refers to. In any case, there anyway needs to be some sort of negotiation between a service and a client, because not every client will implement every version of the API specs, and so there will be cases where clients and servers won't match, regardless of whether the client will iterate or whether we specify the version in the TRS URI. One possibility to facilitate this negotation would be to define the supported API versions in the service info, but of course, the Service Info API is versioned, too...I agree this is an important issue and needs to be discussed on a larger scale within the Cloud WS in order to come up with a sensible strategy that covers DRS and TRS URIs as well as other artefacts/problems with API versioning. A related issue I faced is also that the path (e.g.,
ga4gh/trs/v2
) defined in the schema may not always be possible or desirable to implement. For example, as far as I know, Dockstore does not. The path info could perhaps also be defined in the service info for any given service (although the same problem applies: how do you get to the service info in the first place; perhaps via the service registry...).Still, I don't think this PR makes this problem worse than it is already, given the current definitions of DRS and TRS URIs, so I don't think it should block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair. I was thinking more of different schemas in different versions of the API when I made my comment, but you're right.
I think the explicitness that you can take a URI and translate it to a specific path is a new wrinkle, at least for TRS. You're right about DRS -- I hadn't looked until now and they do the same thing, with
v1
in the path.I had already approved, so it is not blocked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot @coverbeck
So @denis-yuen, others: shall we bring up this important discussion to the Cloud WS?
Also perhaps to see if people agree that it might be useful to specify TRS URIs as inputs to WES (to access workflows) and inside workflows (to access containers through WES/TES implementations)? Because this might require changes on WES and TES specs as well, without which TRS URIs might not be all that useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Proposal at Cloud WS was to break this discussion into a separate ticket #221