Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Implement support for application/octet-stream #83

Open
wants to merge 4 commits into
base: 4.x
Choose a base branch
from

Conversation

sebveit
Copy link

@sebveit sebveit commented Aug 15, 2024

Motivation:

OpenAPI 3.1 supports media type "application/octet-stream" in the request body.
See the migration guide: https://www.openapis.org/blog/2021/02/16/migrating-from-openapi-3-0-to-3-1-0

Conformance:

I've signed the Eclipse Contributor Agreement as explained in https://github.com/eclipse/vert.x/blob/master/CONTRIBUTING.md

Current state of PR:

Pascal and I had trouble figuring out what the OpenAPI spec actually defines in regards of octet-stream/binary related endpoints. This confusion seems to be fixed with OpenAPI spec of version 3.1.1. For now, this PR is on hold until Pascal tells otherwise.

@sebveit
Copy link
Author

sebveit commented Aug 15, 2024

Related to issue #82

@pk-work
Copy link
Contributor

pk-work commented Aug 15, 2024

Hi @sebveit,

thanks for your contribution, help is always welcome! Your PR is a good start, but to support a new content type it requires a little bit more effort than just extending the list with allowed values.

First we need to add a BodyTransformer [1] for application/octet-steam. Here are examples of already existing transformers [2][3]. Then we need to add the mapping in BaseValidator [4]. Then we can extend the list of supported values.

To keep quality high, it is also required to provide unit tests for the new functionality.

I hope this helps you to extend your PR. let me know if you need more insights.

[1] https://github.com/eclipse-vertx/vertx-openapi/blob/main/src/main/java/io/vertx/openapi/validation/transformer/BodyTransformer.java
[2] https://github.com/eclipse-vertx/vertx-openapi/blob/main/src/main/java/io/vertx/openapi/validation/transformer/ApplicationJsonTransformer.java
[3] https://github.com/eclipse-vertx/vertx-openapi/blob/main/src/main/java/io/vertx/openapi/validation/transformer/MultipartFormTransformer.java
[4]

@sebveit
Copy link
Author

sebveit commented Aug 15, 2024

Hi @pk-work, nice to "meet" you! :-)
You're right, I thought it would be a quick win before I stumbled on the validation package!
Thanks for pointing out where to go. I will try to come up with a solution.
Thx for guiding! If questions arise, I'll ask you!

@sebveit sebveit changed the title contract: Add application/octet-stream to supported media types WIP: contract: Add application/octet-stream to supported media types Aug 15, 2024
@sebveit
Copy link
Author

sebveit commented Aug 15, 2024

I've written the corresponding transformer and tests.
Maybe you take a look if it's ok?

One thing I don't know if it needs to change, is the way Media Type Object of OpenAPI are handled by this project.
In the migration guide of OpenAPI 3.1, there's an example of a media type without further content:
application/octet-stream: {}

Before 3.1, it seems that you had to write something like the following:

application/octet-stream:
  schema:
    type: string
    format: binary

Migration guide: https://www.openapis.org/blog/2021/02/16/migrating-from-openapi-3-0-to-3-1-0

Is the current implementation OK with such an empty Media Type Object?

@sebveit sebveit changed the title WIP: contract: Add application/octet-stream to supported media types WIP: Implement support for application/octet-stream Aug 16, 2024
@sebveit
Copy link
Author

sebveit commented Aug 19, 2024

I've fixed the following exception by providing a default schema for application/octet-stream, if the shortcut declaration is used - which is mentioned in the migration guide of OpenAPI 3.1.
Encountered exception:

io.vertx.openapi.contract.OpenAPIContractException: The passed OpenAPI contract contains a feature that is not supported: Media Type without a schema
2024-08-19 11:31:30     at io.vertx.openapi.contract.OpenAPIContractException.createUnsupportedFeature(OpenAPIContractException.java:43)
2024-08-19 11:31:30     at io.vertx.openapi.contract.impl.MediaTypeImpl.<init>(MediaTypeImpl.java:33)
2024-08-19 11:31:30     at io.vertx.openapi.contract.impl.RequestBodyImpl.lambda$new$1(RequestBodyImpl.java:58)

Related commit 232f717.

@sebveit
Copy link
Author

sebveit commented Aug 19, 2024

Now, I'm confronted with another exception:

java.lang.IllegalArgumentException: Instances of class io.vertx.core.buffer.impl.BufferImpl type are not supported
2024-08-19 11:58:11     at io.vertx.json.schema.impl.Utils$JSON.typeOf(Utils.java:214)
2024-08-19 11:58:11     at io.vertx.json.schema.impl.SchemaValidatorImpl.validate(SchemaValidatorImpl.java:101)
2024-08-19 11:58:11     at io.vertx.json.schema.impl.SchemaValidatorImpl.validate(SchemaValidatorImpl.java:54)
2024-08-19 11:58:11     at io.vertx.openapi.validation.impl.RequestValidatorImpl.validateBody(RequestValidatorImpl.java:156)
2024-08-19 11:58:11     at io.vertx.openapi.validation.impl.RequestValidatorImpl.lambda$validate$2(RequestValidatorImpl.java:106)

I'm getting deeper into the rabbit hole! What I previously considered a quick fix is now expanding to three projects. I'm beginning to understand why this hasn't been implemented yet.

Could I just catch that exception if it is a Buffer(Impl) type and pretend nothing happened and the validation is a success?

@pk-work
Copy link
Contributor

pk-work commented Aug 20, 2024

Hi, currently I'm on PTO. I will have a look at the end of the week or next week.

@sebveit
Copy link
Author

sebveit commented Aug 21, 2024

Thanks for the reply! Take you time! I've got a workaround for the actual project that depends on those changes by creating a local snapshot version with the modifications committed here.

The whole point of this project/lib seems to be validation of a given OpenAPI schema and the related request/response pairs. While that's true for the JSON or anything else expressed as strings, explicit binary values cannot really be validated by OpenAPI itself.

This PR needs a bit more polishing and additional tests before it can be merged, I guess. Nevertheless, I think that my commits are a good approach to the problem.

Comment on lines +43 to +46
schemaJson = new JsonObject()
.put("type", "string")
.put("format", "binary")
.put("contentMediaType", "application/octet-stream");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should become a constant maybe in MediaType. I would also remove contentMediaType as in the example only format and type is mentioned. This is then also compatible with 3.0

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll replace it as a constant in MediaType.

But stripping the contentMediaType isn't a good idea, because it replaces format in OpenAPI 3.1. I've intentionally placed both keys in that JSON object to be compatible with OpenAPI 3.x.

That way it's future proof, because later adaptions of this library for OpenAPI 3.1 or later might break.

Am I overthinking that part?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mh maybe you are right, but I think we can simply ignore contentMediaType , because the MediaType is already specified with if (identifier.equalsIgnoreCase(MediaType.APPLICATION_OCTET_STREAM)) {

JSON Schema also offers a contentMediaType keyword. However, when the media type is already specified by the Media Type Object's key, or by the contentType field of an Encoding Object, the contentMediaType keyword SHALL be ignored if present.

Would you agree?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should become a constant maybe in MediaType. I would also remove contentMediaType as in the example only format and type is mentioned. This is then also compatible with 3.0

If someone uses the alias application/octet-stream: {}, it's safe to assume that the contract file is of version 3.1, because that wasn't allowed before in 3.0 and lower. Therefore I'm removing format from the schemaJson.
The final schema object that gets injected into the media type object would look like the following:

new JsonObject()
          .put("type", "string")
          .put("contentMediaType", "application/octet-stream");

Mh maybe you are right, but I think we can simply ignore contentMediaType , because the MediaType is already specified with if (identifier.equalsIgnoreCase(MediaType.APPLICATION_OCTET_STREAM)) {

Yeah, that makes sense to ignore it here.

Summary: For contract validation we place a default schema object like seen above, if someone uses the alias definition introduced in OpenAPI 3.1. For request validation we simplify the validation if the media type object ID is application/octet-stream.

Sounds good?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the benefit of having contentMediaType in the default schema object? I still don't understand it.

Why can't we simply use

new JsonObject()
          .put("type", "string")
          .put("format", "binary")

?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default schema object must only be applied if someone has an openapi-contract.yml file with OpenAPI 3.1. Before that, no shortcuts like {} could be used in the schema file.

In OpenAPI 3.0 you must write:

# an arbitrary binary file:
content:
    application/octet-stream:
        schema:
            type: string
            format: binary

In OpenAPI 3.1 you (can) now write:

# a PNG image as a binary file:
content:
    image/png: {}
# an arbitrary binary file:
content:
    application/octet-stream: {}

And that gets expanded to:

# a PNG image as a binary file:
content:
    image/png:
        schema:
            type: string
            contentMediaType: image/png
            contentEncoding: base64
# an arbitrary binary file:
content:
    application/octet-stream:
        schema:
            type: string
            contentMediaType: application/octet-stream

It really is confusing! As of my understanding, in OpenAPI 3.1, format has been replaced by contentMediaType when dealing with Schema Objects in relation to binary file uploads. See https://spec.openapis.org/oas/latest.html#considerations-for-file-uploads

The task of this library must be the expansion before the validation of the contract. Right now I'm only providing that expansion for application/octet-stream, because everything else isn't part of this PR.

We could still use your approach of the old way of declaring it, but maybe that will create confusion in later stages of development of this library? Supporting multiple versions of OpenAPI is challenging.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Later in the code, if necessary, you can distinguish between 3.0 and 3.1 only be checking the keys of the schema object. If format is missing but contentMediaType is present, it must be 3.1. Otherwise it will be 3.0 schema. This might be come in handy...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I don't get the specs .... maybe we don't need a schema object at all. It seems like it is intentionally left out when using {} in the contract.

Then it is treated like

Content transferred in binary (octet-stream)

OK, so leave out the schema object? What do you think?

JsonSchema schema = mediaType.getSchema();
String schemaTypeValue = schema.get("type", null);
if (schemaTypeValue != null && schemaTypeValue.equals("string")) {
String schemaFormatValue = schema.get("format", null);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for format it's the same as for type. If there is a chance that this required information is missing, it should fail when building the contract object, not when we do a validation.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

format can be null, since OpenAPI 3.1 replaced it with contentMediaType. I'm merely providing a fallback if someone still uses format in the contract file.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The simplified validation procedure has to deal with both 3.0 and 3.1 of OpenAPI. Therefore the statements used seem to be redundant.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, you are right, format can be null. But as mentioned in the comment below, I'm not convinced by the current implementation of the "schema validation bypass".

My concerns:

  • RequestValidatorImpl is already a large class
  • different logic for different versions -> makes it complex
  • you have to take care for corner cases

But I understand the benefit of having a schema validation bypass, especially for binary data.

Maybe we can call a method like "canSkipValidation" and extract this method and its logic into a separated class. You could pass the contract reference to this method to check if the contract is 3.0 or 3.1 which makes the logic in this method less complex.

What do you think?

Comment on lines +167 to +174
if (schemaFormatValue != null && schemaFormatValue.equals("binary")) {
return new RequestParameterImpl(transformedValue);
} else {
// Case only for OpenAPI 3.1
String schemaContentMediaType = schema.get("contentMediaType", null);
if (schemaContentMediaType != null
&& schemaContentMediaType.equals(MediaType.APPLICATION_OCTET_STREAM)) {
return new RequestParameterImpl(transformedValue);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't agree to this. As long as the type is string, you could have properties like maxLength, couldn't you?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I don't understand that one?

The null check for conentMediaType is because of the support for OpenAPI 3.1. Someone could use a contract file for version 3.0 where that field is not a requirement.

The verification of the value of contentMediaType is necessary because it could have another value.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I mean is, that you skip the validation. Couldn't it be possible to define a schema like this?

type: string
contentMediaType: application/octet-stream
minLength: 2

@pk-work
Copy link
Contributor

pk-work commented Aug 26, 2024

The whole point of this project/lib seems to be validation of a given OpenAPI schema and the related request/response pairs. While that's true for the JSON or anything else expressed as strings, explicit binary values cannot really be validated by OpenAPI itself.

That's true when it comes to the validation of a binary string, you are very limited :)

This PR needs a bit more polishing and additional tests before it can be merged, I guess. Nevertheless, I think that my commits are a good approach to the problem.

Yes your commits are a very good start. Do you want to do the polishing and add missing tests, or should I take over? If I continue it might take a little bit longer, but as I understood you have already a work around.

@sebveit
Copy link
Author

sebveit commented Aug 26, 2024

Yes your commits are a very good start. Do you want to do the polishing and add missing tests, or should I take over? If I continue it might take a little bit longer, but as I understood you have already a work around.

I'd like to finish this PR, since I'm using vertx alot and it's time to give back. The problem that I'm facing with OpenAPI and octet-stream is a good way to do so.

Thanks for the current review, I'm going to submit the changes/tests!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants