Validate input file metadata against a schema #8

aidanheerdegen · 2024-04-18T01:08:49Z

The axiom tool would allow us to validate input data against a schema

https://axiom.readthedocs.io/en/latest/schemas/schemas.html

Clearly this is a fraught proposal, as many input files would fail, but that shouldn't be a reason not to do it, but would inform the process by which it is done.

Opinions/thoughts welcome

ping @bschroeter @dougiesquire @kdruken

bschroeter · 2024-05-09T00:14:19Z

@aidanheerdegen, Axiom is capable of validating against an arbitrary schema. It is a lesser-known component of the DRS routine that is exposed through public methods.

I'm happy to help out with this, in the case that Axiom needs any adjustment for our needs - I am still the administrator of the project.

dougiesquire · 2024-05-09T06:32:46Z

I like the idea in principle, though we might end up with a very large number of schema as (depending on how prescriptive they are) they will be quite different for different inputs

aidanheerdegen · 2024-05-09T12:04:09Z

I'm happy to help out with this, in the case that Axiom needs any adjustment for our needs - I am still the administrator of the project.

Thanks @bschroeter. I assume there aren't too many competing tools out there, otherwise you'd not have needed to write that functionality. Basically I don't want to use it just because someone in the org wrote it, but that is obviously a compelling reason.

I like the idea in principle, though we might end up with a very large number of schema as (depending on how prescriptive they are) they will be quite different for different inputs

Maybe. As you say it might depend on how prescriptive the schema are.

I think we could get a fair bit of value out of just quantifying what level of compliance we have. So have some minimal schema standards and work up from there.

bschroeter · 2024-05-13T00:51:44Z

Thanks @bschroeter. I assume there aren't too many competing tools out there, otherwise you'd not have needed to write that functionality. Basically I don't want to use it just because someone in the org wrote it, but that is obviously a compelling reason.

From memory, there were a few, but in order to get them to the level of flexibility that we needed for CCAM it was less dev effort to write something from scratch. Document validation (which is basically what this is) is a reasonably old problem - you could carbon-date me once I start talking about xmlschema etc!

Maybe. As you say it might depend on how prescriptive the schema are. I think we could get a fair bit of value out of just quantifying what level of compliance we have. So have some minimal schema standards and work up from there.

Spot on, Axiom can be as strict as you like. There is the option to specify very exacting standards or a minimum set and other things depending on the use case.

I'd start with looking at the commonalities of the things you want to validate - that will be the base ruleset, then we can work from there.

aidanheerdegen · 2024-05-16T05:54:22Z

ARDC has a FAIR self assessment tool

https://ardc.edu.au/resource/fair-data-self-assessment-tool/

It has petty broad guidelines, but we could use that as a starting point to design some criteria and see what level we're achieving and how we can climb the rungs to improve. e.g. local identifier -> url -> DOI

aidanheerdegen mentioned this issue Apr 18, 2024

New CICE grids ACCESS-NRI/access-om2-configs#92

Closed

aidanheerdegen transferred this issue from ACCESS-NRI/access-om2-configs May 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Validate input file metadata against a schema #8

Validate input file metadata against a schema #8

aidanheerdegen commented Apr 18, 2024

bschroeter commented May 9, 2024

dougiesquire commented May 9, 2024

aidanheerdegen commented May 9, 2024

bschroeter commented May 13, 2024

aidanheerdegen commented May 16, 2024

Validate input file metadata against a schema #8

Validate input file metadata against a schema #8

Comments

aidanheerdegen commented Apr 18, 2024

bschroeter commented May 9, 2024

dougiesquire commented May 9, 2024

aidanheerdegen commented May 9, 2024

bschroeter commented May 13, 2024

aidanheerdegen commented May 16, 2024