-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Validate input file metadata against a schema #8
Comments
@aidanheerdegen, Axiom is capable of validating against an arbitrary schema. It is a lesser-known component of the DRS routine that is exposed through public methods. I'm happy to help out with this, in the case that Axiom needs any adjustment for our needs - I am still the administrator of the project. |
I like the idea in principle, though we might end up with a very large number of schema as (depending on how prescriptive they are) they will be quite different for different inputs |
Thanks @bschroeter. I assume there aren't too many competing tools out there, otherwise you'd not have needed to write that functionality. Basically I don't want to use it just because someone in the org wrote it, but that is obviously a compelling reason.
Maybe. As you say it might depend on how prescriptive the schema are. I think we could get a fair bit of value out of just quantifying what level of compliance we have. So have some minimal schema standards and work up from there. |
From memory, there were a few, but in order to get them to the level of flexibility that we needed for CCAM it was less dev effort to write something from scratch. Document validation (which is basically what this is) is a reasonably old problem - you could carbon-date me once I start talking about xmlschema etc!
Spot on, Axiom can be as strict as you like. There is the option to specify very exacting standards or a minimum set and other things depending on the use case. I'd start with looking at the commonalities of the things you want to validate - that will be the base ruleset, then we can work from there. |
ARDC has a FAIR self assessment tool https://ardc.edu.au/resource/fair-data-self-assessment-tool/ It has petty broad guidelines, but we could use that as a starting point to design some criteria and see what level we're achieving and how we can climb the rungs to improve. e.g. local identifier -> url -> DOI |
The axiom tool would allow us to validate input data against a schema
https://axiom.readthedocs.io/en/latest/schemas/schemas.html
Clearly this is a fraught proposal, as many input files would fail, but that shouldn't be a reason not to do it, but would inform the process by which it is done.
Opinions/thoughts welcome
ping @bschroeter @dougiesquire @kdruken
The text was updated successfully, but these errors were encountered: