-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: schemas #66
WIP: schemas #66
Conversation
Try this instead http://docs.python-cerberus.org/en/stable/index.html |
regolith/tests/test_validators.py
Outdated
from regolith.validators import validate_schema | ||
from regolith.schemas import schemas | ||
from regolith.exemplars import exemplars | ||
import pytest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PEP8 Third party imports go before package imports
python |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this really a valid requirement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took inspiration for this from conda forge recipes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is crazy if it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We usually include python as a run requirement in conda recipes, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, that is true. I wonder about whether this affects pip, though maybe we don't care 🤷♂️
regolith/validators.py
Outdated
The sequential keys for the data being accessed, used for | ||
printing/debuging errors | ||
""" | ||
if isinstance(record, dict): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
check collections.abc.Mapping
, not dict. Same goes for below.
regolith/validators.py
Outdated
raise ValueError('{} is required in {}'.format(k, keys)) | ||
elif k in record and k in schema: | ||
validate_schema(record[k], schema[k], keys + (k, )) | ||
elif isinstance(record, collections.Iterable) and not isinstance(record, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this really include set and generators? Or is this supposed to be collections.abc.Sequence
?
regolith/validators.py
Outdated
for r in record: | ||
validate_schema(r, schema, keys) | ||
else: | ||
if not isinstance(record, schema['type']): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably be an elif, rather than an if inside of an else.
regolith/tests/test_validators.py
Outdated
@@ -0,0 +1,11 @@ | |||
from regolith.validators import validate_schema | |||
from regolith.schemas import schemas | |||
from regolith.exemplars import exemplars |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the exemplars are only used in the tests, then they should be in the test suite, not the package.
regolith/schemas.py
Outdated
@@ -0,0 +1,400 @@ | |||
"""Database schemas""" | |||
schemas = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PEP8 global constants should be in UPPERCASE
regolith/validators.py
Outdated
The database record to be tracked | ||
schema : dict | ||
The schema to validate the record against | ||
keys : tuple |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be listed as , optional
regolith/validators.py
Outdated
|
||
for k in total_keys: | ||
if k not in schema: | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it an error if we get a key that we don't expect?
regolith/validators.py
Outdated
The schema to validate the record against | ||
keys : tuple | ||
The sequential keys for the data being accessed, used for | ||
printing/debuging errors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Broadly speaking, I guess I'd prefer if this returned (bool, str message) rather than raising an error. That way the call site can decide whether or not to raise the error. Also, it allows the message to be appended to so that all of the ways that the record doesn't adhere to the schema can be known with one call to this function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is done with cerberus. (validator.errors)
This is a really great first cut @CJ-Wright! Other than the parts listed inline, I think the thing that makes sense (so that info isn't duplicated) is to have the docs in conf.py generate the schema section of the collections pages from the schema itself. I think that this is needed so there aren't two, possibly diverging sources of truth. |
Also, cerebus is interesting, but it seems to not be on conda-forge. |
Ahh, mistyped it! |
Sorry, I just pushed up a cerberus driven version |
regolith/validators.py
Outdated
import os | ||
import re | ||
|
||
from cerberus import Validator | ||
from getpass import getpass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PEP8 cerberus is a 3rd party library and so should go below getpass, which is from the standard library and should be grouped with those above.
regolith/validators.py
Outdated
if isinstance(schema, dict): | ||
schema.pop('description', None) | ||
for v in schema.values(): | ||
pop_description(v) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You shouldn't need to do this. Instead subclass validtor and make it skip custom keys in the schema we want to add.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I spent some time trying to do that without much headway, but I will try again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There has to be a way. Popping stuff out of a global, mutable object is wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case at least. The alternative is to have a second dict which has the same structure and only contains the descriptions, but that feels wrong too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw there is an open issue for description
pyeve/cerberus#254
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And now a SO question https://stackoverflow.com/questions/47851546/cerberus-additional-schema-keys
regolith/tests/test_validators.py
Outdated
from regolith.validators import validate_schema | ||
from regolith.schemas import SCHEMAS | ||
from regolith.exemplars import EXEMPLARS | ||
import pytest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PEP8 import order
regolith/schemas.py
Outdated
class MyValidator(Validator): | ||
def _validate_description(self, description, field, value): | ||
if False: | ||
self._error(field, "Shouldn't be here") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't this function just be a pass
?
regolith/schemas.py
Outdated
return v.validate(record), v.errors | ||
|
||
|
||
EXEMPLARS = {'abstracts': {'_id': 'Mouginot.Model', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PEP8 Constants should be at the top of the file.
regolith/fsclient.py
Outdated
@@ -48,10 +49,14 @@ def dump_yaml(filename, docs, inst=None): | |||
"""Dumps a dict of documents into a file.""" | |||
inst = YAML() if inst is None else inst | |||
inst.indent(mapping=2, sequence=4, offset=2) | |||
for doc in docs.values(): | |||
my_sorted_dict = ruamel.yaml.comments.CommentedMap() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please remove the my_
from this variable.
docs/collections/index.rst
Outdated
auto/people | ||
auto/projects | ||
auto/proposals | ||
auto/students |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd really prefer these not go into an auto folder. It doesn't seems necessary.
docs/collections/index.rst
Outdated
grades | ||
grants | ||
auto/grades | ||
auto/grants | ||
jobs | ||
news |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are these not generated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't made schemas for them yet
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
haha OK - should that be a requirement for this PR? Or do you want to delay it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was hoping to delay it. I'm expecting a couple more takes at this as I start using more of the dbs so I might come around to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good to me!
regolith/schemas.py
Outdated
'email': '[email protected]', | ||
'university_id': 'HAP42'}} | ||
|
||
SCHEMAS = {'abstracts': {'_description': {'description': 'Abstracts for a conference or workshop. This is generally public information\n\n'}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
many lines too long here, also probably shouldn't have \n\n
at the end of them. That can be added in the doc generator
@@ -6,6 +6,8 @@ | |||
|
|||
from flask import Flask, abort, request, render_template, redirect, url_for | |||
|
|||
from regolith.schemas import validate | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PEP8 two newlines required after import lines
regolith/app.py
Outdated
n = os.path.join(td.name, 'regolith.txt') | ||
print(errors) | ||
print('Writing text file to {}. ' | ||
'Please try again.'.format(n)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these prints should go into the body of the error message below. At the very least, they should be printed to stderr....
regolith/app.py
Outdated
print(errors) | ||
print('Writing text file to {}. ' | ||
'Please try again.'.format(n)) | ||
with open(n) as f: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file isn't opened in write mode. Do you have any tests for the invalid case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we have any tests for the app.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair point!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made an issue #69
regolith/app.py
Outdated
print('Writing text file to {}. ' | ||
'Please try again.'.format(n)) | ||
with open(n) as f: | ||
f.write(form['body']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above
regolith/tests/test_validators.py
Outdated
@@ -0,0 +1,8 @@ | |||
import pytest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please move the tests files to a top-level tests/ dir, rather than embedding them in the project dir? Thanks!
Thanks! Some more minor comments. |
@scopatz ready for another round! |
Looks good to go! Can you please provide a news entry? |
news/schema.rst
Outdated
@@ -0,0 +1,16 @@ | |||
**Added:** | |||
|
|||
* Schemas for the tables |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not particularly informative. Can you please expand on it. Also, aren't we calling them collections, not tables?
required fields are filled and the values are the same type(s) listed in the | ||
schema. The schema also includes descriptions of the data to be included. | ||
The exemplars are examples which have all the specified fields and are | ||
used to check the validation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not valid rst, which requires the paragraph to be indented after the bullet point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll merge this and fix it though.
No description provided.