Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong priority for Optional param #23

Open
apieceofredcloth opened this issue Sep 13, 2013 · 6 comments
Open

Wrong priority for Optional param #23

apieceofredcloth opened this issue Sep 13, 2013 · 6 comments

Comments

@apieceofredcloth
Copy link
Contributor

>>> schema = Schema({Optional('id'): Use(int), str: str})
>>> schema.validate({'id': '23', 'other': 'asf'})

>>> {'id': '23', 'other': 'asf'}

I think value of id should be converted to integer 23 instead of string '23'("by Use(int)").But "str: str" has a higher priority than "Optional('id')", so it just ignores the "Use(int)" statement.
I check the code about priority

def priority(s):
    if type(s) in (list, tuple, set, frozenset):
        return 6
    if type(s) is dict:
        return 5
    if hasattr(s, 'validate'):
        return 4
    if type(s) is type:
        return 3
    if callable(s):
        return 2
    else:
        return 1

Optional has a 'validate' attribute which is inherited from Schema class, so it return 4 here, higher than 3 of the str case.

@keleshev
Copy link
Owner

So you propose switching around the priorities?

I wonder, what impact that would have on other cases...

@apieceofredcloth
Copy link
Contributor Author

I am afraid of that impact will bite too.Currently, I think "hasattr(s, 'validate')"(Optional, And, Or...) is more specifical than "is type"(str, object...)

@sirex
Copy link

sirex commented Jul 29, 2014

Same issue here. But in my case, even worse!

Here is what I have:

import schema

s = schema.Schema({
    'a': Use(int),
    'b': Use(int),
    schema.Optional('c'): Use(int),
    schema.Optional(object): object,
})

Here I expect that 'a', 'b' are always required, 'c' is optional, and all other parameters are optional. I mean, if 'c' is present, it should be always validated. But look what happens:

keys = sorted([(schema.priority(k), k) for k, v in s._schema.items()])
import pprint; pprint.pprint(keys)
[(1, 'a'),
 (1, 'b'),
 (4, Optional(<type 'object'>)),
 (4, Optional('c'))]

It means, that 'c' will never be validated. What is even worse, that this creates race condition, since all Optional keys will be validated in random order ('c' gets validated randomly), that's why it took me 4 hours, to debug this issue, when my tests started to break randomly.

@acaprari
Copy link

I also have the same issue. I'm trying to validate some query arguments from a web request:

s = Schema({
        Optional('start'): Use(int),
        basestring: object
    })

s.validate({'start': '1'})
{'start': '1'}   # matched basestring: object

'1' is not converted to int because basestring matches before Optional('start')

As already pointed out by @apieceofredcloth, I would suggest to increase the priority of the hasattr(s, 'validate') case.

Another solution may be to have the classes derived from Schema (such as Optional) expose the same priority of the underlying schema, so Optional('start') would have a priority of 1, while Optional(str) would have a priority of 3.

I can create a pull request if needed.

@marc2982
Copy link

This isn't fully fixed by #52. Here is a unit test that (sometimes) reproduces the problem i'm seeing on my machine. I'm not sure why it doesn't consistently hit?

diff --git a/test_schema.py b/test_schema.py
index 4ec9993..9feebf8 100644
--- a/test_schema.py
+++ b/test_schema.py
@@ -154,6 +154,9 @@ def test_dict_optional_keys():
     # Make sure Optionals are favored over types:
     assert Schema({basestring: 1,
                    Optional('b'): 2}).validate({'a': 1, 'b': 2}) == {'a': 1, 'b': 2}
+    assert Schema({str: {
+        Optional('a', default=False): bool, Optional(str): object}}).\
+            validate({'c': {'a': True, 'b': 2}}) == {'c': {'a': True, 'b': 2}}

Test output:

E       assert {'c': {'a': False, 'b': 2}} == {'c': {'a': True, 'b': 2}}
E         Differing items:
E         {'c': {'a': False, 'b': 2}} != {'c': {'a': True, 'b': 2}}
E         Use -v to get the full diff

The reason for this is the ordering. I printed out the ordered keys and their relative priority with this diff:

diff --git a/schema.py b/schema.py
index f36ab1d..fd9df58 100644
--- a/schema.py
+++ b/schema.py
@@ -116,6 +116,8 @@ class Schema(object):
             covered_optionals = set()
             # for each key and value find a schema entry matching them, if any
             sorted_skeys = list(sorted(s, key=priority))
+            print('sorted_skeys', sorted_skeys)
+            print([(y, priority(y)) for y in sorted_skeys])
             for key, value in data.items():
                 valid = False
                 skey = None

Output:

('sorted_skeys', [<type 'str'>])
[(<type 'str'>, 3)]
('sorted_skeys', [Optional(<type 'str'>), Optional('a')])
[(Optional(<type 'str'>), 2), (Optional('a'), 2)]

The correct ordering should be: [(Optional('a'), 2), (Optional(<type 'str'>), 2)]

I whipped together a quick patch, but it feels dirty and I don't think it fixes the general case...

diff --git a/schema.py b/schema.py
index f36ab1d..c9c666e 100644
--- a/schema.py
+++ b/schema.py
@@ -73,7 +73,8 @@ class Use(object):
             raise SchemaError('%s(%r) raised %r' % (f, data, x), self._error)


-COMPARABLE, CALLABLE, VALIDATOR, TYPE, DICT, ITERABLE = range(6)
+COMPARABLE, CALLABLE, VALIDATOR_VAL, VALIDATOR_TYPE, TYPE, DICT, ITERABLE = \
+    range(7)


 def priority(s):
@@ -85,7 +86,9 @@ def priority(s):
     if issubclass(type(s), type):
         return TYPE
     if hasattr(s, 'validate'):
-        return VALIDATOR
+        if hasattr(s, '_schema') and issubclass(type(s._schema), type):
+            return VALIDATOR_TYPE
+        return VALIDATOR_VAL
     if callable(s):
         return CALLABLE
     else:
@@ -163,7 +166,7 @@ class Schema(object):
                 return data
             else:
                 raise SchemaError('%r should be instance of %r' % (data, s), e)
-        if flavor == VALIDATOR:
+        if flavor in (VALIDATOR_TYPE, VALIDATOR_VAL):
             try:
                 return s.validate(data)
             except SchemaError as x:

@dylanninin
Copy link
Contributor

These days, I have encountered the same problem as below

def test_optional_key_convert_failed_randomly_while_with_another_optional_object():
    """
    In this test, created_at string "2015-10-10 00:00:00" is expected to be converted to a datetime instance.
        - it works when the schema is

            s = Schema({
                    'created_at': _datetime_validator,
                    Optional(basestring): object,
                })

        - but when wrapping the key 'created_at' with Optional, it fails randomly
    :return:
    """
    import datetime
    datetime_fmt = '%Y-%m-%d %H:%M:%S'
    _use_datetime = lambda i: datetime.datetime.strptime(i, datetime_fmt)
    _datetime_validator = Or(None, Use(_use_datetime))
    s = Schema({
        Optional('created_at'): _datetime_validator,
        Optional(basestring): object,
    })
    data = {
        'created_at': '2015-10-10 00:00:00'
    }
    validated_data = s.validate(data)
    # is expected to be converted to a datetime instance, but fails randomly(most of the time)
    # assert isinstance(validated_data['created_at'], datetime.datetime)

    assert isinstance(validated_data['created_at'], basestring)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants