-
Notifications
You must be signed in to change notification settings - Fork 220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ISO 8601 format check fails on valid dates. #2098
Comments
SAS-3692 |
Same goes for any |
hi thanks for reporting, could you verify if this is not fixed by #2128? |
@Ingmarvdg would you be willing to try out the above fix for your case? |
@m1n0 @Ingmarvdg I can confirm that #2128 corrects the problem. However, I noticed also that ISO 8601 dates should accept 24-hr times, and currently they don't. I am issuing another PR to correct. |
See also #2133 |
Hi @pholser I am using the spark-df version of soda-core, and it seems the problem is not yet corrected:
Worse yet, the error has to do with the scan execution instead of resulting in a fail:
|
@Ingmarvdg -- ok, didn't know about the spark-df version of soda-core. Has it incorporated the above change? I'm satisfied that the iso 8601 date check is improved with the change in soda-core itself. |
The regex being fed to the query still appears to be the old incorrect one. I don't believe you've incorporated the soda-core update above into your setup. |
@Ingmarvdg what version are you using? I doubt the fix is in a released version yet. |
I pulled the most recent version of the main branch, then 'pip install .' from the spark-df folder. That should give your changes right? |
Perhaps I changed something that didn't affect your bug. |
Try also #2133 I added a format test for "1623-10-11T10:10:10.0000+01:00", and it seems to pass when the core tests are run for spark-df: https://github.com/sodadata/soda-core/actions/runs/9962160324/job/27525299481 |
When checking if a datetime string is ISO 8601 compliant, some valid datetimes fail.
Dates in the 10th month.
The date
2020-10-11
fails while the dates2020-09-11
and2020-11-12
are fine. It seems this is caused by this section in the regular expression?((0[0-9]|1[12])
that should be?((0[0-9]|1[0-2])
.Dates before 1900 or after 2099
The dates
2100-01-01
and1899-01-01
fail due to this section in the regular expression*(19|20)[[:digit:]][[:digit:]]
.The text was updated successfully, but these errors were encountered: