Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Graceful fail in case of vdb.czso.cz not available #41

Open
jlacko opened this issue Mar 12, 2021 · 4 comments
Open

Graceful fail in case of vdb.czso.cz not available #41

jlacko opened this issue Mar 12, 2021 · 4 comments

Comments

@jlacko
Copy link
Contributor

jlacko commented Mar 12, 2021

Recently the VDB server of ČSÚ was unavailable for extended period of time, which triggered timeout failures in CI workflows relying on {czso}.

This behaviour of {czso} may be not fully compliant with CRAN policy of a graceful fail of internet resource not being available.

I have a function written for this, and I would be happy to propose a pull request to {czso} if this would be agreeable to the maintainers (the code is ready, so little extra work would be involved).

For your information the code is available for your information at https://github.com/jlacko/RCzechia/blob/master/R/ok_to_proceed.R

@petrbouchal
Copy link
Owner

petrbouchal commented Mar 12, 2021

Thanks for catching this! Yes, the checking for online resources is incomplete - it needs to check for more undesirable conditions, as your code does, and return more informative feedback, again as your package does. And thanks for volunteering a fix!

I'd like to use a helper I already have elsewhere however. First, because I want to try it out and see if it's right, and it will also be easier to maintain.

Second, I'd like to take a slightly different approach from yours. My reading of the CRAN policy (so that makes sense in light of principles around functions providing predictable return values) is that if a remote resource is unavailable:

  • "failing gracefully" in user-facing code means returning an error with an informative error message (and also not timing out etc.). It seems {httr} docs and rOpensci guidance agree on this.
  • in R CMD check, this should not cause check warnings, i.e. tests and examples need to handle the error and demote to a message, or skip the test/example if the resource is unavailable.

Demoting errors to messages in user-facing code in fact makes it harder for users to figure out what went wrong - then, they end up having to do the checking/fallbacks as they receive unexpected return values if the web resource fails (NULL or FALSE where they expect e.g. a data frame).

More generally, I think if a workflow, even on a CI platform, contains a step depending on remote data, the user expects that workflow to fail if the remote data is unavailable. If they don't want to depend on the data being available, they can handle the error themselves. I think of CRAN as different in this sense because its purpose is to check code on changing underlying systems, not to run a data pipeline.

@jlacko
Copy link
Contributor Author

jlacko commented Mar 12, 2021

Oki, perfect. That's why I asked :)

I use {czso} in a vignette - https://cran.r-project.org/web/packages/RCzechia/vignettes/vignette.html#visualizing-czech-population - to save me the trouble of having to provide a sample data file to plot something (anything) related to Czech okresy.

When I had the vignette rebuilding crash on an unhandled server timeout this Wednesday I was reminded of the troubles I had with CRAN maintainers about the "graceful fail" policy, which seems to have two aims - informing users, and keeping CRAN logs clean of warnings - which are somewhat at odds. Inu...

Since you have the issue under control I am closing it; best of luck & nice weekend!

@jlacko jlacko closed this as completed Mar 12, 2021
@petrbouchal
Copy link
Owner

Thanks - I'll actually keep it open to make sure I get to it...

As for the vignette, I would suggest keeping a static copy of the data in the package? (Shouldn't have that much of a size footprint). Or perhaps precompiling the vignette but that's a bit involved...

@petrbouchal petrbouchal reopened this Mar 12, 2021
@jlacko
Copy link
Contributor Author

jlacko commented Mar 12, 2021

Whatever works for you!

As for the vignette: I had a static file there in the past, but getting the data in by the means of {czso} feels way more elegant. After all, this is 21st century and relying on CSV files of uncertain origin & quality is somewhat passé. So if you don't mind I would keep it the way it is - I had my CI crash on me exactly once in the year or so that I am using {czso}, and the VDB server recovered in a few hours. No big deal...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants