Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

overhaul with new approach #21

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open

overhaul with new approach #21

wants to merge 8 commits into from

Conversation

pearsonca
Copy link
Collaborator

Chucks the old approach, in with the new.

@pearsonca pearsonca requested a review from seabbs August 28, 2024 01:22
@pearsonca
Copy link
Collaborator Author

I can deal with the linting issues, but not sure about the test-coverage - my end, the tests work (though they did some troubleshooting).

@seabbs
Copy link
Collaborator

seabbs commented Aug 28, 2024

et.hash(hash, get(".hash.salt", envir = parent.frame()))

  1. ├─base::set.seed(...)
    
  2. └─base::readBin(digest(obj, algo = "xxhash32", raw = TRUE), "integer")
    
  3.   └─base::file(con, "rb")
    

Nice! Will take a look tonight.

This is the error. Is the test CI missing a dependency?

@pearsonca
Copy link
Collaborator Author

This is the error. Is the test CI missing a dependency?

Not sure how that gets specified - the DESCRIPTION and NAMESPACE dependencies correctly connect to digest. Whereabouts do CI package dependencies live? Presumably somewhere in the YAML configs.

@seabbs
Copy link
Collaborator

seabbs commented Aug 28, 2024

If they are in the DESCRIPTION it should pick them up. Platform specific issue maybe?

Copy link
Collaborator

@seabbs seabbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functionality wise this looks good. I still have big reservations about the design (i.e overwriting base functions)

formals(.FUN) <- c(formals(rFUN), alist(hash = ))
dispatch <- as.character(substitute(rFUN))
hashc <- str2lang("set.hash(hash, get('.hash.salt', envir = parent.frame()))")
origc <- str2lang(sprintf("%s%s%s(%s)",dispatch[2],dispatch[1],dispatch[3],toString(names(formals(rFUN)))))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bunch of linting issues here

#'
#' @export
set.salt <- function(salt) {
assign(".hash.salt", salt, envir = parent.frame())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have you thought about doing this with options? I think it might be a lot more cran compliant

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A little bit, but the idea is to have a scope-driven salt value - that probably means setting something in an environment.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but options can do that?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or an environment variable (options -> environment variable would be the classic approach I think)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this approach is an environment variable?

i don't think the options version let's you do this cleanly - unsetting the variable when leaving a scope seems messy?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it is its a normal variable?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what I mean (though I didn't mean via this nice little wrapper package): https://cran.r-project.org/web/packages/options/vignettes/envvars.html

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, i should say: it's a normal variable, in the r-environment-scoping sense. so when r does its normal search enclosing scopes to find a token, it gets found. it also gets cleaned up etc like normal, & can exist in closer scope, while outer scope has its own (masked until that scope is returned to).

.lintr Show resolved Hide resolved
export(rbinom)
export(rcauchy)
export(rchisq)
export(rexp)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have an issue to approach not needing to do this? It seems like quite an unsafe design and one that I don't think will fly with CRAN

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm relatively confident this is actually what users want - minimal modification to their code to use this capability.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great but I don't think you will be able to get it on CRAN!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤷 give the people what they want.

that said: i agree there are tradeoffs here. my goal: enable the user to switch matching on and off, without having to change code. this approach currently only makes the refactor easy-ish. so:

  • make the refactor easier: provide a bit of static analysis to find invocations of the overriden functions, that don't have a hash = ... argument
  • make the switching (back to base) possible with an option / package environment variable
  • be a bit noisier when loading the package

i think adding those also helps allay some of the overloading concerns. I think there's one other, harder problem to solve: making it possible to use hashprng within another package (that does some thing that requires matching) without having that leak out of that package.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think as suggested elsewhere if going to do this I wouldn't do it via reexporting but instead have a overload function that users can use and then use this is an option etc to overload a list of common r functions (i.e these)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you'll still have to assign to the global environment which isn't great but at least its a bit more in line with normal practice?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, i don't recall that suggestion. makes some sense - so basically, users would do:

library(hashprng)
hashoverload() # defaults to everything hashprng knows about
... some code ...
hashunload() # turns off overloads
# or hashoverload(c("unif", "binom")) # runif and rbinom only
# or hashoverload(c("custom")) # wraps some rcustom that it finds in the environment

that's a little more onerous, but definitely less likely to annoy the CRAN gods.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still doesn't deal well with people having re-written their code to have hash = ... arguments, that need to be ignored if they've switched off matching.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for the latter I would enforce they actually use a wrapper around their dists (that can have args the dists don't have) and then they can turn hashing on and off using an option.

It means they have to rewrite their code but they can easily the turn it on and off if they are so minded?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one aside: R CMD check passes just fine.

@seabbs
Copy link
Collaborator

seabbs commented Sep 18, 2024

s live? Presumably somewhere in the YAML configs.

Yes if different to the DESCRIPTION ones

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants