Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Functions to save and load RData from ScienceBase #185

Open
aappling-usgs opened this issue Feb 16, 2016 · 4 comments
Open

Functions to save and load RData from ScienceBase #185

aappling-usgs opened this issue Feb 16, 2016 · 4 comments

Comments

@aappling-usgs
Copy link
Contributor

In line with #164 plus making SB really R friendly, could write functions that load (and save?) native R data to ScienceBase.

I would use this in mda.streams right away if it were available.

@lawinslow
Copy link
Contributor

You thinking just binary Rdata objects or more generic data.frame to CSV/TSV?

@aappling-usgs
Copy link
Contributor Author

I was thinking #164 was about 'more generic data.frame to CSV/TSV', and I think that's also a good idea.

But writing binary data, and keeping it binary, has value of its own. So I'm thinking of this issue as a separate idea.

What are your thoughts on the naming convention for these direct read-in functions(#185, #164, #165, etc.)?

@lawinslow
Copy link
Contributor

Hmmm, if we were using R syntax as a parallel, sb_load and sb_save would be a good way to go. Conversely, all of these functions are referring to items, which would push us towards the current sbtools parlance of item_*something*. Just brainstorming here. item_save_rdata, item_load_rdata. item_load_df, item_save_df (or maybe csv).

You have any thoughts?

@aappling-usgs
Copy link
Contributor Author

I like that item_load_rdata could eventually be complemented by item_load_tsv, etc. Possible modifications:

A. It's possible in SB (and sometimes sensible) to store multiple data files within a single item. This could be accommodated in a couple of ways:

  1. prefix every related function with item_file, e.g., item_file_load_rdata(sb_id, ..., names, session=current_session()) to be parallel to item_file_download
  2. open a new line of function names starting with file or data or sbdata, e.g., data_load_rdata(filename, sb_id, ...).

B. Either way, could then go the route of

  1. one function per datatype (data_load_rdata, data_load_wfs, etc.) or
  2. a single function with a flag for the datatype, e.g., data_load(filename, sb_id, data_type='rdata', ...)

B1 v B2 could depend on whether there are specialized arguments (and how many) for each data type.

I'm leaning toward A2 and B1 at the moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants