Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

write quick start guide for shiny app #10

Closed
sheilasaia opened this issue Jun 13, 2019 · 17 comments
Closed

write quick start guide for shiny app #10

sheilasaia opened this issue Jun 13, 2019 · 17 comments
Assignees
Labels
documentation Improvements or additions to documentation

Comments

@sheilasaia
Copy link
Collaborator

No description provided.

@sheilasaia sheilasaia changed the title write quick start guide write quick start guide for shiny app Jun 13, 2019
@sheilasaia sheilasaia added the documentation Improvements or additions to documentation label Jun 13, 2019
@clnsmth
Copy link
Member

clnsmth commented Jun 16, 2019

The in-depth guide will be a vignette. I've initialized /vignettes/interactive_report.Rmd for this purpose.

@CoastalPlainSoils
Copy link
Collaborator

Sure I will do my best based upon what we discussed and have put together so far.

@CoastalPlainSoils
Copy link
Collaborator

CoastalPlainSoils commented Jul 11, 2019

As mentioned in the call today, I will look at Li's email (which I have received) and run the GUI to draft the quick start guide and as discussed in the call today, I will work with Shelia, Li, and Jason with receiving feedback on the quick start guide. Enclosed below here is the vignette text draft I sent to Colin earlier today.

I know that in my text below I do need to clarify if this will be a web-based application, R package or both?

Please feel free to provide any comments on this text.

Overview

Welcome to the Environmental Data Initiative’s (EDI) interactive web application for data description and exploration. This application was developed through the 2019 Hackathon event which occurred June 9 to 13 in Albuquerque, New Mexico. This web application was released on XXXX, 2019

This package allows users to explore the suitability of a data file or package for further inquiry by utilizing R, an open source statistical software program, and the R Shiny application. Additional R package are utilized to make this application possible and are identified further on in this overview. Users identify data files or packages of interest by passing a digital object identifier (DOI) or data file to a graphical user interface (GUI). The user can then browse summary reports and generate exploratory plots.

The goal of the 2019 Hackathon was to improve methods to visualize data. In today’s data driven and data generating world, massive amounts of data are generated and are accessible with only a small fraction being interpreted. Here, users can review data from an existing site or can supply a dataset and review the contents in an efficient manner. By creating such a tool, EDI and the Hackathon participants hope to increase the amount of existing datasets being reviewed and interpreted with the hope that this application contributes to the overall progression of science. This work follows and adheres to the Findable Accessible Interoperable Reusable (FAIR) initiative.

The application has two main functions: a static report and an interactive application. With the static report, the application will run through a given dataset and create a summary table of the provided data. It will create graphs of the identified variables and graphs of the NA distributions (if applicable). With the interactive portion of the application, the user can identify the variables to be plotted in the type of graphical plot desired so that they may interpret identified variables and create a desirable visual analysis of a given dataset or DOI.

This project assists researchers and other data users who wish to reuse existing data packages that are archived on DataOne member notes. This R shiny application (built off of the framework provided by ggplotgui) 1) downloads identified DOIs registered on DataOne member notes, 2) reads data into a light weight viewer, 3) provides summary statistics and basic graphics describing the data package, and 4) generates a more robust report describing the data. This is not intended to replace a full analysis in R or comparable statistical packages, however, is instead intended to allow the user to quickly access if the data is suitable for their needs.

Note: access to data packages on DataOne member notes is provided through the package metajam.

This application would not have been possible without the Hackathon 2019 event the hard work and dedication of the following participants: Colin A. Smith Alesia Hallmark, Li Kui, Jason J. Mercer, An T. Nguyen, John H. Porter, Shelia Saia, Kathe Todd-Brown, Kristin Vanderbuilt, and Jocelyn Wardrup.

In addition, we wish to credit the author of a pre-existing application, Gert Stulp, that application was edited (with Dr. Stulp’s permission) to meet the purpose and satisfy the goals of this project. That pre-existing application is accessible via this link: https://site.shinyserver.dck.gmw.rug.nl/ggplotgui/

Application information: This application runs on R version XXXX, and utilizes the following packages: XXXXXXX, XXXXXXX, XXXXXXX, XXXXXXX, XXXXXXX, XXXXXXX, XXXXXXX.

Please credit this work as the following when appropriate:
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

If you have any questions regarding the application, find any areas for improvement, or wish to provide feedback, please contact Colin A. Smith ([email protected]) or Kristin Vanderbilt ([email protected]).

The package attempts to address some common shortcomings of data formats to aid suitability evaluation. Data is highly diverse and we provide no guarantee this will actually work for a given data package, but it’s worth a shot!

Static Report

The static report is a brief snippet of a given dataset. The static report provides general information such as min, max, mean, 1st quartile and 3rd quartiles. It further provides the number of observations per variable and the number of NA’s for each variable. The static report provides graphs for the variables. The static report is limited to XXXXX variables.

Please note some static reports can take some time to be generated. The web interface will indicate the progression of the generation of the report.

The code for this interactive interface identifies several different formats of geographic coordinates and several different formats of dates/times. Please note that not all formats will be recognized.

Interactive Report

The interactive report allows the user to choose variables and plot styles. Through the XXXXXXXX tab, a user can select boxplot, density, dot + error, dotplot, histogram, scatter, or violin plots. X and y variables can be selected, legend added, font changed, and titles created.

Similar to the static report, the code for this interactive interface identifies several different formats of geographic coordinates and several different formats of dates/times. Please note that not all formats will be recognized.

@wetlandscapes
Copy link
Collaborator

Great information! Thanks for organizing @CoastalPlainSoils.

I have a couple of organizational comments, but these are just opinions, so do with them what you will:

  1. Some of the attribution information (e.g., about EDI, making the app) should be placed else where. There are two tabs in the gui: "Help" and "About". The attribution information should probably go in the "About" tab, while a lot of the quick-start info should go in the "Help" tab. We should probably change the name of the "Help" tab to "Quick Start" (or similar), too.
  2. Rather than focus on the main functionalities of the app, I will suggest that we focus on the tabs of the app, and describe the functionalities therein. In this context we would have 7 sections in the Quick Start document. Below is an example of how those tabs could be organized, as well as a couple of ideas related to the kinds of information the sections could contain.

Raw Data

  • Something about downloading data and which data archive services are currently supported:
  • Maybe mention that we support anything that metajam supports.

Summary Report

  • What is contained on this page?
  • Static report functionality (taken from the original explanation):

The static report is a brief snippet of a given dataset. The static report provides general information such as min, max, mean, 1st quartile and 3rd quartiles. It further provides the number of observations per variable and the number of NA’s for each variable. The static report provides graphs for the variables. The static report is limited to XXXXX variables.

Please note some static reports can take some time to be generated. The web interface will indicate the progression of the generation of the report.

The code for this interactive interface identifies several different formats of geographic coordinates and several different formats of dates/times. Please note that not all formats will be recognized.

  • Interactive report:

The interactive report allows the user to choose variables and plot styles. Through the XXXXXXXX tab, a user can select boxplot, density, dot + error, dotplot, histogram, scatter, or violin plots. X and y variables can be selected, legend added, font changed, and titles created.

Similar to the static report, the code for this interactive interface identifies several different formats of geographic coordinates and several different formats of dates/times. Please note that not all formats will be recognized.

Plot

  • A blurb about plotting functionalities and what some of the buttons and drop-downs mean.

Interactive Plot

  • Something about interactive plotting.
  • Maybe a warning that not more than 100,000 points should be plotted, else the app gets really slow.

R-code

  • Quick blurb as to what information this tab provides.

Help (or Quick Start)

  • This guide.

About

  • Attribution information.
  • We probably don't need to duplicate the information already contained in the About tab.

@clnsmth
Copy link
Member

clnsmth commented Jul 12, 2019

Looks great @CoastalPlainSoils and I second @wetlandscapes comments.

Consider creating this documentation in /datapie/vignettes as an .Rmd file, which outputs .html that can be referenced by the GUI.

@atn38
Copy link
Member

atn38 commented Jul 12, 2019

Summary Report

  • What is contained on this page?
  • Static report functionality (taken from the original explanation):

The static report is a brief snippet of a given data table within the data package. The static report provides general table-level information such as number of observations and NAs per variable, min, max, mean, 1st quartile and 3rd quartiles for numeric variables and number of levels and distribution of levels for categorical variables The static report provides appropriate plots to assess data availability and summary. The static report is limited to XXXXX variables.
Please note some static reports can take some time to be generated. The web interface will indicate the progression of the generation of the report.
The processing for this report identifies several different formats of geographic coordinates and several different formats of dates/times. Please note that not all formats will be recognized.

@CoastalPlainSoils I edited the section on static reports (edited in bold). Note that functionalities mentioned in italicized text aren't implemented yet 🤷‍♂.

@clnsmth
Copy link
Member

clnsmth commented Jul 26, 2019

Hi @CoastalPlainSoils. Please add the quick start guide to the quick_start_guide.Rmd file in the package directory vignettes. This will enable .html content rendering that can be easily added to the UI and website.

@CoastalPlainSoils
Copy link
Collaborator

Thank you for your comments. I am working on this now. I had some issues trying to figure out how to run the app to complete the documentation. I have the app running now, success! I have finished the "About" tab information and I am working on the Quick Start guide..... I liked the idea of organizing it into the tabs, thank you Jason, and thank you for your edits An. Colin I will try to find that file and put everything in correctly, I might be asking you for help! I'll keep you posted.

@CoastalPlainSoils
Copy link
Collaborator

Colin: that exact file you referenced is not in the folder. I looked at one of the vignettes documents and I think I might need some guidance to make sure I insert the information correctly.

@CoastalPlainSoils
Copy link
Collaborator

In the mean time here is what I have completed for the About tab. Please look over and let me know of any edits, errors, etc.

About Tab

EDI Data Viewer __“datapie” (because it is so pleasant!)_

Information Page

Background:

Welcome to the Environmental Data Initiative’s (EDI) interactive web application for data description and exploration. This application was developed through the 2019 Hackathon event which occurred June 9 to 13 in Albuquerque, New Mexico. This web application was released on XXXX, 2019

Purpose:

This package (“datapie”) and web interface application allows users to explore the suitability of a data file or package for further inquiry by utilizing R, an open source statistical software program, and the R Shiny application. Additional R packages are utilized to make this application possible and are identified in this overview. Users identify data files or packages of interest by passing a digital object identifier (DOI) or data file to a graphical user interface (GUI). The user can then browse summary reports and generate exploratory plots.

The goal of the 2019 Hackathon was to improve methods to visualize data. In today’s data driven and data generating world, massive amounts of data are generated and are accessible with only a small fraction being interpreted. Here, users can review data from an existing site or can supply a dataset and review the contents in an efficient manner. By creating such a tool, EDI and the Hackathon participants hope to increase the amount of existing datasets being reviewed and interpreted with the hope that this application contributes to the overall progression of science. This work follows and adheres to the Findable Accessible Interoperable Reusable (FAIR) initiative.

Use of this Application/Package:

The application has two main functions: a static report (summary report tab) and an interactive application (interactive plot tab). With the static report, the application will run through a given dataset and create a summary table of the provided data. It will create graphs of the identified variables and graphs of the NA distributions (if applicable). With the interactive portion of the application, the user can identify the variables to be plotted in the type of graphical plot desired so that they may interpret identified variables and create a desirable visual analysis of a given dataset or DOI.

This project assists researchers and other data users who wish to reuse existing data. This R shiny application (built off of the framework provided by ggplotgui)
1) downloads and supports identified data from DOIs registered on DataOne, EDI, Long Term Ecological Research Network (LTER), and R package “metajam” compatible data;
2) reads data into a light weight viewer;
3) provides summary statistics and basic graphics describing the data package, and;
4) generates a more robust report describing the data.
This is not intended to replace a full analysis in R or comparable statistical packages, however, is instead intended to allow the user to quickly access if the data is suitable for their needs.

Acknowledgements:

This application would not have been possible without the Hackathon 2019 event the hard work and dedication of the following participants: Colin A. Smith, Alesia Hallmark, Li Kui, Jason J. Mercer, An T. Nguyen, John H. Porter, Shelia Saia, Kathe Todd-Brown, Kristin Vanderbuilt, and Jocelyn Wardrup.

In addition, we wish to credit the author of a pre-existing application, Gert Stulp, whose application was edited (with Dr. Stulp’s permission) to meet the purpose and satisfy the goals of this project. That pre-existing application is accessible via this link: https://site.shinyserver.dck.gmw.rug.nl/ggplotgui/

Further thanks to Wilmer Joling for setting up the website which is based on the magical but incomprehensible docker. Thanks to Hadley Wicham for making such good packages (and open access books describing them), that allow even low-skilled and low-talented programmers to be able to contribute to R.

Application information: This application runs on R version 3.3.2 , and utilizes the following packages: ggplot2, Shiny, stringr, plotly, readr, readxl, haven, and RColorBrewer.

Citation:

Please credit this work as the following when appropriate:
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

Questions:

If you have any questions regarding the application, find any areas for improvement, or wish to provide feedback, please contact Colin A. Smith ([email protected]) or Kristin Vanderbilt ([email protected]).

Closing Remarks:

The package attempts to address some common shortcomings of data formats to aid suitability evaluation. Data is highly diverse and we provide no guarantee this will actually work for a given data package, but it’s worth a shot!

Thank you for utilizing this product and we hope it assists you in the progression of beneficial science and interpretation of data. Cheers, and enjoy the datapie!

@CoastalPlainSoils
Copy link
Collaborator

Here is the Quick Start Guide I have created based upon the tabs and the input received. Please note I will be incorporating this into a .Rmd file, but I am waiting for further instruction before doing so. I tried to play around with the app, but I could not really get the plot or interactive plot to work with the example dataset or the dataset I provided. So when I am able to visualize that I can expand on those sections.

Notes to myself or things to clarify/edit are in all caps and missing things are XXXXXX's.... Again if you have comments, questions, expansions on these, feel free to let me know and I will incorporate them.

I hope both the Information/About and this Quick Start Guide is what everyone was envisioning....

Quick Start Guide Tab

EDI Data Viewer “datapie” (because it is so pleasant!)

Quick Start Guide

Not sure how this thing works or need clarification about a particular tab/process?

You have come to the right place! Here, we go over what you need to know to process your data and visualize a given dataset. The information herein is organized according to the tabs on this data viewer: “Raw Data”, “Summary Report”, “Plot”, “Interactive Plot”, “R-code”.

If you do not find the answer you are looking for on this page, please visit the “About” tab and refer to the package/viewer contacts.

Raw Data:

What data can I view on here?
Any dataset that is supported through the R package “metajam”
Any dataset from the data archive services of:
DataOne
Environmental Data Initiative (EDI)
Long Term Ecological Research Network (LTER)

Or…. Upload your own dataset and see what happens! It just might work! We don’t know the format, etc. of your dataset so it is hard for us to say if it will work or not. Generally, the dataset should be formatted with the column names at the top followed by the data in rows for it to work. Accepted file formats are detailed below.

How it works:

On the left side of the application are three options:
“Load sample data”, “Fetch data from DOI”, or “Upload text file”.

Load sample data - allows a user to download the example dataset provided.

Fetch Data from DOI - a user enters a given DOI (Digital object identifier) and clicks “Fetch Data”.

Upload text file - by clicking “Browse” one can find and upload one of five file types: text (csv), Excel, SPSS, Strata, or SAS. Then select the delimiter (how the data is formatted within that file) based upon four options: Semicolon, Tab, Comma, or Space. Select “Submit datafile” and the data file will appear on the right. If your uploaded datafile does not appear the way you wish, you may need to create a copy of the file and edit headers or delete rows in the top of the document that may contain study information, of which is important, however this application is not designed to know how many rows to skip in any one dataset. The columns at the top of the page are from the top row of the document uploaded.

One can expand or decrease the amount of rows shown on a given page with the options of 10, 25, 50 and 100.

Summary Report:

Based upon the data uploaded, click “Generate report”. On the right side of the page a summary report of the provided data will be appear.

The summary report will display a brief snippet of a given data table within the data package. This report provides general table-level information such as number of observations and NA’s per variable, min, max, mean, 1st quartile and 3rd quartiles for numeric variables and number of levels and distribution of levels for categorical variables. REVIEW TO MAKE SURE CORRECT… This report is limited to XXXXX variables.

Please note some reports can take some time to be generated. The web interface will indicate the progression of the generation of the report. - IS THIS CORRECT?

The processing for this report identifies several different formats of geographic coordinates and several different formats of dates/times. Please note that not all formats will be recognized.

If you are satisfied with the generated report, click “Download report (HTML)”. This HTML can be saved or printed to PDF. Images within the HTML can saved individually by the user.

Plot:

This portion of the application allows the user to create and edit plots in an easy to use format with the option to save the plot created.

In the central portion of the page, a static plot is generated. On the left side of the page, the user can select the type of plot (boxplot, histogram, scatter) and the x and y variables.

Based upon what plot is selected, the user can choose between a variety of options. EXPLAIN OPTIONS IN MORE DETAIL…..

On the right side of the page the user is given the option to change the aesthetics of the plot. Tabs shown are: “Text”, “Theme”, “Legend”, and “Size”. Here the labels on the graph’s axes can be changed, a title may be added, font sizes can be adjusted, text can be rotated, colors selected, gridlines removed, legend edited, and the size of chart adjusted.

The user is given the option to download a pdf or tiff file format of the figure.

Interactive Plot:

The interactive plot gives the user the same options as the plot, however is different in that….. XXXXXXXXXXXXXXX

Please note datasets with greater than 100,000 points will take longer to plot and should be avoided by users. We recommend dividing up your datasets if this is the case.

R-code:

The R-code is enclosed for the purpose of XXXXXXXXXXXXXXXXXXXX

@clnsmth
Copy link
Member

clnsmth commented Jul 31, 2019

@CoastalPlainSoils, quick_start_guide.Rmd is in the /vignettes directory of the development branch, where contributions to the project are being made (see #53 for guidance).

Send me an email if you have any questions.

@sheilasaia
Copy link
Collaborator Author

sheilasaia commented Aug 2, 2019

@CoastalPlainSoils do you mind if I use the text you wrote above about the hackathon for the "About" tab? Specifically, the post you made on July 11? I just noticed that the "About" tab hasn't been updated and isn't assigned to anyone. I will add it manually to the app for now but we can talk about how to streamline this with a markdown file, later (maybe beta release).

@clnsmth
Copy link
Member

clnsmth commented Aug 8, 2019

The quick start guide (above) has been added to the vignette quick_start_guide.Rmd. Now it needs to be integrated into the UI Help tab.

@clnsmth
Copy link
Member

clnsmth commented Sep 27, 2019

Working on this now in the fix_10 branch.

@clnsmth
Copy link
Member

clnsmth commented Sep 27, 2019

In addition to the primary scope of this issue, theAbout tab contents will be relocated to the repo level README.md and the package level DESCRIPTION file.

@clnsmth clnsmth closed this as completed Sep 30, 2019
@clnsmth clnsmth reopened this Sep 30, 2019
@clnsmth
Copy link
Member

clnsmth commented Sep 30, 2019

Merged branch fix_10 into the development branch (see e20f5a9).

@clnsmth clnsmth closed this as completed Sep 30, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

5 participants