Skip to content

Latest commit

 

History

History
39 lines (23 loc) · 3.74 KB

README.md

File metadata and controls

39 lines (23 loc) · 3.74 KB

plume_bloom_drivers

Using classified raster images and meteo drivers to try to better understand what is causing sediment plumes and blooms in Lake Superior. The input data for this repo comes from the rossyndicate/Superior-Plume-Bloom repo.

Building the pipeline

This pipeline is setup to download, process, and run models for detecting blooms and plumes. It is structured as a {{targets}} pipeline so that the workflow is easily reproducible and can be followed. The pipeline and workflow be run easily using tar_make(). The first time you run this, you may get errors about missing packages. Install those and then try again. You should read the following caveats about some of the data inputs/downloads within the pipeline before attempting to build.

Meteorological data from PRISM

The meteorological driver data from PRISM does take a long time to download and process. Due to this, we have two spots in the pipeline where pre-built data can be used to skip over those steps.

  1. If you have access to the zip file of the pre-downloaded, raw meteorological data on Box, comment out the p1_prism_files target in 1_download.R and uncomment the target with the same name that is set up below it. You will need to download the zip file from Box and unzip the files to the 1_download/prism_data/ directory before being able to build the full pipeline.
  2. If you have access to the CSV file of processed meteorological data on Box, comment out the p2_prism_data_huc target in 2_process.R and uncomment the target with the same name that is set up below it. You will need to download the CSV file from Box and move it to the 2_process/in/ directory before being able to build the full pipeline.

Classified raster data from Google Drive

At this time, the raster files of classified imagery are kept in a Google Drive folder where you need to have specific access. The data may be released in the future, which would make this step easier. For now, you need to follow the steps below in order to authenticate to Google Drive when running tar_make().

  1. Create a new text file called .gd_config and save in the top-level directory of this project.
  2. Copy-paste this code into that file: gd_email: '[email protected]'
  3. Change the [email protected] part of the file to match your own email that you will use to access the data.
  4. Then, try running tar_make().

Lake Superior spatial data

For now, the Lake Superior shapefile LakeSuperiorWatershed.shp is only available to our internal team via Box. You should download the spatial zip called LakeSuperiorWatershed.zip from Box (includes all associated metadata files) and unzip to the folder 1_download/in. This will ensure that the target in 1_download.R called p1_lake_superior_watershed_shp will successfully find the file it needs.

Finding and viewing outputs

After you build the pipeline, you should be able to see the following:

  1. Histogram summarizing the pixel counts by year and mission: tar_read(p4_basic_summary_histogram)
  2. PRISM drivers as timeseries, visualized by HUC: tar_read(p4_prism_summary_timeseries)
  3. PRISM drivers as boxplots, visualized by HUC and decade: tar_read(p4_prism_summary_boxes)

Contributing to this pipeline

Everyone who is developing this pacakge will build their own pipeline locally. We will not commit output of the pipeline and should .gitignore and files generated by the pipeline build. The very first time you build the pipeline, you should delete the _targets/.gitignore file. It overrides the top-level gitignore and can be frustrating. Run the following to delete it: file.remove('_targets/.gitignore').