Skip to content

Latest commit

 

History

History
75 lines (46 loc) · 5.23 KB

README.md

File metadata and controls

75 lines (46 loc) · 5.23 KB

lemis

CircleCI Project Status: Active – The project has reached a stable, usable state and is being actively developed.

Authors: Noam Ross, Evan A. Eskew, Allison M. White, Carlos Zambrana-Torrelio

The lemis R package provides access to the United States Fish and Wildlife Service's (USFWS) Law Enforcement Management Information System (LEMIS) data on wildlife and wildlife product imports into the US. This data was obtained via more than 14 years of Freedom of Information Act (FOIA) requests by EcoHealth Alliance.

A manuscript describes the data used in this package in detail: United States wildlife and wildlife product imports from 2000-2014. In addition, Smith et al. (2017) provide a broad introduction to the LEMIS data and its relevance to infectious disease research specifically.

Both the raw LEMIS data and the final, cleaned dataset accessed with this package are available via a Zenodo data repository: United States LEMIS wildlife trade data curated by EcoHealth Alliance.

Installation

Install the lemis package with this command:

devtools::install_github("ecohealthalliance/lemis")

In addition, users must have a GitHub personal access token set up to ensure complete package functionality. Detailed instructions for generating a personal access token can be found here. Note, when setting up your token, selecting the repo scope should be sufficient for lemis package use.

Usage

The main function in lemis is lemis_data(). This returns the cleaned LEMIS data as a dplyr tibble.

lemis makes use of datastorr to manage data download. The first time you run lemis_data(), the package will download the most recent version of the database (~200 MB at present). Subsequent function calls will load the database from storage on your computer.

The lemis database is stored as an efficiently compressed .fst file, and loading it loads it a remote dplyr source. This means that it does not load fully into memory but can be filtered and manipulated on-disk. If you wish to manipulate it as a data frame, simply call dplyr::collect() to load it fully into memory, like so:

all_lemis <- lemis_data() %>% 
  collect()

Note that the full database will be ~1 GB in memory.

See the data paper for a more in-depth description and example use cases for the package data.

Working with data versions

While most users will only want to access the most recent lemis data version, the package provides access to multiple data releases made throughout the package development cycle. Users can view all available data releases using lemis_versions(local = FALSE), while lemis_versions() will show only data versions currently available locally on the user's machine. In order to download and subsequently manipulate older data versions, the desired data release needs to be specified:

lemis_data("0.2.0")

Conversely, the lemis_del() function can be used to delete a specific data version from the user's machine. For users who wish to fully reset their local lemis data files, the following code will delete all local data versions and download only the most recent data release:

lemis_del(version = NULL)
lemis_data()

To help confirm data versions, lemis_version_current() shows the local data version that is returned by default when the user runs lemis_data().

Metadata and background on data preparation

lemis_metadata() provides a brief description of each of the data fields in lemis_data(), while lemis_codes() returns a data frame with the codes (abbreviations) used by USFWS in the various columns. This is useful for lookup or joining with the main data for more descriptive outputs. The ?lemis_codes help file also has a searchable table of USFWS codes. See the developer README file for more on the data cleaning workflow used to process raw LEMIS data into the .fst database files that are accessed via the lemis package.

About

Please give us feedback or ask questions by filing issues.

lemis is developed at EcoHealth Alliance. Please note that this project is released with a Contributor Code of Conduct. By participating in this project, you agree to abide by its terms.

http://www.ecohealthalliance.org/