Skip to content

Latest commit

 

History

History
153 lines (124 loc) · 7.42 KB

README.md

File metadata and controls

153 lines (124 loc) · 7.42 KB

Schedule & Lesson Material for the Data Science for the Public Good Program at the Social and Decision Analytics Laboratory.

The Data Science for the Public Good program teaches student fellows how to sift through vast amounts of information related to public safety, employment, and the provision of services to discover how communities can become more efficient and sustainable. Through the lenses of statistics, social science, and data science research, DSPG students will learn to integrate all available data resources

Why things are setup the way they are:

Mon 5/22
IRB and Ethics Training: Sallie Keller, Gizem Korkmaz
System Setup:
  • Git Bash
  • Server Access
  • RStudio Connection
  • Database Connection
Aaron Schroeder / Daniel Chen
Unix Tools:
  • Navigating Directories & Working with Files (cd, ls, mkdir, touch, nano)
  • Shell Scripts (running shell scripts and understanding the working directory)
  • SSH (connecting to a remote server/computer with secure shell)
Daniel Chen
Tue 5/23
Git
  • Setting up Git
  • Creating a repository
  • Tracking changes
  • Exploring history
  • Ignoring Things
  • Remotes in Github
  • Collaboration
  • Conflicts
Daniel Chen
Git Pull Request CollaborationDaniel Chen
Wed 5/24
Servers and code repositoriesDaniel Chen
Project Setup & TemplatesDaniel Chen
Data Ingestion & Storage
  • Files (csv, Excel, RData, zip)
  • RCUrl (ftp, ftps, sftp)
  • JSON
  • APIs (Google Dev, Arlington, googlesheets)
  • Database (SQL, DBI/PostgreSQL)
Daniel Chen
Thu 5/25
Code repositories, project templates, and where to put and access your dataDaniel Chen
Data Objects in R: Data.FramesDaniel Chen
Functions and apply & Looping in R 'for' loops vs apply familyDaniel Chen
Fri 5/26
Daniel Chen
Training: SQL, SQL, SQL!!! - What is SQL and why? Aaron Schroeder
Tue 5/30
Group by statmentsDaniel Chen
Making Choices & Modeling in RDaniel Chen
LoopsDaniel Chen
Reshaping data / tidy dataDaniel Chen
Regular ExpressionsDaniel Chen
Wed 5/31
ACSVicki Lancaster
Thu 6/1
The Data Science Process & Data DiscoveryAaron Schroeder
Data Structure Profiling: Missing Variables, Combined Variables, Multiple Observation Directions, Combined Observational Unit Types, Divided Observation Unit TypeAaron Schroeder
Data Quality Profiling: Completeness, Value Validity, Consistency, Uniqueness, DuplicationAaron Schroeder
PlottingDaniel Chen
Fri 6/2
data.tableDaniel Chen
Working directoriesDaniel Chen
Running R scriptsDaniel Chen
Background and detach processesDaniel Chen
Mon 6/5
Training: Web Scraping
Training: Data Presentation & Reporting (Shiny, Markdown/Latex, knitr)Daniel Chen
Wed 6/7
"Machine Learning" whirlwindDaniel Chen
Fri 6/9
Data Vizualization & Exploration in RJosh Goldstein
Secure & Federated Record LinkageAaron Schroeder
Training: Working with Geographic Data in RAaron Schroeder
Spatial Data Objects in R: Spatial Data.Frames [point, line, polygon], RastersAaron Schroeder
Mapping Food Deserts: A Shiny DashboardAaron Schroeder
Training: Machine Learning in RDaniel Chen
Training: Baysian Analysis in RDave Higdon
Wed 6/14
Training: Agent-Based Modeling with NetLogoBianica Pires
Agent-Based Modeling ApproachesMark Orr
Wed 6/21
Social Network AnalysisGizem Korkmaz
Thu 7/27POSTER SESSION!!
Speakers: Nancy Potok and David Yokum