This repo has now moved to vote-by-mail/election-official-data. This archived version is maintained but will be deleted shortly.
This repo collects information by locale (county or town) from critical swing states for MailMyBallot.org. Code for each state is under the state's name.
To get started, run the make create-install
command. There are other useful commands there.
The real work is done by PyInvoke, a simple task runner which was installed by the previous command.
Data is saved in the /public
folder of the public-data branch by state (e.g. florida.json
). Each file is a json array of all election-official contacts for locale. The format of the contacts depends on the state but supports (at a minimum) the following typescript
interface
interface Contact {
// each contact should have a locale and either a county or city
// it should also have either an email or fax and preferably the county official's name.
locale: string // locale name, unique within state
county?: string // county name
city?: string // city or township name
official?: string // name of election's official
emails?: string[] // array of emails
faxes?: string[] // list of fax numbers
// optional fields
phones?: string[] // list of phone numbers
url?: string // url for locale's election information
address?: string // mailing address data
physicalAddress?: string // physical address
party?: string // party affiliation of official
}
NB:, fields with a question mark (e.g. county?
) indicate that the value may possibly be empty, i.e. no such key exists. If no values are provided by the state, this is how it is indicated.
State data is no longer saved in the master
branch of this repo.
Data releases are tagged with the date of collection using the format data/2020-06-22
. Files that could not be collected or had no changes will be carried over from previous commits.
Each state's crawler is put under its own folder (e.g. states/new_york/main.py
), with potentially other files in the folder.
-
The goal is for each state's crawler to fetch and process all of its required inputs without human intervention, so that we can easily re-run scripts periodically to collect fresh data.
-
We use
cache_request
fromcommon.py
to request webpages so that the results are saved to a local cache for faster development. Thecommon
module also contains several other functions which may be useful, includingcache_selenium
andcache_webkit
. -
Each state's
main.py
should include a function namedfetch_data()
, which will be called using PyInvoke using (e.g.)inv collect new_york
-
The results are then sorted using
normalize_state
fromcommon.py
and saved in the above json format. -
Once you have the data, verify that it works by running tests:
make test
-
Also, rerun the Jupyter notebook
analysis/Analysis.ipynb
from scratch to update the analytics. You can see how many fields you were able to parse. To start the jupyter notebook, runmake jupyter
. Run the notebook. Make sure that you have all the values you need. Do not commit the notebook changes. Jsut throw them away. They just block rebase merging. -
To release a new data version between scheduled run dates, first make sure the data file passes all tests using
inv test
. Then, commit the passing data file to the public-data branch and tag that commit with a date version of formatdata/2020-06-22
.
Please submit code via pull requests, ideally from this repo if you have access or from your own fork if you do not.
- This repository has a continuous integration (CI) workflow to run pylint and tests on pull requests. The tests must pass for CI for code to be merged.
- We strive to only use rebase merges
- Please don't save changes to the Jupyter notebook
analysis/Analysis.ipynb
(it will break your rebase merge).
To update a version, tag the commit with a bumped semvar version and push the tag. Admittedly we are a little loose on the definition of a "minor" vs "patch" increment. For example, if the previous version was 1.4.0
and we chose to increment to 1.5.0
, we would deploy using:
git tag v1.5.0
git push origin v1.5.0
To see a list of all tags, run
git fetch
git tag --list
DO NOT DELETE TAGS ONCE THEY ARE PUBLISHED! Just increment the minor version and republish if you made a mistake. We rely on stable tags for production.
- Vote by Mail
- Municipal Clerks (with a link to download them as a CSV)
- They seem to accept email applications Election Official's Manual
- "Michigan residents who live in unincorporated places register to vote with their township clerk": MI SOS
- The City-type matters: there is a St. Joseph Township and City, both in Berrien County. They share a zipcode (Wikipedia Township, City)
- The following Address is in Olive Township, not West Olive: "12220 Fillmore Street, Room 130 West Olive, MI 49460".
- There is both a Sheridan Chatter Township as well as a Sheridan Township, which are quite far apart. Similarly for Caledonia, Shelby, Lincoln, Lake, and Benton.
- Contact info for county board of registrars offices website.
- Contact info available here
- To get started, look at the
Makefile
. You can install files, startup Jupyter, etc ... - To run tasks, we use PyInvoke. Look at
tasks.py
file
This repository is for MailMyBallot.org, a National Vote at Home Institute project.