Skip to content

Provides automated anonymization for CSV-based datasets that contain common identifiers.

Notifications You must be signed in to change notification settings

illinois/csv-anonymization-tool

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

NetID Anonymization Tool

Provides automated anonymization for CSV-based datasets that contain common identifiers. The result is an anonymized CSV with all common identifiers removed and replaced with an anonymized ID auid.

This script requires a project-specific secret. Given the same identity and the same secret, auid will contain the same value. Any difference in the identity or secret will result in a different auid. If no secret is provided, a secret is generated for you and saved as a file.

This script can be used to:

  • anonymize a single dataset
  • anonymize multiple separate datasets that, using the same secret, will the same identity to the same auids
  • anonymize a single dataset for different groups of researchers, using different secrets, to generate different auids

Usage

python anon-csv.py -s {project-specific secret} input-file.csv

Identifiers Removed

The script removes the following common identifiers:

  • netid
  • Last Name
  • First Name
  • Username
  • Student ID
  • UIN

About

Provides automated anonymization for CSV-based datasets that contain common identifiers.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages