Implementation of "A Convex Framework for Fair Regression"
A rich family of fairness metrics for regression models that take the form of a fairness regularizer is applied to the standard loss functions for linear and logistic regression. The family of fairness metrics covers the spectrum from group fairness to individual fairness along with intermediate fairness notion. By varying the weight on the fairness regularizer, the efficient frontier of the accuracy-fairness tradeoff is obtained and the severity of this trade-off is computed via a numerical quantity called Price of Fairness (PoF).
python_version = "3.6"
[packages]
numpy==1.18.2
pandas==1.0.3
cvxpy==1.0.31
sklearn==0.22.2.post1
matplotlib==3.2.1
xlrd==1.2.0
Paper(Left) vs Our implementation(Right)
- The
Law School
dataset that we managed to have access to, is a much concise version of what the authors used. Therefore, the result we obtained for this concise version of dataset is different from the author's and hence, isn't shown in the above results. - Because of the unavailability of the
Sentencing
dataset, experimentation with it couldn't be performed.
- The paper doesn't use all the cross-pairs, but rather, random sampling is done for choosing the cross-pairs. In our experiments, we found that some datasets are quite sensitive to which random pairs are chosen and hence the slight difference in the paper's and our results.
- Experimenting with various values of lambdas to get smoother curves couldn't be performed because for the large datasets, the time to run the experiments on our local machines was quite large (~7-8 hours with 7 cores).
- For
Communities and Crime
dataset, the paper says that two groups are formed based on the percentage of Black people, White people, Indians, Asians and Hispanics in a community. However, per capita incomes for these groups are considered for forming groups.
- Sharik A (19CS60D04)
- Manish Chandra (19CS60A01)
- Anju Punuru (19CS60R07)
- Kunal Devanand Zodape (19CS60R13)
- Anirban Saha (19CS60R50)
- Hasmita Kurre (19CS60R67)
- Clone the repo
- Install pipenv
pip install pipenv
- cd to the project directory
- Create the virtual environment
pipenv install --skip-lock
- Activate the virtual environment
pipenv shell
python3 -m src.preprocess_compas
Replace preprocess_compas
with preprocess_adult
, preprocess_default
or preprocess_community
for 'Adult', 'Default' and 'Communities and Crime' datasets respectively.
python3 -m src.frontier --dataset=compas --proc=<number of cores to use>
Replace compas
with adult
, lawschool
, default
or community
for 'Adult', 'Default' and 'Communities and Crime' datasets respectively.
python3 -m src.pof --dataset=compas --proc=<number of cores to use>
Replace compas
with adult
, lawschool
, default
or community
for 'Adult', 'Default' and 'Communities and Crime' datasets respectively.
The final plots will be saved inside output/