Bias Propagation in Federated Learning

This is the code repository for the paper titled Bias Propagation in Federated Learning which was accepted to the International Conference on Learning Representations (ICLR) 2023.

If you have any questions, feel free to email Hongyan ([email protected]).

Introduction

In our paper, we show that participating in federated learning can be detrimental to group fairness. In fact, the bias of a few biased parties against under-represented groups (identified by sensitive attributes such as gender or race) propagates through the network to all parties. On naturally partitioned real-world datasets, we analyze and explain bias propagation in federated learning. Our analysis reveals that biased parties unintentionally yet stealthily encode their bias in a small number of model parameters, and throughout the training, they steadily increase the dependence of the global model on sensitive attributes. What is important to highlight is that the experienced bias in federated learning is higher than what parties would otherwise encounter in centralized training with a model trained on the union of all their data. This indicates that the bias is due to the algorithm. Our work calls for auditing group fairness in federated learning, and designing learning algorithms that are robust to bias propagation.

Dependencies

Our implementation of federated learning is based on the FedML library, and we use the machine learning tasks provided by folktables table.

We tested our code on Python 3.8.13 and cuda 11.4. The essential environments are listed in the environment.yml file. Run the following command to create the conda environment:

conda env create -f environment.yml

Usage

1. Training the models for different settings.

To run federated learning on the Income dataset, use the command:

python main.py --cf config/config_fedavg_income.yaml

Similarly, to run centralized training, use the following command:

python main.py --cf config/config_centralized_income.yaml

For standalone training, use the command:

python main.py --cf config/config_standalone_income.yaml

We report the average results over five different runs. To reproduce the results, run the command five times with different random seeds, which are indicated by common_args.random_seed in the YAML file.

To get the results on other datasets (e.g., Health, Employment), run the main.py file with config/config_standalone_{dataset}.yaml, where the dataset can be health, employment, or income.

2. Save all the information.

Run the following command to get the performance of the models for plotting figures:

python save_information.py --task income

By default, the script collects the prediction information from 5 runs with random seeds 0 to

3. Generate figures

To generate the figures in the paper, run the plotting.ipynb Jupyter notebook.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
config		config
fedml		fedml
folktables		folktables
saved_information		saved_information
.gitignore		.gitignore
README.md		README.md
census_datasets.py		census_datasets.py
data_loader.py		data_loader.py
environment.yml		environment.yml
main.py		main.py
model.py		model.py
plotting.ipynb		plotting.ipynb
requirements.txt		requirements.txt
save_information.py		save_information.py
standard_trainer.py		standard_trainer.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bias Propagation in Federated Learning

Introduction

Dependencies

Usage

1. Training the models for different settings.

2. Save all the information.

3. Generate figures

About

Releases

Packages

Languages

privacytrustlab/bias_in_FL

Folders and files

Latest commit

History

Repository files navigation

Bias Propagation in Federated Learning

Introduction

Dependencies

Usage

1. Training the models for different settings.

2. Save all the information.

3. Generate figures

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages