This project combines Domain Adaptation (DA) with neural network Uncertainty Quantification (UQ) in the context of strong gravitational lens parameter prediction. We hope that this work helps take a step towards more accurate applications of deep learning models to real observed datasets, especially when the latter have limited labels. We predict the Einstein radius
For UQ, we use a mean-variance estimation (MVE) network to predict the Einstein radius
Applying deep learning in science contexts like astronomy presents multiple challenges. For example, when models trained on simulated data are applied to real data, they tend to underperform because simulations rarely adequately represent the complexity of real data. Domain adaptation (DA) is a class of algorithms that are designed to address biases that can result from training networks on one data set and applying them to test data, where the training and testing data have significantly different generating parameters and/or features. Typically, the source data are from one domain, and the target data are from another distinct domain --- e.g., the data-generating parameters have different prior distributions.
Usually, a supervised DA algorithm uses a large amount of labeled source data and a small amount of unlabeled target data to bridge the gap between domains. In this work, we use unsupervised DA (UDA), where the target data do not have labels. Unsupervised DA aligns the latent space embedding of an unlabeled target dataset with that of a labeled source dataset so that predictions can be performed on both. We use the Maximum Mean Discrepancy (MMD) loss to train a network of labeled source lenses in combination with unlabeled target lenses. That target domain has a domain shift that must be aligned. In this work, we use noise parameter settings to incur a domain shift between the source and target data: the source data has no noise, while the target data has the noise of the Dark Energy Survey.
We generate strong lensing images for training and testing with deeplenstronomy
. In the figure below, we show a single simulated strong lens in three bands (
Clone the package into any directory:
git clone https://github.com/deepskies/DomainAdaptiveMVEforLensModeling
Create environments with conda
for training and for simulation, respectively:
conda env create -f training_env.yml.
conda env create -f deeplenstronomy_env.yml
A yaml
file (i.e., training_env.yml
) is required for training the pytorch
neural network model, and deeplenstronomy_env.yml
is required for simulating strong lensing datasets with deeplenstronomy
.
This code works on Linux but has not been tested for Mac or Windows.
There is a sky brightness-related bug in the PyPI 0.0.2.3 version of deeplenstronomy
, and an update to the latest version will be required to reproduce the results.
-
Option A: Generate the Dataset
- Navigate to
src/sim/notebooks/
. - Generate a source/target data pair in the
src/data/
directory by runninggen_sim.py
on the yaml files (src/sim/config/source_config.yaml
andsrc/sim/config/target_config.yaml
for source and target, respectively):-
gen_sim.py src/sim/config/source_config.yaml src/sim/config/target_config.yaml
-
- Navigate to
-
Option B: Download the Dataset
- Zip files of the dataset are available through Zenodo.
- The source and target data downloaded should be added to the
src/data/
directory.- Move or copy the directories
mb_paper_source_final
andmb_paper_target_final
into thesrc/data/
directory.
- Move or copy the directories
-
MVE-Only
- Navigate to
src/training/MVEonly/MVE_noDA_RunA.ipynb
(or the notebook for runs B, C, D, or E) - Activate the conda environment that is related to training:
-
source activate "..."
-
- Use the notebook
src/sim/notebooks/training.ipynb
to train the model. - The trained model parameters will be stored in the
models/
directory.
- Navigate to
-
MVE-UDA
- Follow an identical procedure to the above, replacing
src/training/MVEonly/
withsrc/training/MVEUDA/
.
- Follow an identical procedure to the above, replacing
- To generate the results in the paper, use the notebook
src/training/MVEUDA/ModelVizPaper.ipynb
.- Final figures from this notebook are stored in
src/training/MVEUDA/figures/
. - Saved PyTorch models of the runs are provided in
src/training/MVE*/paper_models/
.
- Final figures from this notebook are stored in
DomainAdaptiveMVEforLensModeling/
│
├── src/
│ ├── sim/
│ │ ├── configs/
│ │ │ └── deeplenstronomy config files to generate the data
│ │ │
│ │ └── notebooks/
│ │ └── gen_sim.ipynb: used to generate the data in data/.
│ │
│ │
│ ├── data/
│ │ └── Data should be stored here after download or generation.
│ │
│ └── training/
│ ├── MVEonly/
│ │ ├── paper_models/
│ │ │ └── Final PyTorch models in the MVEonly model + training information.
│ │ │
│ │ └── RunA.ipynb
│ │ └── Notebook(s) with different seeds required to run the MVEonly model.
│ │
│ └── MVEUDA/
│ ├── paper_models/
│ │ └── Final PyTorch models in the MVEonly model + training information.
│ │
│ ├── figures/
│ │ └── All figures in the paper are drawn from here.
│ |
│ ├── RunA.ipynb
│ │ └── Notebook(s) with different seeds required to run the MVE-UDA model.
│ │
│ └── ModelVizPaper.ipynb
│ └── Notebook used to generate figures in figures/ from data in paper_models/.
│
└── envs/
└── Conda environment specification files.
[ASCII formatting generated using ChatGPT]
This code was written by Shrihan Agarwal.
@article{agarwal2024,
author = {Shrihan Agarwal, Aleksandra Ciprijanovic, Brian Nord},
title = {Domain-adaptive neural network prediction with
uncertainty quantification for strong gravitational lens
analysis},
journal = {Accepted to the Machine Learning for the Physical Sciences workshop at Neurips 2024},
year = {2024}
}
This project is a part of the DeepSkiesLab. We greatly appreciate advice and contributions from Jason Poh, Paxson Swierc, Megan Zhao, and Becky Nevin; this work would be impossible without building on their earlier discoveries. We used the Fermilab Elastic Analysis Facility (EAF) for computational and storage purposes in this project. This project used data from both the Dark Energy Survey and Dark Energy CAM Legacy Survey DR10 to generate realistic data; we thank the collaborations for making their catalogs accessible.