Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some preliminary work on emcee batch #311

Draft
wants to merge 45 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
8974cc4
fixes the argument call to the plot of emcee chains
jcblemai Sep 16, 2024
1453303
finish the plot
jcblemai Sep 16, 2024
e4bbd8c
fix emcee rmse error spotted by Tim in #300
jcblemai Sep 16, 2024
311a20b
Save seir, spar and snpi on save, + made more granullar to access a f…
jcblemai Sep 17, 2024
80e63b8
fix
jcblemai Sep 17, 2024
bf7c37a
change the order so it fails fast if something already exist
jcblemai Sep 17, 2024
6725bf6
a very hacky and stupid batch hack to fight memory creep...
jcblemai Sep 18, 2024
70bbee8
trying to trigger a llik reevaluation
jcblemai Sep 19, 2024
465d0bb
solve #316: likelihood are discarded during resume
jcblemai Sep 19, 2024
2c54cfd
upgrade plot to show the transformed data
jcblemai Sep 20, 2024
f1ecbe6
plotting proections
jcblemai Sep 25, 2024
b5409a7
file structure
jcblemai Sep 25, 2024
ebbf2c2
add poisson likelihood
twallema Sep 25, 2024
589af08
add new distributions - retain old distributions
twallema Sep 25, 2024
b797cf7
miscopied the poisson from pySODM
twallema Sep 25, 2024
3ca0c7a
match data format gempyor --> not summed over dates --> happens after
twallema Sep 25, 2024
88686cd
fix plot-fit for one outcome
twallema Sep 26, 2024
1b1ddb2
tentative: propose cocktail of emcee "moves"
twallema Sep 26, 2024
7e2a061
compatibility with python on longleaf. To remove ?
jcblemai Sep 27, 2024
8849b24
Merge branch 'emcee_batch' of https://github.com/HopkinsIDD/flepiMoP …
jcblemai Sep 27, 2024
6caa83f
plot fix
jcblemai Sep 28, 2024
a843710
Merge branch 'emcee_batch' of https://github.com/HopkinsIDD/flepiMoP …
jcblemai Sep 28, 2024
ae86e52
nicer grid
jcblemai Sep 28, 2024
965ef19
simplify my life
jcblemai Sep 30, 2024
33bd35c
reload groundtruth to help with usage
jcblemai Oct 3, 2024
b590d92
default to making config plots
jcblemai Oct 3, 2024
5ee0f2b
new config
jcblemai Oct 3, 2024
ce59add
cocktail change
jcblemai Oct 3, 2024
fabc759
energy savings :D
jcblemai Oct 3, 2024
d80f213
add a precheck for config sensitivity after what happen on Flu R1 2024
jcblemai Oct 4, 2024
fb5618f
error
jcblemai Oct 4, 2024
823d81f
this check is vad...
jcblemai Oct 15, 2024
88e305b
postprocessing script
jcblemai Oct 15, 2024
ec12e4b
gen markdown
jcblemai Oct 15, 2024
37f7901
draft of modifiers support of out of reach subpops, still unsafe
jcblemai Oct 30, 2024
f186244
remove print
jcblemai Oct 31, 2024
a18d170
temp
jcblemai Oct 31, 2024
b99afec
merge
jcblemai Oct 31, 2024
3ddc975
very iugly code for cli that plot modifiers activation and time series
jcblemai Nov 7, 2024
5ea3d41
add outcoems parameters
jcblemai Nov 7, 2024
a3541d5
tried the outcoems NPI plot
jcblemai Nov 7, 2024
6da7439
tried the outcoems NPI plot
jcblemai Nov 7, 2024
29de35e
fix one subpop
jcblemai Nov 7, 2024
0439fcd
added postprocessing for several subpop
jcblemai Nov 8, 2024
ab86912
Pull HPC init/build from `main` into `emcee_batch`
TimothyWillard Nov 15, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
134 changes: 134 additions & 0 deletions batch/hpc_init.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
# Generic setup
set -e

# Cluster specific setup
if [[ $1 == "longleaf" ]]; then
# Setup general purpose user variables needed for Longleaf
USERO=$( echo $USER | awk '{ print substr($0, 1, 1) }' )
USERN=$( echo $USER | awk '{ print substr($0, 2, 1) }' )
WORKDIR=$( realpath "/work/users/$USERO/$USERN/$USER/" )
USERDIR=$WORKDIR

# Load required modules
module purge
module load gcc/9.1.0
module load anaconda/2023.03
module load git
elif [[ $1 == "rockfish" ]]; then
# Setup general purspose user variables needed for RockFish
WORKDIR=$( realpath "/scratch4/struelo1/flepimop-code/$USER/" )
USERDIR=$WORKDIR
mkdir -vp $WORKDIR

# Load required modules
module purge
module load slurm
module load gcc/9.3.0
module load anaconda/2020.07
module load git/2.42.0
else
echo "The cluster name '$1' is not recognized, must be one of: 'longleaf', 'rockfish'."
set +e
exit 1
fi

# Ensure we have a $FLEPI_PATH
if [ -z "${FLEPI_PATH}" ]; then
echo -n "An explicit \$FLEPI_PATH was not provided, please set one (or press enter to use '$USERDIR/flepiMoP'): "
read FLEPI_PATH
if [ -z "${FLEPI_PATH}" ]; then
export FLEPI_PATH="$USERDIR/flepiMoP"
fi
export FLEPI_PATH=$( realpath "$FLEPI_PATH" )
echo "Using '$FLEPI_PATH' for \$FLEPI_PATH."
fi

# Conda init
if [ -z "${FLEPI_CONDA}" ]; then
echo -n "An explicit \$FLEPI_CONDA was not provided, please set one (or press enter to use 'flepimop-env'): "
read FLEPI_CONDA
if [ -z "${FLEPI_CONDA}" ]; then
export FLEPI_CONDA="flepimop-env"
fi
echo "Using '$FLEPI_CONDA' for \$FLEPI_CONDA."
fi
conda activate $FLEPI_CONDA

# Check the conda environment is valid
WHICH_PYTHON=$( which python )
WHICH_R=$( which R )
PYTHON_ARROW_VERSION=$( python -c "import pyarrow; print(pyarrow.__version__)" )
R_ARROW_VERSION=$( Rscript -e "cat(as.character(packageVersion('arrow')))" )
COMPATIBLE_ARROW_VERSION=$( echo "$R_ARROW_VERSION" | grep "$PYTHON_ARROW_VERSION" | wc -l )
if [[ "$COMPATIBLE_ARROW_VERSION" -ne 1 ]]; then
echo "The R version of arrow is '$R_ARROW_VERSION' and the python version is '$PYTHON_ARROW_VERSION'. These may not be compatible versions."
fi

# Make sure the credentials is is where we expect and have the right perms
if [ ! -f "$USERDIR/slack_credentials.sh" ]; then
echo "You should place sensitive credentials in '$USERDIR/slack_credentials.sh'."
else
chmod 600 $USERDIR/slack_credentials.sh
source $USERDIR/slack_credentials.sh
fi

# Set correct env vars
export FLEPI_STOCHASTIC_RUN=false
export FLEPI_RESET_CHIMERICS=TRUE
export TODAY=`date --rfc-3339='date'`

echo -n "Please set a project path (relative to '$WORKDIR'): "
read PROJECT_PATH
export PROJECT_PATH="$WORKDIR/$PROJECT_PATH"
if [ ! -d $PROJECT_PATH ]; then
echo "> The project path provided, $PROJECT_PATH, is not a directory. Please ensure this is correct."
fi

echo -n "Please set a config path (relative to '$PROJECT_PATH'): "
read CONFIG_PATH
export CONFIG_PATH="$PROJECT_PATH/$CONFIG_PATH"
if [ ! -f $CONFIG_PATH ]; then
echo "> The config path provided, $CONFIG_PATH, is not a file. Please ensure this is correct."
fi

echo -n "Please set a validation date (today is $TODAY): "
read VALIDATION_DATE

echo -n "Please set a resume location: "
read RESUME_LOCATION

echo -n "Please set a flepi run index: "
read FLEPI_RUN_INDEX

# Done
cat << EOM
> The HPC init script has successfully finished.

If you are testing if this worked, say installing for the first time, you can use the inference example from the \`flepimop_sample\` repository:
\`\`\`bash
cd \$PROJECT_PATH
flepimop-inference-main -c \$CONFIG_PATH -j 1 -n 1 -k 1
\`\`\`
Just make sure to \`rm -r model_output\` after running.

Otherwise make sure this diagnostic info looks correct before continuing:
* Cluster: $1
* User directory: $USERDIR
* Work directory: $WORKDIR
* Flepi conda: $FLEPI_CONDA
* Flepi path: $FLEPI_PATH
* Project path: $PROJECT_PATH
* Python: $WHICH_PYTHON
* R: $WHICH_R
* Python arrow: $PYTHON_ARROW_VERSION
* R arrow: $R_ARROW_VERSION
* Stochastic run: $FLEPI_STOCHASTIC_RUN
* Reset chimerics: $FLEPI_RESET_CHIMERICS
* Today: $TODAY
* Config path: $CONFIG_PATH
* Validation date: $VALIDATION_DATE
* Resume location: $RESUME_LOCATION
* Flepi run index: $FLEPI_RUN_INDEX
EOM

set +e
106 changes: 106 additions & 0 deletions build/hpc_install_or_update.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
#!/usr/bin/env bash

# Generic setup
set -e

# Cluster specific setup
if [[ $1 == "longleaf" ]]; then
# Setup general purpose user variables needed for Longleaf
USERO=$( echo $USER | awk '{ print substr($0, 1, 1) }' )
USERN=$( echo $USER | awk '{ print substr($0, 2, 1) }' )
WORKDIR=$( realpath "/work/users/$USERO/$USERN/$USER/" )
USERDIR=$WORKDIR

# Load required modules
module purge
module load gcc/9.1.0
module load anaconda/2023.03
module load git
elif [[ $1 == "rockfish" ]]; then
# Setup general purspose user variables needed for RockFish
WORKDIR=$( realpath "/scratch4/struelo1/flepimop-code/$USER/" )
USERDIR=$WORKDIR
mkdir -vp $WORKDIR

# Load required modules
module purge
module load gcc/9.3.0
module load anaconda/2020.07
module load git/2.42.0
else
echo "The cluster name '$1' is not recognized, must be one of: 'longleaf', 'rockfish'."
set +e
exit 1
fi

# Ensure we have a $FLEPI_PATH
if [ -z "${FLEPI_PATH}" ]; then
echo -n "An explicit \$FLEPI_PATH was not provided, please set one (or press enter to use '$USERDIR/flepiMoP'): "
read FLEPI_PATH
if [ -z "${FLEPI_PATH}" ]; then
export FLEPI_PATH="$USERDIR/flepiMoP"
fi
export FLEPI_PATH=$( realpath "$FLEPI_PATH" )
echo "Using '$FLEPI_PATH' for \$FLEPI_PATH."
fi

# Test that flepiMoP is located there
if [ ! -d "$FLEPI_PATH" ]; then
while true; do
read -p "Did not find flepiMoP at $FLEPI_PATH, do you want to clone the repo? (y/n) " resp
case "$resp" in
[yY])
echo "Cloning on your behalf."
git clone [email protected]:HopkinsIDD/flepiMoP.git $FLEPI_PATH
break
;;
[nN])
echo "Then you need to set a \$FLEPI_PATH before running, cannot proceed with install."
set +e
exit 1
;;
*)
echo "Invalid input. Please enter 'y' or 'n'. "
;;
esac
done
fi

# Setup the conda environment
if [ -z "${FLEPI_CONDA}" ]; then
echo -n "An explicit \$FLEPI_CONDA was not provided, please set one (or press enter to use 'flepimop-env'): "
read FLEPI_CONDA
if [ -z "${FLEPI_CONDA}" ]; then
export FLEPI_CONDA="flepimop-env"
fi
echo "Using '$FLEPI_CONDA' for \$FLEPI_CONDA."
fi
FLEPI_CONDA_ENV_MATCHES=$( conda info --envs | awk '{print $1}' | grep -x "$FLEPI_CONDA" | wc -l )
if [ "$FLEPI_CONDA_ENV_MATCHES" -eq 0 ]; then
conda env create --name $FLEPI_CONDA --file $FLEPI_PATH/environment.yml
fi

# Load the conda environment
conda activate $FLEPI_CONDA
[ -e "$CONDA_PREFIX/conda-meta/pinned" ] && rm $CONDA_PREFIX/conda-meta/pinned
cat << EOF > $CONDA_PREFIX/conda-meta/pinned
r-arrow==17.0.0
arrow==17.0.0
EOF

# Install the gempyor package from local
pip install --editable $FLEPI_PATH/flepimop/gempyor_pkg

# Install the local R packages
R -e "install.packages('covidcast', repos='https://cloud.r-project.org')"
RETURNTO=$( pwd )
cd $FLEPI_PATH/flepimop/R_packages/
for d in $( ls ); do
R CMD INSTALL $d
done
cd $RETURNTO
R -e "library(inference); inference::install_cli()"

# Done
echo "> Done installing/updating flepiMoP."
set +e
39 changes: 39 additions & 0 deletions environment.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# Fri Oct 18 18:29:25 2024 UTC
channels:
- conda-forge
- defaults
- r
- dnachun
dependencies:
- python=3.11
- pip
- r-base>=4.3
- pyarrow=17.0.0
- r-arrow=17.0.0
- r-sf
- r-data.table
- r-doParallel
- r-dplyr
- r-foreach
- r-ggplot2
- r-ggraph
- r-httr
- r-jsonlite
- r-lubridate
- r-magrittr
- r-MMWRweek
- r-optparse
- r-purrr
- r-readr
- r-reticulate
- r-rlang
- r-stringr
- r-tibble
- r-tidygraph
- r-tidyr
- r-tidyselect
- r-tidyverse
- r-truncnorm
- r-vroom
- r-xts
- r-yaml
80 changes: 47 additions & 33 deletions examples/simple_usa_statelevel/inference_benchmark.ipynb

Large diffs are not rendered by default.

Loading