-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
📝 update READMEs and add some hints to config files
- Loading branch information
Henry
committed
May 30, 2024
1 parent
89d554b
commit f78f47b
Showing
4 changed files
with
193 additions
and
113 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
# Alzheimer study configuration | ||
|
||
For [`workflow/Snakefile_v2.yaml`](https://github.com/RasmussenLab/pimms/blob/HEAD/project/workflow/Snakefile_v2.smk): | ||
|
||
- [`config.yaml`](config.yaml) | ||
- see comments in config for explanations. | ||
|
||
For [`workflow/Snakefile_ald_comparison](https://github.com/RasmussenLab/pimms/blob/HEAD/project/workflow/Snakefile_ald_comparison.smk): | ||
|
||
- [`comparison.yaml`](comparison.yaml) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,75 +1,79 @@ | ||
# config for Snakefile_v2.smk | ||
config_split: runs/alzheimer_study/split.yaml # ! will be build | ||
config_train: runs/alzheimer_study/train_{model}.yaml # ! will be build | ||
folder_experiment: runs/alzheimer_study | ||
fn_rawfile_metadata: https://raw.githubusercontent.com/RasmussenLab/njab/HEAD/docs/tutorial/data/alzheimer/meta.csv | ||
cuda: False | ||
file_format: csv | ||
split_data: | ||
FN_INTENSITIES: https://raw.githubusercontent.com/RasmussenLab/njab/HEAD/docs/tutorial/data/alzheimer/proteome.csv | ||
sample_completeness: 0.5 | ||
feat_prevalence: 0.25 | ||
column_names: | ||
- protein groups | ||
index_col: 0 | ||
meta_cat_col: _collection site | ||
meta_date_col: null | ||
frac_mnar: 0.25 | ||
frac_non_train: 0.1 | ||
config_split: runs/alzheimer_study/split.yaml # ! will be build by workflow | ||
config_train: runs/alzheimer_study/train_{model}.yaml # ! will be build by workflow | ||
folder_experiment: runs/alzheimer_study # folder to save the results | ||
fn_rawfile_metadata: https://raw.githubusercontent.com/RasmussenLab/njab/HEAD/docs/tutorial/data/alzheimer/meta.csv # metadata file | ||
cuda: False # use GPU? | ||
file_format: csv # intermediate file formats | ||
split_data: # for 01_01_split_data.ipynb -> check parameters | ||
FN_INTENSITIES: https://raw.githubusercontent.com/RasmussenLab/njab/HEAD/docs/tutorial/data/alzheimer/proteome.csv | ||
sample_completeness: 0.5 | ||
feat_prevalence: 0.25 | ||
column_names: | ||
- protein groups | ||
index_col: 0 | ||
meta_cat_col: _collection site | ||
meta_date_col: null # null if no date column, translated to None in Python | ||
frac_mnar: 0.25 | ||
frac_non_train: 0.1 | ||
models: | ||
- Median: | ||
model: Median | ||
- CF: | ||
model: CF | ||
latent_dim: 50 | ||
batch_size: 1024 | ||
epochs_max: 100 | ||
sample_idx_position: 0 | ||
cuda: False | ||
save_pred_real_na: True | ||
- DAE: | ||
model: DAE | ||
latent_dim: 10 | ||
batch_size: 64 | ||
epochs_max: 300 | ||
hidden_layers: "64" | ||
sample_idx_position: 0 | ||
cuda: False | ||
save_pred_real_na: True | ||
- VAE: | ||
model: VAE | ||
latent_dim: 10 | ||
batch_size: 64 | ||
epochs_max: 300 | ||
hidden_layers: "64" | ||
sample_idx_position: 0 | ||
cuda: False | ||
save_pred_real_na: True | ||
- KNN: | ||
model: KNN | ||
neighbors: 3 | ||
file_format: csv | ||
- Median: # name used for model with this configuration | ||
model: Median # model used | ||
- CF: | ||
model: CF # notebook: 01_1_train_{model}.ipynb will be 01_1_train_CF.ipynb | ||
latent_dim: 50 | ||
batch_size: 1024 | ||
epochs_max: 100 | ||
sample_idx_position: 0 | ||
cuda: False | ||
save_pred_real_na: True | ||
- DAE: | ||
model: DAE | ||
latent_dim: 10 | ||
batch_size: 64 | ||
epochs_max: 300 | ||
hidden_layers: "64" | ||
sample_idx_position: 0 | ||
cuda: False | ||
save_pred_real_na: True | ||
- VAE: | ||
model: VAE | ||
latent_dim: 10 | ||
batch_size: 64 | ||
epochs_max: 300 | ||
hidden_layers: "64" | ||
sample_idx_position: 0 | ||
cuda: False | ||
save_pred_real_na: True | ||
- KNN: | ||
model: KNN | ||
neighbors: 3 | ||
file_format: csv | ||
- KNN5: | ||
model: KNN | ||
neighbors: 5 | ||
file_format: csv | ||
NAGuideR_methods: | ||
- BPCA | ||
- COLMEDIAN | ||
- IMPSEQ | ||
- IMPSEQROB | ||
- IRM | ||
- KNN_IMPUTE | ||
- LLS | ||
# - MICE-CART > 1h20min on GitHub small runner | ||
# - MICE-NORM ~ 1h on GitHub small runner | ||
- MINDET | ||
- MINIMUM | ||
- MINPROB | ||
- MLE | ||
- MSIMPUTE | ||
- MSIMPUTE_MNAR | ||
- PI | ||
- QRILC | ||
- RF | ||
- ROWMEDIAN | ||
# - SEQKNN # Error in x[od, ismiss, drop = FALSE]: subscript out of bounds | ||
- SVDMETHOD | ||
- TRKNN | ||
- ZERO | ||
- BPCA | ||
- COLMEDIAN | ||
- IMPSEQ | ||
- IMPSEQROB | ||
- IRM | ||
- KNN_IMPUTE | ||
- LLS | ||
# - MICE-CART > 1h20min on GitHub small runner | ||
# - MICE-NORM ~ 1h on GitHub small runner | ||
- MINDET | ||
- MINIMUM | ||
- MINPROB | ||
- MLE | ||
- MSIMPUTE | ||
- MSIMPUTE_MNAR | ||
- PI | ||
- QRILC | ||
- RF | ||
- ROWMEDIAN | ||
# - SEQKNN # Error in x[od, ismiss, drop = FALSE]: subscript out of bounds | ||
- SVDMETHOD | ||
- TRKNN | ||
- ZERO |
Oops, something went wrong.