-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
12 changed files
with
103 additions
and
6 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
/config.local | ||
/tmp | ||
/cache |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
[core] | ||
remote = biobricks.ai | ||
['remote "biobricks.ai"'] | ||
url = https://ins-dvc.s3.amazonaws.com/insdvc | ||
['remote "s3.biobricks.ai"'] | ||
url = s3://ins-dvc/insdvc |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
# Add patterns of files dvc should ignore, which could improve | ||
# the performance. Learn more at | ||
# https://dvc.org/doc/user-guide/dvcignore |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
/download | ||
/brick |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
# SMRT Small Molecule Retention Time | ||
|
||
|
||
This dataset is available on figshare at | ||
|
||
https://figshare.com/articles/dataset/The_METLIN_small_molecule_dataset_for_machine_learning-based_retention_time_prediction/8038913 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
# PURPOSE: CHECK IF THE SOURCE HAS CHANGED |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
# PURPOSE: DOWNLOAD THE SMRT DATA TO THE ./download DIRECTORY | ||
import os | ||
|
||
# downloads to the ./download directory | ||
os.makedirs('download', exist_ok=True) | ||
|
||
# read the data from the ./download directory |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
# PURPOSE: CHANGE THE DOWNLOADED DATA TO ONE OR MORE PARQUET FILES | ||
import os | ||
|
||
# exports to the ./brick directory | ||
os.makedirs('brick', exist_ok=True) | ||
|
||
# read the data from the ./download directory |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
schema: '2.0' | ||
stages: | ||
status: | ||
cmd: python code/00_status.py | ||
deps: | ||
- path: code/00_status.py | ||
hash: md5 | ||
md5: 95a09d63c054eb185a1408771f4ee8a3 | ||
size: 43 | ||
download: | ||
cmd: python code/01_download.py | ||
deps: | ||
- path: code/01_download.py | ||
hash: md5 | ||
md5: f82fd5fc2597b90ed411991180e4ac30 | ||
size: 195 | ||
outs: | ||
- path: download/ | ||
hash: md5 | ||
md5: d751713988987e9331980363e24189ce.dir | ||
size: 0 | ||
nfiles: 0 | ||
process: | ||
cmd: python code/02_process.py | ||
deps: | ||
- path: code/02_process.py | ||
hash: md5 | ||
md5: c520d6a17cb1fb7e47155d606ce80701 | ||
size: 197 | ||
- path: download/ | ||
hash: md5 | ||
md5: d751713988987e9331980363e24189ce.dir | ||
size: 0 | ||
nfiles: 0 | ||
outs: | ||
- path: brick/ | ||
hash: md5 | ||
md5: d751713988987e9331980363e24189ce.dir | ||
size: 0 | ||
nfiles: 0 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
stages: | ||
status: | ||
cmd: python code/00_status.py | ||
deps: | ||
- code/00_status.py | ||
download: | ||
cmd: python code/01_download.py | ||
deps: | ||
- code/01_download.py | ||
outs: | ||
- download/ | ||
process: | ||
cmd: python code/02_process.py | ||
deps: | ||
- download/ | ||
- code/02_process.py | ||
outs: | ||
- brick/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,7 @@ | ||
python-dotenv | ||
pandas | ||
biobricks | ||
fastparquet | ||
pyarrow | ||
python-dotenv==1.0.1 | ||
pandas==2.2.2 | ||
biobricks==0.3.7 | ||
fastparquet==2024.5.0 | ||
pyarrow==16.1.0 | ||
dvc==3.51.1 | ||
dvc-s3==3.2.0 |