You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Assuming the worst case scenario (using all of the uniprot predictions, approx 200m structures iirc), mapping graph/node labels into Data/Protein objects would have to be done when we get a structure so they're not stored in memory (200m node label tensors seems.. prohibitive). I think the way to go is to connect this to an (optional) LMDB which could also store additional pre-computed features. Thus when we get a structure we pull in these additional data and store them in the returned Data/Protein.
FWIW, I see this functionality as complementary to the other strand of dataset creation we've been doing in #272 . Essentially, I think a model workflow looks like: make a dataset selection with a Manager -> Instantiate a FoldCompDataset -> wrap it in a LightningModule (optional).
I also saw as of 0.0.3 FoldComp supports multi-chain structures. I'm not sure if this now expands support to "real" (i.e. from the PDB) PDB files, but if it does this is something to strongly consider in #272 as an export option.
Assuming the worst case scenario (using all of the uniprot predictions, approx 200m structures iirc), mapping graph/node labels into
Data
/Protein
objects would have to be done when weget
a structure so they're not stored in memory (200m node label tensors seems.. prohibitive). I think the way to go is to connect this to an (optional)LMDB
which could also store additional pre-computed features. Thus when we get a structure we pull in these additional data and store them in the returned Data/Protein.FWIW, I see this functionality as complementary to the other strand of dataset creation we've been doing in #272 . Essentially, I think a model workflow looks like: make a dataset selection with a
Manager
-> Instantiate aFoldCompDataset
-> wrap it in aLightningModule
(optional).I also saw as of
0.0.3
FoldComp supports multi-chain structures. I'm not sure if this now expands support to "real" (i.e. from the PDB) PDB files, but if it does this is something to strongly consider in #272 as an export option.Originally posted by @a-r-j in #284
The text was updated successfully, but these errors were encountered: