About • Prerequisites • Dependencies • Installation • Publications • Copyright and License
Jack Marquez, Silvina Caino-Lores, Michel Cuendet, Ekaterina Kots, Trilce Estrada, Ewa Deelman, Harel Weinstein, and Michela Taufer.
The project's harnessed knowledge of molecular structures' transformations at runtime can be used to steer simulations to more promising areas of the simulation space, identify the data that should be written to congested parallel file systems, and index generated data for retrieval and post-simulation analysis. Supported by this knowledge, molecular dynamics workflows such as replica exchange simulations, Markov state models, and the string method with swarms of trajectories can be executed from the outside (i.e., without reengineering the molecular dynamics code)
We validate the A4MD framework capability for early termination and assess whether A4MD early trimmed MD simulations cover the conformational space as effectively as the full simulation.
In order to use this Jupyter Notebook, your system should have the following installed:
- Anaconda
- python 3.8.18
NOTE: It is important to have the simulation data to work with the Jupyter Notebook. The entire dataset with all the trajectories and also the frames indentified as the closest to the minimum energy can be found here
The framework is also built on top the following third-party libraries:
- PyEmma
- MDtraj
- MDAnalysis
Here is the extensive installation instructions. Please make sure the all the prerequisites are satisfied before proceeding the following steps. Make sure you are using ssh with GitHub and you have gcc compiler in your system.
- Clone the source code from this repository
git clone https://github.com/Analytics4MD/A4MD_conformational_space_validation.git
- Create your conda environment
cd A4MD_conformational_space_validation/
conda create --name validation --file environment.yml
conda activate validation
The execution of previous commands should create a conda environment that includes all the dependencies that we need to properly run the notebook.
After installing and activating our conda environment, try executing the first two cells in the Jupyter notebook A4MD_conformational_space_validation.ipynb
. If no error arises, you are ready to use the Jupyter Notebook.
There are many things that can be customized in this Notebook, e.g. the paths to the trajectories (full, LEV-trimmed, and ESS-trimmed)
stride
- Stride for loading trajectoryselection
- Used for the RMSD comparison. This option can be "protein" or "all"input_dirs
- Path fot the full trajectories of the simulation.trimmed_trajectories_lev
- Path for the trimmed trajectories of the simulation using LEVtrimmed_trajectories_ess
- Path for the trimmed trajectories of the simulation using ESStop_file
- Topology file for the trajectoriesnstates
- Number of states for the MSM model and the PCCA+anotations
- Path for the annotations files. The annotations files contains the information of the LEV and ESS terminationSAVE_FRAMES
- If True, the frames will be saved in the folder frames_closestframes_closest_folder
- Folder where the frames will be saveddist_cmap
- Color map for the energy plotssize
- Size of the point in the plots
Silvina Caino-Lores, Michel Cuendet, Jack Marquez, Ekaterina Kots, Trilce Estrada, Ewa Deelman, Harel Weinstein, and Michela Taufer. Runtime steering of molecular dynamics simulations through in situ analysis and annotation of collective variables. ACM Proceedings of the Platform for Advanced Scientific Computing Conference. ACM (2023). [link]
Harshita Sahni, Hector Carrillo-Cabada, Ekaterina Kots, Silvina Caino-Lores, Jack Marquez, Ewa Deelman, Michel Cuendet, Harel Weinstein, Michela Taufer, and Trilce Estrada. Online Boosted Gaussian Learners for In-Situ Detection and Characterization of Protein Folding States in Molecular Dynamics Simulations. 2023 IEEE 19th International Conference on e-Science (e-Science). IEEE (2023). [link]
Hector Carrillo-Cabada, Jeremy Benson, Asghar Razavi, Brianna Mulligan, Michel A. Cuendet, Harel Weinstein, Michela Taufer, and Trilce Estrada. A Graphic Encoding Method for Quantitative Classification of Protein Structure and Representation of Conformational Changes IEEE/ACM Transactions on Computational Biology and Bioinformatics (IEEE/ACM TCBC). (2020). [link]
Tu Mai Anh Do, Loic Pottier, Stephen Thomas, Rafael Ferreira da Silva, Michel A. Cuendet, Harel Weinstein, Trilce Estrada, Michela Taufer, and Ewa Deelman. A Novel Metric to Evaluate In Situ Workflows In Proceedings of the International Conference on Computational Science (ICCS), pp. 1 – 14. (2020). [link]
Michela Taufer, Trilce Estrada, and Travis Johnston. A Survey of Algorithms for Transforming Molecular Dynamics Data into Metadata for In Situ Analytics based on Machine Learning Methods Issue of Philosophical Transactions A., 378(2166):1-11. (2020). [link]
Copyright (c) 2022, Global Computing Lab
A4MD is distributed under terms of the Apache License, Version 2.0 with LLVM Exceptions.
See LICENSE for more details.
This research was supported by the National Science Foundation (NSF) under grant numbers 1741057, 1841758, 2138811, 2223704 and 2331152; the Oak Ridge Leadership Computing Facility under allocation CSC427; the Extreme Science and Engineering Discovery Environment (XSEDE) under allocation TG-CIS200053; and IBM through a Shared University Research Award.
Dr. Michela Taufer
University of Tennessee