ML-in-the-loop molecular design with Globus Compute

This repository contains a tutorial showing how Globus Compute can be used to write a machine-learning-guided search for high-performing molecules.

The objective of this application is to identify which molecules have the largest ionization energies (IE, the amount of energy required to remove an electron).

IE can be computed using various simulation packages (here we use xTB); however, execution of these simulations is expensive, and thus, given a finite compute budget, we must carefully select which molecules to explore.

In this example, we use machine learning to predict molecules with high IE based on previous computations (a process often called active learning). We iteratively retrain the machine learning model to improve the accuracy of predictions.

This tutorial is based on a tutorial for Parsl)

Installation

The demo builds on several packages to compute molecular properties and to build the machine learning loop. These dependencies can be easily deployed with Conda or using Docker as shown below.

conda env create --file environment.yml

docker build -t moldesign . 
docker run -it moldesign /bin/bash

Globus Compute Endpoint

The demo requires deployment of a Globus Compute endpoint. Importantly, the same dependencies must be available in the endpoint's environment. You can use the same Conda or Docker environment as above.

First, configure your Globus Compute endpoint (note, you must be in the conda environment)

conda activate moldesign
globus-compute-endpoint configure

Second, start your endpoint and authenticate via Globus to securely pair your endpoint with your account. Optionally, you may update the endpoint configuration to use additional cores or make use of HPC resources. See the Globus Compute documentation for details on configuring endpoints.

globus-compute-endpoint start default

Make note of the endpoint's UUID to add to your notebook.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
data		data
docker		docker
figures		figures
README.md		README.md
chemfunctions.py		chemfunctions.py
environment.yml		environment.yml
molecular-design.ipynb		molecular-design.ipynb
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ML-in-the-loop molecular design with Globus Compute

Installation

Globus Compute Endpoint

About

Releases

Packages

Languages

funcx-faas/molecular-design

Folders and files

Latest commit

History

Repository files navigation

ML-in-the-loop molecular design with Globus Compute

Installation

Globus Compute Endpoint

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages