Standard mlflow does not have any authentication for the web-interface. This project adds basic HTTP authentication with a single username, password to the web interface. And packages this up in a easy-to-install Docker image.
Primarily for use on Heroku with using Google Cloud Storage as the artifact store, and Heroku Postgres as the tracking store. It should be easy to make work on other Docker providers, with other supported mlflow backends for artifacts and database. Pull requests are welcome to fix any compatibility issues.
In Use
This will provision an Heroku app, and a Postgres add-on for persisting metrics etc. Artifact store needs to be configured separately, see below.
Assuming that you have mlflow tracking integration set up already.
Configure the client
export MLFLOW_TRACKING_URI=https://my-mlflow-instance.herokuapp.com
export MLFLOW_TRACKING_USERNAME=user
export MLFLOW_TRACKING_PASSWORD=user
Create a new experiment
mlflow experiments create -n test6
export MLFLOW_EXPERIMENT_NAME=test6
Run your
python3 example.py
Open the web browser at your newly deployed Heroku app. You should now have runs tracked with metrics being logged.
Using Google Cloud Storage.
Create a new or find an existing Google Cloud Storage bucket.
Create a Service Account for API credentials. Download the credentials JSON file.
Add the Service Account to Permissions on the bucket. It needs to have the following roles:
Storage Legacy Bucket Writer
Legacy Bucket Reader
Legacy Object Reader
Add the config to the backend on Heroku
heroku config:set ARTIFACT_URL=gs://MY-BUCKET/SOME/FOLDER
heroku config:set GOOGLE_APPLICATION_CREDENTIALS_JSON="`cat serviceaccount-fa31bc1bbb1d.json`"
Configure the mlflow client
export GOOGLE_APPLICATION_CREDENTIALS=credentials.json
Note that artifact URL is per experiment, so after this you'd need to create a new experiment to have it go to your GCS bucket.
Clone this git repo
git clone https://github.com/geobao/mlflow-easyauth.git
cd mlflow-easyauth.git
Build the image
docker build -t my-mlflow-easyauth:latest .
Create a .env file with settings
cat <<EOT >> settings.env
MLFLOW_TRACKING_USERNAME=user
MLFLOW_TRACKING_PASSWORD=pass
GOOGLE_APPLICATION_CREDENTIALS_JSON=None
ARTIFACT_URL=mlruns
DATABASE_URL=mlruns
EOT
"
Run it
docker run -it -p 8001:6000 --env-file=settings.env my-mlflow-easyauth:latest
This command re-runs all the steps needed to build and run a new version
docker build -t mlflow-easytracking:latest . && docker run -it -p 8001:80 --env-file=`pwd`/dev.env mlflow-easytracking:latest bash /app/entry-point.sh
If you are using Heroku Free dynos, they will go to sleep after inactivity, and then wake up again. Thus when the mlflow client connects the app may be sleeping, causing a the communication timeout and failing the ML pipeline. If using this in automated workflows, it may be smart to wakeup the server a bit in advance by making an HTTP request to it. For example before installing dependencies of the project, etc.