from Algorithmia import ADK
# API calls will begin at the apply() method, with the request body passed as 'input'
# For more details, see algorithmia.com/developers/algorithm-development/languages
def apply(input):
# If your apply function uses state that's loaded into memory via load, you can pass that loaded state to your apply
# function by defining an additional "globals" parameter in your apply function; but it's optional!
return "hello {}".format(str(input))
# This turns your library code into an algorithm that can run on the platform.
# If you intend to use loading operations, remember to pass a `load` function as a second variable.
algorithm = ADK(apply)
# The 'init()' function actually starts the algorithm, you can follow along in the source code
# to see how everything works.
algorithm.init("Algorithmia")
This document will describe the following:
- What is an Algorithm Development Kit
- Changes to Algorithm development
- Example workflows you can use to create your own Algorithms.
- The Model Manifest System
- Datarobot MLOps integrations support
An Algorithm Development Kit is a package that contains all of the necessary components to convert a regular application into one that can be executed and run on Algorithmia.
To do that, an ADK must be able to communicate with langserver.
To keep things simple, an ADK exposes some optional functions, along with an apply
function that acts as the explicit entrypoint into your algorithm.
Along with those basics, the ADK also exposes the ability to execute your algorithm locally, without langserver
; which enables better debuggability.
This kit, when implemented by an algorithm developer - enables an easy way to get started with your project, along with well defined hooks to integrate with an existing project.
Algorithm development does change with this introduction:
- Primary development file has been renamed to
src/Algorithm.py
to aide in understanding around what this file actually does / why it's important - An additional import (
from algorithm import ADK
) - An optional
load()
function that can be implemented- This enables a dedicated function for preparing your algorithm for runtime operations, such as model loading, configuration, etc
- A call to the handler function with your
apply
and optionalload
functions as inputs-
algorithm = ADK(apply) algorithm.init("Algorithmia")
- Converts the project into an executable, rather than a library
- Which will interact with the
langserver
service on Algorithmia - But is debuggable via stdin/stdout when executed locally / outside of an Algorithm container
- When a payload is provided to
init()
, that payload will be directly provided to your algorithm when executed locally, bypassing stdin parsing and simplifying debugging!
- When a payload is provided to
- This includes being able to step through your algorithm code in your IDE of choice! Just execute your
src/Algorithm.py
script and try stepping through your code with your favorite IDE
- Which will interact with the
-
Check out these examples to help you get started:
from Algorithmia import ADK
# API calls will begin at the apply() method, with the request body passed as 'input'
# For more details, see algorithmia.com/developers/algorithm-development/languages
def apply(input):
# If your apply function uses state that's loaded into memory via load, you can pass that loaded state to your apply
# function by defining an additional "globals" parameter in your apply function; but it's optional!
return "hello {}".format(str(input))
# This turns your library code into an algorithm that can run on the platform.
# If you intend to use loading operations, remember to pass a `load` function as a second variable.
algorithm = ADK(apply)
# The 'init()' function actually starts the algorithm, you can follow along in the source code
# to see how everything works.
algorithm.init("Algorithmia")
from Algorithmia import ADK
# API calls will begin at the apply() method, with the request body passed as 'input'
# For more details, see algorithmia.com/developers/algorithm-development/languages
def apply(input, modelData):
# If your apply function uses state that's loaded into memory via load, you can pass that loaded state to your apply
# function by defining an additional "globals" parameter in your apply function.
return "hello {} {}".format(str(input), str(modelData.user_data['payload']))
def load(modelData):
# Here you can optionally define a function that will be called when the algorithm is loaded.
# The return object from this function can be passed directly as input to your apply function.
# A great example would be any model files that need to be available to this algorithm
# during runtime.
# Any variables returned here, will be passed as the secondary argument to your 'algorithm' function
modelData['payload'] = "Loading has been completed."
return modelData
# This turns your library code into an algorithm that can run on the platform.
# If you intend to use loading operations, remember to pass a `load` function as a second variable.
algorithm = ADK(apply, load)
# The 'init()' function actually starts the algorithm, you can follow along in the source code
# to see how everything works.
algorithm.init("Algorithmia")
from Algorithmia import ADK
import Algorithmia
import torch
from PIL import Image
import json
from torchvision import models, transforms
client = Algorithmia.client()
def load_labels(label_path):
with open(label_path) as f:
labels = json.load(f)
labels = [labels[str(k)][1] for k in range(len(labels))]
return labels
def load_model(model_path):
model = models.squeezenet1_1()
weights = torch.load(model_path)
model.load_state_dict(weights)
return model.float().eval()
def get_image(image_url, smid_algo, client):
input = {"image": image_url, "resize": {"width": 224, "height": 224}}
result = client.algo(smid_algo).pipe(input).result["savePath"][0]
local_path = client.file(result).getFile().name
img_data = Image.open(local_path)
return img_data
def infer_image(image_url, n, globals):
model = globals["model"]
labels = globals["labels"]
image_data = get_image(image_url, globals["SMID_ALGO"], globals["CLIENT"])
transformed = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225])])
img_tensor = transformed(image_data).unsqueeze(dim=0)
infered = model.forward(img_tensor)
preds, indicies = torch.sort(torch.softmax(infered.squeeze(), dim=0), descending=True)
predicted_values = preds.detach().numpy()
indicies = indicies.detach().numpy()
result = []
for i in range(n):
label = labels[indicies[i]].lower().replace("_", " ")
confidence = float(predicted_values[i])
result.append({"label": label, "confidence": confidence})
return result
def load(modelData):
modelData["SMID_ALGO"] = "algo://util/SmartImageDownloader/0.2.x"
modelData["model"] = load_model(modelData.get_model("squeezenet"))
modelData["labels"] = load_labels(modelData.get_model("labels"))
return modelData
def apply(input, modelData):
if isinstance(input, dict):
if "n" in input:
n = input["n"]
else:
n = 3
if "data" in input:
if isinstance(input["data"], str):
output = infer_image(input["data"], n, modelData)
elif isinstance(input["data"], list):
for row in input["data"]:
row["predictions"] = infer_image(row["image_url"], n, modelData)
output = input["data"]
else:
raise Exception("\"data\" must be a image url or a list of image urls (with labels)")
return output
else:
raise Exception("\"data\" must be defined")
else:
raise Exception("input must be a json object")
algorithm = ADK(apply_func=apply, load_func=load, client=client)
algorithm.init({"data": "https://i.imgur.com/bXdORXl.jpeg"})
Model Manifests are optional files that you can provide to your algorithm to easily
define important model files, their locations; and metadata - this file is called model_manifest.json
.
{
"required_files" : [
{ "name": "squeezenet",
"source_uri": "data://AlgorithmiaSE/image_cassification_demo/squeezenet1_1-f364aa15.pth",
"fail_on_tamper": true,
"metadata": {
"dataset_md5_checksum": "46a44d32d2c5c07f7f66324bef4c7266"
}
},
{
"name": "labels",
"source_uri": "data://AlgorithmiaSE/image_cassification_demo/imagenet_class_index.json",
"fail_on_tamper": true,
"metadata": {
"dataset_md5_checksum": "46a44d32d2c5c07f7f66324bef4c7266"
}
}
],
"optional_files": [
{
"name": "mobilenet",
"source_uri": "data://AlgorithmiaSE/image_cassification_demo/mobilenet_v2-b0353104.pth",
"fail_on_tamper": false,
"metadata": {
"dataset_md5_checksum": "46a44d32d2c5c07f7f66324bef4c7266"
}
}
]
}
With the Model Manifest system, you're also able to "freeze" your model_manifest.json, creating a model_manifest.json.freeze. This file encodes the hash of the model file, preventing tampering once frozen - forver locking a version of your algorithm code with your model file.
{
"required_files":[
{
"name":"squeezenet",
"source_uri":"data://AlgorithmiaSE/image_cassification_demo/squeezenet1_1-f364aa15.pth",
"fail_on_tamper":true,
"metadata":{
"dataset_md5_checksum":"46a44d32d2c5c07f7f66324bef4c7266"
},
"md5_checksum":"46a44d32d2c5c07f7f66324bef4c7266"
},
{
"name":"labels",
"source_uri":"data://AlgorithmiaSE/image_cassification_demo/imagenet_class_index.json",
"fail_on_tamper":true,
"metadata":{
"dataset_md5_checksum":"46a44d32d2c5c07f7f66324bef4c7266"
},
"md5_checksum":"c2c37ea517e94d9795004a39431a14cb"
}
],
"optional_files":[
{
"name":"mobilenet",
"source_uri":"data://AlgorithmiaSE/image_cassification_demo/mobilenet_v2-b0353104.pth",
"fail_on_tamper":false,
"metadata":{
"dataset_md5_checksum":"46a44d32d2c5c07f7f66324bef4c7266"
}
}
],
"timestamp":"1633450866.985464",
"lock_checksum":"24f5eca888d87661ca6fc08042e40cb7"
}
As you can link to both hosted data collections, and AWS/GCP/Azure based block storage media, you're able to link your algorithm code with your model files, wherever they live today.
As part of the integration with Datarobot, we've built out integration support for the DataRobot MLOps Agent
By selecting mlops=True
as part of the ADK init()
function, the ADK will configure and setup the MLOps Agent to support writing content directly back to DataRobot.
For this, you'll need to select an MLOps Enabled Environment; and you will need to setup a DataRobot External Deployment.
Once setup, you will need to define your mlops.json
file, including your deployment and model ids.
{
"model_id": "YOUR_MODEL_ID",
"deployment_id": "YOUR_DEPLOYMENT_ID",
"datarobot_mlops_service_url": "https://app.datarobot.com"
}
Along with defining your DATAROBOT_MLOPS_API_TOKEN
as a secret to your Algorithm, you're ready to start sending MLOps data back to DataRobot!
from Algorithmia import ADK
from time import time
# API calls will begin at the apply() method, with the request body passed as 'input'
# For more details, see algorithmia.com/developers/algorithm-development/languages
def load(state):
# Lets initialize the final components of the MLOps plugin and prepare it for sending info back to DataRobot.
state['mlops'] = MLOps().init()
return state
def apply(input, state):
t1 = time()
df = pd.DataFrame(columns=['id', 'values'])
df.loc[0] = ["abcd", 0.25]
df.loc[0][1] += input
association_ids = df.iloc[:, 0].tolist()
reporting_predictions = df.loc[0][1]
t2 = time()
# As we're only making 1 prediction, our reporting tool should show only 1 prediction being made
state['mlops'].report_deployment_stats(1, t2 - t1)
# Report the predictions data: features, predictions, class_names
state['mlops'].report_predictions_data(features_df=df,
predictions=reporting_predictions,
association_ids=association_ids)
return reporting_predictions
algorithm = ADK(apply, load)
algorithm.init(0.25, mlops=True)
To compile the template readme, please check out embedme utility and run the following:
npm install -g npx
npx embedme --stdout README_template.md > README.md
Publishing should be automatic on new releases, but if you wish to publish manually this is the process first make sure to update the version in setup.py Then go through the following Then, install these python dependencies
pip install wheel==0.33
pip install setuptools==41.6
pip install twine==1.15
Setup your ~/.pypirc file:
index-servers =
pypi
pypitest
[pypi]
repository: https://upload.pypi.org/legacy/
username: algorithmia
password: {{...}}
[pypitest]
repository: https://test.pypi.org/legacy/
username: algorithmia
password: {{...}}
The passwords (and the pypirc file itself) can be found in our devtools service Make sure to update your setup.py with the new version before compiling. Also make sure that this is created on Linux and not any other platform. Compile via setup.py:
python setup.py sdist bdist_wheel --universal
python -m twine upload -r pypitest dist/*
Verify that it works on pytest, then:
python -m twine upload -r pypi dist/*
and you're done :)