Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Updating worker guide for new deployment patterns plus small warnings added to avoid common snags #300

Merged
merged 10 commits into from
Aug 1, 2023
218 changes: 105 additions & 113 deletions docs/ecs_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ ECS (Elastic Container Service) tasks are a good option for executing Prefect 2
To set up Prefect flow execution in your cloud infrastructure, you need to understand two key Prefect concepts:
taylor-curran marked this conversation as resolved.
Show resolved Hide resolved

- a [worker](https://docs.prefect.io/latest/concepts/work-pools/#worker-overview) to poll Prefect's API (specifically that worker's [work queue](https://docs.prefect.io/latest/concepts/work-pools/#work-queues))
- and the emphemeral [infrastructure](https://docs.prefect.io/2.10.11/concepts/infrastructure/) that this worker will spin up for each scheduled [flow run](https://docs.prefect.io/2.10.11/concepts/flows/#flow-runs).
- and the ephemeral [infrastructure](https://docs.prefect.io/2.10.11/concepts/infrastructure/) that this worker will spin up for each scheduled [flow run](https://docs.prefect.io/2.10.11/concepts/flows/#flow-runs).
taylor-curran marked this conversation as resolved.
Show resolved Hide resolved

#### Architecture Diagram
```mermaid
Expand Down Expand Up @@ -84,7 +84,7 @@ Before you begin, make sure you have:

- An AWS account with permissions to create ECS services and IAM roles.
- The AWS CLI installed on your local machine. You can [download it from the AWS website](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html).
- An [ECS Cluster](https://docs.aws.amazon.com/AmazonECS/latest/developerguide/clusters.html) to host both the worker and the flow runs it submits.
- An [ECS Cluster](https://docs.aws.amazon.com/AmazonECS/latest/developerguide/clusters.html) to host both the worker and the flow runs it submits. Follow [this guide](https://docs.aws.amazon.com/AmazonECS/latest/userguide/create_cluster.html) to create an ECS cluster or simply use the default cluster.
- A [VPC](https://docs.aws.amazon.com/vpc/latest/userguide/what-is-amazon-vpc.html) configured for your ECS tasks. A VPC is a good idea if using EC2 and required if using Fargate.

### Step 1: Set Up an ECS work pool
Expand All @@ -110,7 +110,7 @@ Configuring custom fields is easiest from the UI.
![Launch](img/LaunchType_UI.png)
<sub>Pictures of UI Cropped

Next, set up an Prefect ECS worker that will discover and pull work from this work pool.
Next, set up a Prefect ECS worker that will discover and pull work from this work pool.

### Step 2: Start a Prefect worker in your ECS cluster.

Expand Down Expand Up @@ -165,84 +165,105 @@ To create an [IAM role](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_role

Now, you have a role named `ecsTaskExecutionRole` that you can assign to your ECS tasks. This role has the necessary permissions to pull container images and publish logs to CloudWatch.

4. **Create a task definition**
4. **Launch an ECS Service to host the worker**

Next, create an ECS task definition that specifies the Docker image for the Prefect worker, the resources it requires, and the command it should run. In this example, the command to start the worker is `prefect worker start --pool my-ecs-pool`.
Next, create an ECS task definition that specifies the Docker image for the Prefect worker, the resources it requires, and the command it should run. In this example, the command to start the worker is `prefect worker start --pool my-ecs-pool`.

Here are the steps:
5. **Create a JSON file with the following contents:**

1. Create a JSON file with the following contents:
taylor-curran marked this conversation as resolved.
Show resolved Hide resolved

```json
{
"family": "prefect-worker-task",
"networkMode": "awsvpc",
"taskRoleArn": "<your-ecs-task-role-arn>",
"executionRoleArn": "<your-ecs-task-role-arn>",
"containerDefinitions": [
```json
{
"name": "prefect-worker",
"image": "prefecthq/prefect:2-latest",
"cpu": 512,
"memory": 1024,
"essential": true,
"command": [
"pip",
"install",
"prefect-aws",
"&&",
"prefect",
"worker",
"start",
"--pool",
"my-ecs-pool"
],
"environment": [
{
"name": "PREFECT_API_URL",
"value": "https://api.prefect.cloud/api/accounts/<your-account-id>/workspaces/<your-workspace-id>"
},
"family": "prefect-worker-task",
"networkMode": "awsvpc",
"requiresCompatibilities": [
"FARGATE"
],
"cpu": "512",
"memory": "1024",
"executionRoleArn": "<your-ecs-task-role-arn>",
"taskRoleArn": "<your-ecs-task-role-arn>",
"containerDefinitions": [
{
"name": "PREFECT_API_KEY",
"value": "<your-api-key>"
"name": "prefect-worker",
"image": "prefecthq/prefect",
"cpu": 512,
"memory": 1024,
"essential": true,
"command": [
"pip",
"install",
"prefect-aws",
"&&",
"prefect",
"worker",
"start",
"--pool",
"my-ecs-pool",
"--type",
"ecs"
],
"environment": [
{
"name": "PREFECT_API_URL",
"value": "https://api.prefect.cloud/api/accounts/<your-account-id>/workspaces/<your-workspace-id>"
},
{
"name": "PREFECT_API_KEY",
"value": "<your-prefect-api-key>"
}
]
}
]
]
}
]
}
```
```

- Use `prefect config view` to view the `PREFECT_API_URL` for your current Prefect profile.
- Use `prefect config view` to view the `PREFECT_API_URL` for your current Prefect profile. Use this to replace both `<your-account-id>` and `<your-workspace-id>`.

- For the `PREFECT_API_KEY`, individuals on the organization tier can create a [service account](https://docs.prefect.io/latest/cloud/users/service-accounts/) for the worker. If on a personal tier, you can pass a user’s API key.
- For the `PREFECT_API_KEY`, individuals on the organization tier can create a [service account](https://docs.prefect.io/latest/cloud/users/service-accounts/) for the worker. If on a personal tier, you can pass a user’s API key.

- Replace `<your-ecs-task-role-arn>` with the ARN of the IAM role you created in Step 1, and `<your-ecr-image>` with the URI of the Docker image you pushed to Amazon ECR.
- Replace `<your-ecs-task-role-arn>` with the ARN of the IAM role you created in Step 1.

- Notice that the CPU and Memory allocations are relatively small. The worker's main responsibility is to submit work, _not_ to execute your Prefect flow code.
- Notice that the CPU and Memory allocations are relatively small. The worker's main responsibility is to submit work through API calls to AWS, _not_ to execute your Prefect flow code.

!!! tip
To avoid hardcoding your API key into the task definition JSON see [how to add environment variables to the container definition](https://docs.aws.amazon.com/AmazonECS/latest/developerguide/secrets-envvar-secrets-manager.html#secrets-envvar-secrets-manager-update-container-definition).
!!! tip
To avoid hardcoding your API key into the task definition JSON see [how to add environment variables to the container definition](https://docs.aws.amazon.com/AmazonECS/latest/developerguide/secrets-envvar-secrets-manager.html#secrets-envvar-secrets-manager-update-container-definition). The API key must be stored as plain text, not the key-value pair dictionary that it is formatted in by default.

6. **Register the task definition:**

Before creating a service, you first need to register a task definition. You can do that using the `register-task-definition` command in the AWS CLI. Here is an example:

```bash

### Step 3: Create an ECS service to host your worker
aws ecs register-task-definition --cli-input-json file://task-definition.json
```

Finally, create a service that will manage your Prefect worker:

Open a terminal window and run the following command to create an ECS Fargate service:

```bash
aws ecs create-service \
--service-name prefect-worker-service \
--cluster <your-ecs-cluster> \
--task-definition file://<path-to-task-definition-file>.json \
--launch-type FARGATE \
--desired-count 1 \
--network-configuration "awsvpcConfiguration={subnets=[<your-subnet-ids>],securityGroups=[<your-security-group-ids>]}"
```
Replace `task-definition.json` with the name of your JSON file.

Replace `<your-ecs-cluster>` with the name of your ECS cluster, `<path-to-task-definition-file>` with the path to the JSON file you created in Step 2, `<your-subnet-ids>` with a comma-separated list of your VPC subnet IDs, and `<your-security-group-ids>` with a comma-separated list of your VPC security group IDs.
7. **Create an ECS service to host your worker:**

!!! tip "Sanity check"
The work pool page in the Prefect UI allows you to check the health of your workers - make sure your new worker is live!
Finally, create a service that will manage your Prefect worker:

Open a terminal window and run the following command to create an ECS Fargate service:

```bash
aws ecs create-service \
--service-name prefect-worker-service \
--cluster <your-ecs-cluster> \
--task-definition <task-definition-arn> \
--launch-type FARGATE \
--desired-count 1 \
--network-configuration "awsvpcConfiguration={subnets=[<your-subnet-ids>],securityGroups=[<your-security-group-ids>]}"
```

- Replace `<your-ecs-cluster>` with the name of your ECS cluster.
- Replace `<path-to-task-definition-file>` with the path to the JSON file you created in Step 2, `<your-subnet-ids>` with a comma-separated list of your VPC subnet IDs. Ensure that these subnets are aligned with the vpc specified on the work pool in step 1.
- Replace `<your-security-group-ids>` with a comma-separated list of your VPC security group IDs.
- Replace `<task-definition-arn>` with the ARN of the task definition you just registered.

!!! tip "Sanity check"
The work pool page in the Prefect UI allows you to check the health of your workers - make sure your new worker is live!

### Step 4: Pick up a flow run with your new worker!

Expand All @@ -262,44 +283,13 @@ Replace `<your-ecs-cluster>` with the name of your ECS cluster, `<path-to-task-d
my_flow()
```

2. Configure a project [pull step](https://docs.prefect.io/latest/concepts/projects/#the-pull-section) to allow the worker to access your flow code at runtime.

!!! Tip "Create a [Prefect Project](https://docs.prefect.io/latest/tutorials/projects/#initializing-a-project)."
Run the following command at the base of whichever repo contains your flow. (See our [Projects Tutorial](https://docs.prefect.io/latest/tutorial/projects/#initializing-a-project) for other recipes with other pull step options.)
```bash
prefect project init --recipe git
```

In the `prefect.yaml` file, specify the code in the [pull step](https://docs.prefect.io/latest/concepts/projects/#the-pull-section) to allow the worker to access your flow code.
`prefect.yaml`
```yaml
# File for configuring project / deployment build, push and pull steps

# Generic metadata about this project
name: my-repo-name
prefect-version: 2.10.12

# build section allows you to manage and build docker images
build: null

# push section allows you to manage if and how this project is uploaded to remote locations
push: null

# pull section allows you to provide instructions for cloning this project in remote locations
pull:
- prefect.projects.steps.git_clone_project:
repository: https://github.com/git-user/my-repo-name.git
branch: main
access_token: null
```

3. Deploy the flow to the server, specifying the ECS work pool
2. [Deploy](https://docs.prefect.io/2.11.0/tutorial/deployments/#create-a-deployment) the flow to the server, specifying the ECS work pool when prompted.

```bash
prefect deploy my_flow.py:my_flow --name ecs-test-deployment --pool my-ecs-pool
prefect deploy my_flow.py:my_flow
```

4. Find the deployment in the UI and click the **Quick Run** button!
3. Find the deployment in the UI and click the **Quick Run** button!


## Optional Next Steps
Expand All @@ -309,25 +299,27 @@ Replace `<your-ecs-cluster>` with the name of your ECS cluster, `<path-to-task-d
- Do your flow runs require higher `CPU`?
- Would an EC2 `Launch Type` speed up your flow run execution?

These infrastructure configuration values can be set on your ECS work pool or they can be [overriden on the deployment level](https://docs.prefect.io/latest/concepts/infrastructure/#kubernetesjob-overrides-and-customizations) if desired.

These infrastructure configuration values can be set on your ECS work pool or they can be overridden on the deployment level through [job_variables](https://docs.prefect.io/2.11.0/concepts/infrastructure/#kubernetesjob-overrides-and-customizations) if desired.

2. Create a custom image for your flow executions, which you can build and push to the Amazon ECR registry.

For example, a minimal base Dockerfile for a flow run in ECS could look like:

```Dockerfile
FROM prefecthq/prefect:2-python3.10
RUN pip install s3fs prefect-aws
```
2. Consider adding a [build action](https://docs.prefect.io/2.11.0/concepts/deployments-ux/#the-build-action) to your Prefect Project [`prefect.yaml`](https://docs.prefect.io/2.11.0/concepts/deployments-ux/#the-prefect-yaml-file) if you want to automatically build a Docker image and push it to an image registry `prefect deploy` is run.

3. Consider adding a [build step](https://docs.prefect.io/latest/concepts/projects/#the-build-section) to your Prefect Project `prefect.yaml` if you want to automatically build a Docker image and push it to the repository referenced by the image name each time `prefect deploy` is run.
Here is an example build action for ECR:
```yaml
build:
- prefect_docker.projects.steps.build_docker_image:
requires: prefect-docker>=0.2.0
image_name: my-repo/my-image
tag: my-tag
dockerfile: auto
push: true
- prefect.deployments.steps.run_shell_script:
id: get-commit-hash
script: git rev-parse --short HEAD
stream_output: false
- prefect.deployments.steps.run_shell_script:
id: ecr-auth-step
script: aws ecr get-login-password --region <region> | docker login --username
AWS --password-stdin <>.dkr.ecr.<region>.amazonaws.com
stream_output: false
- prefect_docker.deployments.steps.build_docker_image:
requires: prefect-docker>=0.3.0
image_name: <>.dkr.ecr.us-east-2.amazonaws.com/<registry>
taylor-curran marked this conversation as resolved.
Show resolved Hide resolved
tag: '{{ get-commit-hash.stdout }}'
dockerfile: auto
push: true
```