Skip to content

Commit

Permalink
Update to version v2.4.0
Browse files Browse the repository at this point in the history
  • Loading branch information
YikaiHu committed Apr 27, 2023
1 parent 32543bc commit c8add01
Show file tree
Hide file tree
Showing 123 changed files with 1,225 additions and 1,321 deletions.
6 changes: 5 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,11 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [2.3.1] - 2023-04
## [2.4.0] - 2023-04-28
### Added
- Support for requester pay mode in S3 transfer task.

## [2.3.1] - 2023-04-18
### Fixed
- Fix deployment failure due to S3 ACL changes.

Expand Down
6 changes: 4 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,8 @@ gcr.io, Red Hat Quay.io) to Amazon ECR.

![](docs/images/homepage.png)

You will be responsible for your compliance with all applicable laws in respect of your data transfer tasks.

## Features

- [x] Authentication (Cognito User Pool, OIDC)
Expand Down Expand Up @@ -90,8 +92,8 @@ Create your first data transfer task, For the complete user guide, refer to

## FAQ

**Q. Which are the supported Reigons of this solution?**</br>
You can deploy this solution in these Reigons: N.Virginia (us-east-1), Ohio (us-east-2), N.California (us-west-1),
**Q. Which are the supported Regions of this solution?**</br>
You can deploy this solution in these Regions: N.Virginia (us-east-1), Ohio (us-east-2), N.California (us-west-1),
Oregon (us-west-2), Mumbai (ap-south-1), Seoul (ap-northeast-2), Singapore (ap-southeast-1), Sydney (ap-southeast-2),
Tokyo (ap-northeast-1), Canada (ca-central-1), Frankfurt (eu-central-1), Ireland (eu-west-1), London (eu-west-2),
Paris (eu-west-3), Stockholm (eu-north-1), São Paulo (sa-east-1), Beijing (cn-north-1), Ningxia (cn-northwest-1).
Expand Down
2 changes: 1 addition & 1 deletion docs/S3-SSE-KMS-Policy.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ _Note_: If it's for S3 buckets in China regions, please make sure you also chang
"kms:DescribeKey"
],
"Resource": [
"arn:aws:kms:us-west-2:123456789012:key/f5cd8cb7-476c-4322-ac9b-0c94a687700d <Please replace to your own KMS key arn>"
"arn:aws:kms:us-west-2:111122223333:key/f5cd8cb7-476c-4322-ac9b-0c94a687700d <Please replace to your own KMS key arn>"
]
}
]
Expand Down
2 changes: 1 addition & 1 deletion docs/S3-SSE-KMS-Policy_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ _注意_:如果是针对中国地区的 S3 存储桶,请确保您也更改
"kms:DescribeKey"
],
"Resource": [
"arn:aws:kms:us-west-2:123456789012:key/f5cd8cb7-476c-4322-ac9b-0c94a687700d <Please replace to your own KMS key arn>"
"arn:aws:kms:us-west-2:111122223333:key/f5cd8cb7-476c-4322-ac9b-0c94a687700d <Please replace to your own KMS key arn>"
]
}
]
Expand Down
12 changes: 0 additions & 12 deletions docs/en-base/additional-resources.md

This file was deleted.

55 changes: 55 additions & 0 deletions docs/en-base/architecture-overview/architecture-details.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
This section describes the components and AWS services that make up this solution and the architecture details on how these components work together.


## AWS services in this solution

The following AWS services are included in this solution:

| AWS service | Description |
| --- | --- |
| [Amazon CloudFront](https://aws.amazon.com/cloudfront/) | **Core**. To made available the static web assets (frontend user interface). |
| [AWS AppSync](https://aws.amazon.com/appsync/) | **Core**. To provide the backend APIs. |
| [AWS Lambda](https://aws.amazon.com/lambda/) | **Core**. To call backend APIs. |
| [Amazon ECS](https://aws.amazon.com/ecs/) | **Core**.  To run the container images used by the plugin template. |
| [Amazon DynamoDB](https://aws.amazon.com/dynamodb/) | **Core**.  To store a record with transfer status for each object. |
| [Amazon EC2](https://aws.amazon.com/ec2/) | **Core**. To consume the messages in Amazon SQS and transfer the object from the source bucket to the destination bucket. |
| [AWS Secrets Manager](https://aws.amazon.com/secrets-manager/) | **Core**. Stores the credential for data transfer. |
| [AWS Step Functions](https://aws.amazon.com/step-functions/) | **Supporting**. To start or stop/delete the ECR or S3 plugin template. |
| [Amazon S3](https://aws.amazon.com/s3/) | **Supporting**. To store the static web assets (frontend user interface). |
| [Amazon Cognito](https://aws.amazon.com/cognito/) | **Supporting**. To authenticate users (in AWS Regions). |
| [Amazon ECR](https://aws.amazon.com/ecr/) | **Supporting**. To host the container images. |
| [Amazon SQS](https://aws.amazon.com/sqs/) | **Supporting**. To store the transfer tasks temporarily as a buffer. |
| [Amazon EventBridge](https://aws.amazon.com/eventbridge/) | **Supporting**. To invoke the transfer tasks regularly. |
| [Amazon SNS](https://aws.amazon.com/sns/) | **Supporting**. Provides topic and email subscription notifications for data transfer results. |
| [AWS CloudWatch](https://aws.amazon.com/cloudwatch/) | **Supporting**. To monitor the data transfer progress. |

## How Data Transfer Hub works

This solution has three components: a web console, the Amazon S3 transfer engine, and the Amazon ECR transfer engine.

### Web console
This solution provides a simple web console which allows you to create and manage transfer tasks for Amazon S3 and Amazon ECR.

### Amazon S3 transfer engine
Amazon S3 transfer engine runs the Amazon S3 plugin and is used for transferring objects from their sources into S3 buckets. The S3 plugin supports the following features:

- Transfer Amazon S3 objects between AWS China Regions and AWS Regions
- Transfer objects from Alibaba Cloud OSS / Tencent COS / Qiniu Kodo to Amazon S3
- Transfer objects from S3 Compatible Storage service to Amazon S3
- Support near real time transfer via S3 Event
- Support transfer with object metadata
- Support incremental data transfer
- Support transfer from private payer request bucket
- Auto retry and error handling

### Amazon ECR transfer engine

Amazon ECR engine runs the Amazon ECR plugin and is used for transferring container images from other container registries. The ECR plugin supports the following features:

- Transfer Amazon ECR images between AWS China Regions and AWS Regions
- Transfer from public container registry (such as Docker Hub, GCR.io, Quay.io) to Amazon ECR
- Transfer selected images to Amazon ECR
- Transfer all images and tags from Amazon ECR
The ECR plugin leverages [skopeo][skopeo] for the underlying engine. The AWS Lambda function lists images in their sources and uses Fargate to run the transfer jobs.

[skopeo]: https://github.com/containers/skopeo
Original file line number Diff line number Diff line change
@@ -1,16 +1,15 @@
Deploying the Data Transfer Hub solution with the default parameters builds the following environment in the AWS Cloud.
Deploying this solution with the default parameters builds the following environment in the AWS Cloud.

![architecture](./images/arch-global.png)

Figure 1: Data Transfer Hub architecture
![architecture](../images/arch-global.png)
*Data Transfer Hub architecture*

The solution automatically deploys and configures a serverless architecture with the following services:
This solution deploys the Amazon CloudFormation template in your AWS Cloud account and completes the following settings.

1. The solution’s static web assets (frontend user interface) are stored in [Amazon S3][s3] and made available through [Amazon CloudFront][cloudfront].
2. The backend APIs are provided via [AWS AppSync][appsync] GraphQL.
3. Users are authenticated by either [Amazon Cognito][cognito] User Pool (in AWS Regions) or by an OpenID connect provider (in AWS China Regions) such as [Authing](https://www.authing.cn/), [Auth0](https://auth0.com/), etc.
4. AWS AppSync runs [AWS Lambda][lambda] to call backend APIs.
5. Lambda starts an [AWS Step Functions][stepfunction] workflow that uses [AWS CloudFormation][cloudformation] to start or stop/delete the ECR or S3 plugin template.
5. Lambda starts an [AWS Step Functions][stepfunction] workflow that uses [AWS CloudFormation][cloudformation] to start or stop/delete the Amazon ECR or Amazon S3 plugin template.
6. The plugin templates are hosted in a centralized Amazon S3 bucket manged by AWS.
7. The solution also provisions an [Amazon ECS][ecs] cluster that runs the container images used by the plugin template, and the container images are hosted in [Amazon ECR][ecr].
8. The data transfer task information is stored in in [Amazon DynamoDB][dynamodb].
Expand All @@ -20,37 +19,36 @@ After deploying the solution, you can use [AWS WAF][waf] to protect CloudFront o
!!! note "Important"
If you deploy this solution in AWS (Beijing) Region operated by Beijing Sinnet Technology Co., Ltd. (Sinnet), or the AWS (Ningxia) Region operated by Ningxia Western Cloud Data Technology Co., Ltd. ( ), you are required to provide a domain with ICP Recordal before you can access the web console.


The web console is a centralized place to create and manage all data transfer jobs. Each data type (for example, Amazon S3 or Amazon ECR) is a plugin for Data Transfer Hub, and is packaged as an AWS CloudFormation template hosted in an S3 bucket that AWS owns. When the you create a transfer task, an AWS Lambda function initiates the Amazon CloudFormation template, and state of each task is stored and displayed in the DynamoDB tables.

As of March 2023, the solution supports two data transfer plugins: an Amazon S3 plugin and an Amazon ECR plugin.
As of April 2023, the solution supports two data transfer plugins: an Amazon S3 plugin and an Amazon ECR plugin.

## Amazon S3 plugin

![s3-architecture](./images/s3-arch-global.png)

Figure 2: Data Transfer Hub Amazon S3 plugin architecture
![s3-architecture](../images/s3-arch-global.png)
*Data Transfer Hub Amazon S3 plugin architecture*

The Amazon S3 plugin runs the following workflows:

1. A time-based Event Bridge rule triggers a AWS Lambda function on an hourly basis.
2. AWS Lambda uses the launch template to launch a data comparison job (JobFinder) in an EC2.
2. AWS Lambda uses the launch template to launch a data comparison job (JobFinder) in an [Amazon Elastic Compute Cloud (Amazon EC2)][ec2].
3. The job lists all the objects in the source and destination
buckets, makes comparisons among objects and determines which objects should be transferred.
3. EC2 sends a message for each object that will be transferred to Amazon Simple Queue Service (Amazon SQS). Amazon S3 event messages can also be supported for more real-time data transfer; whenever there is object uploaded to source bucket, the event message is sent to the same SQS queue.
4. A JobWorker running in EC2 consumes the messages in SQS and transfers the object from the source bucket to the destination bucket. You can use an Auto Scaling Group to control the number of EC2 instances to transfer the data based on business need.
5. A record with transfer status for each object is stored in Amazon DynamoDB.
6. The Amazon EC2 instance will get (download) the object from the source bucket based on the SQS message.
7. The EC2 instance will put (upload) the object to the destination bucket based on the SQS message.
4. Amazon EC2 sends a message for each object that will be transferred to [Amazon Simple Queue Service (Amazon SQS)][sqs]. Amazon S3 event messages can also be supported for more real-time data transfer; whenever there is object uploaded to source bucket, the event message is sent to the same Amazon SQS queue.
5. A JobWorker running in Amazon EC2 consumes the messages in SQS and transfers the object from the source bucket to the destination bucket. You can use an Auto Scaling Group to control the number of EC2 instances to transfer the data based on business need.
6. A record with transfer status for each object is stored in Amazon DynamoDB.
7. The Amazon EC2 instance will get (download) the object from the source bucket based on the Amazon SQS message.
8. The Amazon EC2 instance will put (upload) the object to the destination bucket based on the Amazon SQS message.


!!! note "Note"
If an object (or part of an object) failed to transfer, the JobWorker releases the message in the queue, and the object is transferred again after the message is visible in the queue (default visibility timeout is set to 15 minutes). If the transfer fails again, the message is sent to the dead letter queue and a notification alarm is sent.

## Amazon ECR plugin

![ecr-architecture](./images/ecr-arch-global.png)

Figure 3: Data Transfer Hub Amazon ECR plugin architecture
![ecr-architecture](../images/ecr-arch-global.png)
*Data Transfer Hub Amazon ECR plugin architecture*

The Amazon ECR plugin runs the following workflows:

Expand All @@ -62,14 +60,17 @@ The Amazon ECR plugin runs the following workflows:
6. After the copy completes, the status (either success or fail) is logged into DynamoDB for tracking purpose.


[s3]:https://www.amazonaws.cn/s3/?nc1=h_ls
[cloudfront]:https://www.amazonaws.cn/cloudfront/?nc1=h_ls
[appsync]:https://www.amazonaws.cn/appsync/?nc1=h_ls
[cognito]:https://www.amazonaws.cn/cognito/?nc1=h_ls
[lambda]:https://www.amazonaws.cn/lambda/?nc1=h_ls
[stepfunction]:https://www.amazonaws.cn/step-functions/?nc1=h_ls
[cloudformation]:https://aws.amazon.com/cn/cloudformation/
[ecs]:https://aws.amazon.com/cn/ecs/
[ecr]:https://aws.amazon.com/cn/ecr/
[dynamodb]:https://www.amazonaws.cn/dynamodb/?nc1=h_ls
[waf]:https://aws.amazon.com/waf/

[s3]:https://aws.amazon.com/s3/
[cloudfront]:https://aws.amazon.com/cloudfront/
[appsync]:https://aws.amazon.com/appsync/
[cognito]:https://aws.amazon.com/cognito/
[lambda]:https://aws.amazon.com/lambda/
[stepfunction]:https://aws.amazon.com/step-functions/
[cloudformation]:https://aws.amazon.com/cloudformation/
[ecs]:https://aws.amazon.com/ecs/
[ecr]:https://aws.amazon.com/ecr/
[dynamodb]:https://aws.amazon.com/dynamodb/
[waf]:https://aws.amazon.com/waf/
[ec2]:https://aws.amazon.com/ec2/
[sqs]:https://aws.amazon.com/sqs/
46 changes: 46 additions & 0 deletions docs/en-base/architecture-overview/design-considerations.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
This solution was designed with best practices from the [AWS Well-Architected Framework][well-architected-framework] which helps customers design and operate reliable, secure, efficient, and cost-effective workloads in the cloud.

This section describes how the design principles and best practices of the Well-Architected Framework were applied when building this solution.

## Operational excellence
This section describes how the principles and best practices of the [operational excellence pillar][operational-excellence-pillar] were applied when designing this solution.

The Data Transfer Hub solution pushes metrics to Amazon CloudWatch at various stages to provide observability into the infrastructure, Lambda functions, Amazon EC2 transfer workers, Step Function workflow and the rest of the solution components. Data transferring errors are added to the Amazon SQS queue for retries and alerts.

## Security
This section describes how the principles and best practices of the [security pillar][security-pillar] were applied when designing this solution.

- Data Transfer Hub web console users are authenticated and authorized with Amazon Cognito.
- All inter-service communications use AWS IAM roles.
- All roles used by the solution follows least-privilege access. That is, it only contains minimum permissions required so the service can function properly.

## Reliability
This section describes how the principles and best practices of the [reliability pillar][reliability-pillar] were applied when designing this solution.

- Using AWS serverless services wherever possible (for example, Lambda, Step Functions, Amazon S3, and Amazon SQS) to ensure high availability and recovery from service failure.
- Data is stored in DynamoDB and Amazon S3, so it persists in multiple Availability Zones (AZs) by default.

## Performance efficiency
This section describes how the principles and best practices of the [performance efficiency pillar][performance-efficiency-pillar] were applied when designing this solution.

- The ability to launch this solution in any Region that supports AWS services in this solution such as: AWS Lambda, AWS S3, Amazon SQS, Amazon DynamoDB, and Amazon EC2.
- Automatically testing and deploying this solution daily. Reviewing this solution by solution architects and subject matter experts for areas to experiment and improve.

## Cost optimization
This section describes how the principles and best practices of the [cost optimization pillar][cost-optimization-pillar] were applied when designing this solution.

- Use Autoscaling Group so that the compute costs are only related to how much data is transferred.
- Using serverless services such as Amazon SQS and DynamoDB so that customers only get charged for what they use.

## Sustainability
This section describes how the principles and best practices of the [sustainability pillar][sustainability-pillar] were applied when designing this solution.

- The solution‘s serverless design (using Lambda, Amazon SQS and DynamoDB) and the use of managed services (such as Amazon EC2) are aimed at reducing carbon footprint compared to the footprint of continually operating on-premises servers.

[well-architected-framework]:https://aws.amazon.com/architecture/well-architected/?wa-lens-whitepapers.sort-by=item.additionalFields.sortDate&wa-lens-whitepapers.sort-order=desc&wa-guidance-whitepapers.sort-by=item.additionalFields.sortDate&wa-guidance-whitepapers.sort-order=desc
[operational-excellence-pillar]:https://docs.aws.amazon.com/wellarchitected/latest/operational-excellence-pillar/welcome.html
[security-pillar]:https://docs.aws.amazon.com/wellarchitected/latest/security-pillar/welcome.html
[reliability-pillar]:https://docs.aws.amazon.com/wellarchitected/latest/reliability-pillar/welcome.html
[performance-efficiency-pillar]:https://docs.aws.amazon.com/wellarchitected/latest/performance-efficiency-pillar/welcome.html
[cost-optimization-pillar]:https://docs.aws.amazon.com/wellarchitected/latest/cost-optimization-pillar/welcome.html
[sustainability-pillar]:https://docs.aws.amazon.com/wellarchitected/latest/sustainability-pillar/sustainability-pillar.html
7 changes: 7 additions & 0 deletions docs/en-base/contributors.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
- Aiden Dai
- Eva Liu
- Kervin Hu
- Haiyun Chen
- Joe Shi
- Ashwini Rudra
- Jyoti Tyagi
13 changes: 13 additions & 0 deletions docs/en-base/deployment/deployment-overview.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
Use the following steps to deploy this solution on AWS. For detailed instructions, follow the links for each step.

Before you launch the solution, [review the cost](../../plan-deployment/cost), architecture, network security, and other considerations discussed in this guide. Follow the step-by-step instructions in this section to configure and deploy the solution into your account.


**Time to deploy**: Approximately 15 minutes

- Step 1. Launch the stack
- [(Option 1) Deploy the AWS CloudFormation template in AWS Regions](../deployment/#launch-cognito)
- [(Option 2) Deploy the AWS CloudFormation template in AWS China Regions](../deployment/#launch-openid)

- Step 2. [Launch the web console](../deployment/#launch-web-console)
- Step 3. [Create a transfer task](../deployment/#create-task)
Loading

0 comments on commit c8add01

Please sign in to comment.