Skip to content

Commit

Permalink
update post with permissions that work
Browse files Browse the repository at this point in the history
  • Loading branch information
NewGraphEnvironment committed May 25, 2024
1 parent 36ca0b1 commit 6a1817a
Show file tree
Hide file tree
Showing 2 changed files with 60 additions and 18 deletions.
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
{
"hash": "9fbe25ef84f260edabf921f1987e3af8",
"hash": "7db62ebe8f950ed0dcfe48f8816b63be",
"result": {
"engine": "knitr",
"markdown": "---\ntitle: \"Setting aws bucket permissions with R\"\nauthor: \"al\"\ndate: \"2024-05-24\"\ndate-modified: \"2024-05-24\"\ncategories: [news, assets, aws, s3, r, paws]\nimage: \"image.jpg\"\nparams:\n repo_owner: \"NewGraphEnvironment\"\n repo_name: \"new_graphiti\"\nformat: \n html:\n code-fold: true\n theme: quartz\n---\n\n\nHere we will set up an s3 bucket with a policy that allows the public to read from the bucket, but not from a specific\ndirectory, and allows a particular `aws_account_id` to write to the bucket. Although we are stoked on the `s3fs` package\nfor working with s3 buckets, we will use the `paws` package more than perhaps necessary here - only to learn about how\nit all works. Seems like `s3fs` is the way to go for common moves but `paws` is the \"mom\" providing the structure and\nguidance to that package.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(paws)\nlibrary(here)\n```\n\n::: {.cell-output .cell-output-stderr}\n\n```\nhere() starts at /Users/airvine/Projects/repo/new_graphiti\n```\n\n\n:::\n\n```{.r .cell-code}\nlibrary(jsonlite)\nlibrary(stringr)\nlibrary(s3fs)\n```\n:::\n\n\nList our current buckets\n\n\n::: {.cell}\n\n```{.r .cell-code}\ns3 <- paws::s3()\ns3$list_buckets()\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n$Buckets\n$Buckets[[1]]\n$Buckets[[1]]$Name\n[1] \"23cog\"\n\n$Buckets[[1]]$CreationDate\n[1] \"2023-03-17 00:07:12 GMT\"\n\n\n\n$Owner\n$Owner$DisplayName\n[1] \"al\"\n\n$Owner$ID\n[1] \"f5267b02e31758d1efea79b4eaef5d0423efb3e6a54ab869dc860bcc68ebae2d\"\n```\n\n\n:::\n:::\n\n\n# Create Bucket\n\nLet's create a bucket called the same name as this repository.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nmy_bucket_name <- basename(here::here()) |> \n stringr::str_replace_all(\"_\", \"-\") \n\nbucket_path <- s3fs::s3_path(my_bucket_name)\n\ns3$create_bucket(Bucket = my_bucket_name,\n CreateBucketConfiguration = list(\n LocationConstraint = Sys.getenv(\"AWS_DEFAULT_REGION\")\n ))\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n$Location\n[1] \"http://new-graphiti.s3.amazonaws.com/\"\n```\n\n\n:::\n:::\n\n\n# Add the policy to the bucket.\n\n1. **Important** - First we need to allow \"new public policies\" to be added to the bucket.\n\n\n::: {.cell}\n\n```{.r .cell-code}\ns3$delete_public_access_block(\n Bucket = my_bucket_name\n)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\nlist()\n```\n\n\n:::\n:::\n\n\n2. Write the policy for the bucket Here is a function to make a generic policy for an s3 bucket that allows public to \nread from the bucket, but not from a specific directory, and allows a particular `aws_account_id` to write to the bucket. \nPlus + it allows you to provide `Presigned URLs` so we can provide temporary access to private objects without having to \nchange the overall bucket or object permissions. \n\n\n\n::: {.cell}\n\n```{.r .cell-code}\naws_policy_write <- function(bucket_name, bucket_dir_private, aws_account_id, write_json = FALSE, dir_output = \"policy\", file_name = \"policy.json\") {\n policy <- list(\n Statement = list(\n list(\n Effect = \"Allow\",\n Principal = \"*\",\n Action = \"s3:GetObject\",\n Resource = paste0(\"arn:aws:s3:::\", bucket_name, \"/*\")\n ),\n list(\n Effect = \"Deny\",\n Principal = \"*\",\n Action = \"s3:GetObject\",\n Resource = paste0(\"arn:aws:s3:::\", bucket_name, \"/\", bucket_dir_private, \"/*\"),\n Condition = list(\n StringNotEquals = list(\n \"aws:UserAgent\" = c(\"S3Console\", \"paws-storage\")\n )\n )\n ),\n list(\n Effect = \"Allow\",\n Principal = list(AWS = paste0(\"arn:aws:iam::\", aws_account_id, \":root\")),\n Action = c(\"s3:DeleteObject\", \"s3:PutObject\"),\n Resource = paste0(\"arn:aws:s3:::\", bucket_name, \"/*\")\n )\n )\n )\n \n json_policy <- jsonlite::toJSON(policy, pretty = TRUE, auto_unbox = TRUE)\n \n if (write_json) {\n dir.create(dir_output, showWarnings = FALSE)\n output_path <- file.path(dir_output, file_name)\n write(json_policy, file = output_path)\n message(\"Policy written to \", output_path)\n }else{\n return(json_policy)\n }\n}\n```\n:::\n\n\nNow we can write the policy to the bucket.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nmy_policy <- aws_policy_write(bucket_name = my_bucket_name, \n bucket_dir_private = \"private\", \n aws_account_id = Sys.getenv(\"AWS_ACCOUNT_ID\"),\n write_json = FALSE\n )\n\ns3$put_bucket_policy(\n Bucket = my_bucket_name,\n Policy = my_policy,\n ExpectedBucketOwner = Sys.getenv(\"AWS_ACCOUNT_ID\")\n)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\nlist()\n```\n\n\n:::\n:::\n\n\nCheck the policy was added correctly.\n\n\n::: {.cell}\n\n```{.r .cell-code}\n# this is cool\ns3$get_bucket_policy(my_bucket_name)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n$Policy\n[1] \"{\\\"Version\\\":\\\"2008-10-17\\\",\\\"Statement\\\":[{\\\"Effect\\\":\\\"Allow\\\",\\\"Principal\\\":\\\"*\\\",\\\"Action\\\":\\\"s3:GetObject\\\",\\\"Resource\\\":\\\"arn:aws:s3:::new-graphiti/*\\\"},{\\\"Effect\\\":\\\"Deny\\\",\\\"Principal\\\":\\\"*\\\",\\\"Action\\\":\\\"s3:GetObject\\\",\\\"Resource\\\":\\\"arn:aws:s3:::new-graphiti/private/*\\\",\\\"Condition\\\":{\\\"StringNotEquals\\\":{\\\"aws:UserAgent\\\":[\\\"S3Console\\\",\\\"paws-storage\\\"]}}},{\\\"Effect\\\":\\\"Allow\\\",\\\"Principal\\\":{\\\"AWS\\\":\\\"arn:aws:iam::414155577829:root\\\"},\\\"Action\\\":[\\\"s3:DeleteObject\\\",\\\"s3:PutObject\\\"],\\\"Resource\\\":\\\"arn:aws:s3:::new-graphiti/*\\\"}]}\"\n```\n\n\n:::\n:::\n\n\n# Add some files to the bucket\n\nFirst we add a photo to the main bucket. Going to use `s3fs` for this since I haven't actually done just one file yet... We are using the `here` package to get the path to the image due to rendering complexities.\n\n\n::: {.cell}\n\n```{.r .cell-code}\ns3fs::s3_file_copy(\n path = paste0(here::here(), \"/posts/aws-storage-permissions/image.jpg\"),\n bucket_path\n)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] \"s3://new-graphiti/image.jpg\"\n```\n\n\n:::\n:::\n\n\nThen we add one to the private directory.\n\n\n::: {.cell}\n\n```{.r .cell-code}\ns3fs::s3_dir_create(\n path = paste0(bucket_path, \"/private\")\n)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] \"s3://new-graphiti/private\"\n```\n\n\n:::\n\n```{.r .cell-code}\ns3fs::s3_file_copy(\n path = paste0(here::here(), \"/posts/aws-storage-permissions/image.jpg\"),\n paste0(bucket_path, \"/private\")\n)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] \"s3://new-graphiti/private/image.jpg\"\n```\n\n\n:::\n:::\n\n\n# Access the bucket\n\nLet's see if we can add the images to this post.\n\nCreate the paths to the images.\n\n\n::: {.cell}\n\n```{.r .cell-code}\n# s3fs::s3_dir_info(bucket_path, recurse = TRUE)\nimage_path <- paste0(\"https://\", my_bucket_name, \".s3.amazonaws.com/image.jpg\")\nimage_path_private <- paste0(\"https://\", my_bucket_name, \".s3.amazonaws.com/private/image.jpg\")\n```\n:::\n\nAccess the public image.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nknitr::include_graphics(image_path)\n```\n\n::: {.cell-output-display}\n![](https://new-graphiti.s3.amazonaws.com/image.jpg)\n:::\n:::\n\n\nGood to go.\n\n\nAnd now access the private image.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nknitr::include_graphics(image_path_private)\n```\n\n::: {.cell-output-display}\n![](https://new-graphiti.s3.amazonaws.com/private/image.jpg)\n:::\n:::\n\n\n💣 Jackpot! We have the image in the bucket but can't access them from the post.\n\n \n\n\n\n::: {.cell}\n\n```{.r .cell-code}\n# Provide temporary access to an object\n# Can't get this to work yet so will come back to it.\nknitr::include_graphics(\n s3fs::s3_file_url(\n paste0(bucket_path, \"/private\", \"/image.jpg\")\n )\n)\n```\n:::\n\n::: {.cell}\n\n```{.r .cell-code}\n# Delete the bucket\n# Burn down the bucket 🔥. If we try to use `s3$delete_bucket(Bucket = my_bucket_name)` we will get an error because the \n# bucket is not empty. `s3fs::s3_bucket_delete(bucket_path)` works fine though.\n\ns3fs::s3_bucket_delete(bucket_path)\n```\n:::\n",
"markdown": "---\ntitle: \"Setting aws bucket permissions with R\"\nauthor: \"al\"\ndate: \"2024-05-24\"\ndate-modified: \"2024-05-25\"\ncategories: [news, assets, aws, s3, r, paws]\nimage: \"image.jpg\"\nparams:\n repo_owner: \"NewGraphEnvironment\"\n repo_name: \"new_graphiti\"\nformat: \n html:\n code-fold: true\n---\n\n\nHere we will set up an s3 bucket with a policy that allows the public to read from the bucket, but not from a specific\ndirectory, and allows a particular `aws_account_id` to write to the bucket. Although we are stoked on the `s3fs` package\nfor working with s3 buckets, we will use the `paws` package more than perhaps necessary here - only to learn about how\nit all works. Seems like `s3fs` is the way to go for common moves but `paws` is the \"mom\" providing the structure and\nguidance to that package.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(paws)\nlibrary(here)\n```\n\n::: {.cell-output .cell-output-stderr}\n\n```\nhere() starts at /Users/airvine/Projects/repo/new_graphiti\n```\n\n\n:::\n\n```{.r .cell-code}\nlibrary(jsonlite)\nlibrary(stringr)\nlibrary(s3fs)\n```\n:::\n\n\nList our current buckets\n\n\n::: {.cell}\n\n```{.r .cell-code}\ns3 <- paws::s3()\ns3$list_buckets()\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n$Buckets\n$Buckets[[1]]\n$Buckets[[1]]$Name\n[1] \"23cog\"\n\n$Buckets[[1]]$CreationDate\n[1] \"2023-03-17 00:07:12 GMT\"\n\n\n\n$Owner\n$Owner$DisplayName\n[1] \"al\"\n\n$Owner$ID\n[1] \"f5267b02e31758d1efea79b4eaef5d0423efb3e6a54ab869dc860bcc68ebae2d\"\n```\n\n\n:::\n:::\n\n\n# Create Bucket\n\nLet's create a bucket called the same name as this repository.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nmy_bucket_name <- basename(here::here()) |> \n stringr::str_replace_all(\"_\", \"-\") \n\nbucket_path <- s3fs::s3_path(my_bucket_name)\n\ns3$create_bucket(Bucket = my_bucket_name,\n CreateBucketConfiguration = list(\n LocationConstraint = Sys.getenv(\"AWS_DEFAULT_REGION\")\n ))\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n$Location\n[1] \"http://new-graphiti.s3.amazonaws.com/\"\n```\n\n\n:::\n:::\n\n\n# Add the policy to the bucket.\n\n1. **Important** - First we need to allow \"new public policies\" to be added to the bucket.\n\n\n::: {.cell}\n\n```{.r .cell-code}\ns3$delete_public_access_block(\n Bucket = my_bucket_name\n)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\nlist()\n```\n\n\n:::\n:::\n\n\n2. Write the policy for the bucket Here is a function to make a generic policy for an s3 bucket that allows public to \nread from the bucket, but not from a specific directory, and allows a particular `aws_account_id` to write to the bucket. \nPlus + it allows you to provide `Presigned URLs` so we can provide temporary access to private objects without having to \nchange the overall bucket or object permissions. \n\n\n\n::: {.cell}\n\n```{.r .cell-code}\n# https://docs.aws.amazon.com/AmazonS3/latest/userguide/example-walkthroughs-managing-access-example1.html\n#https://chatgpt.com/share/16106509-a34d-4f69-bf95-cd5eb2649707\naws_policy_write <- function(bucket_name, \n bucket_dir_private, \n aws_account_id, \n user_access_permission, \n write_json = FALSE, \n dir_output = \"policy\", \n file_name = \"policy.json\") {\n policy <- list(\n Version = \"2012-10-17\",\n Statement = list(\n list(\n Effect = \"Allow\",\n Principal = \"*\",\n Action = \"s3:GetObject\",\n Resource = paste0(\"arn:aws:s3:::\", bucket_name, \"/*\")\n ),\n list(\n Effect = \"Deny\",\n Principal = \"*\",\n Action = \"s3:GetObject\",\n Resource = paste0(\"arn:aws:s3:::\", bucket_name, \"/\", bucket_dir_private, \"/*\"),\n # IMPORTANT - Denies everyone from getting objects from the private directory except for user_access_permission\n Condition = list(\n StringNotEquals = list(\n \"aws:PrincipalArn\" = paste0(\"arn:aws:iam::\", aws_account_id, \":user/\", user_access_permission)\n )\n )\n ),\n list(\n Effect = \"Allow\",\n Principal = list(AWS = paste0(\"arn:aws:iam::\", aws_account_id, \":root\")),\n Action = c(\"s3:DeleteObject\", \"s3:PutObject\"),\n Resource = paste0(\"arn:aws:s3:::\", bucket_name, \"/*\")\n )\n # list(\n # Effect = \"Allow\",\n # Principal = list(AWS = paste0(\"arn:aws:iam::\", aws_account_id, \":user/\", user_access_permission)),\n # Action = c(\"s3:GetBucketLocation\", \"s3:ListBucket\"),\n # Resource = paste0(\"arn:aws:s3:::\", bucket_name)\n # ),\n # list(\n # Effect = \"Allow\",\n # Principal = list(AWS = paste0(\"arn:aws:iam::\", aws_account_id, \":user/\", user_access_permission)),\n # Action = \"s3:GetObject\",\n # Resource = paste0(\"arn:aws:s3:::\", bucket_name, \"/*\")\n # )\n )\n )\n \n json_policy <- jsonlite::toJSON(policy, pretty = TRUE, auto_unbox = TRUE)\n \n if (write_json) {\n dir.create(dir_output, showWarnings = FALSE)\n output_path <- file.path(dir_output, file_name)\n write(json_policy, file = output_path)\n message(\"Policy written to \", output_path)\n } else {\n return(json_policy)\n }\n}\n```\n:::\n\n\nNow we can write the policy to the bucket.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nmy_policy <- aws_policy_write(bucket_name = my_bucket_name, \n bucket_dir_private = \"private\",\n aws_account_id = Sys.getenv(\"AWS_ACCOUNT_ID\"),\n user_access_permission = \"airvine\",\n write_json = FALSE\n )\n\ns3$put_bucket_policy(\n Bucket = my_bucket_name,\n Policy = my_policy,\n ExpectedBucketOwner = Sys.getenv(\"AWS_ACCOUNT_ID\")\n)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\nlist()\n```\n\n\n:::\n:::\n\n\nCheck the policy was added correctly.\n\n\n::: {.cell}\n\n```{.r .cell-code}\n# this is cool\ns3$get_bucket_policy(my_bucket_name)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n$Policy\n[1] \"{\\\"Version\\\":\\\"2012-10-17\\\",\\\"Statement\\\":[{\\\"Effect\\\":\\\"Allow\\\",\\\"Principal\\\":\\\"*\\\",\\\"Action\\\":\\\"s3:GetObject\\\",\\\"Resource\\\":\\\"arn:aws:s3:::new-graphiti/*\\\"},{\\\"Effect\\\":\\\"Deny\\\",\\\"Principal\\\":\\\"*\\\",\\\"Action\\\":\\\"s3:GetObject\\\",\\\"Resource\\\":\\\"arn:aws:s3:::new-graphiti/private/*\\\",\\\"Condition\\\":{\\\"StringNotEquals\\\":{\\\"aws:PrincipalArn\\\":\\\"arn:aws:iam::414155577829:user/airvine\\\"}}},{\\\"Effect\\\":\\\"Allow\\\",\\\"Principal\\\":{\\\"AWS\\\":\\\"arn:aws:iam::414155577829:root\\\"},\\\"Action\\\":[\\\"s3:DeleteObject\\\",\\\"s3:PutObject\\\"],\\\"Resource\\\":\\\"arn:aws:s3:::new-graphiti/*\\\"}]}\"\n```\n\n\n:::\n:::\n\n\n# Add some files to the bucket\n\nFirst we add a photo to the main bucket. Going to use `s3fs` for this since I haven't actually done just one file yet... We are using the `here` package to get the path to the image due to rendering complexities.\n\n\n::: {.cell}\n\n```{.r .cell-code}\ns3fs::s3_file_copy(\n path = paste0(here::here(), \"/posts/aws-storage-permissions/image.jpg\"),\n bucket_path\n)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] \"s3://new-graphiti/image.jpg\"\n```\n\n\n:::\n:::\n\n\nThen we add one to the private directory.\n\n\n::: {.cell}\n\n```{.r .cell-code}\ns3fs::s3_dir_create(\n path = paste0(bucket_path, \"/private\")\n)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] \"s3://new-graphiti/private\"\n```\n\n\n:::\n\n```{.r .cell-code}\ns3fs::s3_file_copy(\n path = paste0(here::here(), \"/posts/aws-storage-permissions/image.jpg\"),\n paste0(bucket_path, \"/private\")\n)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] \"s3://new-graphiti/private/image.jpg\"\n```\n\n\n:::\n:::\n\n\n# Access the bucket\n\nLet's see if we can add the images to this post.\n\nCreate the paths to the images.\n\n\n::: {.cell}\n\n```{.r .cell-code}\n# s3fs::s3_dir_info(bucket_path, recurse = TRUE)\nimage_path <- paste0(\"https://\", my_bucket_name, \".s3.amazonaws.com/image.jpg\")\nimage_path_private <- paste0(\"https://\", my_bucket_name, \".s3.amazonaws.com/private/image.jpg\")\n```\n:::\n\nAccess the public image.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nknitr::include_graphics(image_path)\n```\n\n::: {.cell-output-display}\n![](https://new-graphiti.s3.amazonaws.com/image.jpg)\n:::\n:::\n\n\nGood to go.\n\n\nAnd now access the private image.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nknitr::include_graphics(image_path_private)\n```\n\n::: {.cell-output-display}\n![](https://new-graphiti.s3.amazonaws.com/private/image.jpg)\n:::\n:::\n\n\n💣 Jackpot! We have the image in the \"private\" bucket so can't access them from the post without permission.\n\n\n# Provide temporary access to an object\nBecause we granted ourselves (airvine in this case) access to the private directory, we can create a `Presigned URL` to \nprovide temporary access to the private image. We will set the maximum of 7 days for the URL to be valid. That means that\non 2024-06-01 01:37 the URL will no longer work and the image below \nwill not appear.\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nknitr::include_graphics(\n s3fs::s3_file_url(\n s3_dir_ls(paste0(bucket_path, \"/private\")),\n 604800,\n \"get_object\"\n )\n)\n```\n\n::: {.cell-output-display}\n![](https://new-graphiti.s3.us-west-2.amazonaws.com/private/image.jpg?AWSAccessKeyId=AKIAWA3MYTHSXCSVLM5N&Expires=1717231041&Signature=cJN9CZgocOjkxHzVStCmbhIxFE4%3D)\n:::\n:::\n\n::: {.cell}\n\n```{.r .cell-code}\n# this is the cmd line way\nurl_file_share <- s3_dir_ls(paste0(bucket_path, \"/private\"))\ncommand <- \"aws\"\nargs <- c('s3', 'presign', url_file_share, '--expires-in', '604800')\n\nworking_directory = here::here() #we could just remove from funciton to get the current wd but its nice to have so we leave\n\n# loaded this function from the other file. should put in functions file or package\nsys_call()\n```\n:::\n\n\nIn order to rerun our post we need to delete the bucket. When we do rerun - we use the `s3fs` package to do it\n\n\n::: {.cell}\n\n```{.r .cell-code}\n# Dont delete the bucket or the post wont render! ha\n# Burn down the bucket 🔥. If we try to use `s3$delete_bucket(Bucket = my_bucket_name)` we will get an error because the \n# bucket is not empty. `s3fs::s3_bucket_delete(bucket_path)` works fine though.\ns3fs::s3_bucket_delete(bucket_path)\n```\n:::\n",
"supporting": [],
"filters": [
"rmarkdown/pagebreak.lua"
Expand Down
Loading

0 comments on commit 6a1817a

Please sign in to comment.