Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Google Cloud Professional Certifications #217

Open
shon-button opened this issue Feb 13, 2023 · 15 comments
Open

Google Cloud Professional Certifications #217

shon-button opened this issue Feb 13, 2023 · 15 comments

Comments

@shon-button
Copy link
Contributor

shon-button commented Feb 13, 2023

Resource List:

Cloud Developer:

Training:

BALLARD:

GABRIEL:

JOSH:
Google Partner Skills boost
Awesome list

SHON:
Course: A Cloud Guru
Exams:
Udemy
Exam Topics

seems very good cloudacademy
https://googlecloudcheatsheet.withgoogle.com/

Practice:

Google
ITExams

Exam:

Online Proctored Certification Testing
Webassessor Exam Registration

@grrlopes
Copy link

grrlopes commented Feb 13, 2023

From my side i am studying through that course.

udemy

Roadmap

localEnv-dev

@shon-button
Copy link
Contributor Author

shon-button commented Feb 14, 2023

Professional Cloud Developer

A Professional Cloud Developer builds scalable and highly available applications using Google-recommended tools and best practices. This individual has experience with cloud-native applications, developer tools, managed services, and next-generation databases. A Professional Cloud Developer also has proficiency with at least one general-purpose programming language and instruments their code to produce metrics, logs, and traces.
Certification exam guide

GCP Computing Services

FaaS (functions as a service)
Cloud Functions in a minute
Cloud Functions best practice
cloudfunctions

PaaS (platform as a service)
App Engine in a minute
appengine

CaaS (containers as a service)
Cloud Run in a minute
CloudRun

Kubernetes Engine in a minute
GKE

IaaS (infrastructure as a service)
Compute Engine in a minute
GCE

Choosing compute options
GCP compute options decision tree
GCPComputeOptions

GCP DevOps

Code as Infrastructure

Infrastructure as code (IaC) is the process of managing and provisioning computer data centers through machine-readable definition files using the popular GitOps methodology. GitOps key concept is using a Git repository to store the environment state that you want. Terraform is a HashiCorp open source tool that enables you to predictably create, change, and improve your cloud infrastructure by using code. You use Cloud Build, a Google Cloud continuous integration service, to automatically apply Terraform manifests to your environment.

Cloud Build

CloudBuild

Operations Suite for profiling and debugging in production

Cloud Operations Suite in a minute
CloudOps

GCP Design Patterns

Microservices

Microservices architecture enable you to break down a large application into smaller independent services, with each service having its own realm of responsibility.
MicroservicesArchitecture

Messaging Middlewares

Cloud Pub\Sub
Cloud Pub/Sub in a minute
pubsub

GCP Storage and Databases

Cloud Storage

GCS

Cloud SQL

cloud_sql_banner

Cloud Firestore

firestore

Cloud BigTable

Bigtable

@shon-button
Copy link
Contributor Author

shon-button commented Feb 16, 2023

GCP Development Setup

image

GCP dev tools:

  • GCP project the organizing entity for what you're building
  • GCloud SDK command line GCloud tool to interact with GCP services
  • IDE environment for your chosen programming language
    • Cloud Code extension on VS-Code and other IDEs which allows you to quickly build and deploy to Cloud Run with just a few clicks from your IDE
    • App Engine guide for getting started with your chosen language
  • Git for source control management
  • Docker for managing containers

@shon-button
Copy link
Contributor Author

shon-button commented Feb 17, 2023

Google Course Series

Fundamentals of Application Development

Application Development Methodologies

Waterfall development method
waterfall-development

Agile development method
agile-development

DevOps Methodology
image

Application Development Best Practices

Best practices for developing cloud applications ensure secure, scalable, resilient applications with loosely coupled services that can be monitored and can fail gracefully on error .

image

Patterns for scalable and resilient apps
Scalability is the measure of a system’s ability to handle varying amounts of work by adding or removing resources from the system.
Resilience means designing to withstand failures. A resilient app is one that continues to function despite failures of system components.

Some key best practices concepts include:

  • Managing your applications code and environment
  • Dependency Management
  • Separate your applications configuration settings from your code
  • Implement automated testing
  • Implement build and release systems
  • Implement microservices-based architectures
  • Use event driven processing where possible
  • Design for loose coupling
  • Design each application so that it focuses on compute tasks only
  • Cache application data
  • Implement API gateways to make backend functionality available to consumer applications
  • Use Federated Identity Management for user management
  • Monitor the status of your application and services
  • Treat your logs as event streams
  • Implement retry logic with exponential back-off and fail gracefully if the errors persist
  • Identify failure scenarios and create disaster recovery plans
  • Consider data sovereignty and compliance requirements
  • Consider using the strangler pattern when re-architecting and migrating large applications

Managing your applications code and environment

image

Code Repository

Store your applications code in a code repository, such as Git or subversion.
This will enable you to track changes to your source code and set up systems for continuous integration and delivery.

Dependency Management

Don’t store external dependencies such as JAR files or external packages in your code repository. Instead, depending on your application platform, explicitly declare your dependencies with their versions and install them using a dependency manager. For example, for a Node js application, you can declare your application dependencies in a package dot json file, and later install them using the NPM install command.

Separate your applications configuration settings from your code

Don’t store configuration settings as constants in your source code. Instead, specify configuration settings as environment variables. This enables you to easily modify settings between development test and production environments.

Implement automated testing

Cloud Build

Integrated within source code
image

Implement build and release systems

Cloud Build

image

Build and release systems enable continuous integration and delivery. While it’s crucial that you have repeatable deployments, it’s also important that you have the ability to roll back to a previous version of the app in a few minutes if you catch a bug in production. Cloud Build is GCP's CI\CD service that builds pipelines, constructs deployment artefacts, and has build-in testing and security. It's important to consider security throughout the continuous integration and delivery process. With a Sec DevOps approach, you can automate security checks, such as confirming whether you're using the most secure versions of third party software and dependencies, you're scanning code for security vulnerabilities, confirming that resources have permissions based on principles of least privilege, and detecting errors in production and rolling back to the last stable built.
By default, Cloud Build uses a special service account to execute builds on your behalf. This service account is called the Cloud Build service account and it is created automatically when you enable the Cloud Build API in a Google Cloud project. This service account has a number of permissions by default such as the ability to update builds or write logs.

Instead of using the default Cloud Build service account, you can specify your own service account to execute builds on your behalf. You can specify any number of service accounts per project. Maintaining multiple service accounts enables you to grant different permissions to these service accounts depending on the tasks they perform. For example, you can use one service account for building and pushing images to the Container Registry and a different service account for building and pushing images to Artifact Registry.

Implement microservices-based architectures

Microservices on GCP

image

Micro services enable you to structure your application components in relation to your business boundaries. In this example, the UI, payment, shipping and order services are all broken up into individual micro services.
The code base for each service is modular, it’s easy to determine where code needs to be changed. Each service can be updated and deployed independently without requiring the consumers to change simultaneously.
Each service can be scaled independently depending on load.
Make sure to evaluate the costs and benefits of optimizing and converting a monolithic application into one that uses a micro services architecture.

Use event driven processing where possible

Cloud Functions

image

Remote operations can have unpredictable response times and can make your application seem slow. Keep the operations in the user thread at a minimum. Perform backend operations asynchronously. Use event driven processing where possible.
For example, if your application processes images that are uploaded by a user, you can use a Google Cloud Storage bucket to store the uploaded images. You can then implement Google Cloud functions that are triggered whenever a new image is uploaded. Cloud Functions can process the image and upload the results to a different Cloud storage location.

Design for loose coupling

Cloud Pub\Sub

image

Design application components so that they are loosely coupled at runtime, tightly coupled components can make an application less resilient to failures, spikes in traffic and changes to services. An intermediate component such as a message queue can be used to implement loose coupling, perform asynchronous processing, and buffer requests in case of spikes in traffic. You can use a Cloud Pub Sub topic as a message queue. Publishers can publish messages to the topic and subscribers can subscribe to messages from this topic. In the context of HTTP API payloads, consumers of HTTP APIs should bind loosely with the publishers of the API. In the example, the email service retrieves information about each customer from the customer service. The customer service returns the customer’s name, age and email address and its payload. To send an email, the email service should only reference the name and email fields in the payload. It should not attempt to bind with all the fields in the payload. This method of loosely binding fields will enable the publisher to evolve the API and add fields to the payload in a backwards compatible manner. Implement application components so that they don’t store state internally, or access a shared state.

Design each application so that it focuses on compute tasks only

Cloud Functions

image

Designing each application so that it focuses on compute tasks only enables you to use a worker pattern to add or remove additional instances of the component for scalability. Application components should start up quickly to enable efficient scaling, and shut down gracefully when they receive a termination signal. For example, if your application needs to process streaming data from IoT devices, you can use a Cloud Pub Sub topic to receive the data. You can then implement Cloud functions that are triggered whenever a new piece of data comes in. Cloud Functions can process, transform, and store the data.
Alternatively, your application can subscribe to the Pub Sub topic that receives the streaming data. Multiple instances of your application can spin up and process the messages in the topic and split the workload. These instances can automatically be shut down when there are very few messages to process. To enable elastic scaling, you can use any compute environment, such as Compute Engine with Cloud load balancing, Google Kubernetes engine or App Engine. With any approach, you don’t have to develop code to manage concurrency or scaling. Your application scales automatically depending on the workload.

Cache application data

Cloud CDN
Cloud Storage

Caching content can improve application performance and lower network latency. Cache application data that is frequently accessed or that is computationally intensive to calculate each day. When a user requests data, the application component should check the cache first. If data exists in the cache, meaning the TTL has not expired, the application should return the previously cached data. If the data does not exist in the cache, or has expired, the application should retrieve the data from backend data sources and recompute results as needed.
The application should also update the cache with the new value. In addition to caching application data in a cache such as mem cache to your readers, you can also use a content delivery network to cache web pages. Cloud content delivery network can cache load balanced frontend content that comes from Compute Engine, VM instance groups, or static content that is served from Cloud Storage.

Implement API gateways to make backend functionality available to consumer applications

API Management

image

You can use Cloud Endpoints to develop, deploy, protect and monitor APIs based on the open API specification, or GRPC. Also, the API for your application can run on backends such as App Engine, GKE, or Compute Engine. If you have legacy applications that cannot be refactored and moved to the Cloud, consider implementing API as a facade or adapter layer.
Each consumer can then invoke these modern APIs to retrieve information from the backend instead of implementing functionality to communicate using outdated protocols and disparate interfaces. Using the Apigee API platform, you can design, secure, analyze, and scale your APIs for legacy applications.

Use Federated Identity Management for user management

Delegate user authentication to external identity providers such as Google, Facebook, Twitter, or GitHub.
This will minimize your effort for user administration. With Federated Identity Management, you don't need to implement secure and scale a proprietary solution for authenticating users.

Monitor the status of your application and services

Cloud Operations

It's important to monitor the status of your application and services to ensure that they're always available and performing optimally. The monitoring data can be used to automatically alert operations teams as soon as the system begins to fail. Operations teams can then diagnose and address the issue promptly.
Implement a health check endpoint for each service. The endpoint handlers should check the health of all dependencies and infrastructure components required for the service to function properly. For example, the endpoint handler can check the performance and availability of storage, database, and network connections required by the service. The endpoint should return an HTTP response code of 200 for a successful health check. If the health check fails, the endpoint handler can return a 503
Google Cloud offers configurable health checks for Google Cloud load balancer backends, Traffic Director backends, and application-based autohealing for managed instance groups. This document covers key health checking concepts.

Unless otherwise noted, Google Cloud health checks are implemented by dedicated software tasks that connect to backends according to parameters specified in a health check resource. Each connection attempt is called a probe. Google Cloud records the success or failure of each probe.
Based on a configurable number of sequential successful or failed probes, an overall health state is computed for each backend. Backends that respond successfully for the configured number of times are considered healthy. Backends that fail to respond successfully for a separately configurable number of times are unhealthy.]
For any further detail: https://cloud.google.com/load-balancing/docs/health-check-concepts

Treat your logs as event streams

Logs constitute a continuous stream of events that keep occurring as long as the application is running. Don't manage log files in your application. Instead, write to an event stream such as standard out and let the underlying infrastructure collate all events for later analysis and storage. With this approach, you can set up logs based metrics and trace requests across different services in your application. With Google Operations you can debug your application, setup error monitoring, setup logging and logs-based metrics, trace requests across services, and monitor applications running in a multi-cloud environment.

Implement retry logic with exponential back-off and fail gracefully if the errors persist

When accessing services and resources in a distributed system, applications need to be resilient to temporary and long-lasting errors. Resources can sometimes become unavailable due to transient network errors. In this case, applications should implement retry logic with exponential back-off and fail gracefully if the errors persist.
The Google Cloud Client Libraries retry failed requests automatically. When errors are more long lasting, the application should not waste CPU cycles attempting to retry the request over and over again.
In this case, applications should implement a circuit breaker and handle the failure gracefully. For errors that are propagated back to the user, consider degrading the application gracefully instead of explicitly displaying the error message.
For example, if the recommendations engine is down, consider hiding the product recommendations area on the page instead of displaying error messages every time the page is displayed.

Identify failure scenarios and create disaster recovery plans

Identify people, processes and tools for disaster recovery. Initially, you can perform tabletop tests. These are tests in which teams discuss how they would respond in failure scenarios but don't perform any real actions. This type of test encourages teams to discuss what they would do in unexpected situations. Then, simulate failures in your test environment. After you understand the behaviour of your application under failure scenarios, address any problems and refine your disaster recovery plan. Then test the failure scenarios in your production environment.

Consider data sovereignty and compliance requirements

Some regions and industry segments have strict compliance requirements for data protection and consumer privacy.

Consider using the strangler pattern when re-architecting and migrating large applications

In the early phases of migration,you might replace smaller components of the legacy application with newer application components or services. You can incrementally replace more features of the original application with new services. A strangler facade can receive requests

Application Design Pattern:

The 12 factors is an approach that helps programmers write modern apps in a declarative way, using clear contracts deployed via cloud.

12-Factor-app-FIN

Cloud Guru: Application Design Summary

Managed platforms, design and migration patterns

image

Different managed platforms available to us, their scaling velocity, and some of their trade-off. Remember,
we need scalability to handle varying demands without wasting money or compromising performance,
and we also need resilience to protect us from the failure of any given component.
Some patterns for designing apps that take advantage of these platforms include the twelve-factor app design and microservices: loosely coupled components that we can scale and maintain individually, but all communicate with each other via an API contract.
Common patterns for migrating applications to the cloud: lift and shift, move, and improve and rip and replace.

Cloud native software development

image

Agile principles for cloud native software development which always uses source control management systems
like Git.
Explored different types of tests that we can perform, including unit testing and integration testing, and how these should be automated as part of our build pipeline.
Identified that infrastructure can be built and deployed as code, in exactly the same way as an application.
Mentioned some observability tools for profiling and debugging.

Message buses to decouple microservices

image

Concept of message buses, Cloud Pub/Sub in particular, and how services like this can help us decouple our microservices into producers and consumers of data using topics and subscriptions.
Various patterns offered by the published/subscribed model, including one-to-one, one-to-many, many-to-many, and many-to-one.

Deployment methodologies

image

Automated deployment methodologies with tools like Cloud Build, helping us to safely promote changes through our different environments and into production.
Deployment patterns like Blue/Green, and Canary deployments can help us to do this.

High-level best practices for securing the source of your compute and container images

image

Creating an auditable build pipeline for your changes, so you can be confident that what you put into production is really supposed to be there.

@shon-button
Copy link
Contributor Author

shon-button commented Feb 17, 2023

GCP storage and database options

image

Non-structured data

Cloud Storage

Cloud storage is the logical home for any unstructured data. such as binary blobs, videos and images, or other proprietary files.

Cloud Storage features

Cloud Storage provides object storage buckets where files are stored as objects inside these buckets.
Cloud Storage provides features to meet goals of scalability and resilience, such as optional geo-redundancy (where files in our buckets can be stored across multiple regions) and object versioning (where we can persist older versions of objects when they are updated).
Cloud Storage provides four choices regarding its storage bucket class, based on how long you are going to store data and how frequently you are going to access it

image

Standard class is the choice for most cases with one of the other options for backups or other long-term storage.

These storage classes will apply to a bucket and become the default storage class for objects inside that bucket. However, you can change the storage class of an individual object, using object lifecycle management.
You can create a lifecycle configuration based on conditions and actions and use these to create rules that will be applied to any object in a bucket where the configuration is run.

image

Cloud storage also provides us with some useful data retention features.
A retention policy can be applied to a bucket which prevents deletion of objects until they've reached a minimum age.
For strict enforcement, you can also add a policy lock that stops the retention policy from being removed from a bucket.
You will only be allowed to remove the policy once every object in the bucket has met the minimum retention age. You can also configure object holds, which also prevent deletion of an object.
These can either be temporarily applied or applied automatically as part of the object creation event.
These features are especially useful for data that is subject to regulation or compliance.

image

Another great feature of cloud storage is signed URLs. Rather than make a storage object's URL public, you can configure a signed URL, which contains authentication information within the URL itself, allowing whoever has this URL the specific permission to read this object for a specific period of time only. Signed URLs can be generated programmatically or with the gsutil command line tool.

Cloud Storage bucket can be configured to host a static website for a domain you own. Static web pages can contain client-side technologies such as HTML, CSS, and JavaScript. They cannot contain dynamic content such as server-side scripts like PHP. Because Cloud Storage doesn't support custom domains with HTTPS on its own, this tutorial uses Cloud Storage with HTTP(S) Load Balancing to serve content from a custom domain over HTTPS. For more ways to serve content from a custom domain over HTTPS, see troubleshooting for HTTPS serving. You can also use Cloud Storage to serve custom domain content over HTTP, which doesn't require a load balancer.

To ensure that Cloud Storage auto-scaling always provides the best performance, you should ramp up your request rate gradually for any bucket that hasn't had a high request rate in several days or that has a new range of object keys. If your request rate is less than 1000 write requests per second or 5000 read requests per second, then no ramp-up is needed. If your request rate is expected to go over these thresholds, you should start with a request rate below or near the thresholds and then double the request rate no faster than every 20 minutes.

If you run into any issues such as increased latency or error rates, pause your ramp-up or reduce the request rate temporarily in order to give Cloud Storage more time to scale your bucket. You should use exponential backoff to retry your requests when:

Receiving errors with 408 and 429 response codes.
Receiving errors with 5xx response codes.

Lab
In this hands-on lab, we create and manage a Cloud Storage bucket and manipulate the objects it holds through both the Google Cloud console and the command line.

image

  1. Create a Bucket
  • In the Google Cloud console, go to the Cloud Storage Buckets page
  • Create a unique bucket name
  • Select the region closest to yourself (for example, us_east_1 (N. Virginia)
  • Make sure to set the storage class to standard
  1. Upload Files to a Bucket
    To complete this objective, upload at least two files by both:
  • Dragging and dropping
  • Click "Upload file" and follow the upload files dialog
  1. Rename at least one of the uploaded files using the Context menu (the 3 vertical dots on the right-hand side of a file)
  2. Share an Object Publicly
    To make a file public:
  • Use the Context menu for a file and change it to Public
  • Once changed, check that the Bucket details reports the file as "Public to internet."
  1. Interact with Cloud Storage Using the Cloud Shell Terminal and gsutil Commands
  • Open the local terminal using the Activate Cloud Shell terminal button (the square button with >_ inside it) located at the top of the page
  • Using gsutil, download a file to the local terminal from our bucket
  • Create a new bucket with gsutil
  • Upload our file to the second bucket using gsutil

Analytic Data

image

Cloud BigTable is Google's petabyte-scale, wide-column NoSQL database designed for high throughput and scalability.
BigTable is a NoSQL designed for wide column key value data. Entries have a single key and multiple columns, and the whole system is designed for very high-volume writes, making it ideally suited to time series transactional or internet of things.
Designing a Bigtable schema is different than designing a schema for a relational database. In Bigtable, a schema is a blueprint or model of a table, including the structure of the following table components:

  • Row keys
  • Column families, including their garbage collection policies
  • Columns
**Key Point:** BigTable schema is designed for the queries required for the application.

Data BigQuery is Google's other petabyte-scale data platform. BigQuery is designed to be your big data analytics warehouse, storing incredible amounts of data but in a familiar relational way. This enables data analysts to query these enormous datasets with simple SQL statements. BigQuery can retain historical data for very little cost, the same as cloud storage itself, so you can use it for analytics that form the foundations of business intelligence systems or train machine learning models on its datasets. There are also multiple public datasets available, covering everything from baby names and taxi journeys to medical information and weather data.

Relational Structured Data

image

Cloud Spanner is Google's global SQL based relational database. It's a proprietary product that provides horizontal scalability and high availability and strong consistency. It's not cheap to run, but if your business needs all three of these things, such as in financial sector, Spanner could be the answer.

Cloud SQL provides managed instances of MySQL, Postgres and Microsoft SQL server and removes the requirement for you to provision and configure your own machines.
Cloud SQL is a fully-managed service with built-in backups, replicas, and failover providing high availability for your databases.
There's another important distinction between these two products with regard to scaling.
Cloud Spanner scales horizontally, adding more nodes that provide synchronized assistant data.
Cloud SQL scales vertically where database runs on a single host, and you can add more CPU and RAM to that host as demand for the database grows.

Choosing a primary key

Often your application already has a field that's a natural fit for use as the primary key. For example, for a Customers table, there might be an application-supplied CustomerId that serves well as the primary key. In other cases, you may need to generate a primary key when inserting the row. This would typically be a unique integer value with no business significance (a surrogate primary key).

In all cases, you should be careful not to create hotspots with the choice of your primary key. For example, if you insert records with a monotonically increasing integer as the key, you'll always insert at the end of your key space. This is undesirable because Spanner divides data among servers by key ranges, which means your inserts will be directed at a single server, creating a hotspot. There are techniques that can spread the load across multiple servers and avoid hotspots:

NoSQL Data

image

Cloud Firestore is Google's fully-managed NoSQL document database, designed for large collections of small JSON documents. Cloud Firestore offers some amazing features like strong consistency and mobile SDKs that support offline data.
Cloud Memorystore, which offers out of the box Redis and Memcached instances.
Memorystore is a fully managed service, so it saves you the bother of having to provision and configure your own
machines. It also provides options for scaling and high availability.

Note:
Databases that offer strong consistency guarantee that when data is created or updated, the changes take effect immediately for anyone reading the data (as opposed to eventual consistency). This means that they have to make sure that the write completes on all of the nodes in the database as part of the same instruction, so that any reads that come in are guaranteed to see the updated data.

Connecting to managed databases

image

Connecting to managed database is the same as with a non-managed version via connection strings. In addition,
Google provides client libraries for almost all of its services, and some have a direct HTTP API to interact with.
Cloud SQL proxy creates a secure connection to the Cloud SQL API from inside a VM or container that you can connect to on local host.

LAB
Creating a Google Cloud SQL Instance and Loading Data

image

In this lab, we create a MySQL database and then securely connect to it with a service account using [Cloud SQL Auth proxy](https://cloud.google.com/sql/docs/mysql/sql-proxy). We also upload some pre-generated data and run some simple queries. [Github repo](https://github.com/linuxacademy/content-google-certified-pro-cloud-developer)

Create a MySQL 2nd Generation Cloud SQL Instance

Our first task is to create our MySQL 2nd generation Cloud SQL instance:

  1. In your GCP console, under the Navigation menu (the three horizontal lines in the top left corner), scroll down to the Databases section and select SQL.
  2. Click CREATE INSTANCE then select Choose MySQL
  3. For the Instance ID enter forumdb.
  4. Click Generate to create a root password, and copy and save the password for later use.
  5. Click Show configuration options
  6. Change Machine Type to Lightweight 1 vCPU, 3.75 GB.
  7. Click Create to create the instance.

Note: It will take up to 10 minutes to create the database instance. You can complete the next objective while you wait.

Create the VM

Next, we need to create a virtual machine:

  1. From the Navigation menu, select Compute Engine then VM instances.
  2. Click Create.
  3. Change the name of the instance to mysql-client.
  4. Change Machine Type to e2-small.
  5. Make sure the Boot Disk is set to Debian GNU/Linux 10 (buster). If a different image is set, click Change and choose Debian GNU/Linux 10 (buster) from the list.
  6. Click Create.

Create the Service Account Used to Connect with Cloud SQL Securely

With our VM created, we now need to create the service account we'll use to connect with Cloud SQL securely:

  1. From the Navigation menu, select IAM & Admin then Service Accounts.
  2. Click CREATE SERVICE ACCOUNT.
  3. Under Service Account Name enter forumdb-access.
  4. Click Create.
  5. Under Grant this service account access to project, click inside the Select a role box, type SQL to search the roles, and select Cloud SQL Client.
  6. Click Continue, then click Done.
  7. In the row for our service account, click the 3 vertical dots in the Actions column.
  8. Select Manage Keys, then ADD KEY, Create new key, choose JSON, then CREATE..
  9. Click Create, and a JSON key will be downloaded to your computer.

Upload the Key to the VM

Now we'll upload the key to our VM and configure the mysql client and Cloud SQL Proxy:

  1. From the Navigation menu, select Compute Engine then VM instances.
  2. Click SSH next to our mysql-client VM, and follow the dialog to connect.
  3. From the cog menu in the top-right of the terminal window, select Upload file, then upload the JSON key file that we just downloaded.
  4. Using the cog menu, upload the forumdb.sql file we downloaded at the beginning of this lab.

Configure the MySQL Client and Cloud SQL Proxy

With our VM set, we can configure the MySQL client and Cloud SQL Proxy:

  1. Update packages on the VM with:
    sudo apt-get -y update
  2. Install the mysql-client with:
    sudo apt-get -y install default-mysql-client
  3. Download the Cloud SQL Proxy from Google:
    curl -o cloud_sql_proxy https://dl.google.com/cloudsql/cloud_sql_proxy.linux.amd64
  4. Make the file executable:
    chmod +x ./cloud_sql_proxy

Enable an API for the Cloud SQL Proxy

For this step, we must first enable an API for the Cloud SQL Proxy and grab the connection name for our DB:

  1. From the Navigation menu, select APIs & Services then Dashboard, and click + Enable APIs and Services.
  2. Search for Cloud SQL and select Cloud SQL Admin API when it appears, then click Enable.
  3. From the Navigation menu, select SQL.
  4. Select the forumdb instance.
  5. Under Connect to this instance locate the Connection name and click the Clipboard button to copy it.

Create a Secure Connection to the Database

Now we can run the Cloud SQL proxy to create a secure connection to the database using the service account we made using:

**Note** To use the Cloud SQL Auth proxy, you must meet the following requirements: - The Cloud SQL Admin API must be enabled. - You must provide the Cloud SQL Auth proxy with Google Cloud authentication credentials - You must provide the Cloud SQL Auth proxy with a valid database user account and password. - The instance must either have a public IPv4 address, or be configured to use private IP.
  1. Back in your SSH terminal, run the following command. Make sure to replace <YOUR_CONNECTION_NAME> and <YOUR_KEY_FILE> with the information you've collected:
    ./cloud_sql_proxy -instances=<YOUR_CONNECTION_NAME>=tcp:3306 -credential_file=./<YOUR_KEY_FILE> &
    The command should run with the message: Ready for new connections
  2. Use the mysql client to connect to your database as root using the root password we saved from the first objective:
    mysql -u root -p --host 127.0.0.1
  3. Create a database called forum from within the mysql client:
    CREATE DATABASE forum;
  4. Type exit to return to the command line.
  5. Press the Up arrow key to retrieve the last command, and add a database name and SQL dump import to the end of the command:
    mysql -u root -p --host 127.0.0.1 forum < forumdb.sql
  6. Connect to the database again:
    mysql -u root -p --host 127.0.0.1
  7. Run some simple SQL commands to query the new database, such as the following:
    USE forum;

    SHOW TABLES;

    SELECT * FROM posts;

Lab
Creating Google Cloud Firestore Collections and Documents

image

In this lab, we create a Python Flash web application that adds, stores, and tracks the names and birth years of famous computer scientists. We store that data in a NoSQL document store so that we can interact with the database records as JSON documents. To make sure that the database is consistent and globally available, we create a Cloud Firestore database and connect it to our app, which runs in the Cloud Run serverless platform. [Git Hub repo](https://github.com/linuxacademy/content-google-certified-pro-cloud-developer)

Activate Cloud Shell and the Required APIs

First, we need to set up our Cloud Shell:

  1. On the GCP dashboard, click the Activate Cloud Shell button located at the top right of the page (it is a square button with >_ inside it) and click Continue.
  2. Our project should already be active in the terminal (highlighted in yellow), but if it isn't, use [gcloud](https://cloud.google.com/sdk/gcloud) and enter the following code using our project ID, which is located next to the Google Cloud Platform title at the top of the page:
    gcloud config set project <YOUR_PROJECT_ID>
  3. Click Authorize.
  4. In the Cloud Shell terminal, activate the three required APIs for this lab:
    gcloud services enable run.googleapis.com cloudbuild.googleapis.com containerregistry.googleapis.com
  5. Configure these defaults for Cloud Run:
    gcloud config set run/region us-east1 gcloud config set run/platform managed

Create a Firestore Collection and Documents

With our Cloud Shell set, we can move on to working with Firestore:

  1. From the GCP Navigation menu, under the DATABASES section, select Firestore.
  2. Choose SELECT NATIVE MODE Note: To access all of the new Firestore features, you must use [Firestore in Native mode](https://cloud.google.com/datastore/docs/firestore-or-datastore).
  3. From the Select a location dropdown, choose the us-east1 region and click CREATE DATABASE.
  4. When you can see our empty database, click Start Collection.
  5. Enter compscientists as the Collection ID.
  6. We'll create the first document at the same time with the following under Add its first document:
    • Field name: first
    • Field type: string
    • Field value: Ada
  7. Click the + ADD FIELD button to add another line and enter the following:
    • Field name: last
    • Field type: string
    • Field value: Lovelace
  8. Click the + ADD FIELD button to add one more line and enter the following:
    • Field name: birthyear
    • Field type: number
    • Field value: 1815
  9. Click Save. We can now browse our new collection and see the document we added.
  10. Add another document by clicking + ADD DOCUMENT.
  11. Create a document for Alan Turing, born in 1912:
    • Field name: first
    • Field type: string
    • Field value: Alan
  12. Click the + ADD FIELD icon to add another line and enter the following:
    • Field name: last
    • Field type: string
    • Field value: Turing
  13. Click the + ADD FIELD icon to add one more line and enter the following:
    • Field name: birthyear
    • Field type: number
    • Field value: 1912
  14. Click Save.

Deploy and Test the Flask Application

In the following instructions, substitute <YOUR_PROJECT_ID> with your GCP project ID, which can be found next to the Google Cloud Platform title on the GCP dashboard. We will use it to deploy our application:

  1. In the Cloud Shell terminal, clone our GitHub repo flask-firestone to get the Flask application code:
    git clone https://github.com/linuxacademy/content-google-certified-pro-cloud-developer
  2. Change to the flask-firestore directory.
    cd content-google-certified-pro-cloud-developer/flask-firestore
  3. Build the demo app:
    gcloud builds submit --tag gcr.io/<YOUR_PROJECT_ID>/flask-firestore
  4. Deploy the app to Cloud Run:
    gcloud run deploy flask-firestore --image gcr.io/<YOUR_PROJECT_ID>/flask-firestore --allow-unauthenticated
  5. Copy the Service URL that is created, and paste it into a new browser window.
  6. In our app, add a new computer scientist (perhaps yourself!) to the collection using the form. Make sure you use proper names and a real year, as there is no exception handling in this web app.
  7. Click Submit, and our new addition appears on the web app.
  8. Go back to the Cloud Firestore page in the GCP dashboard. Our new document appears there as well.

Lab
Basics of Google Cloud Bigtable

image

In this lab, we will create a Cloud Bigtable instance, create and write data to our table, and then query that data with the HBase shell.

Create a Cloud Bigtable Instance

Our first step is to create a Bigtable instance:

  1. Open the menu by clicking on the icon with the three horizontal lines in the top left corner.
  2. Under Databases, select Bigtable.
  3. Click + Create Instance.
  4. In Instance name, enter vehicle-locations. This will populate the Instance ID field for you.
  5. Click Continue.
  6. Leave HDD selected and click Continue.
  7. For Region, choose the region that is geographically closest to you.
  8. Set the Zone as Any.
  9. Make sure the number of nodes is 1, and click Create.

Connect to Bigtable with HBase

  1. From the top of the GCP dashboard, click the Activate Cloud Shell button (the square button with >_ inside).

  2. Update the Cloud Shell:

    sudo apt-get -y update
  3. Install the Java 8 runtime for HBase:

    sudo apt-get -y install openjdk-8-jdk-headless
  4. Configure Java:

    sudo update-alternatives --config java
  5. Choose the /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java option.

  6. Export Java:

    export JAVA_HOME=$(update-alternatives --list java | tail -1 | sed -E 's/\/bin\/java//')
  7. Clone the Bigtable examples repo:

git clone https://github.com/GoogleCloudPlatform/cloud-bigtable-examples.git
  1. Change to the cloud-bigtable-examples/quickstart directory:

    cd cloud-bigtable-examples/quickstart
  2. Run the quickstart.sh script to set up a connection:

    ./quickstart.sh
  3. When prompted to authorize the Cloud Shell, click Authorize. Don't worry about any errors or warnings; the script should pick up our project's Bigtable instance and drop us into a connected HBase shell.

Create and Write Data to a Table with HBase

  1. Enter the following commands to create a vehicles table with loc and det column groups:

    create 'vehicles', 'loc', 'det'
  2. Check that our table was created successfully:

describe 'vehicles'
  1. Enter details for the first vehicle:

    put 'vehicles', 'M117-223', 'loc:lat', '40.781212'
    put 'vehicles', 'M117-223', 'loc:long', '-73.961942'
    put 'vehicles', 'M117-223', 'det:company', 'NYMT'
    put 'vehicles', 'M117-223', 'det:route', '86'
  2. Enter details for the second vehicle:

    put 'vehicles', 'M117-391', 'loc:lat', '40.780664'
    put 'vehicles', 'M117-391', 'loc:long', '-73.958357'
    put 'vehicles', 'M117-391', 'det:company', 'NYMT'
    put 'vehicles', 'M117-391', 'det:route', '88'

Query the Table's Data with HBase

  1. Use get to pull our first row using the row key (the value starting with M):

    get 'vehicles', 'M117-223'
  2. Scan the entire table:

    scan 'vehicles'
  3. Find all vehicles assigned to route 88 using a ValueFilter with a regexstring:

    scan 'vehicles', { COLUMNS => 'det:route', FILTER => "ValueFilter( =, 'regexstring:88')" }

    Note: This is an example of an extremely inefficient way to search the database, as the entire table needs to be read to find matching rows.

  4. Search for our second vehicle using the ROWPREFIXFILTER filter:

    scan 'vehicles', {ROWPREFIXFILTER => 'M117-3'}

    This returns all rows where the row key starts with M117-3. A better-designed row key system could provide more efficient database searching.

  5. Disconnect from the HBase shell:

    exit

@shon-button
Copy link
Contributor Author

shon-button commented Feb 17, 2023

Developing applications with GCE, or Google Compute Engine

image

When comparing GCP managed services, we talked about Compute Engine, (half-managed) Kubernetes engine, Cloud Run, and Cloud Functions...where does Compute Engine stand out?

With Compute Engine we're deploying virtual machine instances. Each VM is a complete virtual server that requires an operating system and software.
With Kubernetes Engine and Cloud Run, you've packaged your application into a Docker container, and that's the artifact you deploy. Although with Kubernetes Engine, there is still VM infrastructure under the hood.
With Cloud Functions, you just deploy a snippet of code that performs a single action and exits.
In terms of access, the major difference here is that, with Cloud Run and Cloud Functions, you basically get a built-in frontend. Google's cloud load balancer is set up for you whenever you deploy a service to one of these serverless platforms.
With Compute Engine, you can absolutely use Google's load balancer, but there is some configuration heavy lifting you need to do yourself to set this up.

Compute Engine

image

So why would you choose Compute Engine over other platforms?
If you need to run your workload on an actual virtual server, Compute Engine is your best-in-class option.
Compute Engine is also going to be a launching point for a lot of legacy applications being moved to the cloud, using the lift and shift pattern.

Compute Engine is simply, managed virtual machines, usually running a version of Linux or sometimes Windows.
The operating system comes from the instance boot disk, which you choose at the time of creation.
Other characteristics of an instance include its machine type, which determines how many virtual CPU cores and how much RAM it has assigned, and its service account, which is the identity it runs as inside your project.
Compute Engine offers predefined machine types that you can use when you create a VM instance. A predefined machine type has a preset number of vCPUs and amount of memory, and is charged at a set price.
If predefined VMs don't meet your needs, you can create a VM instance with custom virtualized hardware settings. Specifically, you can create a VM instance with a custom number of vCPUs and amount of memory, effectively using a custom machine type. Custom machine types are available in the general-purpose machine family. When you create a custom VM, you're deploying a custom VM from the E2, N2, N2D, or N1 machine family.
Every project has a default Compute Engine service account, which has the Project Editor IAM role giving broad permissions across all of the resources in your project. So it's a good idea to create specific service accounts for apps deployed to Compute Engine to limit their permissions.
Note: Compute instances are zonal. They exist in a single zone within a single region. This is something you have to bear in mind when designing apps to be resilient.

Compute Engine also adds plenty of configuration and automation options to make your VMs first-class cloud-native citizens alongside all of those other services such as custom disk images, firewall rules, network tags, metadata server., and startup scripts to bootstrap an applications.

Using Bootstrap Scripts in Google Compute Engine

image

In this lab, we automate the basic set up of a new server in Google Compute Engine by deploying a simple bootstrap script using the bootstrap script stored in Cloud Storage that references the Compute Engine metadata server.

Deploy Apache with a Startup Script

  1. In the GCP console, navigate to Compute Engine, VM instances, and click Create.
  2. Change the Machine type to e2-small.
  3. Under Firewall, check the box for Allow HTTP traffic.
  4. Expand the Management, security, disks, networking, sole tenancy section.
  5. Under Startup script, paste in the following script:
    #! /bin/bash apt update apt -y install apache2 cat <<EOF > /var/www/html/index.html <html><body><h1>Hello Cloud Gurus</h1> <p>This page was created from a startup script.</p> </body></html> EOF
  6. Click Create to create the instance.
  7. Wait a minute or so, then click the link under External IP to view the web page created by your startup script.

Deploy a Startup Script from a Cloud Storage Bucket

  1. In the GCP console, navigate to Storage, Browser, and click Create Bucket.
  2. Create a unique bucket name to store a startup script. For example, use the last 8 digits of the project name to create a name like 836bd4db-scripts.
  3. Create a file on our local computer called startup.sh and paste in the following code:
    #! /bin/bash apt update apt -y install apache2 ZONE=`curl -fs http://metadata/computeMetadata/v1/instance/zone -H "Metadata-Flavor: Google" | cut -d'/' -f4` cat <<EOF > /var/www/html/index.html <html><body><h1>Hello Cloud Gurus</h1> <p>This server is serving from ${ZONE}.</p> </body></html> EOF
  4. Upload the startup.sh file to your storage bucket.
  5. Select the file and copy its URI (it should start with gs://).
  6. Navigate to Compute Engine, VM instances, and click Create instance.
  7. Change the Machine type to e2-small.
  8. Under Firewall, check the box for Allow HTTP traffic.
  9. Expand the Management, security, disks, networking, sole tenancy section.
  10. In the Metadata section, add the Key startup-script-url and paste in the URI of your startup script as the Value.
  11. Click Create to create the instance.
  12. Wait a minute or so, then click the link under External IP for this instance to view the web page created by your startup script.

Bootstrap an application in GCE through a custom image as a boot disk

image

You can prepare an image in advance based on a standard boot disk that also includes your application and any necessary startup scripts to get it working. Another useful tool you can use with custom disk images is the GCE metadata server. We can call the metadata server from within your startup scripts to configure your application in different ways, based on metadata about your instance. For example, you could set environment-specific variables or make decisions based on an instance's location. Let's take a quick look at this in action.
Connect to the instance with SSH and install any packages required.
Once this instance is configured as you want you can make an image of this instance,
First, stop the instance.
Second, go to the images section
Third click Create Image with the source of the image is a disk from the instance configured above.
Now we have a reusable Compute Engine image and use that image in new instances by selecting the boot disk to use the new image.

Private Google Access.

VM instances that only have internal IP addresses (no external IP addresses) can use Private Google Access. They can reach the external IP addresses of Google APIs and services. The source IP address of the packet can be the primary internal IP address of the network interface or an address in an alias IP range that is assigned to the interface. If you disable Private Google Access, the VM instances can no longer reach Google APIs and services; they can only send traffic within the VPC network.

Managing Google Compute Engine Images and Instance Groups

image

image

A regional managed instance group automates the creation of the instances for you and distributes them across multiple zones in a region for high availability.

Managing groups of identical virtual machines can provide extra reliability and resilience in your infrastructure, as each individual machine comes as a disposable component—easily replaced from a template that you have previously defined. In this lab, we'll set up a "golden image" for our desired Compute Engine instance and use it to create an instance template. Then we'll deploy a group of managed instances based on this template that are distributed across a region for high availability.

  1. On the lab page, right-click Open Google Console and select the option to open it in a new private browser window. This option will read differently depending on your browser (e.g., in Chrome, it says Open Link in Incognito Window).
  2. Sign in to Google Cloud Platform using the credentials provided on the lab page.
  3. On the Welcome to your new account screen, review the text, and click Accept.
  4. In the Welcome Cloud Student! pop-up that appears once you're signed in, check to agree to the terms of service, choose your country of residence, and click Agree and Continue.

Create a Golden Image for a Web Server

  1. Click on the hamburger menu icon (the icon with three horizontal lines) in the top left corner of the console.

  2. In the left-hand navigation menu, select Compute Engine.

  3. From the dropdown menu, select VM instances.

  4. Under Machine type, select e2-small from the dropdown menu.

  5. Under Firewall, check the box next to Allow HTTP traffic.

  6. Click Create.

  7. Once the instance is up and running, click SSH under Connect.

  8. Update the system packages:

    sudo apt update
  9. Install the Apache web package:

    sudo apt -y install apache2
  10. Return to the Google Cloud console.

  11. Click the checkbox next to instance1.

  12. Click the square-shaped Stop icon to stop the instance.

  13. In the left-hand Compute Engine navigation menu, select Images.

  14. Click Create image.

  15. On the Create an image page, set the following parameters:

    • Name: Enter apache-gold.
    • Source disk: Select the instance1 instance.
    • Location: Select Regional.
  16. Click Create. This may take a few minutes.

Create an Instance Template

  1. In the left-hand Compute Engine navigation menu, select Instance Templates.

  2. Click Create instance templates.

  3. On the Create an instance template page, set the following parameters:

    • Name: Enter apache-template.
    • Machine type: Select e2-small.
    • Firewall: Check the box next to Allow HTTP traffic.
  4. Under Boot disk, click Change.

  5. Select the Custom images tab.

  6. Under Image, select our apache-gold image.

  7. Click Select.

  8. Expand the Advanced options. Under Management > Automation, paste in the following startup script:

    #! /bin/bash ZONE=`curl -fs http://metadata/computeMetadata/v1/instance/zone -H "Metadata-Flavor: Google" | cut -d'/' -f4` cat > /var/www/html/index.html <<EOF <html><body><h1>Hello Cloud Gurus</h1><p>This server is serving from ${ZONE}.</p></body></html> EOF

9.. Click Create.

Create a Regional Managed Instance Group

  1. In the left-hand Compute Engine navigation menu, select Instance groups.

  2. Click Create instance group.

  3. On the Create an instance group page, set the following parameters:

    • Name: Enter apache.
    • Location: Select Multiple zones.
    • Instance template: Select apache-template.
    • Minimum number of instances: Enter 3.
    • Maximum number of instances: Enter 5.
  4. Click Create. It may take a few minutes for the instance group to set up.

  5. Once the instance group is set up, click apache for more information.

  6. Click any of the links under External IP to view the web page served by the instance.

Load Balancer

The final piece of the puzzle for GCE deployment is Google's load balancer service.

image
image

Global Load Balancing with Google Compute Engine

Once you have successfully set up a group of managed instances, you have moved from the "pets" model to the "cattle" model, looking after a template rather than an individual machine so instances themselves can come and go as necessary. The final piece of the puzzle is how you direct users and traffic to your instances — and only to the healthy ones. In this lab, we will set up a managed instance group and then use a Google Cloud load balancer to manage incoming requests from the outside world.

On the lab page, right-click Open Google Console and select the option to open it in a new private browser window (this option will read differently depending on your browser — e.g., in Chrome, it says "Open Link in Incognito Window"). Then, sign in to Google Cloud Platform using the credentials provided on the lab page.   On the Welcome to your new account screen, review the text, and click Accept. In the "Welcome Cloud Student!" pop-up once you're signed in, check to agree to the terms of service, choose your country of residence, and click Agree and Continue.

Set up a Firewall Rule

  1. From the left menu, scroll down to Networking and select VPC network > Firewall.
  2. Click Create Firewall Rule.
  3. Set the following values:
    • Name: allow-http
    • Target tags: http-server
    • Source IP ranges: 0.0.0.0/0
  4. Scroll down to Protocols and ports.
  5. In Specified protocols and ports, select TCP and enter "80" in the field to the right.
  6. Leave the rest as their defaults and click Create.

Note: If we weren't opening HTTP to the world, we would still need to add a rule to allow incoming HTTP connections from GCP's health check probes. For now, our "allow all" HTTP rule covers those connections already.

Set up a HTTP Health Check

  1. From the left menu, scroll down to Compute and select Compute Engine > Health checks.
  2. Click Create A Health Check.
  3. In Name, enter "http-port-80".
  4. Under Health criteria, set the following values:
    • Check interval: 10
    • Healthy threshold: 3
    • Unhealthy threshold: 3
  5. Leave the rest as their defaults and click Create.

Create the Instance Template and Managed Instance Group

  1. From the left menu, select Compute Engine > Instance groups and click Create Instance Group.

  2. In Name, enter "apache".

  3. Under Location, select Multiple zones.

  4. In Instance template, click the drop-down and select Create a new instance template.

  5. In Name, enter "apache-template".

  6. In Machine Type, select **e2-small.

  7. Scroll down to Firewall and select Allow HTTP traffic.

  8. Click Management, security, disks, networking, sole tenancy to expand the section.

  9. Under Automation, paste the following startup script:

    #! /bin/bash sudo apt update sudo apt -y install apache2 ZONE=`curl -fs http://metadata/computeMetadata/v1/instance/zone -H "Metadata-Flavor: Google" | cut -d'/' -f4` cat <<EOF > /var/www/html/index.html <html><body><h1>Hello Cloud Gurus</h1> <p>This server is serving from ${ZONE}.</p> </body></html> EOF
  10. Click Save and continue

  11. Under Autoscaling, set the Minimum number of instances to "3" and the Maximum numbers of instances to "5".

  12. Under Autohealing, select http-port-80.

  13. Click Create.

    Note: After a few minutes, the instance group should be ready.

Create an HTTP Load Balancer

  1. From the left menu, scroll down to Networking and select Network services > Load balancing.
  2. Click Create load balancer.
  3. In HTTP(S) Load Balancing, click Start configuration.
  4. Click Continue.
  5. In Name, enter "apache-lb".
  6. Click Backend configuration.
  7. In Backend services & backend buckets, select Backend services > Create a backend service from the dropdown.
  8. In Name. enter "apache-backend".
  9. Scroll down to New backend and select apache as the instance group.
  10. In Port numbers, enter "80" and click Done.
  11. In Health check, select http-port-80.
  12. Click Create.
  13. Click Frontend configuration.
  14. In Name, enter "apache-frontend".
  15. Leave the rest as their defaults and click Done.
  16. Click Create.

    Note: It will take several minutes for your new configuration to be updated.

  17. To verify the load balancer works, click the load balancer to open it, copy the IP address, and load it in your browser.

Logging

image

The Cloud Logging agent can gather logs from many common applications or from custom log files you define.
Cloud Monitoring can be used to generate alerts based on logs-based metrics that have been gathered by Cloud Logging.

Preemptive Shut Downs

Compute Engine sends a preemption notice to the instance in the form of an ACPI G2 Soft Off signal. You can use a shutdown script to handle the preemption notice and complete cleanup actions before the instance stops.

Troubleshooting

Ex: You have created a managed instance group, but instances inside the group are constantly being deleted and recreated. What is a common issues that could cause this behaviour?

If you have defined a health check (for example, an HTTP check) for your group, you need to ensure you have sufficient firewall configuration to allow Google's probes to connect to your instances. Otherwise, the health checks will fail, and the instances will be marked as unhealthy and recreated.

Errors in the instance template will cause instance creation inside the group to fail. These can include specifying source images that no longer exist, or attempting to attach the same persistent disk in read/write mode to multiple instances.

@shon-button
Copy link
Contributor Author

shon-button commented Feb 19, 2023

Developing Applications with GKE

THE ILLUSTRATED CHILDREN’S GUIDE TO KUBERNETES thanks, Josh!

Orchestrating containers in production

image

Kubernetes Cluster

image

At a basic level, a Kubernetes cluster contains two primary types of computer or VM. First you have masters, which provide the components that make up the control plane. This is responsible for making decisions about your cluster, such as scheduling containers.
Then you have nodes, which provide the runtime environment. They are the primary resources of the cluster where containers will actually run.
You can have multiples of either of these, but you'll normally have more nodes than masters.

Kubernetes Cluster Master

Kubernetes masters run components, which provide the control plane for the cluster.
image

First, there's the scheduler for scheduling workloads. When you want to deploy a container, the scheduler will pick a node to run that container on. The node it picks can be affected by all kinds of factors, such as the current load on each available node and the requirements of your container.
Next there's the Cloud Controller Manager, which is what allows Kubernetes to work with cloud platforms. This manager is responsible for handling things like networking and load balancing ss they translate to the products and services of a cloud vendor.
Then we have the Kube Controller Manager, whose job it is to manage some other controllers in the clusters for things like nodes and deployments.
Kubernetes stores all of its configuration and state in a database called etcd and exposes an API server for all of its master functions. Every time you communicate with the master or anything else communicates with
the master, it will be through this API.
Most of the time you'll use the Google Cloud console or a command line tool, but in the background, it's always talking to this API.

Kubernetes Cluster Node

image

Nodes are a lot more straightforward than the master. There's the kubelet, which is an agent for Kubernetes. It communicates with the control plane and takes instructions,
such as deploying containers when it's told to. Next there's kube-proxy, which is responsible for managing network connections in and out of the node.
And finally, there's the actual container runtime. Most of the time, this will be Docker itself running your containers.

Note The entire control plane and all of the components are fully managed for you by Google when you create a GKE cluster.

Containers, Pods, Replica Sets, and Deployment

Containers

image

Containers themselves contain all of our application code and libraries and are defined by a Docker file.
When you build a, GCP, container you use Cloud Build which can use a Docker file or a Cloud Build config YAML file to configure a build, which can also be triggered by a commit to a Git repository. The resulting artifact is a Docker container image stored in a container registry.
To make a container production ready Kubernetes wraps a container to a Pod
Note: When you start working with containers, it's a common mistake to treat them as virtual machines that can run many different things simultaneously. A container can work this way, but doing so reduces most of the advantages of the container model. For example, take a classic Apache/MySQL/PHP stack: you might be tempted to run all the components in a single container. However, the best practice is to use two or three different containers: one for Apache, one for MySQL, and potentially one for PHP if you are running PHP-FPM.

Pods

image

A Pod is a logical application-centric unit of deployment, and it's the smallest thing that you can deploy to Kubernetes.
It might only comprise a single container, or it might contain multiple containers that all work together. Inside a Pod,
the containers will share a file system and network IP address. This is a really useful design pattern when, for example,
you need to proxy database connections for your application, which you can do inside a Pod using a sidecar container.
A pod acan be deployed on Kubernetes. but to make a pod production ready Kubernetes wraps a pod in a Replica Set

Replica Sets

image

Replica Sets introduce some scaling and resilience using some other simple Kubernetes objects. Normally we run multiple copies of a Pod called replicas inside something called a replica set. A replica set contains a template that is used to create each replica Pod and a definition of how many replicas it should run.
To make a replica sets production ready Kubernetes wraps a replica set to a Deployment

Deployment

image

A deployment is the most common way to deploy an application to Kubernetes as it give us lots of extra useful logic for getting our apps safely into production. For example, let's say we're running version 1 of our amazing app in this deployment. Our replica set has three replicas, each an identical Pod that can handle requests to our app. But now let's say we want to deploy version 2. How can we do that safely? Well, we simply update the configuration of our deployment and the deployment manager will do something really clever.
First it will create a new replica set with the updated Pod specification. Then it will start to spin up Pods in that set as each Pod becomes ready and reports that it's healthy. A Pod in the older replica set is terminated. This will carry on until the new replica set has completely replaced the old one, but there are safeguards in place that can hold an update if a certain percentage of Pods don't spin up and you can easily roll back to previous configurations in your deployment.
We can add other fancy tricks to our deployment, like a horizontal Pod autoscaler that can automatically add new Pods to our replica set to scale it up when there's extra demand.

Service

image

A Kubernetes service object exposes the groups of Pods to the network. It does this by creating a single fixed IP address and routing incoming traffic to a group of Pods. We add labels to our replica set -- things like app equals NGINX, env equals prod -- and then we use a selector in our service to match those labels and route traffic. Now, if you recall how deployments allow us to safely update Pods, you'll see that services are only going to route traffic to healthy Pods.
It's more built-in resilience in front of our service. We can put something like a Cloud Load Balancer or more advanced options.

Note
To achieve Blue\Green Deployment methodology using Kubernetes:
Maintain a Blue deployment and a Green deployment. Label Pods in each deployment for their respective colour (such as deploy=blue) and for a common app name (such as app=frontend). Create a Service with a selector that requires labels to match the deploy colour and the app name. Switch the selector to a different deploy colour to create the necessary Blue/Green switching as required by the deployment method.

Note
To achieve Canary Deployment methodology using Kubernetes:
Create a new deployment for the canary release. Ensure that the old and new deployments share the same label for the application name. Make sure the size of the canary deployment is appropriate (i.e., 10% the size of the current deployment). Use a Service with a selector that routes traffic to Pods based on the application label.

Workload Identity

That’s why we introduced Workload Identity, a new way to help reduce the potential “blast radius” of a breach or compromise and management overhead, while helping you enforce the principle of least privilege across your environment. It does so by automating best practices for workload authentication, removing the need for workarounds and making it easy to follow recommended security best practices.

By enforcing the principle of least privilege, your workloads only have the minimum permissions needed to perform their function. Because you don’t grant broad permissions (like when using the node service account), you reduce the scope of a potential compromise.
Since Google manages the namespace service account credentials for you, the risk of accidental disclosure of credentials through human error is much lower. This also saves you the burden of manually rotating these credentials.
Credentials actually issued to the Workload Identity are only valid for a short time, unlike the 10-year lived service account keys, reducing the blast radius in the event of a compromise.
Workload Identity is project wide, so you can use it to grant permissions to new or existing clusters in your projects that share Kubernetes namespaces and Kubernetes service account names. For example, in the add-iam-policy-binding call below, any pod running under the Kubernetes namespace K8S_NAMESPACE and the Kubernetes service account KSA_NAME have permission to use the [GSA_NAME]@[PROJECT_NAME].iam.gserviceaccount.com IAM service account to access Google Cloud services.

Provisionig

PersistentVolume resources are used to manage durable storage in a cluster. In GKE, a PersistentVolume is typically backed by a persistent disk. You can also use other storage solutions like NFS. Filestore is a NFS solution on Google Cloud.

A PersistentVolumeClaim is a request for and claim to a PersistentVolume resource. PersistentVolumeClaim objects request a specific size, access mode, and StorageClass for the PersistentVolume. If a PersistentVolume that satisfies the request exists or can be provisioned, the PersistentVolumeClaim is bound to that PersistentVolume.

Pods use claims as volumes. The cluster inspects the claim to find the bound volume and mounts that volume for the Pod.

Resource sharing and quotas:

Resource-sharing policy for applications used by different teams in a Google Kubernetes Engine cluster need to ensure that all applications can access the resources needed to run via the following:

A resource quota, defined by a ResourceQuota object, provides constraints that limit aggregate resource consumption per namespace. It can limit the quantity of objects that can be created in a namespace by type, as well as the total amount of compute resources that may be consumed by resources in that namespace.

A LimitRange is a policy to constrain the resource allocations (limits and requests) that you can specify for each applicable object kind (such as Pod or PersistentVolumeClaim) in a namespace.

Lab Deploying WordPress and MySQL to GKE

In this lab, we will create a reasonably complex application stack on Google Kubernetes Engine, creating deployments for WordPress and MySQL, utilizing persistent disks. To complete this lab, you should have some basic experience and familiarity with Google Cloud Platform and Google Kubernetes Engine.

Create the GKE Cluster and Storage Class
  1. From the sidebar menu, select Kubernetes Engine.

  2. Click ENABLE.

  3. In the Kubernetes clusters window, click CREATE.

  4. To the right of GKE Standard, click CONFIGURE.

  5. In the sidebar menu, select the default-pool node pool.

  6. Verify the Number of nodes is set to 3, then click CREATE.

    Note: This process can take a few minutes to complete.

  7. While the cluster is being created, select the Activate Cloud Shell icon (>_) to the right of the top menu bar.

  8. When prompted, click CONTINUE.

  9. In the Cloud Shell terminal, clone the Git repo provided for the lab:

    git clone https://github.com/linuxacademy/content-gke-basics
  10. After the cluster is created, use the three-dot menu to the right of the cluster details to select Connect.

  11. Below Command-line access in the pop-up, click RUN IN CLOUD SHELL.

  12. After the command populates in the Cloud Shell terminal, run it.

    Note: If prompted, click AUTHORIZE to authorize API calls.

  13. Change to the content-gke-basics directory:

    cd content-gke-basics/
  14. Use the ssd-storageclass.yaml file to create a new storage class object that utilizes SSD persistent disks:

    kubectl apply -f ssd-storageclass.yaml
Create Persistent Volumes
  1. Use the mysql-pvc.yaml file to create the persistent volume claim for MySQL:
    kubectl apply -f mysql-pvc.yaml
  2. Use the wordpress-pvc.yaml file to create the persistent volume claim for WordPress:
    kubectl apply -f wordpress-pvc.yaml
  3. In the sidebar menu, select Storage to verify the persistent volume claims were created.
Deploy MySQL
  1. In the Cloud Shell terminal, create a new base64 encoded password:

    echo -n mypassword | base64 -w 0
  2. Copy everything in the output up until cloud_user.

  3. Above the terminal, click Open Editor.

    Note: You may need to enable third-party cookies to open the editor. Follow the on-screen instructions to do this. Afterwards, you may need to reopen the Cloud Shell terminal and the editor.

  4. In the sidebar menu, expand the content-gke-basics folder.

  5. Select mysql-secret.yaml.

  6. Replace the current value for password with the value you copied.

  7. In the editor's top menu bar, select File > Save.

  8. On the right, click Open Terminal.

  9. Back in the terminal, create the secret using the mysql-secret.yaml file:

    kubectl apply -f mysql-secret.yaml
  10. Use the mysql-deployment.yaml file to create the MySQL deployment:

    kubectl apply -f mysql-deployment.yaml
  11. Use the mysql-service.yaml file to create the MySQL service:

    kubectl apply -f mysql-service.yaml
  12. Verify that MySQL is running and ready:

    kubectl get pods
  13. View the ClusterIP service for MySQL:

    kubectl get services
Deploy WordPress
  1. Use the wordpress-deployment.yaml file to create the WordPress deployment:

    kubectl apply -f wordpress-deployment.yaml
  2. Use the wordpress-service.yaml file to create the WordPress service:

    kubectl apply -f wordpress-service.yaml
  3. Select the hamburger menu icon in the top left corner, then navigate to Kubernetes Engine > Workloads to verify the WordPress deployment was successful.

  4. From the Kubernetes Engine sidebar menu, select Services & Ingress.

    It will take a few minutes for the load balancer to assign an external IP address.

  5. After the external IP link appears in the Endpoints column, click it and then click the IP link again on the redirect page to view the deployed app.

    Note: WordPress installations can be insecure, and you should not leave yours running for long before concluding and stopping this lab.

Managing Deployments

Without constraints, GKE will let a container use all of the available resources of a node. So it's important to understand the requirements of your container and plan accordingly.

CPU, Memory, Requests, Limits

image

Among other things, Kubernetes lets you define settings for the CPU and memory that a container can use. These definitions are done in two ways: using requests and limits.
Requests are used to determine where a Pod should be scheduled. For example, a Pod that makes a request for two gigabytes of RAM will only be scheduled onto a node that has that much room to spare. Once the Pod is scheduled, however, it can try to consume more RAM and won't be stopped. unless of course, the note itself runs out of memory. To stop Pods from doing this, you use limits.
If a Pod has a limit of two gigabytes of memory specified and a process inside a container in the Pod attempts to use more than this, the process will be terminated with an out of memory error. You can use combinations of requests and limits to carefully plan where your Pods should run and make sure they don't consume more resources than you want
them to.

Namespaces

image

When you deploy an object to Kubernetes without specifying a namespace, it runs in the default namespace, but you can create any number of additional custom namespaces as a logical separation for applications, environments, or even teams.
Using different namespaces can provide some convenience, like separating service names in DNS, but it's also really helpful when combined with resource requests and limits. For starters,you can define a default requests and limits policy that will apply to all Pods in the namespace, but you can also configure constraints that apply to a namespace, which contain a minimum and maximum value for these settings.
If a Pod does not meet the constraints -- for example, if it tries to set a request or limit higher than that allowed by the constraint -- Kubernetes will refuse to schedule it. Combined, these controls allow you to operate namespaces almost as virtual clusters within your cluster.

RBAC and IAM

image

To control security and access for people who operate or deploy to GKE, Kubernetes uses role-based access control, or RBAC, which you can combine with GCP's identity and access management, or IAM.
With RBAC, you create a role that can perform operations on specific resources -- for example, creating or listing Pods within a namespace. A cluster role is basically the same thing, but applied to the entire cluster instead of a specific namespace.
RBAC works with IAM by granting roles, to either a person --basically someone with a Google cloud account -- or a service account. Also, roles can be granted to Google groups, saving you the trouble of having to assign them to individual users.

Workload Identity

image

To lock down the permissions of workloads actually running on the cluster, which we can do with workload identity.
So let's say you've deployed your application to a GKE cluster, but it needs access to other services and APIs within GCP,
like cloud storage or maybe a cloud SQL database.
Now it's very easy to run a Kubernetes deployment with a custom service account. This is a Kubernetes service account,
a Kubernetes native object that provides a unique identity for running Pods within the cluster. But with workload identity, you can easily map that Kubernetes service account to a GCP service account. Then you can assign specific IAM roles and permissions in GCP to allow the service account access to the resources it needs and nothing else,
maintaining the security principle of least privilege.
Note Spinnaker allows you to define complex deployment pipelines, each comprising multiple stages that contain multiple sequential tasks themselves . It is designed to facilitate the rapid release of software changes while automating any necessary infrastructure work under the hood.

Operating Kubernetes Engine

How to manage, maintain, and troubleshoot application deployments on GKE.

Pod and Container Lifecycle

image

To help you operate GKE effectively and know what's going on under the hood, it's important to understand the lifecycle of Pods and containers. Whenever a Pod is created, either on its own or part of a replica set or deployment, the Pod state goes through various different phases. First, the Pod is pending. This means its definition has been accepted by the cluster, but it's not yet ready to serve, usually because one or more of the containers inside the Pod is not yet ready.
Pods might also be pending if they can't immediately be scheduled due to a lack of resources in the cluster. So while our Pod is pending, let's take a look at container states.The container star ts in the waiting state. While in this state, various things could be happening. A container image could be being pulled from the registry or some other startup
is taking place inside a container, like the application of secret data or the running of a postStart hook.
Once a container is ready, it enters the running state. When the containers in a Pod are running and the Pod has been bound to a node, the Pod also enters the running state. In most scenarios,this is where you want your Pods to stay until they are ready to be replaced, perhaps as part of a rolling update to your deployment.
If you're running a job or cron job object, your Pods will behave quite differently. In this case, you want them to complete some piece of work and terminate successfully.
A container will enter a terminated state if it either ran to completion successfully or it failed. If the container has a preStop hook, it should run here if the container terminates successfully.
If all of the containers in a Pod have terminated successfully, the Pod enters the succeeded state.
The job has been completed and no further action will be taken. Oh, and just so you know, when Pods are in the succeeded state, they may report their status as completed, just to confuse you. Now, if at least one of the containers has terminated in failure, the Pod will enter a failed state. Depending on the restart policy of the Pod, the scheduler will attempt to recreate fail containers with an exponential backoff, which means it will wait longer and longer between each retry and then eventually give up.

Common Errors

image

Two of the most common errors to look for in a failed container are error image pull and crash loop backoff.
Error image pull means that the container image could not be retrieved from the registry. More often than not, this is due to a typo in the image name in a Pod manifest, or perhaps you're specifying an image tag that doesn't exist.
You might also see an image pull backoff error if you're not authorized to pull an image from a private registry.
Crash loop backoff means that the container has failed and been restarted, then failed and been restarted, and this has happened too many times, so the scheduler has given up. This is the exponential backoff we just talked about. When this happens, either there is a bug with the container itself or some configuration that it relies upon is not set up properly.

GKE Logging and Monitoring

image

GKE is tightly integrated with Cloud Operations for logging and monitoring. Logs can be viewed through the GKE dashboard by drilling down through individual deployments, Pods, and services, or in dedicated operations dashboards.
Custom and external metrics can also be used to determine the behavior of the horizontal Pod autoscaler, which, as we know, can add new Pods to a deployment.

Custom and External Metrics

image

Custom and external metrics can also be used to determine the behavior of the horizontal Pod autoscaler, which, as we know, can add new Pods to a deployment. Custom metrics are metrics reported to Cloud Monitoring by our own application -- for example, things like queries per second, latency from dependencies, or anything else you like.
You create these metrics yourself within your application and report them to the Cloud Monitoring API.
External metrics originate from outside your cluster, but can still be used to influence cluster behavior.
These include things like other Cloud Monitoring metrics, perhaps the number of requests to some other part of your stack, or maybe the number of pending messages in a pub/sub topic.
Both of these options give you many different ways to make sure that your workloads can scale appropriately to meet demand.

Resources

Container design pattern

@shon-button
Copy link
Contributor Author

shon-button commented Feb 19, 2023

Developing Serverless Applications with CGP

image

App Engine

PaaS (platform as a service)

image

App Engine Standard

image

Let's take a quick look at an App Engine standard app in the GCP console\App Engine section.
The first thing I need to do is create an application. Pick a region for my app, which is a permanent choice, pick a language and pick a runtime- standard environment.
So now, App Engine is setting up and initializing all of its services. Basically the platform part of this platform as a service.
An App at a minimum has to have a single function , app.yaml file to specify our runtime and a requirements.txt file to specify any libraries needed.
Deploy deploy this app by simply running gcloud app deploy.

Cloud Function

FaaS (function as a service)

image

One of the primary differentiators for cloud functions is that it is event driven.
At a very high level, that means that an event has to occur for the function to execute. And once the function has finished executing, it disappears. The entire life cycle of a cloud function serves to answer a single request, but that's okay because the platform itself can scale to run millions of functions at the same time if it has to.

There are many different events that can trigger a function.
image

Generally we group them into two categories for triggering two types of functions: HTTP functions and background functions.
HTTP functions are triggered by an HTTP request like a get or a post to a URL and should provide an HTTP response.
Background functions by comparison are invoked in response to some other kind of events, such as a message on a Pub/Sub topic, a change in a cloud storage bucket, or a Firebase event, or many other supported triggers.

Writing Cloud Functions

image

To write a cloud function we have to conform to one of the supported function's runtimes, which are Node versions 8 and 10, Python 3, Go, Java, .Net, or Ruby. We can't use our own runtimes here as these have been optimized by Google to run as efficiently as possible.
Tp write HTTP functions code must accept an HTTP request and provide an HTTP response. All of the HTTP verbs like get, post, put, and so on are supported.
Background functions receive a payload that consists of some data object for the event and a context object for the event. The data object varies depending on the type of event. For example, a cloud storage event would contain object metadata. The context will contain information such as the event type and a timestamp.

Use Cases

image

Create and deploy a cloud function in the GCP console

In the GCP console in the cloud functions section, click create function.
Specify the basics: the functions name, and region and trigger type here HTTP, and under authentication, for this demo, make the function open to the world by selecting to allow unauthenticated invocations.
We'll save this configuration and click next.
The final thing we need to configure is the code of our function itself.
Because we're setting this up in the console, we actually get this little neat inline code editor, but we could alternatively upload a zip file of our code or specify a cloud storage or cloud repository location.

I can pick any of the supported runtimes and you can see that the editor prepares some boilerplate code for my function.
Note that there is a single function inside the code editor. That's the whole point. A cloud function isn't supposed to respond to multiple different routes or paths and perform multiple tasks. It should just do a single thing and do it well.
Note the function name, which will be formatted appropriately for the runtime you are using. Now this is important because we have to use that function name to define our entry point. When the cloud functions platform invokes our code, it will look for a function that matches the name in that entry point.
Deploy this function by clicking "Deploy" button.

How to secure cloud functions

Like most GCP resources, functions run with their own identity, a dedicated custom service account that you create and manage. Using a custom service account allows you to specify granular permissions on who and what can access your function, and other things your function can access itself.

image

For example, when you deploy an HTTP function, it is secured by default and will only accept authenticated requests.
If I want to call this function, I need an identity that has the cloud functions invoker role, and this needs to be specifically granted for the identity of the function I'm calling. It doesn't just apply to all functions. Then I can get an OAuth ID token using my service account and include it in the authorization header of my HTTP request. This method is also how you configure secure functions to speak to other secure functions.
Note: To make an HTTP function public, as we saw in the demo, you assign the cloud functions invoker role to the special all users user.

For background functions they can only be invoked by the event source to which they are subscribed, such as a Pub/Sub or cloud storage event.

image

If the function being called needs access to other GCP resources or APIs, it's a good idea to give it a custom identity to control the level of access that it has inside your project.

Best practices for using cloud functions.

First of all, don't start background activities in your function.
Next, only use the dependencies you need in your code.
And finally, always delete temporary files.

Summary:

Cloud Functions must be written in a supported language and can be triggered by either an HTTP request or a background event. These are perfect for event driven workloads like processing Internet of Things data, acting as a web hook for another application, or just generally providing logic or glue between other things in your stack.

Lab: Create an HTTP Google Cloud Function

image

Google Cloud Functions are a fully managed and serverless way to run event-driven code. They react to demand from zero to planet-scale and come with integrated monitoring, logging, and debugging. All you need to do is plug in your code!

In this lab, we will introduce ourselves to Cloud Functions by writing our first function that will simply respond to an HTTP trigger; in other words, our function will run when you send a request to its URL.

Enable APIs and Set Up the Cloud Shell

  1. Take note of the Project ID in the top left corner of the page.

  2. In the GCP console, click the Activate Cloud Shell icon near the upper right.

  3. When prompted, click Continue.

  4. When the terminal is ready, run this command to set the project ID:

    gcloud config set project <PROJECT_ID>
  5. When prompted, click Authorize.

  6. Using the Cloud Shell, enable the Cloud Build and Cloud Functions APIs:

    gcloud services enable cloudbuild.googleapis.com cloudfunctions.googleapis.com
  7. Create a directory for your function:

    mkdir ~/helloworld
  8. Change to the new directory:

    cd ~/helloworld
  9. In the Cloud Shell, click Open Editor.

  10. Choose Open in a new window.

  11. When the editor has opened in a new tab, return to the original tab and click Open Terminal to get the terminal back.

Write the Hello World Function

  1. Go back to the Cloud Shell Editor.

  2. Click the helloworld directory.

  3. Using the File menu, click New File.

  4. Name the file main.py and click OK.

  5. Click the main.py file and paste in the following:

    from flask import escape

    def hello_http(request):

    request_json = request.get_json(silent=True)
    request_args = request.args
    
    if request_json and 'name' in request_json:
        name = request_json['name']
    elif request_args and 'name' in request_args:
        name = request_args['name']
    else:
        name = 'World'
    return 'Hello {}!'.format(escape(name))
    

  6. Using the File menu, click Save.

  7. Using the File menu, click New File.

  8. Name the file requirements.txt and click OK.

  9. Click the requirements.txt file and paste in the following:

    Flask==2.0.3
  10. Using the File menu, click Save.

Deploy and Test the Hello World Function

  1. Back in the Cloud Shell Terminal, deploy the function using the following command:

    gcloud functions deploy hello_http --runtime python38 --trigger-http --allow-unauthenticated
  2. In the console, using the main navigation menu, select Cloud Functions from the COMPUTE section.

  3. Select the hello_http function.

  4. Click the TRIGGER tab.

  5. Click the URL under Trigger URL.

  6. Click the redirect link to trigger your function and see the "Hello World!" response.

  7. To customize the response, add a query parameter to the end of the URL (for example, ?name=Cloud%20Gurus).

Lab: Triggering a Cloud Function with Cloud Pub/Sub

image

Cloud Functions can be triggered in 2 ways: through a direct HTTP request or through a background event. One of the most frequently used services for background events is Cloud Pub/Sub, Google Cloud’s platform-wide messaging service.

In this pairing, Cloud Functions becomes a direct subscriber to a specific Cloud Pub/Sub topic, which allows code to be run whenever a message is received on a specific topic. In this hands-on lab, we’ll walk through the entire experience, from setup to confirmation.

Enable Required APIs

  1. From the main console navigation, go to APIs & Services > Library.
  2. Search for Pub/Sub.
  3. Select the Cloud Pub/Sub API card.
  4. Click ENABLE, if displayed.
  5. Return to the API Library and search for functions.
  6. Select the Cloud Functions API and click ENABLE.
  7. Return to the API Library and search for build.
  8. Select the Cloud Build API and click ENABLE.

Create Pub/Sub Topic

  1. Using the main navigation menu, under Analytics go to Pub/Sub > Topics.
  2. Click CREATE TOPIC.
  3. For the Topic ID, enter a name for the topic (e.g., "greetings").
  4. Click CREATE TOPIC.

Create a Cloud Function

  1. Using the main navigation, under SERVERLESS, go to Cloud Functions.
  2. Click CREATE FUNCTION.
  3. Configure the function with the following values:
    • Name: acg-pubsub-function
    • Region: us-central1
    • Trigger: Cloud Pub/Sub
    • Topic: The topic you just created
  4. Make sure to expand out the "Runtime, build, connections and security settings" section, and set the "Maximum number of instances" to 1.
  5. Click SAVE.
  6. Click NEXT.
  7. For the Runtime, select Python 3.9.
  8. Using the Inline Editor, select the main.py file.
  9. Delete the existing code, and paste in the following:
import base64

def greetings_pubsub(data, context):

if 'data' in data:
    name = base64.b64decode(data['data']).decode('utf-8')
else:
    name = 'folks'
print('Greetings {} from Linux Academy!'.format(name))

  1. Set Entry point to greetings_pubsub.
  2. Click DEPLOY.

    Note: This process can take up to two minutes to complete.

Publish Message to Topic From Console

  1. Click the newly created Cloud Function.
  2. Switch to the Trigger tab.
  3. Click the topic link to go to the Cloud Pub/Sub topic.
  4. Scroll down to the bottom of the page and switch to the MESSAGES tab.
  5. Click PUBLISH MESSAGE.
  6. Under Message body, enter the following "everyone around the world".
  7. Click PUBLISH.

Confirm Cloud Function Execution

  1. Return to the Cloud Functions dashboard and click the LOGS tab.
  2. Review the logs and confirm the function was executed successfully.

Trigger Cloud Function Directly From Command Line

  1. At the top of the page, click the Project field, and copy the project ID.

  2. Click the icon in the top right corner of the console to activate Cloud Shell.

  3. When prompted, click Continue.

  4. Enter the following to set the project ID:

    gcloud config set project <PROJECT_ID>
  5. When prompted, click Authorize.

  6. Using the following, set a variable called DATA:

    DATA=$(printf 'my friends' | base64)
  7. Using the DATA variable, trigger the function:

    gcloud functions call acg-pubsub-function --data '{"data":"'$DATA'"}'
  8. Review the logs and confirm the function was executed successfully.

Publish Message to Topic From Command Line

  1. In the Cloud Shell, enter the following command:

    gcloud pubsub topics publish greetings --message "y'all"
  2. Review the logs and confirm the function was executed successfully.

Cloud Run

CaaS (container as a service)

image

Cloud Run, at a very high level, takes the same serverless model as Cloud Functions, but changes the deployment artifact from a piece of code to a container. This gives us a containers as a service model.

So choosing between these two models really depends on what type of serverless application you're trying to deploy and you need to keep in mind the constraints and differences of each choice.

Cloud Functions has a limited set of supported runtimes.
By comparison Cloud Run, as a containers as a service model, lets you use any runtime language or framework you like so long as it runs in a Docker container. This gives you the most freedom, but at the cost of a slightly larger overhead.
Cloud Functions is purely event driven.
Cloud Run, while it can be event driven, is also designed for concurrency. A single instance of a Cloud Run service can
serve multiple requests in the style of a traditional server.

Broadly speaking, Cloud Functions is perfect if and only if your use case fits into its model and requirements. More generally then, Cloud Run will give you the most choice but with slightly more overhead.

Cloud Run app in the GCP console

image

Cloud Run, a fully managed serverless execution environment that lets you run stateless HTTP-driven containers, without worrying about the infrastructure.

Cloud Run for Anthos, which lets you deploy Cloud Run applications into an Anthos GKE cluster running on-prem or in Google Cloud.

Our commitment to Knative, the open API and runtime environment on which Cloud Run is based, bringing workload portability and the serverless developer experience to your Kubernetes clusters, wherever they may be.

Revisions and Traffic Splitting

image

Triggers and Schedule

image

Summary:

Cloud Run can be invoked from HTTP requests or they can be event driven, but Cloud Run can also handle multiple concurrent connections per instance, and then spin itself back down to zero when there is no further demand.

Lab: Cloud Run Deployments with CI/CD

image

Introduction

In this lab, we’ll configure a continuous deployment pipeline for Cloud Run using Cloud Build. We'll set up Cloud Source Repositories and configure Cloud Build to automate our deployment pipeline. Then, we'll commit changes to Git and observe the fully automated pipeline as it builds and deploys our new image into service.

Enable APIs and Create the Git Repo

  1. Make note of the project ID once you're logged in — we'll need to remember it for a later step.
  2. Using the main navigation menu, go to APIs & Services > Dashboard.
  3. Click Enable APIs & Services.
  4. Search for Cloud Run.
  5. Select Cloud Run API.
  6. Click Enable.
  7. Once it's enabled, navigate back to APIs & Services > Dashboard.
  8. Click ENABLE APIS AND SERVICES.
  9. Search for Cloud Build.
  10. Select Cloud Build API.
  11. Click Enable.
  12. Using the main navigation menun, under TOOLS, click Source Repositories.
  13. Click Get started.
  14. Click Create repository.
  15. Select Create new repository and click Continue
  16. For Repository name, enter "cddemo".
  17. Click the Project dropdown, and select your project ID (the one you made note of at the beginning of the lab).
  18. Click Create.
  19. Click the Google Cloud SDK tab.
  20. Copy the second command listed, which will clone the repo.
  21. Back on the GCP console tab, using the icon in the top right corner of the page, activate the Cloud Shell.
  22. When prompted, click Continue.
  23. Paste the command you copied into Cloud Shell.

Commit Application Code

  1. In the Cloud Shell terminal, download the sample app zip file:
    wget https://github.com/linuxacademy/content-google-cloud-run-deep-dive/raw/master/amazingapp.zip
  2. Unzip the file:
    unzip amazingapp.zip
  3. Change directory (there are three versions of the app, but we'll use the blue one throughout the lab guide here):
    cd amazingapp/blue
  4. Copy the files into our new empty Git repo:
    cp -r * ~/cddemo/
  5. Change to the directory of the repo:
    cd ~/cddemo/
  6. Configure your Git identity, replacing <YOUR_EMAIL> and <YOUR_NAME> with your information:
    git config --global user.email "<YOUR_EMAIL>"
    git config --global user.name "<YOUR_NAME>"
  7. Add the directory's files to your repo:
    git add .
  8. Commit the files to the repo:
    git commit -m "Initial commit"
  9. Push the files to the repo:
    git push -u origin master
  10. Once completed, reload the Source Repositories page in the browser. We should now see the files have been committed.

Set Up Cloud Build

  1. In the Cloud Shell, click Open Editor.
  2. Expand cddemo.
  3. Click File > New File.
  4. Name it cloudbuild.yaml.
  5. Enter the following contents into the file:
steps: - name: 'gcr.io/cloud-builders/docker' args: ['build', '-t', 'gcr.io/$PROJECT_ID/amazingapp:$COMMIT_SHA', '.'] - name: 'gcr.io/cloud-builders/docker' args: ['push', 'gcr.io/$PROJECT_ID/amazingapp:$COMMIT_SHA'] - name: 'gcr.io/cloud-builders/gcloud' args: - 'run' - 'deploy' - 'amazingapp' - '--image' - 'gcr.io/$PROJECT_ID/amazingapp:$COMMIT_SHA' - '--region' - 'us-east1' - '--platform' - 'managed' - '--allow-unauthenticated' images: - gcr.io/$PROJECT_ID/amazingapp:$COMMIT_SHA
  1. Click File > Save.
  2. Click Open Terminal.
  3. Commit the file to the repo:
    git add cloudbuild.yaml
    git commit -m "Add cloudbuild.yaml"
    git push

Set Up Build Triggers

  1. In the GCP console, navigate to Cloud Build > Settings.

  2. Click the Disabled dropdown next to the Cloud Run Admin role, and select Enable. (This should also enable the Service Account User role, which is what we also want to enable.)

  3. Click GRANT ACCESS TO ALL SERVICE ACCOUNTS in the dialog that pops up.

  4. In the left-hand menu, click Triggers.

  5. Click the three dots icon on the right in the cddemo box, and select Add trigger.

  6. Set the following values on the Create trigger page:

    • Name: deployall
    • Event: Push to a branch
    • Branch: ^master$
    • Build configuration: Cloud Build configuration file
  7. Click CREATE.

  8. In Cloud Shell, click Open Editor.

  9. Open the homepage.html file.

  10. Change the header title to: "My Even More Amazing App!"

  11. Click File > Save.

  12. In Cloud Shell, click Open Terminal.

  13. Commit the change to Git:

    git commit -am "Update app"
    git push
  14. In the GCP console, navigate to Cloud Build > Dashboard to watch the build as it runs.

  15. Once the build is finished, using the main navigation menu, go to COMPUTE > Cloud Run.

  16. Click amazingapp.

  17. Click the service URL.

Serverless APIs

image

Cloud Endpoints

image

The first option uses the sidecar method with the Nginx version of the extensible service proxy or ESP.
We set up Docker in Compute Engine, where we deployed our echo container as an example API backend, and the ESP container as a sidecar. Requests for our service hit the ESP proxy first and because we'd configured security in our open API spec, we needed a valid API key to return a response.

The second option uses the remote proxy method using the Envoy version of the proxy.
We deployed a hello world function to Cloud Functions. Then we ran the ESP proxy in Cloud Run. Requests were handled by the ESP proxy first, then routed to the backend service.

Both of these methods took quite a bit of configuration to set up, but the results are worth it.

API Gateway

image

As a fully managed service, using API Gateway is somewhat more simple than Cloud Endpoints.
We deploy our API service to the serverless platform of our choice then use an open API specification to create a managed gateway, which provides the frontend for API requests. Authentication and rate limiting are configured as part of the API specification.

Summary:

Cloud Endpoints provides an Nginx proxy service for App Engine, Compute Engine, and GKE, as well as a newer Envoy based proxy for App Engine, Cloud Run, and Cloud Functions.
API Gateway is basically a managed version of that Envoy proxy, specifically designed for those serverless platforms.
Both of these services basically do the same thing, providing a gateway for our API backends using an open API spec to add things like authentication and rate limiting without us having to write our own middleware into our backends to handle that sort of thing.

Google ML API

image

Google's pre-trained off-the-shelf APIs which use machine learning models that Google has already trained for millions
of hours.

Cloud Vision API

image

Currently, the Cloud Vision API supports the following features:
face detection--it can detect faces and return bounding polygons for their location along with individual features such as eyes, ears, noses, etc., and even the likelihood of a general emotion on someone's face;
landmark detection--it will find famous landmarks in pictures and return their name, coordinates, and a confidence score for its guess;
logo detection--it can find common brand-name logos, and return a text description, confidence score, and bounding polygon; label detection--this one just returns general metadata labels about the image, for example, this is a street, here are some people, and so on;
object detection and localization--this feature provides the location in a bounding box and a label for all of the objects it can recognize in a picture and again, each result is given a confidence score.
A similar feature detects web entities;
basically any web content it can find related to objects or images in your request.
The model also has advanced text detection allowing it to convert text found in images or in documents with full OCR support for dense texts and even handwriting.
And finally, the model can detect explicit content, returning a confidence score for an image under the categories of adult, spoof, medical, violence, and racy.
The results you get from the Cloud Vision API is codified in JSON.

Cloud Video Intelligence API

image

The Video Intelligence API is like the Vision API but for videos instead of images.
It has features such as detecting shot changes in a video, tracking objects, detecting logos, and transcribing speech from the soundtrack of a video. It can detect labels and text in the same way as the Vision API, which can be done on a frame-by-frame or shot-by-shot basis or for a segment that you specify. And it also has explicit content detection.
At the time of recording, there are also some new features in beta:
face and person detection and support for streaming the response to a recorded video or annotating a live video feed.
There's also a restricted beta feature: celebrity detection. This feature is intended for media and entertainment companies or their approved partners and you need to submit an application to Google to be allowed to use
it.

Cloud Speech to Text API

image

You simply provide the API with an audio file containing some speech, and it will return a text transcript.
There are multiple options for specifying different detection models, and you can request that the API return a confidence level for specific important words in the transcript.

Cloud Text to Speech API

image

You simply provide the API with an audio file containing some speech, and it will return a text transcript.
There are multiple options for specifying different detection models, and you can request that the API return a confidence level for specific important words in the transcript.

Translation API

image

You simply send it some text to convert from one language to another.
You can specify the languages to use or have them auto detected. An advanced version of the API also supports custom dictionaries and batch requests.

Auto ML API

image

This is Google's managed machine learning service that lets you create your own models by using the existing pre-trained ones as a jumping off point. Auto ML supports video intelligence, vision, natural language, and translation models with a few more in beta at the time of this recording. Using auto ML, you find tune the existing models using your own data or classifications with minimal effort and little to no machine learning expertise.

@shon-button
Copy link
Contributor Author

shon-button commented Feb 20, 2023

GCP Application Security

Overview

image

You configure IAM by defining policies or rules that allow you to specify who has what access to which resource.
A policy combines two things: a member and a role. A member can be a Google account, a service account, a Google group, or a domain defined in either Google workspace or cloud identity.
A role is simply a collection of permissions. Normally, you use a predefined role. Some examples would be compute instance admin, pub/sub publisher, or storage object creator, but you can also create custom roles with only the permissions you choose.

An easy way to manage access to staff is to add them to an appropriate Google group and then grant the role to the group. Individual users can then be moved in and out of groups as required.
Another thing to bear in mind with regard to roles is their scope. Some of these roles are pretty far-reaching. For example, the compute instance admin role affects the entire project. That means the permissions granted by this role affect every Compute Engine instance in the project. But as an example of a more scoped role, the permissions granted by pub/sub publisher can be limited to a specific pub/sub topic. Different roles will have different options for scope,
so check the IAM documentation to find out which one is best for your use case.

Service Accounts

image

One of the types of member we can use in an IAM policy is the service account. When you create a project in GCP,
a service account is created in that project, called the Compute Engine default service account. Everything that runs in GCP does so with an associated identity and, by default, for most services, it's this service account that will be used to provide that identity. But the problem is that this service account, again by default, has the project editor role, and this is basically the keys to the kingdom, so if your deployed service gets compromised, it could be used to wreak havoc on the rest of your project.
So, instead of using the default account, you should create custom service accounts for each deployed service in your
stack. Grant only the specific roles that the service needs and nothing else. By using this principle of least privilege,
you are basically limiting your blast radius in the event of a single components becoming compromised.

This is particularly good practice when dealing with either calls to Google APIs or when you're deploying microservices, or even when you're doing both. Each part of the stack should operate with its own individual identity, which can then be used to allow it to talk to the other parts of the stack that you want it to and only those parts, or the Google APIs that you have granted access to using its unique service account identity.

image

Security Vulnerabilities

Web security scanning and container analysis are two features that can come in really handy and add to your security toolkit when you're developing your applications.

Best way of preventing exploits over your application is to spot the vulnerabilities before you even deploy it,
and that's what we can use container analysis for, at least if we're deploying container-based apps.
Enable the container scanning API.
Container analysis is part of Artifact Registry, the new version of Container Registry.
Artifact Registry will scan any container I upload to it automatically for lots of known vulnerabilities.
So, now we've enabled Artifact Registry.
Pushing the container to the repository, run that container analysis.
Scanning containers for vulnerabilities can be a really useful and important
part of your CI/CD process.

Note: scans are not a substitute for manual security checks, secure design, and good security best practices!

Best Practices

Organizational Security

image

Think about your organizational security in advance and how you can implement it using the resource hierarchy of GCP.
The top of the Google cloud hierarchy is the organization level. The organization is associated with a domain from either Google workspace or cloud identity.
Next we have folders and projects.
Folders are a neat way to combine projects together into logical groups.
One suggestion for this is to use company departments: engineering, HR,finance, and so on.
But folders are also very handy when you have multiple product teams.
Each team and each folder then gets its own set of projects.
Here is where it's recommended that you make the separation of environments.
So, each team gets a dev project, a staging project, and a production project.
Changes to infrastructure can be promoted from project to project using CI/CD tooling.
Separating environments into their own projects reduces the risks of changes in dev or staging interfering with production.
Finally, inside each project, you have the actual cloud resources, such as compute instances, storage databases, you name it.
Now the purpose of designing your hierarchy like this is that you can apply IAM policies at any level: organization, folder, project, or resource. Where you set this policy establishes your trust boundary.
IAM inheritance moves down through the stack. So, applying a policy to the team B dev project implies that all of the resources inside that project have some level of trust of each other.

Network Security

image

Every VPC network in a project has its own set of firewall rules,and each firewall rule consists of seven components.
The direction --this can be ingress to control incoming connections or egress to control outgoing connections.
A priority --which defines in what order the rules are applied.
Rules with the highest priority will override conflicting rules with lower priority. Just remember that the lower the priority number, the higher the actual priority.
Action to take if rule matches, which can either be to allow or deny.
The status of the rule -- whether or not it is enabled and being enforced.
The target of a rule -- this can either be all of the instances in a VPC or instances specified by network tag or service account.
The source of a rule --this can be either an IP range, a network tag or service account, or a combination of IP range and network tag or service account.
A protocol and a port.

Examples.

Ingress

Ingress rule,so it controls traffic coming into our project.
Priority number is 1000. That's the default priority number.
It's an allow rule,.
The action is allow; so, matching traffic is allowed.
Status is enabled for enforcement.
Target is a network tag, so this rule will apply to all compute instances with the network tag HTTP server.
Source is IP range 0.0.0/0, which basically means all public IPV-4 addresses.
Protocol is TCP and the ports to open are 80 and 443 for HTTP and HTTPS connections.
So, this is a pretty standard firewall rule to let traffic into our network to reach our web servers.

Egress

This time it's an egress rule to control outgoing network traffic.
Its priority number is 800 and the lower number means it's actually a higher priority, so if it conflicts with other rules that are lower priority, it can override them.
The action here is deny; so, matching traffic will be dropped.
Status is enabled for enforcement.
Target this time is any instance with the tag private.
Source is any instance with the tag HTTP server.
The protocol and port is TCP 8080.
So in this example, you can imagine we're using a firewall rule to specifically deny traffic to a group of compute instances from another group of compute instances.

You can also create hierarchical firewall policies, which are sets of rules deployed at the organization or folder level.
Just like other hierarchical policies, these are then inherited by the projects underneath them.

VPC Service Controls

image

VPC Service Controls allows you to define a service perimeter around protected resources and enforce special rules at the border of this perimeter, which act independently of any existing IAM or firewall policies.
This extra layer of security requires that inbound access is only from authorized VPC networks and that outbound access can only travel to whitelisted IP ranges.
Data is also prevented from being copied outside of the service perimeter with tools like gsutil and bq.
You can also use these VPC service controls in a dry-run mode to simply report where data might be leaving your VPC, rather than prevent it, to help you get a better understanding of traffic in and out of your networks.

Combining IAM firewall rules and VPC service controls will give you the most protection against unwanted data exfiltration.

Summary:

Use custom service identities and service accounts, and grant granular permissions to only the resources required for your apps. Limit the blast radius of any potential attack.
Treat Service Acount JSON keys as sensitive data. Store them wisely, and never commit them to code.
Cloud Secret Manager is a secure and convenient storage system for API keys, passwords, certificates, and other sensitive data.
To abide to certain minimum standards of encryption, or storing personal identifiable information you might use Cloud Key Management Service to manage keys for encryption

Resources

IAM Concepts
oAuth 2.0
Security Command Center
Overview of Web Security Scanner
Cloud KMS
Secret Manager
Google Cloud security best practices center

@shon-button
Copy link
Contributor Author

shon-button commented Feb 21, 2023

Application Performance Monitoring

Operations Suite

image

Cloud Logging

There are many different types of logs available:
image

Platform logs -- logs emitted by Google managed services, such as Cloud SQL and Cloud Run.
User logs, which are generated by user-specified services or applications.
Monitoring agents, installed on Compute Engine VMs, logs.
Security logs, which come in two types:
Audit logs provide details of administrative changes and accesses to Google cloud resources,
Access transparency logs show you when Google staff access your resources
Multi-cloud logs, which are logs that are ingested from other providers, such as Microsoft Azure, AWS, or on-premises systems.

Logs are stored in buckets, and retention is based on a couple of factors.
Security logs are stored in a separate bucket, as their retention is pre-configured at 400 days and can't be changed.
Audit logs are always on. They can't be disabled.
Access transparency can be enabled if required, but can then only be disabled by Google support.
All other logs are stored in a default logs bucket and can have their retention configured for anything from one day to 10 years, the default is 30 days.

Log Routing

All of these log entries, regardless of where they come from, go through the Cloud Logging API
image

Logs in the Cloud Logging API are sorted by the logs router. The logs router can be configured to send logs to sinks. For example, the default log sink will direct logs to the default logs cloud storage bucket, or custom sinks can be created to send logs to BigQuery, Pub/Sub, or other storage buckets.
The logs router is also responsible for sending any logs-based metrics you have configured to cloud monitoring. But the logs router can also be configured with exclusion rules. You can create up to 50 rules that will allow you to filter what is being logged and potentially save on processing and storage fees. However, you can't exclude mandatory logs, such as security logs.

Log Viewing

image

Most parts of the console have embedded logs via a "Logs" tab.
Exploring further, I'll be taken to the Legacy Logs Viewer, which is the standalone logs viewer in Cloud Logging, but this is slowly being deprecated in favor of the Logs Explorer, which is more responsive and has more features for viewing, analyzing, and visualizing log data.

The Logs Explorer easily allows you to build a query using the log fields on the left and visualize the results in a histogram view on the right. You can use the query builder to save queries and run them later.
The Legacy Logs Viewer is still available but does not offer much more than the embedded logs.

Cloud Monitoring

Workspaces

image

A workspace is the first thing you set up, and it's a single place for monitoring resources, not just in your own project,
but in other GCP projects as well and even in AWS accounts.
This concept in cloud monitoring makes it slightly different to any other service in your project because a workspace is multi-project. In fact, in large organizations, it's common to run a host project specifically for the monitoring workspace
that contains no other monitored resources.

Resources and their metrics

image

Resources are the hardware or software component being monitored -- for example, a Compute Engine disk or instance, a Cloud SQL instance, or a Cloud Tasks queue, or any one of dozens of different resources that can be created in GCP.
Each resource has many different supported metrics that are automatically collected by the platform. For example,
we can see things like the utilization of a Compute Engine disk or CPU, the state of a Cloud SQL database,
or the number of requests to the Cloud Tasks API. And these are just a few examples from the hundreds of combinations of resource and metric available. Luckily, the Metrics Explorer in Cloud Monitoring makes it a bit easier to find exactly what you're looking for.

Custom Metrics

image

If you can't find the data you need with the thousands of built-in metrics offered by GCP, you can create your own custom metrics. For example, you have an app running in Compute Engine that makes calls to a third-party URL as part of its logic. You want to record the latency of those calls and log them as a time series metric, so that later you can set up an alerting policy to let you know if something is going wrong. We can easily add some code to our app to record that latency, but then the fun part is recording it as a metric for Cloud Monitoring.
To record custom metrics, we first need a custom metric descriptor, although this can be created automatically the first time we write a metric if we skip this stage. We use the descriptor to specify the name of our custom metric,
along with how it will store data. Usually, our metric kind is gauge, which means a metric will provide a single numerical value for a point in time. The other option would be cumulative, which provides a cumulative value.
The value type is double, which is a 64-bit floating point number. We also add a description, explaining what the metric is for. We can now instruct our app to write metrics directly to the Cloud Monitoring API.
Each metric we send contains the resource type for where it originated --in this example, a Compute Engine instance --and then the actual point of data itself. The point is optionally added to a series, which is then sent to the monitoring API.

There's quite a few moving parts to how custom metrics work, but you only really need to understand it from the theoretical point of view

Logs-based Metrics

image

You can also use logs-based metrics with Cloud Monitoring, which are simply metrics that originate in log entries.
You can create user-defined metrics based on any application logs picked up by the logging agents, including simple things like HTTP logs or even your own custom logs using filters that you define.

Lab Install and Configure Monitoring Agent with Google Cloud Monitoring

This lab will guide you in the process of installing the optional Monitoring agent on a pre-created Apache web server. Installing and configuring the Monitoring agent allows us to collect more detailed metrics that would not normally be possible, such as memory utilization and application-specific metrics. In this case, we will collect metrics from our Apache web server application.

Initialize Monitoring Workspace
  1. Click the hamburger menu icon in the upper left to view the Navigation menu.
  2. Click Monitoring (under Operations). It will take about a minute or so for the Workspace to be created.
  3. Once the Workspace is initialized, go to the Compute Engine interface.
  4. In the left-hand menu, click Dashboards.
  5. Click VM Instances.
  6. Click the Overview and Memory tabs. We'll notice there aren't any metrics available.
Install Monitoring Agent on Compute Engine Instance
  1. Click Google Cloud Platform in the upper left part of the console.

  2. In the side navigation menu, click Compute Engine.

  3. In the Connect column, click SSH. (This will open a new browser tab.)

  4. Add the agent's installation location as a repository, and then update your repositories:

    curl -sSO https://dl.google.com/cloudagents/add-monitoring-agent-repo.sh
  5. View the files:

    ls

    We should see the add-monitoring-agent-repo.sh file.

  6. Run the script:

    sudo bash add-monitoring-agent-repo.sh
  7. Update our repositories:

    sudo apt update
  8. Install the latest version of the Monitoring agent:

    sudo apt-get install stackdriver-agent -y
  9. Verify the agent is working as expected:

    sudo service stackdriver-agent status

    You should receive feedback that the agent is active and initialization succeeded.

  10. Download the Apache configuration file into the agent's directory (the parentheses are required):

    (cd /opt/stackdriver/collectd/etc/collectd.d/ && sudo curl -O https://raw.githubusercontent.com/Stackdriver/stackdriver-agent-service-configs/master/etc/collectd.d/apache.conf)
  11. Restart the agent:

    sudo service stackdriver-agent restart
  12. Exit SSH:

    exit
Open the Demonstration Web Page and Generate Some Traffic
  1. Back on the VM instances page in the GCP console, click the external IP listed for web-agent-server. This will take us to the web server's website.
  2. Refresh the page a few times to generate traffic.
Confirm Agent Installation and Apache Configuration Success in Monitoring Workspace
  1. Back in the GCP console, click to open the navigation menu.
  2. Under Operations, click Monitoring > Dashboards.
  3. You should see a new dashboard titled Apache HTTP Server.
    • If it does not appear, wait a few minutes and refresh your screen.
  4. Once the Apache HTTP Server dashboard appears, click to view it.
  5. Click the Host tab to view agent-less metrics.
  6. Click the Agent tab to view non-application-specific, agent-collected metrics.
  7. Click the Apache tab to view application-specific metrics.

Uptime Checks

image

An uptime check is simply an HTTP request made by the monitoring platform to your app. Uptime checks work on any HTTP URL, providing they have a fully qualified domain name. It's just looking for a successful HTTP response, but it can check for expired SSL certificates and optionally it can use custom headers and basic authentication.
You can also create an alerting policy alongside an uptime check to notify you of problems with your app via email, SMS,
and many other integrations like Slack and PagerDuty.
Using an uptime check is one of the most simple approaches, but of course you can also use alerting policies on any other monitored resource, along with specific conditions that should trigger an alert.

Cloud Error Reporting

image

Cloud Error Reporting is powered by Cloud Logging, that will aggregate occurrences of any errors and provide an easy-to-read stack trace through a dedicated interface, which will hopefully reveal the bug.
Let's say you're running an app in App Engine, which has built-in support for Cloud Error Reporting.
All of your logs are sent to Cloud Logging, but when an error happens, rather than searching and filtering logs looking for individual occurrences, they will be highlighted for you in the Error Reporting interface.
Errors are deduplicated and grouped together into an issue, giving you an operational overview of any live bugs within your app, along with a stack trace and list of occurrences.
Cloud Error Reporting is supported for all of the most popular programming languages across all the application platforms.

Lab Real-Time Troubleshooting with Google Cloud Error Reporting

You have been tasked with deploying your team's application to App Engine, as a proof-of-concept demo for Platform-as-a-Service technologies that you will present to the rest of your organization. The app will work just great — most of the time. For some reason, it also seems to return an internal server error, although you can't see any bugs in your code when you run the app locally. In this lab, we will use GCP Error Reporting to see live errors and stack traces from our deployed application, to identify where the error is occurring.

Deploy the Demo Application
  1. Under "APIs & Services," "Library," search for "Cloud Build API." Click "ENABLE" to enable the Cloud Build API.

  2. Inside the GCP console, click the Activate Cloud Shell icon to the right of the search bar.

  3. Click the Continue button.

  4. In Cloud Shell, clone the Git repo from GitHub:

    git clone https://github.com/ACloudGuru-Resources/content-google-certified-pro-cloud-developer
  5. Move into the appengine-buggy directory:

    cd content-google-certified-pro-cloud-developer/appengine-buggy
  6. Deploy the application to App Engine:

    gcloud app deploy
  7. Click the Authorize button.

  8. Enter the numeric choice for the us-east1 region.

  9. Confirm the details to create the application by entering "Y".

    Note:  If the deploy fails, wait a moment, then retry the command. Make sure you use us-east1.

  10. Retrieve the URL of the deployed service:

    gcloud app browse
  11. Click the link to visit the application in your browser, and keep reloading the page until it breaks.

Use Cloud Error Reporting
  1. Return to the tab with the GCP console.
  2. Click the hamburger menu icon (the icon with three horizontal lines) in the top left corner.
  3. Under Compute, click App Engine.
  4. Scroll down to Application Errors to view error information.
  5. Click the Visit Error Reporting link. Alternatively, you can click the hamburger menu icon again and click Error Reporting under Operations.
  6. Click the link to the error under Errors in the last hour.
  7. View the stack trace under Stack trace sample.
Fix the Application and Redeploy
  1. Open Cloud Shell by clicking the Activate Cloud Shell icon to the right of the search bar.

  2. Click the Open Editor button.

  3. Click Open in a new window.

  4. In the left-hand file explorer menu, click the content-google-certified-pro-cloud-developer folder.

  5. Click appengine-buggy.

  6. Click templates.

  7. Click main.py.

  8. In the main.py file, remove the if condition on line 21 of main.py so that sample_text is always set.

  9. Change the text after sample text = so that it no longer claims to fail some of the time.

  10. Click File in the top left.

  11. Select Save.

  12. Return to the tab with the Cloud Shell terminal open.

  13. Move into the appengine-buggy directory:

    cd content-google-certified-pro-cloud-developer/appengine-buggy
  14. Redeploy the app:

    gcloud app deploy
  15. Confirm the details by entering "y".

  16. Go to the tab containing the link to the application in your browser.

  17. Reload the application by refreshing the page.

  18. Go back to the tab with the Error Reporting console open.

  19. Click the Open dropdown menu near the top right.

  20. Select the Resolved option to mark the error as resolved.

Cloud Debugger

image

Snapshot Break Points

It's common practice when debugging applications to use snapshots. Depending on your development environment,
this normally involves choosing a line number where a break point should occur in a running program, which will then halt its execution and provide a snapshot of its current parameters and variables.
This practice can provide invaluable insights during the development process, helping you to debug what's going wrong when different parts of your application might not be working.
But sometimes bugs don't show up until they've been deployed to production.
With Cloud Debugger, you can perform live snapshots of your application that provide the same information as a break point in a traditional IDE, but without interrupting or affecting the performance of your running app.
Cloud Debugger allows us to inject log points, snapshot breakpoint, into a running application that will write the live state of the app into Cloud Logging.

Log Points

Adding a log point, a piece of logging code within the application code, that will log the value variables etc. to Cloud Logging. This logging code will expire in 24 hours.

Cloud Trace

image

Modern application stacks are often distributed across multiple layers where a frontend service relies on a middleware service, which in turn relies on a backend service.
When a user makes a request, each part of the distributed architecture makes its own request and returns a response, before a response is finally returned to the user.
Now, let's say that your monitoring is showing you that requests from users are showing occasional lags. In response time, you seem to have an intermittent issue in your stack, but how do you find it?
Any one of these three different stages may be the problem.
One way to diagnose this is to use distributed tracing. To do this, we instrument our code, which simply means to add libraries to it that collects tracing telemetry and sends it to a collector. The collector is of course Cloud Trace.
But the nice thing is that the libraries used for collecting this data are open-source.
OpenTelemetry and OpenCensus client libraries are recommended, or you can use Google client libraries to instrument your code. You can then record metrics such as the round trip time of each part of the stack to identify where the problem is.
A trace record describes the time it takes an application to complete a single operation, but each trace consists of one or more spans, a span describes how long it takes to perform a single part of that operation or a sub-operation.
So, in our scenario, a trace describes how long it takes to process the entire incoming request from a user and return a response. But an individual span is just how long it takes to perform the request at that stage of the stack.

Resources

Cloud Logging Overview
Cloud Monitoring Overview
Create custom metrics
Cloud Monitoring Overview
Create custom metrics
Debug Snapshots
Debug Logpoints
Cloud Trace Setup

@shon-button
Copy link
Contributor Author

shon-button commented Feb 21, 2023

Exam Prep

Google References:

App Engine

https://cloud.google.com/appengine/docs/the-appengine-environments
https://cloud.google.com/appengine/docs/standard#standard_environment_languages_and_runtimes
https://cloud.google.com/appengine/docs/flexible/dotnet/mapping-custom-domains?hl=fa
https://cloud.google.com/appengine/docs/standard/python/application-security
https://cloud.google.com/appengine/docs/standard/python/splitting-traffic

Big Query

https://cloud.google.com/bigquery/docs/running-queries#batch

Cloud Build

https://cloud.google.com/cloud-build/docs/configuring-builds/use-community-and-custom-builders#creating_a_custom_builder
https://cloud.google.com/build/docs/build-config-file-schema
https://cloud.google.com/build/docs/automating-builds/create-manage-triggers

Cloud Code

https://cloud.google.com/code/docs

Cloud Function

https://cloud.google.com/functions/docs/troubleshooting
Runtimes on Cloud Functions include an operating system, software required to execute and/or compile code written for a specific programming language, and software to support your functions.
Google Cloud Functions applies updates to runtimes as the updates are made available by the maintainers of these runtime components. When a component is no longer actively maintained, Cloud Functions may deprecate and, eventually, remove the runtime.
This involves three aspects: a publication of the deprecation date, a deprecation period, and a decommission date. The deprecation date posted below indicates the start of the deprecation period and the decommission date.
During the deprecation period, you can generally continue to create new functions and update existing functions using the runtime. You should use this time to migrate functions that use the deprecated runtime to a more up-to-date runtime.
After the decommission date, you can no longer create new functions or update existing functions using the runtime. You must choose a more up-to-date runtime to deploy your functions. Functions that continue to use a decommissioned runtime may be disabled.
Note: <PROJECT_ID>@appspot.gserviceaccount.com is the default runtime service account for 1st gen. For 2nd gen, the default runtime service account is [email protected].
The Cloud Functions service uses the Cloud Functions Service Agent service account (service-<PROJECT_NUMBER>@gcf-admin-robot.iam.gserviceaccount.com) when performing administrative actions on your project. By default this account is assigned the Cloud Functions cloudfunctions.serviceAgent role. This role is required for Cloud Pub/Sub, IAM, Cloud Storage and Firebase integrations. If you have changed the role for this service account, deployment fails.

Cloud KMS

https://cloud.google.com/kms/docs/separation-of-duties#using_separate_project

Cloud Profiler

To diagnose the performance problem of your application running slower on a Compute Engine instance compared to when it is tested locally, you should use Cloud Profiler. Cloud Profiler is a service that allows you to analyze the performance of your application by providing detailed information on where the application is spending the most time.
Here are the steps to use Cloud Profiler:
Enable Cloud Profiler for your application by creating a Profiler agent configuration file and specifying it when you start your application.
Once the Profiler agent is running, it will begin collecting performance data from your application and send it to the Cloud Profiler service.
Use the Cloud Profiler web UI or the Cloud Profiler API to view and analyze the performance data. You can view information such as the CPU and memory usage of your application, as well as the functions within the application that take the longest amount of time to execute.
Use the information provided by Cloud Profiler to identify the specific parts of your application that are causing the performance problem. Once you have identified the problem, you can take steps to optimize the application and improve its performance.
By using Cloud Profiler, you can obtain detailed information on the performance of your application and identify the specific functions within the application that are causing the performance problem. This will allow you to take appropriate steps to optimize the application and improve its performance when running on a Compute Engine instance.

Cloud Run

https://cloud.google.com/blog/products/serverless/knative-based-cloud-run-services-are-ga
https://cloud.google.com/blog/topics/developers-practitioners/3-ways-optimize-cloud-run-response-times

Cloud Spanner

https://cloud.google.com/spanner/docs/data-types

Cloud Security Scanner

https://cloud.google.com/security-command-center/docs/concepts-web-security-scanner-overview

Cloud Storage

https://cloud.google.com/blog/products/storage-data-transfer/uploading-images-directly-to-cloud-storage-by-using-signed-url
https://cloud.google.com/storage/docs/json_api/v1/status-codes#504_Gateway_Timeout
https://cloud.google.com/storage/docs/folders
https://cloud.google.com/storage/docs/hosting-static-website
https://cloud.google.com/storage/docs/request-rate#ramp-up
To upload files from an on-premises virtual machine to Google Cloud Storage as part of a data migration, you should use the command gsutil cp [LOCAL_OBJECT] gs://[DESTINATION_BUCKET_NAME]/

Compute Engine

https://cloud.google.com/compute/docs/instance-groups/updating-migs#opportunistic_updates
https://cloud.google.com/compute/docs/disks/sharing-disks-between-vms#use-multi-instances
https://cloud.google.com/compute/docs/internal-dns#access_by_internal_DNS
https://cloud.google.com/compute/docs/storing-retrieving-metadata#custom
https://cloud.google.com/compute/docs/troubleshooting/vm-startup#identify_the_reason_why_the_boot_disk_isnt_booting
https://cloud.google.com/service-infrastructure/docs/service-metadata/reference/rest#service-endpoint
https://cloud.google.com/sql/docs/mysql/connect-compute-engine#connect-gce-private-ip

DataFlow

https://cloud.google.com/dataflow/docs/concepts/streaming-with-cloud-pubsub
Streaming with Pub/Sub, a conceptual overview of Dataflow's integration with Pub/Sub. The overview describes some optimizations that are available in the Dataflow runner's implementation of the Pub/Sub I/O connector. Pub/Sub is a scalable, durable event ingestion and delivery system. Dataflow compliments Pub/Sub's scalable, at-least-once delivery model with message deduplication, exactly-once processing, and generation of a data watermark from timestamped events.
To use Dataflow, write your pipeline using the Apache Beam SDK and then execute the pipeline code on the Dataflow service.

Docker

https://cloud.google.com/build/docs/optimize-builds/speeding-up-builds#using_a_cached_docker_image
https://cloud.google.com/build/docs/optimize-builds/increase-vcpu-for-builds
https://cloud.google.com/architecture/best-practices-for-building-containers#tagging_using_the_git_commit_hash

Firebase

https://firebase.google.com/docs/firestore/manage-data/enable-offline

Firestore

https://cloud.google.com/datastore/docs/firestore-or-datastore

GKE

https://cloud.google.com/istio/docs/istio-on-gke/overview
https://medium.com/google-cloud/how-to-securely-invoke-a-cloud-function-from-google-kubernetes-engine-running-on-another-gcp-79797ec2b2c6
https://cloud.google.com/kubernetes-engine/docs/tutorials/authenticating-to-cloud-platform#use_workload_identity
https://cloud.google.com/kubernetes-engine/docs/how-to/kubernetes-service-accounts
https://cloud.google.com/kubernetes-engine/docs/how-to/encrypting-secrets
https://cloud.google.com/kubernetes-engine/docs/concepts/horizontalpodautoscaler
https://cloud.google.com/kubernetes-engine/docs/concepts/types-of-clusters#regional_clusters
https://cloud.google.com/anthos/run
https://kubernetes.io/docs/concepts/services-networking/ingress/#name-based-virtual-hosting
https://cloud.google.com/kubernetes-engine/docs/concepts/daemonset#usage_patterns
https://cloud.google.com/blog/products/operations/troubleshoot-gke-faster-with-monitoring-data-in-your-logs
https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/
https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/
https://cloud.google.com/kubernetes-engine/docs/concepts/cluster-upgrades
https://cloud.google.com/kubernetes-engine/docs/how-to/node-upgrades-quota

Logging

https://cloud.google.com/logging/docs/routing/overview#logs-retention
https://cloud.google.com/logging/docs/agent/logging/installation
https://cloud.google.com/logging/docs/agent/configuration
https://cloud.google.com/sdk/gcloud/reference/logging/read

Microservices

https://cloud.google.com/architecture/migrating-a-monolithic-app-to-microservices-gke#choosing_an_initial_migration_effort

Miscellaneous

https://cloud.google.com/architecture/modernization-path-dotnet-applications-google-cloud#take_advantage_of_compute_engine
https://www.techonthenet.com/sql/union_all.php
https://open4tech.com/array-vs-linked-list-vs-hash-table/

Monitoring

https://cloud.google.com/monitoring/alerts/concepts-indepth
https://cloud.google.com/blog/products/gcp/drilling-down-into-stackdriver-service-monitoring
https://cloud.google.com/monitoring/api/metrics_agent#agent-memory
https://cloud.google.com/bigquery/docs/monitoring
A good technique to monitor the availability of your application and alert you when it is unavailable would be to use Stackdriver uptime checks.

Stackdriver uptime checks periodically send HTTP or HTTPS requests to a specified URL and verify that the response is received and matches an expected pattern. If an error occurs, such as a timeout or a non-200 response code, it will trigger an alert.

Pub/Sub

https://cloud.google.com/dataflow/docs/concepts/streaming-with-cloud-pubsub
https://cloud.google.com/pubsub/docs/pull

Scanning

https://cloud.google.com/container-analysis/docs/automated-scanning-howto#view-code https://cloud.google.com/binary-authorization/docs

Security

https://cloud.google.com/iap/docs/concepts-overview
https://cloud.google.com/vpc/docs/serverless-vpc-access#how_it_works
https://cloud.google.com/functions/docs/samples/functions-http-cors

Testing

https://cloud.google.com/architecture/distributed-load-testing-using-gke
https://cloud.google.com/architecture/application-deployment-and-testing-strategies#choosing_the_right_strategy

Troubleshooting

https://cloud.google.com/compute/docs/troubleshooting/troubleshooting-using-serial-console
https://cloud.google.com/debugger/docs/using/logpoints
If you find that your existing monitoring platform is too slow for time critical problems after migrating your applications to Google Cloud Platform, one solution would be to replace your entire monitoring platform with Stackdriver.
If "To Many Request Error" one resolution would involve initially retrying the request immediately after receiving the "Too Many Requests" status code, then incrementally increasing the amount of time between retries with each subsequent failure. This allows for a balance between quickly retrying the request and not overwhelming the service with too many requests in a short period of time.

@shon-button
Copy link
Contributor Author

shon-button commented Mar 3, 2023

Exam: Main Topics Covered

image

https://cloudacademy.com/course/google-professional-cloud-developer-exam-prep-introduction/google-professional-cloud-developer-exam-prep-introduction/?context_id=1335&context_resource=lp

Cloud Computing Fundamentals

image

For most applications, you need three core elements: compute, storage, and networking.
image

Compute Services

Infrastructure-as-a-Service, runs traditional IT infrastructure components that are offered as a service.

Compute Engine
One of the most common ways to run applications on GCP is to use virtual machines, or VMs for short. These are machines that run either Linux or Windows. Google’s service for running VMs is called Compute Engine. If you currently have an application running on a Windows or Linux server, then the most straightforward way to migrate it to GCP is to do what’s called a “lift and shift” migration. That is, you simply lift the application from your on-premises server and shift it to a virtual server in the cloud.
image
Note
In order to automatically create other virtual machines with a saved and updatable configuration you have to create a template; that is a saved configuration used by GCP to automate the process. Instance templates define the machine type, image, identity tags, service accounts and other instance properties.
For cost savings, you can create preemptible instances in a managed instance group, with the preemptible option in the instance template before you create or update the group. For any further detail: https://cloud.google.com/compute/docs/instances/preemptible

The OS Login is the standard feature for GCP that allows to use Compute Engine IAM roles to manage SSH access to Linux instances. It is possible and easy to add an extra layer of security by setting up OS Login with two-factor authentication, and manage access at the organization level by setting up organization policies. What you have to do is: Enable 2FA for your Google account or domain. Enable 2FA on your project or instance. Grant the necessary IAM roles to the correct users. For any further detail: https://cloud.google.com/compute/docs/oslogin/setup-two-factor-authentication

Compute Engine offers several types of storage options for your instances. Each of the following storage options has unique price and performance characteristics:

Zonal persistent disk: Efficient, reliable block storage.
Regional persistent disk: Regional block storage replicated in two zones.
Local SSD: High performance, transient, local block storage.
Cloud Storage buckets: Affordable object storage.
Filestore: High performance file storage for Google Cloud users.
If you are not sure which option to use, the most common solution is to add a persistent disk to your instance.

Managed instance groups (MIGs) let you operate apps on multiple identical VMs. You can make your workloads scalable and highly available by taking advantage of automated MIG services, including: autoscaling, autohealing, regional (multiple zone) deployment, and automatic updating.

Platform-as-a-Service

App Engine
App Engine lets you host web and mobile applications without having to worry about the underlying infrastructure. After creating an App Engine app, you can just upload your code to it and let GCP take care of the details. It even scales the underlying resources up and down automatically. For example, if your app isn’t getting any traffic, App Engine will scale the number of underlying VMs down to zero, and you won’t get charged until your app starts getting traffic again.
In most cases, using App Engine is a better solution than using virtual machines, but there are times when it makes more sense to use VMs. For example, if you have an application that’s not a web or mobile app, then you can’t use App Engine, so you’ll have to use a VM.
Here is the hierarchical organization of an App Engine Application.

image

Each App Engine application is a top-level container that includes the service, version, and instance resources. Services: App Engine services behave like microservices. Therefore, you can run your whole app in a single service or you can design and deploy multiple services to run as a set of microservices. So you may divide a big app in mobile and web procedure and specialized backends. Very useful with microservices. Versions with versions each service, independently, may switch between different versions for rollbacks, testing, or other temporary events. You can route traffic to one or more specific versions of your app by migrating or splitting traffic. Instances The versions within your services run on one or more instances.

By default, App Engine scales your app to match the load. Your apps will scale up the number of instances that are running to provide consistent performance, or scale down to minimize idle instances and reduces costs. For more information about instances, see How Instances are Managed. For any further detail: https://cloud.google.com/appengine/docs/standard/python/an-overview-of-app-engine

App Engine, both Standard then Flex Edition, are specially suited for Building Microservices. In an App Engine Project you can use any mix of standard and flexible environment services, written in any language. In addition with Cloud Endpoints it is possible to deploy, protect, and monitor your APIs. Using an OpenAPI Specification or API frameworks, Cloud Endpoints gives tools for API development and provides insight with Stackdriver Monitoring, Trace, and Logging. Cloud Functions with https endpoint is not enough for enterprise integrated projects.

Container-as-a-Service

These are self-contained software environments. For example, a container might include a complete application plus all of the third-party packages it needs. Containers are somewhat like virtual machines except they don’t include the operating system. This makes it easy to deploy them because they’re very lightweight compared to virtual machines. In fact, containers run on virtual machines.
image

Cloud Run
The simplest way is to to run containers is to use Cloud Run. This service lets you run a container using a single command.

GKE
If you have a more complex application that involves multiple containers, then you’ll probably want to use Google Kubernetes Engine (or GKE for short), which is a container orchestrator. It makes it easy to deploy and manage multi-container applications.
Google Kubernetes Engine (GKE) is a managed, production-ready environment for deploying containerized applications.
Kubernetes provides: automatic management, monitoring and liveness probes for application containers, automatic scaling, rolling updates.
The correct steps to deploy a containerized application with Google Kubernetes Engine are:

  • Create a GKE cluster: A cluster consists of at least one cluster master machine and multiple worker machines called nodes. Nodes are Compute Engine virtual machine (VM) instances that run the Kubernetes processes necessary to make them part of the cluster. Get authentication credentials to interact with the cluster: gcloud container clusters get-credentials cluster-name
  • Deploy an application to the cluster: Kubernetes provides the Deployment object for deploying stateless applications like web servers.
  • Create a Service object to define rules and load balancing to expose the deployment to the internet so that users can access it, a Service will be created, a Kubernetes resource that exposes your application to external traffic.
    For any further detail: https://cloud.google.com/kubernetes-engine/docs/quickstart

To deploy an application from a Kubernetes Deployment file use gcloud or Deployment Manager to create a cluster then use kubectl to create a deployment
https://cloud.google.com/kubernetes-engine/docs/tutorials/hello-app

GKE’s cluster autoscaler automatically resizes the number of nodes in a given node pool, based on the demands of your workloads. You don’t need to manually add or remove nodes or over-provision your node pools. Instead, you specify a minimum and maximum size for the node pool, and the rest is automatic. If your node pool contains multiple managed instance groups with the same instance type, cluster autoscaler attempts to keep these managed instance group sizes balanced when scaling up. This can help prevent an uneven distribution of nodes among managed instance groups in multiple zones of a node pool. Cluster autoscaler considers the relative cost of the instance types in the various pools, and attempts to expand the least expensive possible node pool. The reduced cost of node pools containing preemptible VMs is taken into account. Vertical pod autoscaling (VPA) is a feature that can recommend values for CPU and memory requests and limits, or it can automatically update the values. With Vertical pod autoscaling: Cluster nodes are used efficiently, because Pods use exactly what they need. Pods are scheduled onto nodes that have the appropriate resources available. You don’t have to run time-consuming benchmarking tasks to determine the correct values for CPU and memory requests. Maintenance time is reduced, because the autoscaler can adjust CPU and memory requests over time without any action on your part. With GKE you don’t have to use the scalability features of Compute Engine.
To perform an update to the application with minimal downtime on Google Kubernetes Engine (GKE), you can use a rolling update strategy, which involves updating the application incrementally, one pod at a time, while ensuring that the updated pods are functioning properly before updating the next set. Here's the general process:
kubectl set image deployment/echo-deployment echo=<new_image_tag>

For any further detail:
https://cloud.google.com/kubernetes-engine/docs/concepts/cluster-autoscaler
https://cloud.google.com/kubernetes-engine/docs/concepts/verticalpodautoscaler https://cloud.google.com/kubernetes-engine/docs/concepts/scalability

You may notice that pods are similar to Compute Engine managed instance groups. A key difference is that pods are for executing applications in containers and may be placed on various nodes in the cluster, while managed instance groups all execute the same application code on each of the nodes. Also, you typically manage instance groups yourself by executing commands in Cloud Console or through the command line. Pods are usually managed by a controller.
Since pods are ephemeral and can be terminated by a controller, other services that depend on pods should not be tightly coupled to particular pods. For example, even though pods have unique IP addresses, applications should not depend on that IP address to reach an application. If the pod with that address is terminated and another is created, it may have another IP address. The IP address may be re-assigned to another pod running a different container. Kubernetes provides a level of indirection between applications running in pods and other applications that call them: it is called a service. A service, in Kubernetes terminology, is an object that provides API endpoints with a stable IP address that allow applications to discover pods running a particular application. Services update when changes are made to pods, so they maintain an up-to-date list of pods running an application. For any further detail: https://cloud.google.com/kubernetes-engine/docs/concepts/kubernetes-engine-overview https://cloud.google.com/kubernetes-engine/docs/concepts/node-pools https://cloud.google.com/compute/docs/instance-groups/adding-an-instance-group-to-a-load-balancer

What is Google Kubernetes Engine (GKE)?

image
image
image
image
image
image
image
image
image

Function-as-a-Service

Cloud Function
Cloud Functions is used to deploy individual functions. Cloud Functions is event-driven, which means the function gets executed when a particular event occurs. For example, you could configure a function to be triggered whenever a new file is uploaded to a particular storage location.
image

Storage Options (When is the right use case for each):

image

Files

Cloud Storage: flat unstructured
The simplest one is called Cloud Storage. It’s referred to as object storage, but really it’s just a collection of files. It’s not like a normal filesystem, though, because it doesn’t have a hierarchical folder structure. It has a flat structure. It’s typically used for unstructured data, such as images, videos, and log files.
One of the great things about it is that it has multiple storage classes: Standard, Nearline, Coldline, and Archive. Standard is for frequently accessed files. Nearline is for files you expect to access only about once a month or less. The advantage is that it costs less than Standard as long as you don’t access it frequently. Coldline is for files you expect to access at most once every three months. Archive is for files you expect to access less than once a year. It has the lowest cost.
All four storage classes give you immediate access to your files. This is different from some other cloud providers where the lowest cost storage can take hours to access.
image
Note when you create a bucket you have to declare if it will be either regional or multiregional, You cannot change afterwards. All the other transitions are allowed.

The only solution for multi-regional object storage is Cloud Storage. In order to reach higher performances, the use of Cloud CDN is advisable.

A Cloud Storage trigger enables a function to be called in response to changes in Cloud Storage. When you specify a Cloud Storage trigger for a function, you choose an event type and specify a Cloud Storage bucket. Your function will be called whenever a change occurs on an object (file) within the specified bucket.
The following Cloud Storage event types are supported:

  • Object finalized
  • Object deleted
  • Object archived
  • Object metadata updated

Cloud Storage has a Text to BigQuery (Stream) pipeline that allows to stream text files stored in Cloud Storage, transform them using JavaScript into User Defined Function (UDF) that you provide, and output the result to BigQuery

Filestore: Hierarchial, NSF compatible file shares
image

Databases

Relational, Transactional:

Cloud SQL: fully managed, but hard to scale to high volume, high speed data
If you’re currently using MySQL, PostgreSQL, or Microsoft SQL Server, then Cloud SQL is your best bet. It’s a fully-managed service for each of those three database systems. These are all relational databases that are suitable for online transaction processing.
The problem with relational databases is that it’s very difficult to scale them to handle high-volume, high-speed data.
image
Any SQL Database uses declarative statements that specify what data you want to retrieve. If you want to understand how it obtains the results, you should use look at execution plans. A query execution plan displays the cost associated with each step of the query. Using those costs, you can debug query performance issues and optimize your query.

In order to avoid any SPF: Single Point of Failures, you have to use a managed Database Service or manage a Replica.
Cloud SQL is a managed service that handles High Availability and Failover out of the box. The alternative solution is to create transactional or merge db replicas. A transactional replica keep in synch Databases at transaction level. A merge replica keep in synch Databases at checkpoint times. For any further detail: https://cloud.google.com/sql/docs/mysql/ https://en.wikipedia.org/wiki/Distributed_database

Cloud Spanner: massively scalable, but more expensive and re-write need if coming from a legacy SQL db
Cloud Spanner is a unique database because it seems to combine the best of both worlds. It’s a relational database that’s massively scalable.
Cloud Spanner is a scalable, enterprise-grade, globally-distributed, and strongly consistent relational built for the cloud that combines the benefits and consistency of traditional databases with non-relational horizontal scale. Cloud Spanner uses the industry-standard ANSI 2011 SQL for queries and has client libraries for many programming languages: Python, Javascript, GO, Java, PhP, Ruby and SQL

image

Cloud Spanner provides a special kind of consistency, called external consistency. We are used to deal with strong consistency, that make possible that, after an update, all the queries will receive the same result. In other words the state of the Database is always consistent, no matter the distribution of the processing, partitions and replicas. The problem with a global, horizontal scalable DB as Spanner the transactions are executed in many distributed Instances and therefore, is really difficult to guarantee strong consistency. Spanner manage to achieve all that by means of TrueTime, a distributed clock in all GCP computing systems. With TrueTime, Spanner manages the serialization of transactions, achieving in this way out external consistency, that is the strictest concurrency-control for Databases.

In Cloud Spanner it is necessary to be careful not to create hotspots with the choice of your primary key. For example, if you insert records with a monotonically increasing integer as the key, you’ll always insert at the end of your key space. This is undesirable because Cloud Spanner divides data among servers by key ranges, which means your inserts will be directed at a single server, creating a hotspot. The techniques that can spread the load across multiple servers and avoid hotspots: Hash the key and store it in a column. Use the hash column (or the hash column and the unique key columns together) as the primary key. Swap the order of the columns in the primary key. Use a Universally Unique Identifier (UUID). Version 4 UUID is recommended, because it uses random values in the high-order bits. Don’t use a UUID algorithm (such as version 1 UUID) that stores the timestamp in the high order bits. Bit-reverse sequential values.

NoSQL:
Bigtable
Firestore/Datastore
Firebase
Memory store
Google offers many NoSQL databases, including Bigtable, Firestore, Firebase Realtime Database, and Memorystore. Bigtable is best for running large analytical workloads. Firestore is ideal for building client-side mobile and web applications. Firebase Realtime Database is best for syncing data between users in real time, such as for collaboration apps. Memorystore is an in-memory datastore that’s typically used to speed up applications by caching frequently requested data.
image

BigTable
Bigtable is a NoSQL wide-columnar database. Wide-column and petabyte-scale database store tables that can have a large and variable number of columns, that may be grouped in families.

Cloud Bigtable is a sparsely populated table with 3 dimensions (row, column, time) that can scale to billions of rows and thousands of columns, enabling you to store terabytes or even petabytes of data and to access data at sub-millisecond latencies. A single value in each row is indexed; this value is known as the row key. Cloud Bigtable is ideal for storing very large amounts of single-keyed data with very low latency. It supports high read and write throughput at low latency, and it is an ideal data source for MapReduce operations. Each row is indexed by a single row key, and columns that are related to one another are typically grouped together into a column family. Each column is identified by a combination of the column family and a column qualifier, which is a unique name within the column family. Each row/column intersection can contain multiple cells, or versions, at different timestamps, providing a record of how the stored data has been altered over time. Cloud Bigtable tables are sparse; if a cell does not contain any data, it does not take up any space. Cloud Bigtable scales in direct proportion to the number of machines in your cluster without any bottleneck.

Datastore\Firestore

A Datastore manages relationships between entities (records), in a hierarchically structured space similar to the directory structure of a file system. When you create an entity, you can optionally designate another entity as its parent; the new entity is a child of the parent entity. An entity without a parent is a root entity. A transaction is a set of Datastore operations on one or more entities in up to 25 entity groups. Each transaction is guaranteed to be atomic, which means that transactions are never partially applied. Either all of the operations in the transaction are applied, or none of them are applied.

Datawarehouse
BigQuery: aggregates data good for analytics (OLAP)
If you need a data warehouse, then BigQuery is the right solution. It’s something you use after data is collected, rather than being a transactional system. It’s best suited to aggregating data from many sources and letting you search it using SQL queries. In other words, it’s good for OLAP (that is, Online Analytical Processing) and business intelligence reporting.
Being an OLAP engine, it is far better, even if it can manage normalised data and joins, to have denormalized information. In addition BigQuery can manage nested and repeated columns and structures, as required. BigQuery can quickly analyze gigabytes to petabytes of data using ANSI SQL.
image
With Big Query you can run sql queries with external data from: Cloud SQL, Cloud Storage, Google Drive. An external data source (also known as a federated data source) is a data source that you can query directly even though the data is not stored in BigQuery. Instead of loading or streaming the data, you create a table that references the external data source. To query an external data source without creating a permanent table, you run a command to combine: A table definition file with a query An inline schema definition with a query A JSON schema definition file with a query The table definition file or supplied schema is used to create the temporary external table, and the query runs against the temporary external table. Querying an external data source using a temporary table is supported by the BigQuery CLI and API.

Networking

VPC

When you create a virtual machine on GCP, you have to put it in a Virtual Private Cloud, or VPC. A VPC is very similar to an on-premises network. Each virtual machine in a VPC gets an IP address, and it can communicate with other VMs in the same VPC.
image

You can also divide a VPC into subnets and define routes to specify how traffic should flow between them.
image
To restrict communications between VM instances within a VPC without relying on static IP addresses or subnets, you can use firewall rules based on network tags attached to the compute instances. This will allow you to specify which instances are allowed to communicate with each other and on which paths and ports. You can then attach the relevant network tags to the compute instances when they are created, allowing you to control communication between the instances without relying on static IP addresses or subnets.

By default, all outbound traffic from a VM to the Internet is allowed. If you also want to allow inbound traffic, then you need to assign an external IP address to the VM.
image

If you want VMs in one VPC to be able to communicate with VMs in another VPC, then you can connect the VPCs together using VPC Network Peering.
image

If you want to create a secure connection between a VPC and an on-premises network, then you can use Cloud VPN, which stands for Virtual Private Network, Cloud Interconnect, or Peering. A VPN sends encrypted traffic over the public Internet, whereas Cloud Interconnect and Peering communicate over a private, dedicated connection between your site and Google’s network. Cloud Interconnect is much more expensive than a VPN, but it provides higher speed and reliability since it’s a dedicated connection. Peering is free, but it’s not well-integrated with GCP, so you should usually use Cloud Interconnect instead.
image

Global networking services

CDN

One way to make your web applications respond more quickly to your customers is to use a Content Delivery Network. Google offers Cloud CDN for this purpose. It caches your content on Google’s global network, which reduces the time it takes for your users to retrieve it, no matter where they’re located in the world. This is especially important if your content includes videos.
image

Cloud Load Balancing

To make sure your application continues to be responsive when there’s a sudden increase in traffic, or even if one of Google’s data centers fails, you can use Cloud Load Balancing. It redirects application traffic to groups of VM instances distributed in different locations, and it can automatically scale the number of instances up or down as needed. All of this complexity is hidden behind a single IP address.
image

Cloud Armor

Load Balancing works well for normal increases in network traffic, but what about when you’re hit by a Distributed Denial of Service, or DDoS, attack? You can use Cloud Armor, which integrates with Cloud Load Balancing.
image

IAM

The most important layers of security in GCP is IAM, which stands for Identity and Access Management. Since identity is handled using an outside service, such as Cloud Identity or even Google accounts, IAM is really about access management. It lets you assign roles to users and applications. A role grants specific permissions, such as being able to create a VM instance.
image

Encryption

Another important security area is encryption. GCP handles this very well because everything is encrypted by default. However, many organizations need to manage the encryption keys that are used to encrypt their data, especially to comply with certain security standards.

image

Google provides Cloud Key Management Service to allow your organization to centrally manage your encryption keys and integrating the services related to encryption keys for other Google cloud services that enterprises can use to implement cryptographic functions.

A similar service is Secret Manager, which is a central place to store your API keys, passwords, certificates, and other secrets. https://cloud.google.com/kms/ https://cloud.google.com/secret-manager/docs/

HSM is a physical computing device that stores and manages digital keys for strong authentication and provides crypto-processing. They usually plug-in cards or external devices that are attached directly to a computer or network server. Cloud HSM is a managed service for HSM and it is fully integrated with KMS for creating and using customer-managed encryption keys. It is necessary only in special cases where an hardware enforced additional level of security is required. https://cloud.google.com/hsm/

Finally, the Data Loss Prevention service helps you protect sensitive data.
DLP uses information types—or infoTypes—to define what it scans for. An infoType is a type of sensitive data, such as name, email address, telephone number, identification number, or credit card number. For example, if your user records contain credit card numbers, you could configure DLP to remove them before responding to a database query.
Every infoType defined in Cloud DLP has a corresponding detector. Cloud DLP uses infoType detectors in the configuration for its scans to determine what to inspect for and how to transform findings. InfoType names are also used when displaying or reporting scan results. There are a large set of pre-ready infoTypes but it is possible to develop and create Custom infoType detectors.
image

Interacting with GCP

There are many ways to interact with GCP. The Google Cloud Console runs in a browser, so you don’t need to install anything to use it. Alternatively, you can install the SDK, which stands for Software Development Kit. The SDK includes two types of tools. The first is what you’d expect in an SDK: a collection of client libraries that your applications can use to interact with GCP services. The second is a set of command-line tools, including gcloud, gsutil, bq, and kubectl. The one you’ll use the most is gcloud, which is for managing all services other than Cloud Storage, BigQuery, and Kubernetes.

it’s pretty easy to use the Google Cloud Console to create GCP resources, but if you know how to use the command-line interface, you can usually create resources more quickly, often with a single command. For example, to create a virtual machine called instance-1 in the us-central1-a zone, all you need to do is type, “gcloud compute instances create instance-1 --zone=us-central1-a”. This will create the instance using defaults for everything. To specify particular options, you can just add them to the command.

The Cloud Shell, which is a very small virtual machine that you can use to run commands. It already has the Cloud SDK installed on it, so we don’t need to.

Migrating to GCP

Migrate a VMWare
Most organizations that start using GCP already have on-premises systems, and they typically want to move some of the applications running on these systems to the cloud. Google provides a number of services to help with this.
The most popular on-premises virtualization platform is, of course, VMware. The easiest way is to run them in Google Cloud VMware Engine, which is a complete VMware environment that runs on GCP.

image

Migrate for Compute Engine
If you’d rather run on Google’s standard compute services instead, then there are a couple of great options. First, Migrate for Compute Engine offers a very sophisticated way to migrate your local virtual machines from VMware to GCP. This service runs an instance on GCP while using the data attached to your local VM. It transfers the data it needs to GCP as it goes. Running the application on GCP before all of the data has been transferred makes the migration process much faster than other methods. Once the data transfer is finished, the instance reboots, and the migration is complete. In addition to VMware, Migrate for Compute Engine also supports physical machines and even VMs running on Amazon Web Services and Microsoft Azure.
image

Migrate for Anthos
If you want to move from virtual machines to containers, then you can use Migrate for Anthos. It will actually convert a VM into a container that’s managed by Google Kubernetes Engine. It currently supports Linux VMs running on VMware, AWS, or Azure. It also supports migrating both Linux and Windows VMs from Google Compute Engine. So, if you want to migrate from Windows or from physical servers, then you can use Migrate for Compute Engine to move them to GCP instances, and then you can use Migrate for Anthos to move them to containers.
image

Migrate Large Data
One common issue is how to move a large amount of data to GCP. Trying to transfer hundreds of terabytes of data over the internet or even Direct Interconnect would be slow and expensive. Google’s solution is called a Transfer Appliance. It’s a physical storage server that Google ships to your data center. Then you transfer your data to the appliance and ship it back to Google where it gets transferred to Cloud Storage.

Migrate Active Directory
One of the most important considerations when you’re moving to GCP is how to ensure that access to your resources is only given to the people who should have it. The most commonly used on-premises identity solution is Microsoft’s Active Directory, so as you would expect, Google has provided a way to integrate with it. Managed Service for Microsoft Active Directory lets you connect your on-premises Active Directory to one that’s hosted on GCP. This allows you to authenticate your users with your existing directory.
If you aren’t using Active Directory, then you can use Cloud Identity instead, a Google cloud-based identity solution.

image

Data Analytics

Google offers so many services in this data analytics that can be divided into Ingest, Store, Process, and Visualize.
image

Ingest

There are lots of ways to ingest data, but if you have a large amount of data streaming in, then you’ll likely need to use Pub/Sub. It essentially acts as a buffer for services that may not be able to handle such large spikes of incoming data.
image

Store

In the Store category, the main option for interactive analytics is Big Query, that is, running queries on your data. If you need high-speed automated analytics, then Bigtable is usually the right choice.
image

Process

The Process category is where Google has the most options. These services are used to clean and transform data. If you already have Hadoop or Spark-based code, then you can use Dataproc, which is a managed implementation of Hadoop and Spark. Alternatively, if you already have Apache Beam-based code, then you can use Dataflow. If you’re starting from scratch, you might want to choose Dataflow because Apache Beam has some advantages over Hadoop and Spark. If you’d like to do data processing without writing any code, you can use Dataprep, which actually uses Dataflow under the hood.
image

Visualize

To visualize or present your data with graphs, charts, etc., you can use Data Studio or Looker. Data Studio was Google’s original visualization solution, but then Google acquired Looker, which is a more sophisticated business intelligence platform. One big difference is that Data Studio is free, but Looker isn’t. So, if you need to do simple reporting, then Data Studio should be fine, but if you want to do something more complex, then Looker is your best bet.

Processing pipelines

Cloud Composer/Data Fusion
If you want to create a processing pipeline that runs tasks in multiple GCP services, then you can use Composer, which is a managed implementation of Apache Airflow. Not only can it run tasks in GCP services like Pub/Sub, Dataflow, Dataproc, and BigQuery, it can even run tasks in on-premises environments. Data Fusion is similar to Composer except that it has a graphical interface and doesn’t require you to write any code.
image

IoT Core
One common source of big data is the Internet of Things, or IoT. This refers to devices, such as thermostats and light switches, that are connected to the internet. Google provides a service called IoT Core that lets you manage and ingest data from your IoT devices.
image

Deployment Strategies

DevOps

DevOps services help you automate the building, testing, and releasing of application updates.

Cloud Build

The most important DevOps tool is Cloud Build. It lets you create continuous integration / continuous deployment pipelines. A Cloud Build can define workflows for building, testing, and deploying across multiple environments such as VMs, serverless, Kubernetes, or Firebase. Cloud Build integrates with third-party code repositories, such as Bitbucket and GitHub, but you may want to use Google’s Cloud Source Repositories, which are private Git repositories hosted on GCP. If you’re deploying your applications using containers, then you can configure Cloud Build to put the code into a container and push it to Artifact Registry, which is a private Docker image store hosted on GCP.

image

A Cloud Build provides a gke-deploy builder that enables you to deploy a containerized application to a GKE cluster. gke-deploy is a wrapper around kubectl, the command-line interface for Kubernetes. It applies Google’s recommended practices for deploying applications to Kubernetes by: Updating the application’s Kubernetes configuration to use the container image’s digest instead of a tag. Adding recommended labels to the Kubernetes configuration. Retrieving credentials for the GKE clusters to which you’re deploying the image. Waiting for the Kubernetes configuration that was submitted to be ready. If you want to deploy your applications using kubectl directly and do not need additional functionality, Cloud Build also provides a kubectl builder that you can use to deploy your application to a GKE cluster.

A/B Testing
Canary deployments
Blue-Green deployments
Testing Strategies
Feature flags
Backward compatibility in API development
Unit Testing with emulators
Emulating Google Cloud services for local application development
Integration testing

Troubleshooting Applications

Cloud Operations Suite

Once you’ve deployed applications on GCP, you’ll need to maintain them. Google provides many services to help with that. One of the most important is the Cloud Operations suite, which was formerly known as Stackdriver.

Cloud Monitoring & Cloud Logging

Cloud Monitoring gives you a great overview of what’s happening with all of your resources. By default, it provides graphs showing metrics like CPU utilization, response latency, and network traffic. You can also create your own custom graphs and dashboards. But an even more critical feature is that you can set up alerts to notify you if there are problems. For example, you can set up an uptime check that alerts you if a virtual machine goes down.
Another useful service in the Cloud Operations suite is Cloud Logging. This is a central place where you can search all of the logs related to your resources, which can be very helpful for troubleshooting.
image

The suite also includes Error Reporting, Cloud Trace, Cloud Debugger, and Cloud Profiler to debug live applications and track down performance problems.
image

Security Command Center

In addition to monitoring performance, you’ll also need to monitor security and compliance. Security Command Center gathers this information in one place. Its overview dashboard shows you active threats and vulnerabilities, ordered by severity. For example, if one of your applications is vulnerable to cross-site scripting attacks, then that vulnerability will show up in the list. Security Command Center also includes a compliance dashboard that lets you know about violations of compliance standards, such as PCI-DSS, in your GCP environment.
image

Cloud Deployment Manager.

Cloud Deployment Manager is Google’s solution to automated resource creation. To use it, you create a configuration file with all the details of the GCP resources you want to create, and then you feed it to Cloud Deployment Manager. What makes it really powerful is that you can define the configuration of multiple, interconnected resources, such as two VM instances and a Cloud SQL database. Then you can deploy all of them at once.

Security

Servers

Google server machines use a variety of technologies to ensure that they are booting the correct software stack. We use cryptographic signatures over low-level components like the BIOS, bootloader, kernel, and base operating system image. These signatures can be validated during each boot or update. The components are all Google-controlled, built, and hardened. With each new generation of hardware we strive to continually improve security: for example, depending on the generation of server design, we root the trust of the boot chain in either a lockable firmware chip, a microcontroller running Google-written security code, or the above mentioned Google-designed security chip.

Services

Each service that runs on the infrastructure has an associated service account identity. A service is provided cryptographic credentials that it can use to prove its identity when making or receiving remote procedure calls (RPCs) to other services. These identities are used by clients to ensure that they are talking to the correct intended server, and by servers to limit access to methods and data to particular clients.

GFE

When a service wants to make itself available on the Internet, it can register itself with an infrastructure service called the Google Front End (GFE). The GFE ensures that all TLS connections are terminated using correct certificates and following best practices such as supporting perfect forward secrecy. The GFE additionally applies protections against Denial of Service attacks (which we will discuss in more detail later). The GFE then forwards requests for the service using the RPC security protocol discussed previously.

Microservice Architecture

IAM / Security Best Practices

Principle of least privilege
Kubernetes secrets
Identity-Aware Proxy
Service Accounts

Pub/Sub best practices

@shon-button
Copy link
Contributor Author

shon-button commented Mar 6, 2023

@shon-button
Copy link
Contributor Author

shon-button commented Mar 10, 2023

@shon-button
Copy link
Contributor Author

Google Cloud – HipLocal Case Study

HipLocal is a community application designed to facilitate communication between people in close proximity. It is used for event planning and organizing sporting events, and for businesses to connect with their local communities. HipLocal launched recently in a few neighborhoods in Dallas and is rapidly growing into a global phenomenon. Its unique style of hyper-local community communication and business outreach is in demand around the world.

HipLocal Solution Concept

HipLocal wants to expand their existing service with updated functionality in new locations to better serve their global customers. They want to hire and train a new team to support these locations in their time zones. They will need to ensure that the application scales smoothly and provides clear uptime data, and that they analyze and respond to any issues that occur.

Key points here are HipLocal wants to expand globally, with an ability to scale and provide clear observability, alerting and ability to react.

HipLocal Existing Technical Environment

HipLocal’s environment is a mixture of on-premises hardware and infrastructure running in Google Cloud. The HipLocal team understands their application well, but has limited experience in globally scaled applications. Their existing technical environment is as follows:

  • Existing APIs run on Compute Engine virtual machine instances hosted in Google Cloud.
  • Expand availability of the application to new locations.
  • Support 10x as many concurrent users.
  • State is stored in a single instance MySQL database in Google Cloud.
  • Release cycles include development freezes to allow for QA testing.
  • The application has no consistent logging.
  • Applications are manually deployed by infrastructure engineers during periods of slow traffic on weekday evenings.
  • There are basic indicators of uptime; alerts are frequently fired when the APIs are unresponsive.

Business requirements

HipLocal’s investors want to expand their footprint and support the increase in demand they are experiencing. Their requirements are:

  • Expand availability of the application to new locations.
    • Availability can be achieved using either
      • scaling the application and exposing it through Global Load Balancer OR
      • deploying the applications across multiple regions.
  • Support 10x as many concurrent users.
    • As the APIs run on Compute Engine, the scale can be implemented using Managed Instance Groups frontend by a Load Balancer OR App Engine OR Container-based application deployment
    • Scaling policies can be defined to scale as per the demand.
  • Ensure a consistent experience for users when they travel to different locations.
    • Consistent experience for the users can be provided using either
      • Google Cloud Global Load Balancer which uses GFE and routes traffic close to the users
      • multi-region setup targeting each region
  • Obtain user activity metrics to better understand how to monetize their product.
    • User activity data can also be exported to BigQuery for analytics and monetization
    • Cloud Monitoring and Logging can be configured for application logs and metrics to provide observability, alerting, and reporting.
    • Cloud Logging can be exported to BigQuery for analytics
  • Ensure compliance with regulations in the new regions (for example, GDPR).
    • Compliance is shared responsibility, while Google Cloud ensures compliance of its services, application hosted on Google Cloud would be customer responsibility
    • GDPR or other regulations for data residency can be met using setup per region, so that the data resides with the region
  • Reduce infrastructure management time and cost.
    • As the infrastructure is spread across on-premises and Google Cloud, it would make sense to consolidate the infrastructure into one place i.e. Google Cloud
    • Consolidation would help in automation, maintenance, as well as provide cost benefits.
  • Adopt the Google-recommended practices for cloud computing:
    • Develop standardized workflows and processes around application lifecycle management.
    • Define service level indicators (SLIs) and service level objectives (SLOs).

Technical requirements

  • Provide secure communications between the on-premises data center and cloud hosted applications and infrastructure
    • Secure communications can be enabled between the on-premise data centers and the Cloud using Cloud VPN and Interconnect.
  • The application must provide usage metrics and monitoring.
    • Cloud Monitoring and Logging can be configured for application logs and metrics to provide observability, alerting, and reporting.
  • APIs require authentication and authorization.
    • APIs can be configured for various Authentication mechanisms.
    • APIs can be exposed through a centralized Cloud Endpoints gateway
    • Internal Applications can be exposed using Cloud Identity-Aware Proxy
  • Implement faster and more accurate validation of new features.
    • QA Testing can be improved using automated testing
    • Production Release cycles can be improved using canary deployments to test the applications on a smaller base before rolling out to all.
    • Application can be deployed to App Engine which supports traffic spilling out of the box for canary releases
  • Logging and performance metrics must provide actionable information to be able to provide debugging information and alerts.
    • Cloud Monitoring and Logging can be configured for application logs and metrics to provide observability, alerting, and reporting.
    • Cloud Logging can be exported to BigQuery for analytics
  • Must scale to meet user demand.
    • As the APIs run on Compute Engine, the scale can be implemented using Managed Instance Groups frontend by a Load Balancer and using scaling policies as per the demand.
    • Single instance MySQL instance can be migrated to Cloud SQL. This would not need any application code changes and can be as-is migration. With read replicas to scale both horizontally and vertically seamlessly.

                       

GCP Certification Exam Practice Questions

  1. Which database should HipLocal use for storing state while minimizing application changes?
    1. Firestore
    2. BigQuery
    3. Cloud SQL
    4. Cloud Bigtable
  2. Which architecture should HipLocal use for log analysis?
    1. Use Cloud Spanner to store each event.
    2. Start storing key metrics in Memorystore.
    3. Use Cloud Logging with a BigQuery sink.
    4. Use Cloud Logging with a Cloud Storage sink.
  3. HipLocal wants to improve the resilience of their MySQL deployment, while also meeting their business and technical requirements. Which configuration should they choose?
    1. ​Use the current single instance MySQL on Compute Engine and several read-only MySQL servers on Compute Engine.
    2. ​Use the current single instance MySQL on Compute Engine, and replicate the data to Cloud SQL in an external master configuration.
    3. Replace the current single instance MySQL instance with Cloud SQL, and configure high availability.
    4. ​Replace the current single instance MySQL instance with Cloud SQL, and Google provides redundancy without further configuration.
  4. Which service should HipLocal use to enable access to internal apps?
    1. Cloud VPN
    2. Cloud Armor
    3. Virtual Private Cloud
    4. Cloud Identity-Aware Proxy
  5. Which database should HipLocal use for storing user activity?
    1. BigQuery
    2. Cloud SQL
    3. Cloud Spanner
    4. Cloud Datastore
  6. A recent security audit discovers that HipLocal’s database credentials for their Compute Engine-hosted MySQL databases are stored in plain text on persistent disks. HipLocal needs to reduce the risk of these credentials being stolen. What should they do?

    1. Create a service account and download its key. Use the key to authenticate to Cloud Key Management Service (KMS) to obtain the database credentials.

    2. Create a service account and download its key. Use the key to authenticate to Cloud Key Management Service (KMS) to obtain a key used to decrypt the database credentials.

    3. Create a service account and grant it the roles/iam.serviceAccountUser role. Impersonate as this account and authenticate using the Cloud SQL Proxy.

    4. Grant the roles/secretmanager.secretAccessor role to the Compute Engine service account. Store and access the database credentials with the Secret Manager API.

  7. Which Google Cloud product addresses HipLocal’s business requirements for service level indicators and objectives?
    1. Cloud Profiler

    2. Cloud Monitoring

    3. Cloud Trace

    4. Cloud Logging

  8. HipLocal wants to reduce the latency of their services for users in global locations. They have created read replicas of their database in locations where their users reside and configured their service to read traffic using those replicas. How should they further reduce latency for all database interactions with the least amount of effort?
    1. Migrate the database to Bigtable and use it to serve all global user traffic.

    2. Migrate the database to Cloud Spanner and use it to serve all global user traffic.

    3. Migrate the database to Firestore in Datastore mode and use it to serve all global user traffic.

    4. Migrate the services to Google Kubernetes Engine and use a load balancer service to better scale the application.

  9. How should HipLocal increase their API development speed while continuing to provide the QA team with a stable testing environment that meets feature requirements?
    1. Include unit tests in their code, and prevent deployments to QA until all tests have a passing status.

    2. Include performance tests in their code, and prevent deployments to QA until all tests have a passing status.

    3. Create health checks for the QA environment, and redeploy the APIs at a later time if the environment is unhealthy.

    4. Redeploy the APIs to App Engine using Traffic Splitting. Do not move QA traffic to the new versions if errors are found.

  10. How should HipLocal redesign their architecture to ensure that the application scales to support a large increase in users?
    1. Use Google Kubernetes Engine (GKE) to run the application as a microservice. Run the MySQL database on a dedicated GKE node.

    2. Use multiple Compute Engine instances to run MySQL to store state information. Use a Google Cloud-managed load balancer to distribute the load between instances. Use managed instance groups for scaling.

    3. Use Memorystore to store session information and CloudSQL to store state information. Use a Google Cloud-managed load balancer to distribute the load between instances. Use managed instance groups for scaling.

    4. Use a Cloud Storage bucket to serve the application as a static website, and use another Cloud Storage bucket to store user state information.

  11. Which service should HipLocal use to enable access to internal apps?
    1. Cloud VPN

    2. Cloud Armor

    3. Virtual Private Cloud

    4. Cloud Identity-Aware Proxy

  12. HipLocal's application uses Cloud Client Libraries to interact with Google Cloud. HipLocal needs to configure authentication and authorization in the Cloud Client Libraries to implement least privileged access for the application. What should they do?
    1. Create an API key. Use the API key to interact with Google Cloud.

    2. Use the default compute service account to interact with Google Cloud.

    3. Create a service account for the application. Export and deploy the private key for the application. Use the service account to interact with Google Cloud.

    4. Create a service account for the application and for each Google Cloud API used by the application. Export and deploy the private keys used by the application. Use the service account with one Google Cloud API to interact with Google Cloud.

  13. In order to meet their business requirements, how should HipLocal store their application state?
    1. Use local SSDs to store state.

    2. Put a memcache layer in front of MySQL.

    3. Move the state storage to Cloud Spanner.

    4. Replace the MySQL instance with Cloud SQL.

  14. Which service should HipLocal use for their public APIs?
    1. Cloud Armor

    2. Cloud Functions

    3. Cloud Endpoints

    4. Shielded Virtual Machines

  15. HipLocal wants to reduce the number of on-call engineers and eliminate manual scaling. Which two services should they choose? (Choose two.)
    • Use Google App Engine services.

    • Use serverless Google Cloud Functions.

    • Use Knative to build and deploy serverless applications

    • Use Google Kubernetes Engine for automated deployments.

    • Use a large Google Compute Engine cluster for deployments.

  16. HipLocal's .net-based auth service fails under intermittent load. What should they do?

    1. Use App Engine for autoscaling.

    2. Use Cloud Functions for autoscaling.

    3. Use a Compute Engine cluster for the service.

    4. Use a dedicated Compute Engine virtual machine instance for the service.

  17. HipLocal's APIs are having occasional application failures. They want to collect application information specifically to troubleshoot the issue. What should they do?
    1. Take frequent snapshots of the virtual machines.

    2. Install the Cloud Logging agent on the virtual machines

    3. Install the Cloud Monitoring agent on the virtual machines.

    4. Use Cloud Trace to look for performance bottlenecks.

  18. HipLocal has connected their Hadoop infrastructure to GCP using Cloud Interconnect in order to query data stored on persistent disks. Which IP strategy should they use?
    1. Create manual subnets.

    2. Create an auto mode subnet.

    3. Create multiple peered VPCs.

    4. Provision a single instance for NAT.

  19. In order for HipLocal to store application state and meet their stated business requirements, which database service should they migrate to?
    1. Cloud Spanner

    2. Cloud Datastore

    3. Cloud Memorystore as a cache

    4. Separate Cloud SQL clusters for each region

  20. HipLocal is configuring their access controls. Which firewall configuration should they implement?
    1. Block all traffic on port 443.

    2. Allow all traffic into the network.

    3. Allow traffic on port 443 for a specific tag.

    4. Allow all traffic on port 443 into the network.

  21. HipLocal wants to improve the resilience of their MySQL deployment, while also meeting their business and technical requirements. Which configuration should they choose?
    1. Use the current single instance MySQL on Compute Engine and several read-only MySQL servers on Compute Engine.

    2. Use the current single instance MySQL on Compute Engine, and replicate the data to Cloud SQL in an external master configuration.


    3. Replace the current single instance MySQL instance with Cloud SQL, and configure high availability.

    4. Replace the current single instance MySQL instance with Cloud SQL, and Google provides redundancy without further configuration.

  22. Which database should HipLocal use for storing user activity?
    1. BigQuery

    2. Cloud SQL

    3. Cloud Spanner

    4. Cloud Datastore

  23. HipLocal's data science team wants to analyze user reviews. How should they prepare the data?
    1. Use the Cloud Data Loss Prevention API for redaction of the review dataset.

    2. Use the Cloud Data Loss Prevention API for de-identification of the review dataset.

    3. Use the Cloud Natural Language Processing API for redaction of the review dataset.

    4. Use the Cloud Natural Language Processing API for de-identification of the review dataset.

Reference

Case_Study_HipLocal

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants