Code Crunch Labs · Tier IISub-brand · GCP15 weeks · intensiveGPL-3.0

Crunch GCP.

Fifteen weeks of production-grade Google Cloud, built from IAM up. A multi-region GKE system that survives an AZ loss, a Pub/Sub-to-BigQuery pipeline that replays cleanly, a Vertex AI endpoint with a Gemini fallback, and Cloud Armor at the edge — all instrumented with OpenTelemetry and runnable inside the free trial. Open-source-first. Free, forever.

15weeks
Program length
540hrs
Total workload
15+1
Labs + capstone
$0
Tuition · always

§ I · The Program

Google Cloud, as a platform discipline.

Crunch GCP is the Google Cloud track of the Code Crunch Labs tier — a 15-week intensive that uses Google Cloud as the substrate for a production-engineering course, not a tour of services. You leave with a multi-region, observable, Cloud Armor-protected, BigQuery-instrumented, Vertex AI-served system you built yourself, and a runbook you wrote that another engineer can read at three in the morning.

It assumes you have finished C1 and C15 (Crunch DevOps)or carry equivalent industry experience with Docker, Kubernetes, and Terraform — and Linux fluency at the C14 level. This is not a “GCP for beginners” survey. By week three you are writing custom IAM roles, designing shared-VPC topologies, and wiring Workload Identity Federation from a CI runner. By week fifteen you have shipped a realtime event pipeline at scale and on-called it through a failover.

“Crunch GCP is not a Google PCA cram course. It is the lab the certification doesn’t give you.”— Crunch GCP, course README

§ II · Who It’s For

Four engineers, one cloud.

GCP is opinionated about its audience. C15 (Crunch DevOps) is the floor — if you have never written a Terraform module from scratch or paged on a Saturday, do C15 first. C18 will overwhelm you in week three.

No. 01

The DevOps Engineer Adding GCP

Already runs CI/CD on AWS or on-prem Kubernetes. The company acquired a team on Google Cloud or wants BigQuery vs. Kafka. Needs the platform vocabulary, the IAM model, and the failure modes.

No. 02

The Senior Backend Leveling Up

Shipped FastAPI for three years. Reads flame graphs, tunes Postgres. Now wants to own the system end-to-end: GKE, VPC, LB, OpenTelemetry, SLOs — and step into a platform or staff role.

No. 03

The SRE Prepping PCA

Can already pass the Professional Cloud Architect exam on theory. Wants the practical reps: a real multi-region failover, a real Cloud Armor rule that bites, a real Spanner migration, a real on-call.

No. 04

The Founder Choosing GCP

Picking a cloud and needs to know the billing surface, IAM mistakes, regional gotchas, AI/data primitives, and exit-cost shape. Leaves Week 15 with a defensible architecture, a cost model, and a runbook.

§ III · Four Phases

From landing zone to live capstone.

The arc of the program is composed in four phases, each building on the last like floors of a building. Mini-projects compound: by Week 10 you are extending Week 06’s GKE cluster, not starting fresh.

Phase I · Wk. 01—04

Foundations

Resource hierarchy, IAM and Workload Identity, VPC topology and Cloud NAT, Terraform on the google provider. A locked-down landing zone you can hand to a junior engineer without flinching.

Phase II · Wk. 05—08

Compute & Networking

Compute Engine and MIGs, GKE Autopilot vs. Standard, Cloud Run and the serverless decision, Cloud Load Balancing and Cloud Armor. A multi-tier service architecture that survives an AZ loss.

Phase III · Wk. 09—12

Data & AI

Pub/Sub and Dataflow, BigQuery deep, Spanner vs. Cloud SQL vs. AlloyDB, Vertex AI and Model Garden. An event-to-insight pipeline with a model-serving path and a documented fallback.

Phase IV · Wk. 13—15

Production & Capstone

OpenTelemetry, SLOs and burn-rate alerts, organization policies and VPC Service Controls, FinOps, the on-call drill. Capstone delivery, architecture review, and a PCA readiness gate.

§ IV · The Curriculum

Fifteen weeks, week by week.

Each entry corresponds to a folder in the GitHub repository with lecture notes, exercises, challenges, a quiz, homework, and a mini-project. Detailed acceptance criteria live in the syllabus.

01

The GCP Resource Hierarchy & Billing Discipline

Organization, folders, projects, billing accounts · resource hierarchy as a security model · quota model · the seven gcloud muscle-memory commands.

Lab 01

Three-folder, five-project Terraform landing zone

02

IAM, Service Accounts & Workload Identity

Principals · roles (basic, predefined, custom) · IAM conditions · service-account impersonation · Workload Identity Federation — no more keyfiles.

Lab 02

Replace SA key file with WIF from GitHub Actions

03

VPC, Subnets, Routes & Cloud NAT

VPC topology · primary/secondary subnet ranges · shared VPC · firewall rules (legacy + hierarchical) · Cloud NAT · Cloud Router · Private Google Access.

Lab 03

Multi-region shared VPC with hierarchical firewall

04

Terraform for GCP, End-to-End

terraform on google + google-beta · module structure · remote state in GCS with locking · terragrunt · Cloud Foundation Toolkit · plan-review workflow.

Lab 04

Reusable Terraform module library (org-bootstrap, vpc, iam-baseline)

05

Compute Engine, Instance Groups & Managed VMs

Machine families (E2/N2/N2D/C3/T2D) · regional & zonal MIGs · instance templates · OS Login · Shielded VM · spot & preemptible.

Lab 05

Regional MIG behind an internal TCP LB with autoscaling

06

GKE Autopilot vs. Standard

Cluster architecture · Autopilot constraints · Standard with private endpoint · in-cluster Workload Identity · PDBs, HPA, VPA · cluster upgrades and surge config.

Lab 06

FastAPI service on both Autopilot and Standard + spot pool

07

Cloud Run, Cloud Functions & the Serverless Decision

Cloud Run (v2) services and jobs · concurrency, min-instances, CPU allocation · Cloud SQL via Private Service Connect · Cloud Functions gen2 · Eventarc.

Lab 07

Cloud Run + Cloud SQL Postgres over PSC, benchmarked

08

Cloud Load Balancing & Cloud Armor

External HTTPS LB (global) · regional internal HTTPS LB · backend services and NEGs · Cloud CDN · Cloud Armor WAF + rate limiting · IAP · PSC.

Lab 08

Global LB + CDN + Cloud Armor in front of Week 07's Cloud Run

09

Pub/Sub and Dataflow (Apache Beam)

Topics, subscriptions (push/pull), ordering keys, dead-letter topics, exactly-once delivery · Dataflow as managed Beam · streaming vs. batch · windowing, watermarks, triggers.

Lab 09

Synthetic events → Pub/Sub → Dataflow → BigQuery, with DLQ

10

BigQuery Deep

Storage model · partitioning (time, integer-range) · clustering · BI Engine · materialized views · BigQuery ML · INFORMATION_SCHEMA · slot reservations · query plans.

Lab 10

Partitioned-clustered NYC taxi table; five <1%-scan queries

11

Spanner, Cloud SQL, AlloyDB & the Database Decision

Cloud SQL HA · read replicas · AlloyDB columnar engine · Spanner architecture (Paxos, TrueTime) · Firestore vs. Bigtable · Memorystore.

Lab 11

Postgres → Spanner zero-downtime migration via Datastream

12

Vertex AI, Model Garden & Serving Inference

Workbench · Pipelines (Kubeflow) · custom training containers · online & batch Endpoints · Model Garden (open weights) · Gemini API · Document AI.

Lab 12

Model Garden endpoint with GPU autoscaling + Gemini fallback

13

Observability with OpenTelemetry

OTel SDKs (Python + Go) for traces, metrics, logs · Cloud Trace, Cloud Logging, Cloud Monitoring · log sinks (BigQuery, Pub/Sub, GCS) · SLOs and burn-rate alerts · Cloud Profiler.

Lab 13

Instrument every Phase 2–3 service; one SLO + burn-rate alert each

14

Security Hardening, FinOps & the On-Call Drill

Organization Policy · VPC Service Controls · KMS + CMEK · Secret Manager · Binary Authorization · Security Command Center · commitments & sustained-use discounts · billing-export analysis.

Lab 14

Org-policy bundle + VPC SC perimeter + synthetic on-call drill

15

Capstone Delivery & Architecture Review

Final integration · architecture review presentation · recorded video walkthrough · resume & portfolio polish · PCA / Cloud DevOps Engineer practice exam · mock interview.

Capstone

Realtime Event Pipeline at Scale — multi-region delivery

§ V · The Toolchain

GCP-native, open-source-honest.

We use GCP services where they’re genuinely best-in-class. We name the open-source alternative every time (Trino + Iceberg, CockroachDB, NATS/Kafka, vLLM / TGI / Hugging Face). Almost nothing is done in the Cloud Console after Week 02.

Compute
GKE · Cloud Run
Autopilot, Standard, serverless
IaC
Terraform · OpenTofu
google & google-beta providers
Streaming
Pub/Sub · Dataflow
Apache Beam, exactly-once
Warehouse
BigQuery
columnar, partitioned, clustered
Database
Spanner · AlloyDB · Cloud SQL
strong-consistency to managed Postgres
AI / ML
Vertex AI · Model Garden
open weights · Gemini API fallback
CI/CD
Cloud Build · Skaffold
attestor-signed deploys
Observability
OpenTelemetry · Cloud Trace
SLOs & burn-rate alerts
Edge
Cloud Armor · Cloud CDN
WAF · rate limit · CEL rules
Identity
IAM · Workload Identity
federation · no keyfiles
Crypto
KMS · CMEK · Secret Manager
customer-managed keys
CLI
gcloud · bq · gsutil · kubectl
terminal-first, console-second

§ VI · Skills You Will Carry

What you walk away with.

By the end of Week 15, you are able to do each of the following — credibly, in front of a real architecture review, with a budget and an exit plan.

  • Design a multi-region GCP architecture from a blank diagram and defend every choice.
  • Provision an entire org with Terraform modules, remote GCS state, and locking.
  • Operate a GKE cluster — Autopilot and Standard — through upgrades and rolling deploys.
  • Build an event pipeline on Pub/Sub and Dataflow that replays cleanly after an outage.
  • Ship a stateless Cloud Run service over Private Service Connect to Cloud SQL.
  • Serve a Vertex AI Endpoint with autoscaling and a documented Gemini API fallback.
  • Instrument every service with OpenTelemetry, exporting to Cloud Trace and Cloud Monitoring.
  • Write an SLO and a burn-rate alert that pages on real risk, not noise.
  • Lock down a production project with VPC Service Controls and Binary Authorization.
  • Write a custom IAM role with the minimum permission set for a real job function.
  • Configure Workload Identity Federation from GitHub Actions — zero long-lived keys.
  • Read a BigQuery query plan and find the stage that costs the money.
  • Choose Cloud SQL, AlloyDB, or Spanner with a budget and a justification.
  • Run a no-drama on-call shift end-to-end: page, triage, mitigate, postmortem.
  • Cost-engineer a workload from billing-export-to-BigQuery analysis.
  • Sit a Google PCA practice exam at passing score and write an honest exit plan.

§ VII · The Capstone

One system. Shipped, on-called, postmortemed.

Week 15 is reserved for a single substantial system — the kind a real product team would scope across a quarter. Architecture diagram, live deploy, video walkthrough, chaos-drill postmortem, cost report, and an exit plan.

Capstone Brief

Realtime Event Pipeline at Scale

Architect, build, deploy, and on-call a multi-region realtime event pipeline on GCP. Edge: Global HTTPS LB → Cloud CDN → Cloud Armor. Ingest: Cloud Run validates and publishes to Pub/Sub. Process: Dataflow (Python Beam) windows, enriches via Memorystore, lands partitioned-clustered BigQuery tables. Serve: a GKE Standard regional cluster runs a gRPC “current state” service backed by Spanner and a Vertex AI Endpoint with a Gemini fallback. Primary us-central1, standby us-east1. Tear it down on demand. Defend every decision.

  • Live deploy in your own GCP project, reachable by the grader, with a 5-minute video walkthrough and a Mermaid architecture diagram.
  • OpenTelemetry traces, metrics, and logs from every service; one SLO per service with a burn-rate alert armed.
  • Workload Identity Federation for all deploys; VPC Service Controls around the data project; Binary Authorization on the GKE deploy path; CMEK on BigQuery and Spanner.
  • Postmortem of one chaos drill — region failover, certificate rotation, or 10× Pub/Sub overload — with mitigation, timeline, and follow-ups.
  • Cost report from billing-export-to-BigQuery, with three optimization moves and an annualized cost under $500/month at 100 RPS sustained.
  • A 2-page exit plan describing what it would take to move this workload to AWS or to self-hosted Kafka + Trino + Iceberg + vLLM. Honest about effort.

§ VIII · Getting Started

Three commands. Then begin.

The setup is intentionally lightweight. If you have a Linux-ish terminal, the gcloud CLI, and a fresh GCP free-trial account, you can begin Week 1 today. Set your billing budget alerts as Exercise 1 of Week 01 — before you provision anything.

# 1. Clone the curriculum repository
git clone https://github.com/CODE-CRUNCH-WORLDWIDE/C18-CRUNCH-GCP.git
cd C18-CRUNCH-GCP

# 2. Authenticate and pin the course project
gcloud auth login
gcloud auth application-default login
gcloud config set project $YOUR_COURSE_PROJECT_ID
gcloud config set compute/region us-central1

# 3. Open Week 01 README and arm your billing budget alerts
$EDITOR curriculum/week-01-resource-hierarchy/README.md

Need the cost-expectations rundown or the free-trial guidance? See the README.

§ IX · Frequently Asked

Questions, anticipated.

What will this actually cost me on GCP?

Nearly everything runs inside the GCP $300 free trial (90 days for new accounts) and the always-free tier, provided you tear down nightly. A few exercises — multi-region Dataflow in Week 09, the Spanner labs in Week 11, the capstone failover drill in Week 14 — cost a few dollars each. Total course out-of-pocket beyond the trial is $30–50 if you follow teardown discipline. Billing budget alerts are Exercise 1 of Week 01; skipping them is how the course punishes you later.

Does this prep me for the Google Professional Cloud Architect cert?

Yes, but as a side-effect — not the goal. The course covers ~90% of the PCA blueprint and ~85% of the Cloud DevOps Engineer blueprint. A diagnostic practice exam runs in Week 13 and a readiness gate (>= 70% to clear) in Week 15. You'll be ready to sit the test after Week 13, but C18 does not teach to the exam.

Do I need Kubernetes experience before starting?

Yes. C18 picks up where C15 (Crunch DevOps) left off and assumes you can already write a multi-stage Dockerfile, read a kubectl describe pod to debug a CrashLoopBackOff, write a Helm chart from scratch, and author a Terraform module with for_each and a remote backend. If you cannot do those, take C15 first — C18 will overwhelm you in week three.

How does this relate to C19 (Crunch AWS)?

They're sibling clouds. The platform-engineering muscles transfer directly — IAM, VPC, IaC, container orchestration, observability, on-call. After C18 you can complete C19 in ~10 weeks rather than 15. We don't recommend taking them in parallel; the vocabulary collision is unkind.

What does the on-call commitment look like?

One synthetic on-call drill in Week 14 — you receive a paged incident, diagnose via Cloud Logging and Cloud Trace, mitigate, write the postmortem, and adjust the alert that fired. The drill is graded on postmortem quality. The capstone ships with a production-runbook.md covering five expected pager pages, alert hygiene rules, error budgets, and the cohort's no-blame postmortem template.

What hardware and accounts do I need?

Any laptop running Linux, macOS, or Windows with WSL2; a terminal; the gcloud, kubectl, terraform (or tofu), and helm CLIs. One credit card on file for a fresh GCP free-trial account and a dedicated billing account for the course. A GitHub account for Workload Identity Federation deploys. No GPU required locally — Vertex AI handles the model serving in Week 12.

§ X · Begin

Fifteen weeks from now,
you will have shipped a multi-region system.

Open the repository. Read Week 1. Arm your billing budget. Then begin.