Professional Summary

Passionate Cloud & DevOps Engineer with expertise in AWS services, Infrastructure as Code, and CI/CD automation. Experienced in designing and implementing scalable, secure cloud architectures using Terraform, GitHub Actions, and serverless-first principles. AWS Certified Solutions Architect – Associate with a strong focus on FinOps, observability, and production reliability.

Technical Skills

Cloud Platforms

  • Amazon Web Services (AWS)
  • S3, Lambda, API Gateway, CloudFront
  • ECS Fargate, EKS, RDS, DynamoDB
  • VPC, IAM, WAF, Secrets Manager
  • CloudWatch, EventBridge, Bedrock

Infrastructure as Code

  • Terraform (modular, multi-environment)
  • Remote state & workspace management
  • CloudFormation

DevOps & CI/CD

  • GitHub Actions (OIDC, multi-stage)
  • Docker & container orchestration
  • Kubernetes (EKS + Karpenter)
  • Git — branching, PRs, code review

FinOps & Cost Optimization

  • Automated resource cleanup pipelines
  • Monthly savings reporting
  • Right-sizing & Fargate Spot
  • RDS auto-pause & log retention

Security & Observability

  • IAM least-privilege design
  • VPC isolation, WAF, SSL/TLS
  • CloudWatch, Prometheus, Grafana
  • Health checks & alerting

Languages & Config

  • Python, Bash scripting
  • HCL (Terraform), YAML
  • Linux systems administration

Featured Projects

AI-Assisted SRE Incident Analysis System

  • Step Functions
  • Bedrock
  • Lambda
  • Terraform
  • Python 3.11

Event-driven AWS serverless pipeline that automatically detects infrastructure issues via CloudWatch Alarms, runs three parallel data collectors (metrics, logs, deploy context), and feeds the results to Amazon Bedrock (Claude) to generate root-cause hypotheses with confidence levels — advisory-only, no auto-remediation.

  • Parallel fan-out via Step Functions: Metrics Collector, Logs Collector & Deploy Context Collector run simultaneously to minimise latency
  • AI analysis with Bedrock (Claude): root-cause hypotheses, supporting evidence, and actionable remediation recommendations
  • Multi-channel notifications: Slack webhooks + SNS email with rich formatting
  • 90-day DynamoDB incident store with TTL, queryable by resource ARN, severity, or time range
  • Full observability: structured JSON logging, X-Ray distributed tracing, custom CloudWatch metrics

MCP DevOps Mentor

  • Docker
  • Python
  • MCP
  • GitHub API
  • SQLite

Dockerized MCP (Model Context Protocol) server that acts as a senior DevOps & Cloud mentor inside your IDE. Parses live GitHub repos, CI/CD pipelines, and Terraform files to give structured, production-grade feedback — teaching how to think like a DevOps engineer, not just what to do.

  • 11 MCP tools: repo analysis, CI/CD review, Terraform audit, AWS cost advisor, skill tracker, learning path engine
  • 4 review modes — Mentor, Review, Debug, Interview — each changes tone and depth of feedback
  • CI/CD engine checks: OIDC vs long-lived creds, action pinning, missing permissions blocks, job timeouts, concurrency groups
  • Terraform engine (HCL2 parsing): flags hardcoded secrets, missing remote backend, IAM wildcards, S3 without encryption
  • Skill tracker persists progress in SQLite and generates a personalised learning roadmap

Aura — AI-Driven EKS GPU Autoscaler

  • EKS
  • Karpenter
  • Terraform
  • Kubernetes
  • GitHub Actions

Modular Terraform + GitHub Actions system for ephemeral AWS EKS clusters that spin up on demand, auto-scale GPU nodes with Karpenter, run batch AI workloads (Kubernetes Jobs), collect results, then self-destruct — eliminating idle cluster costs entirely.

  • Full lifecycle orchestration: Deploy → Run → Collect → Destroy via a single CI/CD pipeline
  • Karpenter provisioner with CPU & GPU NodePool support for LLM inference workloads
  • Modular IaC: VPC, EKS, IAM, IAM_EKS, and OIDC provider as independent Terraform modules
  • OIDC identity federation — GitHub Actions as trusted identity provider, zero static AWS keys

FinOps Zombie Hunter

  • Lambda
  • EventBridge
  • Terraform
  • Python 3.12
  • GitHub Actions

Automated AWS cost-optimization engine that hunts orphaned "zombie" resources (EBS volumes, NAT Gateways, RDS instances, Elastic IPs) across every enabled region, triggered by EventBridge on a weekly Sunday cron schedule.

  • CloudWatch metrics-based detection — flags resources by actual usage patterns, not just status
  • Cross-region: dynamically fetches all enabled AWS regions and audits each independently
  • Shift-left security: tfsec + flake8 in CI pipeline before any deployment
  • Modular IaC using Terraform moved blocks for zero-destroy refactoring
  • Sample output: {"estimated_monthly_savings": "$347.50"}

Enterprise 3-Tier Containerized App on AWS

  • ECS Fargate
  • RDS PostgreSQL
  • React
  • Node.js
  • Terraform
  • Docker

Production-ready containerized web app: React (TypeScript + Nginx) frontend, Node.js (Express) backend, and PostgreSQL 15 on RDS Multi-AZ — deployed on ECS Fargate behind an ALB with full VPC isolation, enterprise security, and complete observability.

  • 3-tier: ALB → ECS Fargate frontend → ECS Fargate backend → RDS PostgreSQL (private subnet)
  • Security: Secrets Manager, WAF, SSL/TLS termination at ALB, security groups per tier
  • Observability: Prometheus metrics, CloudWatch logs, health checks, and alerting
  • Cost-optimised: Fargate Spot, RDS auto-pause, log retention policies, right-sizing
  • Performance targets: <2s page load, <500ms API response, 99.9% uptime

Cloud Resume Challenge This Website

  • S3
  • CloudFront
  • Lambda
  • DynamoDB
  • Terraform

Serverless resume website with a live visitor counter. Static assets served from S3 via CloudFront OAC, visitor count stored in DynamoDB and incremented atomically by a Python Lambda behind API Gateway — full IaC with modular Terraform, OIDC-based CI/CD, and custom domain at www.ericchiu.page.

Professional Experience

Assistant Manager

OneZo 2018 – 2021
  • Team Leadership: Managed daily operations and supervised a team of 6–10 staff, coordinating task assignments, scheduling, and performance feedback to maintain consistent service quality
  • Operational Decision-Making: Acted as the on-site decision-maker in the manager’s absence, triaging issues in real time and keeping operations running smoothly — a mindset directly applicable to incident response and on-call workflows
  • Process Improvement: Identified bottlenecks in daily workflows and implemented streamlined procedures for inventory, opening/closing, and order fulfillment — mirroring continuous improvement principles in DevOps
  • Training & Mentorship: Onboarded and trained new team members on SOPs, food safety standards, and customer service best practices — building habits around documentation and knowledge transfer
  • Cross-functional Coordination: Collaborated with the owner on supply chain logistics, staffing plans, and quality control, balancing competing priorities under tight timelines

Shift Leader

Teazzi 2021 – 2024
  • Shift Ownership: Led end-to-end shift operations including opening/closing, cash handling, and quality assurance — taking full accountability for outcomes, similar to owning a deployment pipeline from build to production
  • Team Coordination: Directed a team of 4–6 baristas during high-volume periods, delegating tasks and adjusting roles on the fly to maintain throughput and service standards
  • Problem Solving Under Pressure: Resolved equipment failures, supply shortages, and customer escalations in real time while keeping the team focused — building the composure and triage skills essential for production incident management
  • Communication & Collaboration: Maintained clear handoff notes between shifts and communicated operational updates to management, reinforcing the structured communication habits valued in DevOps and SRE teams

Education & Certifications

Computer Engineering

University of Massachusetts Boston

Certifications

  • AWS Certified Solutions Architect – Associate (SAA-C03)
  • AWS Cloud Practitioner
  • HashiCorp Terraform Associate — In Progress

Visitor Counter

This resume has been viewed Loading… times

Powered by AWS Lambda & DynamoDB