Senior DevOps / SRE

Secure, reliable infrastructure for cloud platforms, custody systems, and high-stakes production operations.

I’m Penghian Ang, a Senior Cold Storage Engineer at Coinbase with 7+ years across DevOps, SRE, and blockchain infrastructure. I build systems that stay calm under operational pressure, from institutional custody workflows to large-scale game infrastructure and validator fleets.

Production scale

Supported infrastructure serving 4M–10M daily concurrent users and 6,000+ VMs in live production environments.

Automation and reliability

Built automation for restoration workflows, alerting, deployments, and validator operations to reduce human error and accelerate response.

High-stakes systems

Own work spanning institutional custody, staking infrastructure, incident response, security hardening, and observability.

Selected Impact

Fast proof that the work translates to senior infrastructure roles.

6,000+ VMs

Owned live infrastructure at scale

Served as a production point-of-contact for PUBG Mobile infrastructure across a large VM estate.

4M–10M

Supported massive daily concurrency

Worked on reliability for environments used by millions of daily concurrent players.

26+ metrics

Improved validator observability

Built a custom exporter and dashboards to make Solana validator health measurable and actionable.

600+ → 140

Reduced alert noise

Reclassified noisy monitoring rules into a smaller set of actionable alerts for operators.

Experience

Recent roles with the strongest signal for senior DevOps and SRE hiring.

Focused on reliability, automation, infrastructure hardening, and blockchain operations.

Site Reliability Engineer

Valigator

Jun 2025 – Oct 2025

  • Built Solana staking infrastructure with Ansible automation to provision and operate validator fleets more consistently.
  • Developed a custom Solana exporter with 26+ validator metrics and Grafana dashboards to tighten observability for operators.

Solana · Ansible · FastAPI · Prometheus · Grafana

Senior Infrastructure Support Engineer

Thoughtworks

Jun 2024 – Sep 2025

  • Led an enterprise AWS infrastructure build for SNTC.org.sg using Terraform and CI/CD pipelines.
  • Managed deployments, monitoring, and security hardening for client-facing production systems.

AWS · Terraform · CI/CD · Production Operations

Senior Blockchain Security DevOps Engineer

Crypto.com

Dec 2023 – Dec 2024

  • Built secure infrastructure to monitor and protect blockchain systems across multiple protocols and environments.
  • Standardized validator and RPC deployments by consolidating Cosmos-family playbooks into a reusable modular template.

Blockchain Security · Ansible · Validators · RPC Operations

DevOps Engineer

Tencent

May 2022 – Dec 2023

  • Served as a production point-of-contact for PUBG Mobile infrastructure supporting 6,000+ VMs and 4M–10M daily concurrent users.
  • Reduced alert noise from 600+ to 140 actionable alerts and built alert automation later used as a reference by Tencent Overseas Games.

Large-Scale Ops · FastAPI · Alerting · Monitoring

Earlier Experience

  • UP DevLabs — DevOps Engineer (Oct 2021 – May 2022): Managed full lifecycle environments across AWS, Aliyun, and Tencent Cloud, and introduced ELK self-service log search.
  • Kenrich Partners — System Analyst (Jul 2021 – Oct 2021): Responded to phishing and compliance incidents, then helped migrate on-prem infrastructure to Azure.
  • GovTech — SNOC Engineer (Nov 2020 – Jul 2021): Supported uptime for critical government systems with AWS CloudWatch, Grafana, Splunk, and SolarWinds.
  • Netpluz Asia — NOC Engineer (Feb 2020 – Nov 2020): Provisioned and troubleshot circuits and built Python tools that reduced manual reporting by 80%.

Projects

Three projects that show how I approach infrastructure, automation, and observability.

solana-ansible-kit

Problem
Validator provisioning and upgrades were operationally fragile and too dependent on manual setup.
Solution
Built an Ansible-driven workflow that takes Solana validator fleets from bare Linux to hardened production-ready nodes.
Result
Created a reusable operating model for repeatable deployments, safer upgrades, and stronger baseline security.

Ansible · Solana · Linux · Security Hardening

solana-repro-builds

Problem
Validator binary delivery needed stronger reproducibility and trust guarantees across releases.
Solution
Implemented a CI/CD pipeline with hermetic Docker builds and checksum verification for Solana validator binaries.
Result
Made release validation more repeatable and easier for operators to verify before rollout.

Docker · CI/CD · Solana · GitHub Actions

solana-exporter

Problem
Validator operators lacked clean, actionable visibility into node health and protocol-specific metrics.
Solution
Built a FastAPI exporter that publishes 26+ Prometheus metrics and pairs them with a focused Grafana dashboard.
Result
Turned raw validator state into a monitoring surface that is easier to operate and troubleshoot.

FastAPI · Prometheus · Grafana · Python

This resume site is also an infrastructure project: static hosting on AWS S3, Terraform IaC, and Jenkins CI/CD. Source Write-up

Skills

Capabilities I use most often in production infrastructure work.

Infrastructure

  • Terraform
  • Kubernetes
  • Docker
  • Helm
  • Linux

Cloud

  • AWS
  • GCP
  • Azure
  • Tencent Cloud

Observability

  • Datadog
  • Prometheus
  • Grafana
  • ELK
  • Splunk

Automation

  • Ansible
  • Python
  • Bash
  • Jenkins
  • GitHub Actions
  • FastAPI

Blockchain

  • Solana
  • TON
  • Conflux
  • Oasis
  • Neutron
  • ZetaChain

Certifications

Validated across cloud, Kubernetes, infrastructure as code, and security.

Amazon Web Services

  • AWS Solutions Architect – Associate
  • AWS Cloud Practitioner

Kubernetes & IaC

  • Certified Kubernetes Administrator
  • HashiCorp Terraform Associate

Microsoft

  • Azure Fundamentals

Tencent Cloud

  • Solutions Architect Associate
  • SysOps Associate
  • Cloud Practitioner

Security

  • Sophos Certified Engineer

Career Timeline

Optional interactive walkthrough of the same career story.

This runner is secondary to the resume content above, but it is still a quick way to explore milestones and the tools collected along the way.

Back to Experience
2020 NOC Engineer Netpluz
2020 SNOC Engineer GovTech
2021 System Analyst Kenrich
2021 DevOps UP DevLabs
2022 DevOps Tencent
2023 Sr DevOps Crypto.com
2024 Sr Infra Thoughtworks
2025 SRE Valigator
2025 Sr Engineer Coinbase
Skills Collected 0/0

Press Space or tap to start — collect milestones and skills.