
I design, build, automate, and operate reliable cloud infrastructure.
As a DevOps / Site Reliability Engineer, I work across cloud platforms, CI/CD automation, and containerized, production-grade Kubernetes environments to turn complex deployments into repeatable, automated workflows. I enjoy building platforms teams can trust — with strong observability, secure configurations, and minimal manual effort.
My goal is simple: fewer outages, faster releases, and calm production systems.
July 2025 – Present
Migrated BFSI applications from VMware/Tanzu to Azure Kubernetes Service (AKS) with precision.
Oversaw Terraform-based infrastructure provisioning using reusable modules. Managed the setup of AKS clusters, networking, environment creation, and remote state management to guarantee consistent and controlled infrastructure modifications.
Developed and maintained reusable GitHub Actions workflows for diverse application stacks. Implemented automated continuous integration processes, including SonarQube scanning, pre-commit validation, image scanning, and secure build pipelines.
Executed Helm-based Kubernetes deployments and managed ArgoCD for GitOps-driven continuous deployment. Facilitated seamless end-to-end application deployment on AKS, from infrastructure provisioning to production rollout.
Ensured that infrastructure and deployments were automated, secure, and repeatable.
September 2023 – July 2025
Built and maintained CI/CD pipelines using Azure DevOps, handling end-to-end deployments to applications running on Azure App Service.
Implemented a Canary-based deployment strategy to safely release new features with minimal production risk.
Owned the observability stack by consolidating monitoring using Prometheus, Grafana, OpenTelemetry, and ELK. Improved system visibility and reduced monitoring overhead by centralizing logs and metrics.
Automated cost optimization using Python scripts to identify and clean up unused Azure resources, reducing unnecessary cloud spend.
Actively handled production issues, troubleshooting live incidents and resolving performance bottlenecks to maintain platform stability.
Tools and technologies I use to build scalable, reliable systems
Google · Issued May 2025
Credential ID: 145046407 · Show credential
Skills: Site Reliability Engineering · Monitoring · Incident Management · Alerting
Microsoft · Issued Feb 2025 · Expires Feb 2027
Credential ID: 213A8DA82568A82A · Show credential
Skills: Version Control · Azure DevOps · GitHub Actions · Continuous Integration and Continuous Delivery (CI/CD)
Microsoft · Issued Aug 2024 · Expires Aug 2026
Credential ID: B63507BCDC698910 · Show credential
Skills: Microsoft Azure · Compute · Storage · IAM · Networking