Role Description
You will be the primary owner of Mad Monkey's cloud infrastructure, deployment pipelines, and platform security. You will work closely with the Head of Technology and the engineering team to harden our current environment, drive the platform to properly isolated network architecture, and build the operational foundation that the product team builds on top of.
In your first 30 days, you will:
-
Audit and remediate the most critical open infrastructure security items
-
Establish visibility into all running workloads across Kubernetes, DigitalOcean droplets, and Cloudflare
-
Get familiar with our CI/CD pipelines, deployment processes, and environment configuration
The role can be based in any location within South East Asia.
Key Responsibilities
-
Infrastructure & Cloud
-
Own and operate all DigitalOcean infrastructure: Kubernetes cluster (3-node SGP1), managed droplets, VPC configuration, firewall rules, and databases
-
Manage Cloudflare configuration across all zones: DNS, proxying, Zero Trust access, SSL/TLS, WAF rules, and Workers
-
Drive the VPC migration to move internal services off public network exposure
-
Maintain and improve infrastructure-as-code practices (Terraform or equivalent)
-
Manage all environment secrets, API key rotation schedules, and credential hygiene across services
-
Security
-
Own the infrastructure security posture end-to-end β firewalls, network segmentation, access control, secrets management
-
Implement and maintain Zero Trust access for internal tools (n8n, Plane, Hasura, Kubernetes dashboard)
-
Define and enforce Cloudflare security policies across all domains
-
Establish server-level monitoring and intrusion detection
-
Conduct regular reviews of open ports, service exposure, and dependency vulnerabilities
-
Own the incident response process for infrastructure-level security events
-
Kubernetes & Containers
-
Manage and harden the Kubernetes cluster: RBAC, network policies, pod security standards, ingress configuration
-
Build and maintain Docker images and container registries
-
Define resource requests/limits, HPA policies, and cluster autoscaling
-
Implement proper secrets management within the cluster (Sealed Secrets, External Secrets, or equivalent)
-
CI/CD & Developer Experience
-
Own and improve CI/CD pipelines for all services (backend API, Next.js web app, mobile app builds)
-
Reduce deployment friction for the engineering team while maintaining gates for security and quality
-
Manage environment promotion across development, staging, and production
-
Reliability & Observability
-
Implement and maintain monitoring, alerting, and log aggregation across all services
-
Define and report on uptime and error-rate metrics for critical services
-
Own backup schedules and disaster recovery procedures for databases and stateful services
-
Lead post-mortems on infrastructure incidents and drive preventative improvements
Qualifications
-
3-6 years of hands-on cloud infrastructure experience in a production environment
-
Strong Kubernetes experience: cluster administration, RBAC, network policies, ingress, Helm
-
Proven cloud security mindset: firewall rules, VPC design, secrets management, least-privilege access
-
Experience with DigitalOcean, AWS, GCP or equivalent cloud providers
-
Cloudflare configuration: DNS, proxying, SSL/TLS, WAF, Zero Trust Access
-
Comfortable working across Linux servers, shell scripting, and infrastructure automation
-
Experience managing CI/CD pipelines (GitHub Actions, GitLab CI, or equivalent)
Requirements
-
Infrastructure as Code experience (Terraform, Pulumi, or equivalent)
-
Experience hardening Kubernetes clusters in production environments
-
Familiarity with VPN/Zero Trust tooling (Cloudflare Access, Tailscale, WireGuard)
-
Monitoring and observability stack experience (Grafana, Prometheus, Wazuh, or equivalent)
-
Understanding of PostgreSQL administration, backup strategy, and connection pooling
-
Experience with RabbitMQ or other message brokers in production
-
Knowledge of container security scanning and supply chain hardening
Nice to Have
-
Experience in a startup or small engineering team where you were the primary infrastructure owner
-
Familiarity with n8n or similar workflow automation platforms
-
Security certifications (CKS, AWS Security Specialty, or equivalent)
Benefits
-
Work with an international team of passionate, purpose-driven people.
-
Be part of a fast-growing global travel brand with a social impact mission.
-
Enjoy travel perks across our hostels and the chance to see your ideas come to life.
-
Flexible work setup and a culture that values creativity, adventure, and community.