[Hiring] Staff Database Platform Reliability Engineer @Rackspace Technology
Staff Database Platform Reliability Engineer @Rackspace Technology
All Others
Salary unspecified
Remote Location
Employment Type full-time
Posted 4d ago

[Hiring] Staff Database Platform Reliability Engineer @Rackspace Technology

4d ago - Rackspace Technology is hiring a remote Staff Database Platform Reliability Engineer. πŸ’Έ Salary: unspecified πŸ“Location: India

Role Description

We are seeking an experienced SRE/DBRE to ensure reliability, performance, scalability, and operational excellence of our multi-cloud DBaaS platform across:

  • Microsoft Azure
  • Amazon Web Services
  • Google Cloud Platform

This role combines deep database expertise with SRE principles to build highly available, automated, and resilient database platforms. The DBRE Lead will drive operational standards, automation frameworks, and reliability engineering practices across distributed cloud environments.

Qualifications

  • 8-10+ years in DBA / Platform Engineering
  • Strong multi-cloud experience (Azure / AWS / GCP – at least two)
  • Deep HA/DR & performance tuning expertise
  • Automation-first mindset (Terraform, scripting, CI/CD)
  • Experience in SaaS/DBaaS environments preferred

Requirements

  • Database Administration (DBA) Skills
    • Primary Database: MySQL
    • Secondary Database: PostgreSQL, SQLServer
    • Database Backup & Recovery: Tools and strategies for database backups and disaster recovery.
    • Performance Tuning: Query optimization, indexing strategies, and database performance troubleshooting.
    • Database Security: User management, roles, access control, and auditing.
  • Cloud Infrastructure Knowledge (DBaaS)
    • Cloud Platforms: AWS (RDS, Aurora), Azure (Cosmos DB, SQL Database), GCP (Cloud SQL, Firestore).
    • Infrastructure as Code (IaC): Terraform, CloudFormation, Kubernetes.
    • Kubernetes & Containers: Running databases in containers (like Kubernetes).
    • Observability Tools: ELK stack (Elasticsearch, Logstash, Kibana)
    • Database Migration: Migrating databases across different platforms or cloud environments.
    • Database Scaling: Vertical and horizontal scaling techniques in cloud environments.
  • SRE Principles (Site Reliability Engineering)
    • Incident Management: Handling database outages, incident response, and on-call rotations.
    • Monitoring and Alerting: Tools like Prometheus, Grafana, Datadog, CloudWatch.
    • Service Level Objectives (SLOs) / Service Level Agreements (SLAs): Ensuring uptime and performance targets.
    • Disaster Recovery Planning: Ensuring high availability (HA) and disaster recovery (DR) solutions.
  • Scripting and Automation
    • Scripting Languages: Python, Shell scripting, Bash, PowerShell.
    • Automation Tools: Ansible, Puppet, Chef.
    • Infrastructure Automation: Automating database deployment, patching, and scaling.
  • Networking and Infrastructure
    • Networking Basics: TCP/IP, DNS, Firewall, Load Balancers.
    • Database Connectivity: Connection pooling, failover strategies, and multi-region deployment.
    • Storage and Disk Management: Understanding IOPS, latency, and throughput.
  • Expertise in Linux OS (RHEL, Ubuntu, Centos)
    • Understanding of file systems (ext4, XFS, etc.), permissions, and ownership (chmod, chown, ACLs).
    • Knowledge of process monitoring, management, and troubleshooting (ps, top, htop, kill, pkill, etc.).
    • Proficiency with tools like top, htop, vmstat, iostat, sar, and dstat to monitor CPU, memory, disk I/O, and network usage.
    • Ability to analyze system logs (/var/log/, journalctl, dmesg) for troubleshooting.
    • Understanding of resource limits (CPU, memory, disk, network) and how they impact database performance.
    • Knowledge of partitioning tools (fdisk, parted) and file system management (mkfs, mount, umount).
    • Understanding of RAID configurations and Logical Volume Management (LVM) for storage scalability.
  • Troubleshooting and Debugging
    • Log Analysis: Reading and analysing database and system logs.
    • Root Cause Analysis (RCA): Performing in-depth analysis after major incidents.
    • Query Performance: Analysing slow queries, deadlocks, and resource contention.
  • Soft Skills
    • Communication Skills: Clear communication with stakeholders and engineering teams.
    • Problem-Solving: Ability to troubleshoot complex database issues under pressure.
    • Collaboration: Working closely with DevOps, Infrastructure, and Engineering teams.

Company Description

We are the multicloud solutions experts. We combine our expertise with the world’s leading technologies β€” across applications, data and security β€” to deliver end-to-end solutions. We have a proven record of advising customers based on their business challenges, designing solutions that scale, building and managing those solutions, and optimizing returns into the future.

Named a best place to work, year after year according to Fortune, Forbes and Glassdoor, we attract and develop world-class talent. Join us on our mission to embrace technology, empower customers and deliver the future.

Before You Apply
️
remote Be aware of the location restriction for this remote position: India
β€Ό Beware of scams! When applying for jobs, you should NEVER have to pay anything. Learn more.
Staff Database Platform Reliability Engineer @Rackspace Technology
All Others
Salary unspecified
Remote Location
Employment Type full-time
Posted 4d ago
Apply for this position
Did not apply βœ“
Applied βœ“
Sent Follow-Up βœ“
Interview Scheduled βœ“
Interview Completed βœ“
Offer Accepted βœ“
Offer Declined βœ“
Unlock 145,000+ Remote Jobs
️
remote Be aware of the location restriction for this remote position: India
β€Ό Beware of scams! When applying for jobs, you should NEVER have to pay anything. Learn more.
Apply for this position
Did not apply βœ“
Applied βœ“
Sent Follow-Up βœ“
Interview Scheduled βœ“
Interview Completed βœ“
Offer Accepted βœ“
Offer Declined βœ“
Unlock 145,000+ Remote Jobs
Γ—

Apply to the best remote jobs
before everyone else

Access 145,000+ vetted remote jobs and get daily alerts.

4.9 β˜…β˜…β˜…β˜…β˜… from 500+ reviews
Unlock All Jobs Now

Maybe later