[Hiring] Designated Service Engineer - Ceph Expert @WEKA
Designated Service Engineer - Ceph Expert @WEKA
Software Development
Salary unspecified
Remote Location
Employment Type full-time
Posted 1mth ago

[Hiring] Designated Service Engineer - Ceph Expert @WEKA

1mth ago - WEKA is hiring a remote Designated Service Engineer - Ceph Expert. 💸 Salary: unspecified 📍Location: Argentina

Role Description

This is a customer-facing Premium Services role that combines deep, hands-on Ceph architecture and administration with the high-touch, outcome-driven approach of a Senior Designated Services Engineer (DSE). You will be the primary Ceph subject matter expert for assigned strategic customers and internal initiatives, owning the design, deployment, lifecycle operations, and performance of Ceph-based object storage environments. In parallel, you will play a key role in ensuring WEKA's Customer Success, contributing to our five-star Gartner reviews. You will work with cutting-edge technologies and top-tier customers, providing technical expertise and strengthening customer relationships.

Collaborating closely with Account Teams, you will gain deep insight into customers' business requirements, technical needs, and system environments. Your role involves resolving technical issues, bridging gaps between customers and Engineering, and ensuring the highest level of service.

Ceph Architecture & Operations

  • Architect, deploy, and operate large-scale production Ceph clusters supporting S3 with an emphasis on availability, performance, and operational simplicity.
  • Own cluster lifecycle activities: upgrades, patching, configuration management, routine health checks, and proactive risk remediation.
  • Troubleshoot complex issues across the Ceph stack, lead incident response and root-cause analysis.
  • Establish and maintain runbooks, operational best practices, and customer-facing documentation; drive continuous improvement in reliability, observability, and automation.
  • Partner with customer teams on security and compliance requirements.
  • Advise on hardware and topology choices to meet workload requirements.

Designated Services Engineering

  • Serve as the primary technical liaison between customers and WEKA Engineering/Product to address feature gaps, reliability concerns, and documentation improvements.
  • Own, track, and document customer issues via the ticketing system; drive issues to resolution with clear, timely communication and executive-ready updates when needed.
  • Proactively monitor customer environments (Ceph and WEKA) using observability and remote monitoring tools to identify and remediate risks before they impact production.
  • Support account teams (Customer Success, Sales Engineering, Partners/Resellers) with deep technical expertise and credibility in front of senior customer stakeholders.
  • Contribute to knowledge sharing through internal and customer-facing documentation (FAQs, KB articles, runbooks) and repeatable troubleshooting playbooks.
  • Manage multiple engagements and cases concurrently, balancing urgency, impact, and long-term customer outcomes.
  • Participate in on-call and follow-the-sun support rotations as required; work occasional alternative hours (nights/weekends/holidays) and travel as needed.

Learning & Growth at WEKA

  • Ramp on WEKA’s architecture, tooling, and support model, and progressively take ownership of designated services engagements beyond Ceph.
  • Develop deeper expertise in S3-compatible object storage concepts and ecosystems (clients, load balancing, performance testing, multi-tenancy), with mentorship from WEKA SMEs.
  • Partner with internal teams to improve product supportability and operational excellence for object-storage use cases.

Qualifications

  • 10+ years in customer-facing technical roles solving complex enterprise infrastructure issues.
  • 5+ years of hands-on Ceph experience in production: cluster design, deployment, upgrades, and day-2 operations.
  • Strong understanding of Ceph internals and operational mechanics: MON quorum, MGR active/standby, OSD behavior, CRUSH and CRUSH maps, pools and placement groups (PGs), recovery/backfill and rebalancing.
  • Experience operating large-scale (multi-PB) Ceph environments and navigating the operational challenges of fleet size, PG scaling, and long-running recovery events.
  • Practical experience with Ceph RGW and S3 concepts (buckets, users/tenants, load balancing, scaling patterns, performance troubleshooting).
  • Expertise in Linux/Unix administration in multi-platform, distributed environments.
  • Strong troubleshooting skills across hardware, OS, networking, and distributed storage layers (including diagnosing performance bottlenecks and failure scenarios).
  • Deep understanding of networking (Infiniband, Ethernet, DPDK, UCX), cloud computing, and distributed storage.
  • Experience with observability and monitoring stacks (Prometheus/Grafana and common log/metrics tooling).
  • Proficiency in Python and/or Bash; comfort building automation for monitoring, diagnostics, and repeatable operational tasks.
  • Excellent written and verbal communication skills, with the ability to explain complex technical topics to both technical and non-technical stakeholders.

Requirements

  • Experience with Kubernetes/Containers and/or cloud platforms (AWS, Azure, OCI, GCP) in storage-heavy environments.
  • Familiarity with Jira, Confluence, Slack, and collaborating across Support, Engineering, and Product teams.
  • Experience supporting HPC or AI/ML infrastructure (GPU clusters, high-throughput networking, performance benchmarking).
  • Experience with infrastructure-as-code or config management (Ansible, Terraform, etc.).
  • Strong technical writing skills and a habit of creating reusable runbooks and playbooks.

What Success Looks Like

  • Customers view you as a trusted technical leader for Ceph and object storage, and they proactively engage you for architectural guidance and operational reviews.
  • You can independently assess Ceph cluster health, identify risk, and lead remediation plans that improve availability and performance.
  • You deliver consistent, high-quality customer communications and drive issues to resolution while partnering effectively with internal teams (CS, Product and Engineering).
  • You build durable artifacts (runbooks, dashboards, automation, postmortems) that raise the operational maturity of customer environments and WEKA’s supportability.
  • You ramp quickly on WEKA’s products and processes and expand your ownership beyond Ceph into broader designated services engagements.

The WEKA Way

  • We are Accountable: We take full ownership, always–even when things don’t go as planned. We lead with integrity, show up with responsibility & ownership, and hold ourselves and each other to the highest standards.
  • We are Brave: We question the status quo, push boundaries, and take smart risks when needed. We welcome challenges and embrace debates as opportunities for growth, turning courage into fuel for innovation.
  • We are Collaborative: True collaboration isn’t only about working together. It’s about lifting one another up to succeed collectively. We are team-oriented and communicate with empathy and respect. We challenge each other and conduct positive conflict resolution. We are being transparent about our goals and results. And together, we’re unstoppable.
  • We are Customer Centric: Our customers are at the heart of everything we do. We actively listen and prioritize the success of our customers, and every decision we make is driven by how we can better serve, support, and empower them to succeed. When our customers win, we win.
Before You Apply
remote Be aware of the location restriction for this remote position: Argentina
Beware of scams! When applying for jobs, you should NEVER have to pay anything. Learn more.
Designated Service Engineer - Ceph Expert @WEKA
Software Development
Salary unspecified
Remote Location
Employment Type full-time
Posted 1mth ago
Apply for this position
Did not apply
Applied
Sent Follow-Up
Interview Scheduled
Interview Completed
Offer Accepted
Offer Declined
Unlock 160,000+ Remote Jobs
remote Be aware of the location restriction for this remote position: Argentina
Beware of scams! When applying for jobs, you should NEVER have to pay anything. Learn more.
Apply for this position
Did not apply
Applied
Sent Follow-Up
Interview Scheduled
Interview Completed
Offer Accepted
Offer Declined
Unlock 160,000+ Remote Jobs
×

Apply to the best remote jobs
before everyone else

Access 160,000+ vetted remote jobs and get daily alerts.

4.9 ★★★★★ from 500+ reviews
Unlock All Jobs Now

Maybe later