Staff Software Engineer @Infinity
Software Development
Salary competitive sal..
Remote Location
Job Type full-time
Posted 2d ago

[Hiring] Staff Software Engineer @Infinity

2d ago - Infinity is hiring a remote Staff Software Engineer. 💸 Salary: competitive salary commensurate with experience 📍Location: USA timezones

Role Description

We're looking for a Staff/Principal Software Engineer to own and evolve the core platform that powers our AI employees. This is a technical leadership position responsible for the systems that enable our agents to scale reliably:

  • Django backend
  • Distributed task infrastructure
  • Event-driven architecture
  • Kubernetes deployments
  • Observability stack

You'll work across the full system—from database query optimization to Helm chart tuning to designing new platform abstractions. You'll be a force multiplier for the engineering team, driving architectural decisions, eliminating scaling bottlenecks, and establishing patterns that make the platform more robust and developer-friendly. This role reports to the Director of Engineering and involves significant autonomy in shaping technical direction.

What You'll Own

  • Drive platform architecture decisions and align the team on scalable patterns and long-term maintainability
  • Review a high volume of code, design docs, and architectural proposals for scalability, reliability, security, and operability
  • Be a technical mentor and force multiplier: unblock engineers, raise the bar on production readiness, and establish platform best practices
  • Own and evolve the core backend platform (Django/DRF/ASGI) performance and correctness
  • Scale async execution across Celery + Dramatiq + Temporal/Cortex; implement resilient workflow patterns (retries, circuit breakers, graceful degradation)
  • Optimize PostgreSQL/pgvector (query tuning, connection pooling) and caching strategies
  • Maintain and improve Kubernetes deployment infrastructure (GKE, Helm, Terraform/OpenTofu) and CI/CD + rollout strategies. Own KEDA autoscaling policies and resource allocation across worker pools.
  • Own reliability of RabbitMQ, Redis, and PostgreSQL infrastructure; lead incident response and post-mortems
  • Extend OpenTelemetry + Datadog instrumentation, dashboards, alerts, and SLOs; profile and reduce latency/memory bottlenecks

Qualifications

  • 10+ years building and operating production backend systems at scale
  • Deep expertise in Python (Django preferred) and relational databases (PostgreSQL)
  • Hands-on experience with Kubernetes, Helm, and cloud infrastructure (GCP preferred)
  • Strong background in distributed systems: message queues, event sourcing, workflow orchestration
  • Production experience with async task systems (Celery, Dramatiq, or similar)
  • Track record of debugging complex production issues across multiple services
  • Ability to work autonomously and drive technical initiatives without close supervision
  • Clear technical communication—able to explain tradeoffs and build consensus

Requirements

  • Experience with Temporal or similar workflow engines
  • Background in LLM infrastructure, RAG systems, or AI/ML platforms
  • Familiarity with OpenTelemetry, Datadog, or similar observability stacks
  • Experience with KEDA or other Kubernetes autoscaling solutions
  • Contributions to multi-tenant SaaS platform architecture
  • History of improving developer experience and platform abstractions

What Success Looks Like

  • Platform services maintain high availability with predictable performance under load
  • Scaling bottlenecks are identified and resolved proactively
  • New features ship faster because platform primitives are well-designed and documented
  • Incidents are rare, quickly detected, and thoroughly addressed
  • Engineers across the team adopt platform patterns and best practices
  • Technical debt is systematically identified and paid down
  • You're a trusted technical voice in architectural discussions

Compensation & Logistics

  • Compensation: Competitive salary commensurate with experience (Staff/Principal level)
  • Location: Remote
  • Type: Full-time
  • Requirements: Overlap with Americas timezones for collaboration; reliable high-speed internet
Before You Apply
remote Be aware of the location restriction for this remote position: USA timezones
Beware of scams! When applying for jobs, you should NEVER have to pay anything. Learn more.
Staff Software Engineer @Infinity
Software Development
Salary competitive sal..
Remote Location
Job Type full-time
Posted 2d ago
Apply for this position
Did not apply
Applied
Sent Follow-Up
Interview Scheduled
Interview Completed
Offer Accepted
Offer Declined
Unlock 152,720 Remote Jobs
remote Be aware of the location restriction for this remote position: USA timezones
Beware of scams! When applying for jobs, you should NEVER have to pay anything. Learn more.
Apply for this position
Did not apply
Applied
Sent Follow-Up
Interview Scheduled
Interview Completed
Offer Accepted
Offer Declined
Unlock 152,720 Remote Jobs