Role Description
A DevOps Engineer is a developer who thinks deeply about systems and how they behave in the wild. Your primary focus will be on defining, building, and maintaining our robust, observable, and scalable infrastructure. You will collaborate closely with development teams to ensure seamless integration and deployment. Your objective at Benzinga is the creation of a reliable, high-performing platform that supports the amazing products our users come to know and love.
Responsibilities
-
Infrastructure Responsibilities
-
Radiate knowledge about the service's infrastructure and reliability to the rest of the development team.
-
Identify parts of the system that do not scale, provide immediate palliative measures, and drive systemic resolution of contributing root cause(s).
-
Plan the growth of Benzinga's infrastructure.
-
Development/Deployment Responsibilities
-
Document every action so your learnings turn into repeatable actions and then into automation.
-
Improve the deployment process to make it as boring as possible.
-
Define, provision, and manage our production infrastructure using Kubernetes and cloud-native serverless deployed by way of Terraform.
-
Security Responsibilities
-
Proactively identify and reduce security risks, in alignment with ongoing SOC2 auditing and reporting.
-
Develop security training and guidance to internal development teams.
-
Ability to discover and patch SQLi, XSS, CSRF, SSRF, authentication and authorization flaws, and other web-based security vulnerabilities.
-
Knowledge of common authentication technologies including OAuth, SAML, CAs, OTP/TOTP.
-
Production Responsibilities
-
Design, build and maintain core infrastructure pieces that allow Benzinga to scale, supporting thousands of concurrent users.
-
Be on an on-call rotation to respond to benzinga.com availability incidents and provide support for service engineers with customer incidents.
-
Debug production issues across all services and levels of the stack.
-
Monitoring Responsibilities
-
Make monitoring and alerting notify on symptoms and not on outages.
-
Manage day-to-day maintenance and evolution of Benzinga's Prometheus monitoring and alerting infrastructure.
-
Bundle Prometheus monitoring as an out-of-the-box monitoring solution for Benzinga products.
-
Build and maintain the benzinga.com public monitoring gateway.
-
Help migrate our current performance monitoring solution to Prometheus.
-
Improve coverage of Benzinga performance monitoring.
-
Create automated alerts to notify team members of regression.
Qualifications
-
Strong communication skills.
-
Self-motivated with strong organizational skills.
-
Experience with some of these technologies a must: AWS/GCP, Kubernetes, Terraform, CI/CD, OpenSearch/Elasticsearch, Postgres, MySQL, Kafka, BigQuery, Python, NodeJS, Go, Java, Prometheus, Grafana, Coralogix, Varnish, Nginx, Kong.
-
You can reason about software, algorithms, and performance from a high level.
-
You have experience thinking about systems - edge cases, failure modes, behaviors, and specific implementations.
-
You have worked with distributed systems and have a solid understanding of how modern web stacks are built, and why.
-
You know your way around a *nix shell.