Are you ready to join an established (but also growing and innovating!) tech company that’s making waves in the eLearning space? Are you ready to improve the continuing education process for millions of nurses, physicians, surgeons, attorneys, and other professionals across the country? We’re looking for a talented and experienced fully remote DevOps Engineer for our Path LMS team. Yes – we mean 100% remote, 100% of the time, no backsies. If you’re ready, then our team of 100+ seasoned yet friendly professionals is ready to welcome and collaborate with you!
A Little About Our Architecture
- Ruby / Rails
- Postgresql 12 / Redis / MongoDB
- Sentry, Corologix, Better Uptime for logging/alerting/monitoring/error handling
- Atlassian suite of productivity and project management tools (JIRA, Confluence, etc.)
- Github for source control
- Heroku PaaS and AWS for app hosting, DB, etc.
- Docker / Kubernetes for app containerization
- CircleCI / ArgoCD for Build Pipelines and Kubernetes Delivery
- Cloudflare for Custom Domain Management
- Terraform for Infrastructure as Code
What You’ll Be Doing
We are currently migrating our core application off Heroku to AWS to improve scaling and reduce operational spend. Our application has a unique traffic flow due to multiple, large live streamed events occurring nearing daily. We understand that the root of many of these scaling issues are buried within our code base, but you’ll be able to help us scale in the meantime in order to give us plenty of runway to fix the code. Additionally, you will be helping in all apsects of the migration of our apps off of Heroku and onto AWS.
Beyond just our immediate needs, you’ll help us scale and fine tune our architecture, infrastructure, and monitoring. You’ll use your skills to help us expand our capabilities in automating and develop tooling to support our developers as we grow our team even more. We have multiple exciting initiatives on the horizon (like deploying Kubernetes, database scaling, and streamlining our CI/CD infrastructure), and we need a passionate and experienced pro who can help scale and improve through infrastructure and automation.
Here’s a rough run-down of responsibilities:
- Design and implement a cloud-based infrastructure for existing and new applications, including our core application and supporting microservices.
- Develop automated solutions to monitor and alert on performance & stability in our cloud systems.
- Partner with engineers to envision, implement and improve the current deployment process and identify cross-project dependencies.
- Champion & implement CI/CD best practices as well as monitor for failures and enforce best practices.
- Help set standards for services and software to streamline test and release cycles and improve system maintenance.
- Support our SDLC through automation, tooling, and monitoring and help build and maintain comprehensive documentation of our infrastructure and tools
- Constantly reviewing and updating our infrastructure to ensure we are scalable and handling end user demand.
- Collaborate with the support team and engineers to troubleshoot production alerts and both addressing in the short term and preventing in the long term.
- Constantly update alert thresholds to help identify problems and reduce noise.
- Lead and coach the team on how to better monitor solutions, to ensure we have a full understanding of how features/systems are performing.
- Creating and modifying dashboards to show overall platform health.
- Ensure frameworks, and dependencies are up to date and have correct open-source licenses.
- Participate in project planning meetings to share your point of view of system options, impact, risk, and costs vs. benefits. Communicate current operational requirements and development predictions.
- Organize and participate in on-call duties for production issues. Don’t worry – you won’t be on-call all the time for all the things. We simply ask that you be included in the rotation. We activate our on-call system maybe twice a year as most incidents occur during business hours.
- 3+ years of experience in DevOps, specifically for web-based SaaS products
- 2+ years of experience with AWS
- 2+ years of experience with Docker
- 2+ years of proficiency with Kubernetes
- 2+ years of experience with CircleCI or other industry-leading CI/CD
- 2+ years of experience with ArgoCD or other Kubernetes delivery framework
- 2+ years of experience with Terraform for Infrastructure as Code
- 1+ year of experience with Cloudflare for Custom Domain Management
- Experience working with and scaling PostgreSQL, Redis, Memcache, and MongoDB
- Strong experience with observability tools, APM tools, and cloud monitoring tools
- Experience migrating between cloud providers and/or building multi-cloud environments
- Concrete understanding of DevSecOps principles
- Ruby to understand how our Ruby Application runs in the Docker Container
- excellent communication level
- currently living in Latin America.
What We’re Looking For (the intangibles that often don’t get mentioned):
- Have a strong sense of ownership over our infrastructure
- Bring new ideas and a fresh perspective to the table; provide guidance and leadership in regard to security and best practices
- Be able to absorb feedback, take direction, and operate within boundaries without ego
- Think critically and find repeatable solutions for issues and challenges
- Have excellent time management skills to meet deadlines while delivering exceptional work
- Demonstrate vision, courage, respect, and accountability
- Be a respectful, flexible team member, and overall good human
- Be authentic, engaged, and endlessly collaborative
The Nice-To-Haves (but not required):
- Experience deploying and scaling monolithic web applications built using the Ruby on Rails framework.
- Experience deploying, monitoring, and operating applications deployed on the Heroku PaaS.
- Experience with database administration, specifically Postgresql and maybe a little MongoDB
- Experience monitoring and scaling background services built with Sidekiq Pro and using Redis as a FIFO queue for jobs.
Our Recruitment Process
- 15-minute Initial Call
- 20-minute take-home skills test
- 30-minute Call with Recruiter (project, benefits, etc.)
- Interviews directly with the client (depending on the project the # of interviews may vary, this may include a code assessment)
- Final Offer!
- Work Remote Monday - Friday, 40 hours a week (no weekends)
- Vacation: 10 business days a year
- Holidays: 5 National Holidays a year
- Company Holidays: 5 Company Holidays a year (Christmas Eve, Christmas Day, New Years Eve, New Years Day, Zipdev Day)
- Major Medical Insurance
- Active Lifestyle/Gym Reimbursement
- Quarterly Home Office Reimbursement
- Performance-based Bonus
- Continuous Education Bonus
- Access to Training and Professional Development Platforms
- Did we mention its REMOTE