Senior Lead – Site Reliability Engineer Java / APIs @Kyndryl
DevOps / Sysadmin
Salary unspecified
Remote Location
Job Type full-time
Posted 2d ago

[Hiring] Senior Lead – Site Reliability Engineer Java / APIs @Kyndryl

2d ago - Kyndryl is hiring a remote Senior Lead – Site Reliability Engineer Java / APIs. 💸 Salary: unspecified 📍Location: Worldwide

Role Description

To ensure the stability, availability, resilience, and scalability of critical systems, guaranteeing that Java applications, microservices, APIs, and batch chains run according to schedule and reliability objectives, with evidence, traceability, proactive monitoring, and failure recovery, while leading operations under SRE, DevOps, and continuous improvement principles.

  • Reliability, Incident, and AMS Operations Management
    • Lead the management of critical incidents and major problems, ensuring root cause analysis (RCA) and remediation plans.
    • Oversee the operation and reliability of batch jobs and process chains (Java and SAP) using Control‑M, including dependencies, calendars, alerts, and SLAs.
    • Ensure continuous monitoring of critical transactional systems (Salesforce, Tandem, OmniPayments, or equivalent).
    • Define and validate escalation, communication, and resolution criteria according to the AMS operating model.
    • Ensure proper ticket management in ServiceNow / Jira, including evidence‑based closure and SLA compliance.
  • Reliability Engineering and Advanced Technical Analysis
    • Design, implement, and evolve Site Reliability Engineering practices, including SLIs, SLOs, and SLAs.
    • Operational automation and toil reduction.
    • Analyze complex failures across Java applications, microservices, integrations, and cloud platforms.
    • Validate end‑to‑end flows, system dependencies, and single points of failure.
    • Lead post‑incident reviews and propose structural reliability improvements.
  • Technical Leadership and Cross‑Functional Coordination
    • Act as the technical lead for the AMS service, guiding support, operations, and development engineers.
    • Coordinate with Java development teams, architects, DevOps, database, network, and security teams.
    • Supervise vendor and technology partner activities.
    • Ensure alignment between business needs, software architecture, and production operations.
    • Participate in release planning, change windows, and post‑deployment stabilization.
  • Platforms, Development, and Architecture
    • Provide expert support and technical leadership in Java 8/11/17, Spring Boot, Microservices, and REST/SOAP APIs.
    • Messaging platforms (Kafka).
    • Integrate cloud applications (Azure / GCP) with legacy systems.
    • Apply design patterns, development best practices, and corporate standards.
    • Use quality and security tools such as SonarQube, BlackDuck, Fortify, AquaSec.
    • Version control and collaboration using GIT, Bitbucket.
    • Implement and mature DevOps and CI/CD practices: Docker, Jenkins, Shell scripting.
  • Monitoring, Backups, and Service Continuity
    • Oversee monitoring, alerting, and observability strategies.
    • Ensure correct execution of backup and restore processes: CommVault Simpana, Veeam Backup & Replication.
    • Validate recovery testing and service continuity plans.
    • Ensure compliance with security and operational policies.
  • Documentation, Continuous Improvement, and Governance
    • Maintain up‑to‑date technical and operational documentation, including architecture diagrams, procedures, runbooks, and postmortems.
    • Identify improvement opportunities in reliability, performance, security, and cost optimization.
    • Drive automation and service standardization initiatives.
    • Ensure compliance with methodologies, best practices, and IT governance guidelines.

Qualifications

  • Technical Knowledge
    • Operating systems: Windows, Linux, Unix (commands, log analysis, shell).
    • Networking and security: Networking concepts, cybersecurity fundamentals, and access control.
    • Databases: SQL Server, Oracle, PL/SQL.
    • Development and platforms: Java 8/11/17, Spring Boot, REST/SOAP APIs, Kafka, Microservices architectures.
    • Cloud: Azure and/or GCP.
    • DevOps / SRE: Docker, Jenkins, CI/CD pipelines, automation.
    • Scheduling and operations: Control‑M.
    • ITSM: ServiceNow, Jira.
    • Quality and security: SonarQube, BlackDuck, Fortify, AquaSec.
  • Soft Skills
    • Technical leadership and decision‑making under pressure.
    • Clear communication with technical teams and business stakeholders.
    • Structured analysis and resolution of complex problems.
    • Organization, self‑management, and prioritization in 24/7 environments.
    • Collaborative mindset and cross‑functional influence.
    • Results‑driven approach with a focus on reliability and continuous improvement.

Requirements

  • Bachelor’s or Engineering degree in:
    • Systems Engineering
    • Computer Science
    • Information Technology
    • Software Engineering or related fields.
  • Preferred:
    • Java certifications
    • ITIL / ITSM
    • Cloud certifications (Azure / GCP)
    • DevOps / SRE certifications
  • Senior Lead / Expert profile: 8+ years of experience in IT, with strong focus on:
    • Production support
    • AMS operations
    • Java development and architecture
    • Reliability and availability of mission‑critical systems
  • Preferred:
    • Experience in retail / El Palacio de Hierro
    • Participation in implementations, stabilizations, and migrations
    • Availability for 24/7 operations, on‑call rotations, and change windows
    • Proven experience leading technical teams

Benefits

  • Dynamic, hybrid-friendly culture that supports well-being and empowers growth.
  • Be Well programs designed to support financial, mental, physical, and social health.
  • Impactful work that powers the systems customers rely on every day.
  • Personalized development goals aligned with ambitions and continuous feedback.
  • Access to cutting-edge learning opportunities, including certifications with Microsoft, Google, and Amazon.
  • Culture that values empathy, restless learning, and shared success.
Before You Apply
worldwide Be aware of the location restriction for this remote position: Worldwide
Beware of scams! When applying for jobs, you should NEVER have to pay anything. Learn more.
Senior Lead – Site Reliability Engineer Java / APIs @Kyndryl
DevOps / Sysadmin
Salary unspecified
Remote Location
Job Type full-time
Posted 2d ago
Apply for this position
Did not apply
Applied
Sent Follow-Up
Interview Scheduled
Interview Completed
Offer Accepted
Offer Declined
Unlock 152,720 Remote Jobs
worldwide Be aware of the location restriction for this remote position: Worldwide
Beware of scams! When applying for jobs, you should NEVER have to pay anything. Learn more.
Apply for this position
Did not apply
Applied
Sent Follow-Up
Interview Scheduled
Interview Completed
Offer Accepted
Offer Declined
Unlock 152,720 Remote Jobs
×

Apply to the best remote jobs
before everyone else

Access 152,720+ vetted remote jobs and get daily alerts.

4.9 ★★★★★ from 500+ reviews
Unlock All Jobs Now

Maybe later