Back to Remote jobs > Devops > reliability engineer

Staff Site Reliability & DevOps Engineer - Observability @Brandwatch

Devops

Salary unspecified	Remote Location Bulgaria
Employment Type full-time	Posted 3wks ago

[Hiring] Staff Site Reliability & DevOps Engineer - Observability @Brandwatch

3wks ago - Brandwatch is hiring a remote Staff Site Reliability & DevOps Engineer - Observability. 💸 Salary: unspecified 📍Location: Bulgaria

Role Description

This role focuses on designing, operating, and evolving observability platforms with a strong emphasis on metrics, logging, and alerting. The primary tooling is Grafana and Prometheus, with responsibility for ensuring production systems are observable, reliable, and operable at scale. The role works closely with platform, infrastructure, and application teams.

Design, build, and operate observability platforms based on Grafana and Prometheus
Define and maintain metrics standards, dashboards, alerts, and SLOs
Improve signal quality: reduce alert noise, tune thresholds, and improve runbooks
Support incident response by providing actionable telemetry and post-incident analysis
Integrate metrics, logs, and traces across distributed systems
Work with engineering teams to instrument services correctly
Automate observability configuration using infrastructure as code
Contribute to reliability improvements through capacity planning and performance analysis

Qualifications

Strong experience with Prometheus (scraping, federation, recording rules, alerting)
Strong experience with Grafana (dashboards, alerting, templating, RBAC)
Solid Linux and networking fundamentals
Experience running observability stacks in Kubernetes environments
Infrastructure as code experience (Terraform preferred)
Familiarity with incident management and on-call practices
Ability to debug production systems using metrics and logs

Requirements

Experience with logs and traces (e.g. Loki, Tempo, OpenTelemetry)
Experience operating large-scale or multi-cluster Kubernetes platforms
Experience with cloud platforms (GCP, AWS, OCI)
Exposure to SRE concepts such as error budgets and SLO-driven prioritisation

Benefits

Engineers trust dashboards and alerts to reflect system health
Incidents are detected earlier and diagnosed faster
Alert fatigue is reduced and on-call quality improves
Observability is treated as a first-class platform capability

Similar Remote Jobs

Senior DevOps Engineer • Lemon.io Lemon.io

Devops Americas Europe Asia Oceania

Featured
Apply See more >

Kickstart Your Job Search

⚡ 12,826 remote jobs added this week

You're seeing 0.4% of available roles

Unlock 150,000+ jobs →

Meet JobCopilot: Your Personal Al Job Hunter

Automatically Apply to Remote Jobs

Try it now →

Before You Apply

️

	Be aware of the location restriction for this remote position: Bulgaria
‼	Beware of scams! When applying for jobs, you should NEVER have to pay anything. Learn more.

Back to Remote jobs > Devops > reliability engineer

Staff Site Reliability & DevOps Engineer - Observability @Brandwatch

Devops

Salary unspecified	Remote Location Bulgaria
Employment Type full-time	Posted 3wks ago

Apply for this position

Unlock 150,000+ Remote Jobs

️

	Be aware of the location restriction for this remote position: Bulgaria
‼	Beware of scams! When applying for jobs, you should NEVER have to pay anything. Learn more.

Apply for this position

Unlock 150,000+ Remote Jobs

[Hiring] Staff Site Reliability & DevOps Engineer - Observability @Brandwatch

Apply to the best remote jobsbefore everyone else

Apply to the best remote jobs
before everyone else