Coderio

*Monitoring and Observability Analyst Sr. (Sat, Sun, Holidays)

Posted 7 Days Ago

Be an Early Applicant

5 Locations

In-Office or Remote

Mid level

Software • Design • App development

Accelerate Your Digital Transformation

The Role

As a Monitoring and Observability Analyst, you will design and maintain monitoring systems for IT infrastructure, collaborating with DevOps teams to enhance incident resolution and service availability.

Summary Generated by Built In

About Us

Coderio designs and delivers scalable digital solutions for global businesses. With a strong technical foundation and a product mindset, our teams lead complex software projects from architecture to execution. We value autonomy, clear communication, and technical excellence. We work closely with international teams and partners, building technology that makes a difference.

🌍 Learn more:http://coderio.com

In this role, as an Monitoring and Observability Analyst , you will design, implement, and maintain proactive monitoring and alerting systems to ensure the availability, performance, and health of IT infrastructure, applications, and services. Your main focus will be on designing end-to-end monitoring solutions using metrics, logs, and traces , configuring business-impact-based alert thresholds (SLIs/SLOs) , and supporting incident resolution by providing detailed monitoring data for Root Cause Analysis (RCA). You will work closely with Operations and Development (DevOps) teams to minimize MTTR (Mean Time to Recovery) and support the continuous improvement of the ecosystem.

The role of Monitoring and Observability Engineer/Analyst is critical to our operation and requires continuous coverage (24/7).

Since we support the infrastructure in the United States, all shifts and holidays are governed by the United States (U.S.) time zone and schedule.

Saturdays, Sundays, and any U.S. Holidays require 24-hour coverage, which is divided into full 12-hour shifts.

It is essential that you have the availability and willingness to work this shift pattern (evening/night and weekends/holidays) to ensure service continuity and compliance with our SLAs.

What to Expect in This Role (Responsibilities)

Contribute to the definition of the company's observability strategy, aligned with industry best practices (SRE/DevOps).

Design and implement end-to-end monitoring solutions.

Configure alert thresholds (SLIs/SLOs) based on business impact and minimize notification noise.

Develop and maintain informative and visually clear dashboards (e.g., Grafana, Kibana) for real-time visibility.

Implement and optimize monitoring automation, from agent deployment to automatic alert response (AIOps basic/intermediate).

Administer and maintain monitoring platforms (updates, patches, cost optimization).

Create and maintain technical documentation (runbooks, monitoring procedures, service maps).

Requirements

Minimum 3 years of experience in Monitoring, IT Operations, or SRE roles.

Advanced experience with one or more monitoring platforms: Prometheus/Grafana, ELK Stack, New Relic, Datadog or similar.

Dominance in monitoring Cloud environments (AWS/Azure/GCP) and containers (Docker, Kubernetes).

Solid understanding of Logs (fluentd, Logstash, Loki) and Distributed Tracing (Jaeger, Zipkin, OpenTelemetry).

Practical experience in scripting languages (e.g., Python, Bash) for task automation and custom checker development.

Deep knowledge of Linux operating systems.

Strong ability to correlate events and data from multiple sources to identify the root cause of complex problems (Analysis Skill).

Ability to anticipate problems instead of just reacting to alerts (Proactivity Orientation).

Excellent oral and written communication skills.

Experience in a collaborative work environment with a DevOps mindset.

Bachelor's degree in Systems Engineering, Computer Science, or a related field.

Nice to Have

Certifications related to Cloud (AWS, Azure).

Certifications related to Observability Platforms (Datadog, Dynatrace).

Certifications related to DevOps/SRE practices.

Understanding of basic networking concepts (TCP/IP, DNS, Load Balancers).

Benefits

100% remote Long-term commitment, with autonomy and impact

Strategic and high-visibility role in a modern engineering culture

Collaborative international team and strong technical leadership

Clear path to growth and leadership within Coderio

Why join Coderio?

At Coderio, we value talent regardless of location. We are a remote-first company, passionate about technology, collaborative work, and fair compensation.We offer an inclusive, challenging environment with real opportunities for growth.If you are motivated to build solutions with impact, we are waiting for you.

Apply now.

Top Skills

AWS

Azure

Bash

Datadog

Docker

Elk Stack

Fluentd

GCP

Grafana

Jaeger

Kubernetes

Logstash

Loki

New Relic

Opentelemetry

Prometheus

Python

Zipkin

View all jobs at Coderio

View Coderio Profile

Report Job

Am I A Good Fit?

beta

Get Personalized Job Insights.

Our AI-powered fit analysis compares your resume with a job listing so you know if your skills & experience align.

The Company

HQ: Miami, FL

223 Employees

Year Founded: 2017

What We Do

Accelerate your digital transformation with our expert nearshore engineering teams.
From experienced Software Engineers to augment your tech team, to fully managed expert Development Squads. We design, engineer, and deliver customized technology solutions for companies of every size.

We can assemble your enterprise-level dev squad within 7 days. Scale fast with our on-demand timezone-aligned software development talent.

Contact us: [email protected]